26,869 Matching Annotations
  1. May 2024
    1. Reviewer #2 (Public Review):

      Summary:

      This paper derives the first three functional gradients in the left and right hippocampus across two datasets. These gradient maps are then compared to dopamine receptor maps obtained with PET, associated with age, and linked to memory. Results reveal links between dopamine maps and gradient 2, age with gradients 1 and 2, and memory performance.

      Strengths:

      This paper investigates how hippocampal gradients relate to aging, memory, and dopamine receptors, which are interesting and important questions. A strength of the paper is that some of the findings were replicated in a separate sample.

      Weaknesses

      The paper would benefit from added clarification on the number of models/comparisons for each test. Furthermore, it would be helpful to clarify whether or not multiple comparison correction was performed and - if so - what type or - if not - to provide a justification. The manuscript would furthermore benefit from code sharing and clarifying which results did/did not replicate.

    2. Reviewer #3 (Public Review):

      Summary:

      In this study, the authors analyzed the complex functional organization of the hippocampus using two separate adult lifespan datasets. They investigated how individual variations in the detailed connectivity patterns within the hippocampus relate to behavioral and molecular traits. The findings confirm three overlapping hippocampal gradients and reveal that each is linked to established functional patterns in the cortex, the arrangement of dopamine receptors within the hippocampus, and differences in memory abilities among individuals. By employing multivariate data analysis techniques, they identified older adults who display a hippocampal gradient pattern resembling that of younger individuals and exhibit better memory performance compared to their age-matched peers. This underscores the behavioral importance of maintaining a specific functional organization within the hippocampus as people age.

      Strengths:

      The evidence supporting the conclusions is overall compelling, based on a unique dataset, rich set of carefully unpacked results, and an in-depth data analysis. Possible confounds are carefully considered and ruled out.

      Weaknesses:

      No major weaknesses. The transparency of the statistical analyses could be improved by explicitly (1) stating what tests and corrections (if any) were performed, and (2) justifying the elected statistical approaches. Further, some of the findings related to the DA markers are borderline statistically significant and therefore perhaps less compelling but they line up nicely with results obtained using experimental animals and I expect the small effect sizes to be largely related to the quality and specificity of the PET data rather than the derived functional connectivity gradients.

    1. eLife assessment

      This study offers valuable insight into the remarkable resistance of tardigrades to ionizing radiation by showing that radiation treatment induces a suite of DNA repair proteins and by identifying a strongly induced tardigrade-specific DNA-binding protein that can reduce the number of double-strand breaks in human U2OS cells. The evidence of upregulation of repair proteins is convincing, and the case for a role of the newly identified protein in repair can be strengthened as genetic tools for tardigrades become better developed. The results will interest the fields of DNA repair and radiobiology as well as tardigrade biologists.

    2. Reviewer #3 (Public Review):

      Summary:

      This paper describes transcriptomes from three tardigrade species with or without treatment with ionizing radiation (IR). The authors show that IR produces numerous single strand and double strand breaks as expected and that these are substantially repaired within 4-8 hours. Treatment with IR induces strong upregulation of transcripts from numerous DNA repair proteins, and from the newly described protein TDR1 with homologs in both Hypsibioidea and Macrobiotoidea supefamilies. The authors show that TDR1 transcription produces newly translated TDR1 protein, which can bind DNA and co-localizes with DNA in the nucleus. At higher concentrations TDR appears to form aggregates with DNA, which might be relevant to a possible function in DNA damage repair. When introduced into human U2OS cells treated with the radiomimetic drug bleomycin, TDR1 reduces the number of double-strand breaks as detected by gamma H2AX spots. This paper will be of interest to the DNA repair field and to radiobiologists.

      Strengths:

      The paper is well-written and provides solid evidence of the upregulation of DNA repair enzymes after irradiation of tardigrades, as well as upregulation of the TRD1 protein. The reduction of gamma-H2A.X spots in U2OS cells after expression of TRD1 supports a role in a DNA damage.

      Weaknesses:<br /> Genetic tools are still being developed in tardigrades, so there is no mutant phenotype to support a DNA repair function for TRD1, but this may be available soon.

    3. Reviewer #4 (Public Review):

      In this study, Anoud et al. show convincing results of genes involved in the radio-resistance of tardigrades. With transcriptomics, they found many genes involved in DNA repair pathways to be overexpressed after ionizing radiation. In addition, they found RNF146 coding for a ubiquitin ligase, and genes of the AMNP family. Finally, they more deeply characterized one upregulated gene that they named TDR1 (Tardigrade DNA damage Response 1) which seems specific to tardigrades. With proteomics they verified these results. They show that TDR1 binds DNA in vitro and co-localize with DNA in tardigrades. Because of the difficulties of carrying reverse genetics in tardigrades, the authors showed in vitro that human cells expressing TDR1 led to a reduced number of phospho-H2AX foci (indicating DNA damages) when treated with Bleomycin. Based on these results, the authors suggested that TDR1 interacts with DNA and might regulate chromosomal organization and favors DNA repair.

      Strengths:

      The paper provides solid evidence of the upregulation of DNA repair enzymes after irradiation of tardigrades, as well as upregulation of the TRD1 protein.

      The reduction of gamma-H2A.X spots in U2OS cells after expression of TRD1 supports a role in a DNA damage.

      The shown interaction of TDR1 with DNA.

      Weaknesses:

      No reverse genetics to support a DNA repair function for TRD1, even if I recognize that these remain difficult to carry in tardigrades.

      No pulse field electrophoresis gels to show DNA damages in tardigrades, which remain apparently challenging to perform in tardigrades.

      After revision, the manuscript gained in structure, and in precision.

      Overall, the manuscript provides valuable and convincing results contributing to our knowledge of tardigrade radio resistance. While reverse genetics remain difficult to carry in tardigrades, the authors used the alternative approach to investigate TDR1 function in vitro in human cells.

      This study illustrates integrative biology as it combines a set of different methodologies including next-generation sequencing, transcriptomic and proteomic analyses, immunohistochemistry, immunolabelling, in vitro assays and SEM. According to me, the quality and importance of the results make it of interest to the fields of DNA repair, radiobiology, and radio resistance.

    1. eLife assessment

      The manuscript presents a machine-learning method to predict protein hotspot residues. The validation is incomplete, along with the misinterpretation of the results with other current methods like FTMap.

    2. Reviewer #1 (Public Review):

      Summary:

      The paper describes a program developed to identify PPI-hot spots using the free protein structure and compares it to FTMap and SPOTONE, two webservers that they consider as competitive approaches to the problem. On the positive side, I appreciate the effort in providing a new webserver that can be tested by the community but have two major concerns as follows.

      (1) The comparison to the FTMap program is wrong. The authors misinterpret the article they refer to, i.e., Zerbe et al. "Relationship between hot spot residues and ligand binding hot spots in protein-protein interfaces" J. Chem. Inf. Model. 52, 2236-2244, (2012). FTMap identifies hot spots that bind small molecular ligands. The Zerbe et al. article shows that such hot spots tend to interact with hot spot residues on the partner protein in a protein-protein complex (emphasis on "partner"). Thus, the hot spots identified by FTMap are not the hot spots defined by the authors. In fact, because the Zerbe paper considers the partner protein in a complex, the results cannot be compared to the results of Chen et al. This difference is missed by the authors, and hence the comparison of the FTMap is invalid. I did not investigate the comparison to SPOTONE, and hence have no opinion.

      (2) Chen et al. use a number of usual features in a variety of simple machine-learning methods to identify hot spot residues. This approach has been used in the literature for more than a decade. Although the authors say that they were able to find only FTMap and SPOTONE as servers, there are dozens of papers that describe such a methodology. Some examples are given here: (Higa and Tozzi, 2009; Keskin, et al., 2005; Lise, et al., 2011; Tuncbag, et al., 2009; Xia, et al., 2010). There are certainly more papers. Thus, while I consider the web server as a potentially useful contribution, the paper does not provide a fundamentally novel approach.

      Higa, R.H. and Tozzi, C.L. Prediction of binding hot spot residues by using structural and evolutionary parameters. Genet Mol Biol 2009;32(3):626-633.

      Keskin, O., Ma, B.Y. and Nussinov, R. Hot regions in protein-protein interactions: The organization and contribution of structurally conserved hot spot residues. J Mol Biol 2005;345(5):1281-1294.

      Lise, S., et al. Predictions of Hot Spot Residues at Protein-Protein Interfaces Using Support Vector Machines. PLoS One 2011;6(2).

      Tuncbag, N., Gursoy, A. and Keskin, O. Identification of computational hot spots in protein interfaces: combining solvent accessibility and inter-residue potentials improves the accuracy. Bioinformatics 2009;25(12):1513-1520.

      Xia, J.F., et al. APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility. BMC Bioinformatics 2010;11:174.

      Strengths:<br /> A new web server was developed for detecting protein-protein interaction hot spots.

      Weaknesses:<br /> The comparison to FTMap results is wrong. The method is not novel.

    1. eLife assessment

      This is an important study characterizing the unique expression of mouse Schlemm's canal endothelial cells (SECs), which function in the aqueous humor outflow pathway of the eye. The work convincingly identifies novel biomarkers for SECs and molecular markers for inner wall and outer wall SECs, followed by targeted RNA and protein expression validation in mouse eyes. Gene networks and pathways were analyzed for their potential contribution to glaucoma pathogenesis.

    2. Reviewer #1 (Public Review):

      Summary:

      Balasubramanian et al. characterized the cell types comprising mouse Schlemm's canal (SC) using bulk and single-cell RNA sequencing (scRNA-seq). The results identify expression patterns that delineate the SC inner and outer wall cells and two inner wall 'states'. Further analysis demonstrates expression patterns of glaucoma-associated genes and receptor-ligand pairs between SEC's and neighboring trabecular meshwork.

      Strengths:

      While mouse SC has been profiled in previous scRNA-seq studies (van Zyl et al 2020, Thomson et al 2021), these data provide higher resolution of SC cell types, particularly endothelial cell (SEC) populations. SC is an important regulator of anterior chamber outflow and has important consequences for glaucoma.

      Weaknesses:

      (1) Since SC has previously been characterized in mouse, human, and other species by scRNA-seq in other studies, this study would benefit from more direct comparisons to published datasets. For example, Table 4 could be expanded to list the SC cell numbers profiled in each study. Expression patterns highlighted in this study could be independently verified by plotting in publicly available mouse SC datasets. Further, a comparison to human expression patterns would assess whether type-specific expression patterns are conserved. Alternatively, an integrated analysis could be performed. Indeed, the authors mention that an integrated analysis was attempted but the data is not shown. It is unclear if this was because of a lack of agreement between datasets or other reasons.

      (2) Figure 1 presents bulk RNA seq results comparing SEC, BEC, and LEC expression patterns. These populations were isolated using cell surface markers and enrichment by FACS. Since each EC population is derived from the same sample, the accuracy of this data hinges on the purity of enrichment. However, a reference is not given for this method and it is not clear how purity was validated. The authors later note that marker Emcn, which was used to identify BECs, is also expressed in SECs and LECs at lower levels. It should be demonstrated that these populations are clearly separated by flow cytometry.

      (3) Bulk RNA-seq analysis infers similarity from the number of DEGs between samples, however, this is not a robust indicator. A correlation analysis should be run to verify conclusions.

      (4) Figures 2-4 present three different datasets targeting the same tissue: 1) C57bl/6j scRNA-seq, 2) C57bl/6j snRNA-seq, 3) 129/sj scRNA-seq. Integrated analysis comparing datasets #1 to #2 and #3 is also presented. Integration methods are not described beyond 'normalization for cell numbers'. It is unclear if additional alignment methods were used. Integration across each of these datasets needs careful consideration, especially since different filtering methods were used (e.g. <20% mito in scRNA-seq and <5% in snRNA-seq). Improper integration could affect the ability to cluster or exaggerate differences between cell/types and states. It would be useful to demonstrate the contribution of different samples and datasets to each cell type/state to verify that these are not driven by batch effects, mouse strain, or collection platform.

      (5) IW1 and IW2 are not well separated, and it is unclear if these represent truly different cell states. Figure 5b shows the staining of CCL21A and describes expression in the 'posterior portion' but in the image there are no DAPI+ nuclei in the anterior portion, suggesting the sampling in this section is different from Figure 5a. This would be improved by co-staining NPNT and CCL21A to demonstrate specificity.

      (6) The substructures observed within clusters in sc/snRNA-seq data suggest that overall profiling may still not be comprehensive. This should be noted in the discussion.

    1. eLife assessment

      This study reports single-cell RNA sequencing results of lung adenocarcinoma, comparing 4 treatment-naive and 5 post-neoadjuvant chemotherpy tumor samples. Of interest is the delineation of two macrophage subtypes : Anti-mac cells (CD45+CD11b+CD86+) and Pro-mac cells (CD45+CD11b+ARG+), with the proportion of Pro-mac/pro-tumorigenic cells significantly increasing in LUAD tissues after neoadjuvant chemotherapy. In terms of significance, the findings might be useful but only if robust statistical comparisons (currently missing) can be provided. As it stands, the level of supportive evidence is inadequate.

    2. Reviewer #1 (Public Review):

      Summary:

      This study reports single-cell RNA sequencing results of lung adenocarcinoma, comparing 4 treatment-naive and 5 post-neoadjuvant chemotherapy tumor samples.<br /> The authors claim that there are metabolic reprogramming in tumor cells as well as stromal and immune cells after chemotherapy.<br /> The most significant findings are in the macrophages that there are more pro-tumorigenic cells after chemotherapy, i.e. CD45+CD11b+ARG+ cells. In the treatment-naive samples, more anti-tumorigenic CD45+CD11b+CD86+ macrophages are found. They sorted each population and performed functional analyses.

      Strengths:

      Comparison of the treatment-naive and post-chemotherapy samples of lung adenocarcinoma.

      Weaknesses:

      (1) Lengthy descriptive clustering analysis, with indistinct direct comparisons between the treatment-naive and the post-chemotherapy samples.<br /> (2) No statistical analysis was performed for the comparison.<br /> (3) Difficult to match data to the text.<br /> (4) ARG1 is a cytosolic enzyme that can be detected by intracellular staining after fixation. It is unclear how the staining and sorting was performed to measure function of sorted cells.

    3. Reviewer #2 (Public Review):

      In this study, Huang et al. performed a scRNA-seq analysis of lung adenocarcinoma (LUAD) specimens from 9 human patients, including 5 who received neoadjuvant chemotherapy (NCT), and 4 without treatment (control). The new data was produced using 10 × Genomics technology and comprises 83622 cells, of which 50055 and 33567 cells were derived from the NCT and control groups, respectively. Data was processed via R Seurat package, and various downstream analyses were conducted, including CNV, GSVA, functional enrichment, cell-cell interaction, and pseudotime trajectory analyses. Additionally, the authors performed several experiments for in vitro and in vivo validation of their findings, such as immunohistochemistry, immunofluorescence, flow cytometry, and animal experiments.

      The study extensively discusses the heterogeneity of cell populations in LUAD, comparing the samples with and without chemotherapy. However, there are several shortcomings that diminish the quality of this paper:

      • The number of cells included in the dataset is limited, and the number of patients from different groups is low, which may reduce the attractiveness of the dataset for other researchers to reuse. Additionally, there is no metadata on patients' clinical characteristics, such as age, sex, history of smoking, etc., which would be valuable for future studies.<br /> • Several crucial details about the data analysis are missing: How many PCs were used for reduction? Which versions of Seurat/inferCNV/other packages were used? Why monocle2 was used and not monocle3 or other packages? Also, the authors use R version 3.6.1, and the current version is 4.3.2.<br /> • It seems that the authors may lack a fundamental understanding of scRNA-seq data processing and the functions of Seurat. For instance, they state, 'Next, we classified cell types through dimensional reduction and unsupervised clustering via the Seurat package.' However, dimensional reduction and unsupervised clustering are not methods for cell classification. Typically, cell types are classified using marker genes or other established methods.<br /> "Therefore, to identify subclusters within each of these nine major cell types, we performed principal component analysis" (Line 127). Principal component analysis is a method for dimensionality reduction, not cell clustering.<br /> The authors did not mention the normalization or scaling of the data, which are crucial steps in scRNA-seq data preprocessing.<br /> • Numerous style and grammar mistakes are present in the main text. For instance, certain sections of the methods are written in the present tense, suggesting that parts of a protocol were copied without text editing. Furthermore, some sections of the introduction are written in the past tense when the present tense would be more suitable. Clusters are inconsistently referred to by numbers or cell types, leading to confusion. Additionally, the authors frequently use the term "evolution" when describing trajectory analysis, which may not be appropriate. Overall, significant revisions to the main text are required.<br /> • Some figures are not mentioned in order or are not referenced in the text at all, such as Figure 5l (where it is also unclear how the authors selected the root cells). Additionally, many figures have text that is too small to be read without zooming in. Overall, the quality of the figures is inconsistent and sometimes very poor.<br /> • At times, the authors' statements are incomplete (ex. Lines 67-69, Line 177, Line 629, Lines 646-648 and 678).

      The results section lacks clarity on several points:<br /> • The authors state that "myofibroblasts exclusively originated from the control group". However, pathways up-regulated in myofibroblasts (such as glycolysis) were enhanced after chemotherapy, as indicated by GSVA score. Similarly, why are some clusters of TAMs from the control group associated with pathways enriched in chemotherapy group?<br /> • Further explanation is necessary regarding the distinctions between malignant and non-malignant cells, as well as regarding the upregulation of metabolism-related pathways in fibroblasts from the NCT group. Additionally, clarification is needed regarding why certain TAMs from the control group are associated with pathways enriched in the chemotherapy group.<br /> • In the section titled 'Chemo-driven Pro-mac and Anti-mac Metabolic Reprogramming Exerted Diametrically Opposite Effects on Tumor Cells': The markers selected to characterize the anti- and pro-macrophages are commonly employed for describing M1 or M2 polarization. It is uncertain whether this new classification into anti- and pro-macrophages is necessary. Additionally, it should be noted that pro-macrophages are anti-inflammatory, while anti-macrophages are pro-inflammatory, which could lead to confusion. M2 macrophages are already recognized for their role in stimulating tumor relapse after chemotherapy.<br /> • The authors suggest that there is "reprogramming of CD8+ cytotoxic cells" following chemotherapy (Line 409). It remains unclear whether they imply the reprogramming of other CD8+ T cells into cytotoxic cells. While it is indicated that cytotoxic cells from the control group differ from those in the NCT group and that NCT cytotoxic T cells exhibit higher cytotoxicity, the authors did not assess the expression of NK and NK-like T cell markers (aside from NKG7), which may possess greater cytotoxic potential than CD8+ cytotoxic cells. This could also elucidate why cytotoxic cells from the NCT and control groups are positioned on separate branches in trajectory analysis. Overall, with 22.5k T cells in the dataset, only 3 subtypes were identified, suggesting a need for improved cell annotations by the authors.

    1. eLife assessment

      The paper uses published data and a proposed cell-based model to understand how growth and death mechanisms lead to the observed data. This work provides an important insight into the early stages of tumour development. From the work provided here, the results are solid, showing a thorough analysis. However, the work has not fully specified the model, which can lead to some questions around the model's suitability.

    2. Reviewer #2 (Public Review):

      Summary:

      The article uses a cell-based model to investigate how mutations and cells spread throughout a tumour. The paper uses published data and the proposed model to understand how growth and death mechanisms lead to the observed data. This work provides an insight into the early stages of tumour development. From the work provided here, the results are solid, showing a thorough analysis. However, the work has not fully specified the model, which can lead to some questions around the model's suitability. The article is well-written and presents a very suitable and rigorous analysis to describe the data. The authors did a particularly nice job of the discussion and decision of their "metrics of interest", though this is not the main aim of this work.

      Strengths:

      Due to the particularly nice and tractable cell-based model, the authors are able to perform a thorough analysis to compare the published data to that simulated with their model. They then used their computational model to investigate different growth mechanisms of volume growth and surface growth. With this approach, the authors are able to compare the metric of interest (here, the direction angle of a new mutant clone, the dispersion of mutants throughout the tumour) to quantify how the different growth models compare to the observed data. The authors have also used inference methods to identify model parameters based on the data observed. The authors performed a rigorous analysis and have chosen the metrics in an appropriate manner to compare the different growth mechanisms.

      Weaknesses:

      The work contained within this article considers a single cell-based model. While ideally, this is sufficient, results from simulated multi-cellular systems can often be sensitive to the model choice. Performing this work with various other standard models would strengthen the results significantly. This is, however, not an easy task.

      Context:

      Improved mechanistic understanding into the early developmental stages of tumours will further assist in disease treatment and quantification. Understanding how readily and quickly a tumour is evolving is key to understanding how it will develop and progress. This work provides a solid example as to how this can be achieved with data alongside simulated models.

    1. eLife assessment

      This useful study reports on the impact of antibiotic pressure on the genomic stability of the mc2155 strain of Mycobacterium smegmatis, a model for Mycobacterium tuberculosis. The study concludes that exposure to antibiotics did not lead to the emergence of new adaptive mutations in laboratory settings, contradicting the prevailing theory of antibiotic resistance development through drug-induced microevolution. While the genomic analysis provided detailed insights into the stability of M. smegmatis following exposure to standard TB treatment antibiotics, the evidence presented for antibiotic pressure not contributing to the occurrence of new adaptive mutations is still incomplete.

    2. Reviewer #1 (Public Review):

      In this manuscript, Molnar, Suranyi and colleagues have probed the genomic stability of Mycobacterium smegmatis in response to several anti-tuberculosis drugs as monotherapy and in combination. Unlike the study by Nyinoh and McFaddden http://dx.doi.org/10.1002/ddr.21497 (which should be cited), the authors use a sub-lethal dose of antibiotic. While this is motivated by sound technical considerations, the biological and therapeutic rationale could be further elaborated. The results the authors obtain are in line with papers examining the genomic mutation rate in vitro and from patient samples in Mycobacterium tuberculosis, in vitro in Mycobacterium smegmatis and in vitro in Mycobacterium tuberculosis (although the study by HL David (PMID: 4991927) is not cited). The results are confirmatory of previous studies. It is therefore puzzling why the authors propose the opposite hypothesis in the paper (i.e antibiotic exposure should increase mutation rates) merely to tear it down later. This straw-man style is entirely unnecessary. The results on the nucleotide pools are interesting, but the statistically significant data is difficult to identify as presented, and therefore the new biological insights are unclear. Finally, the authors show that a fluctuation assay generates mutations with higher frequencies that the genetic stability assays, confirming the well-known effect of phenotypic antibiotic resistance.

    3. Reviewer #2 (Public Review):

      In this study, the authors assess whether selective pressure from drug chemotherapy influences the emergence of drug resistance through the acquisition of genetic mutations or phenotypic tolerance. I commend the authors on their approach of utilizing the mutation accumulation (MA) assay as a means to answer this and whole genome sequencing of clones from the assay convincingly demonstrates low mutation rates in Mycobacteria when exposed to sub-inhibitory concentrations of antibiotics. Also, quantitative PCR highlighted the upregulation of DNA repair genes in Mycobacteria following drug treatment, implying the preservation of genomic integrity via specific repair pathways.

      Even though the findings stem from M. smegmatis exposure to antibiotics under in vitro conditions, this is still relevant in the context of the development of drug resistance so I can see where the authors' train of thought was heading in exploring this. However, I think important experiments to perform to more fully support the conclusion that resistance is largely associated with phenotypic rather than genetic factors would have been to either sequence clones from the ciprofloxacin tolerance assay (to show absence/ minimal genetic mutations) or to have tested the MIC of clones from the MA assay (to show an increase in MIC). There seems to be a disconnect between making these conclusions from experiments conducted under different conditions, or perhaps the authors can clarify why this was done. With regards to the sub-inhibitory drug concentration applied, there is significant variation in the viability as calculated by CFUs following the different treatments and there is evidence that cell death greatly affects the calculation of mutation rate (PMCID: PMC5966242). For instance, the COMBO treatment led to 6% viability whilst the INH treatment led to 80% cell viability. Are there any adjustments made to take this into account? It would also be useful to the reader to include a supplementary table of the SNPs detected from the lineages of each treatment - to determine if at any point rifampicin treatment led to mutations in rpoB, isoniazid to katG mutations, etc. Overall, while this study is tantalizingly suggestive of phenotypic tolerance playing a leading role in drug resistance (and perhaps genetic mutations a sub-ordinate role) a more substantial link is needed to clarify this.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript describes how antibiotics influence genetic stability and survival in Mycobacterium smegmatis. Prolonged treatment with first-line antibiotics did not significantly impact mutation rates. Instead, adaptation to these drugs appears to be mediated by upregulation of DNA repair enzymes. While this study offers robust data, findings remain correlative and fall short of providing mechanistic insights.

      Strengths:

      The strength of this study is the use of genome-wide approaches to address the specific question of whether or not mycobacteria induce mutagenic potential upon antibiotic exposure.

      Weaknesses:

      The authors suggest that the upregulation of DNA repair enzymes ensures a low mutation rate under drug pressure. However, this suggestion is based on correlative data, and there is no mechanistic validation of their speculations in this study.

      Furthermore, as detailed below, some of the statements made by the authors are not substantiated by the data presented in the manuscript.

      Finally, some clarifications are needed for the methodologies employed in this study. Most importantly, reduced colony growth should be demonstrated on agar plates to indicate that the drug concentrations calculated from liquid culture growth can be applied to agar surface growth. Without such validations, the lack of induced mutation could simply be due to the fact that the drug concentrations used in this study were insufficient.

    1. eLife assessment

      This paper presents a valuable optimization algorithm for determining the spatio-temporal organization of chromatin. The algorithm identifies the polymer model that best fits population averaged Hi-C data and makes predictions about the spatio-temoral organization of specific genomic loci such as the oncogenic Myc locus. While the algorithm will be of value to biologists and physicists working in the field of genome organization, the provided methodological details and evidence are incomplete to fully substantiate the conclusions. In particular, the following would be beneficial: analysis of single-cell data, the inclusion of loci beyond Myc, testing the dependence of results on the chosen parameters, providing more details on CTCF occupancy at loop anchors, and better substantiating the claim about predictions of single-cell heterogeneity.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors of this study aim to use an optimization algorithm approach, based on the established Nelder-Mead method, to infer polymer models that best match input bulk Hi-C contact data. The procedure infers the best parameters of a generic polymer model that combines loop-extrusion (LE) dynamics and compartmentalization of chromatin types driven by weak biochemical affinities. Using this and DNA FISH, the authors investigate the chromatin structure of the MYC locus in leukemia cells, showing that loop extrusion alone cannot explain local pathogenic chromatin rearrangements. Finally, they study the locus single-cell heterogeneity and time dynamics.

      Strengths:

      -The optimization method provides a fast computational tool that speeds up the parameter search of complex chromatin polymer models and is a good technical advancement.

      -The method is not restricted to short genomic regions, as in principle it can be applied genome-wide to any input Hi-C dataset, and could be potentially useful for testing predictions on chromatin structure.

      Weaknesses:

      (1) The optimization is based on the iterative comparison of simulated and Hi-C contact matrices using the Spearman correlation. However, the inferred set of the best-fit simulation parameters could sensitively depend on such a specific metric choice, questioning the robustness of the output polymer models. How do results change by using different correlation coefficients?

      (2) The best-fit contact threshold of 420nm seems a quite large value, considering that contact probabilities of pairs of loci at the mega-base scale are defined within 150nm (see, e.g., Bintu et al. Science (2018) and Takei et al. Science (2021)).

      (3) In their model, the authors consider the presence of LE anchor sites at Hi-C TAD boundaries. Do they correspond to real, experimentally found CTCF sites located at genomic positions, or they are just assumed? A track of CTCF peaks of the considered chromatin loci would be needed.

      (4) In the model, each TAD is assigned a specific energy affinity value. Do the different domain types (i.e., different colors) have a mutually attractive energy? If so, what is its value and how is it determined? The simulated contact maps (e.g., Figure 2C) seem to allow attractions between different blocks, yet this is unclear.

      (5) To substantiate the claim that the simulations can predict heterogeneity across single cells, the authors should perform additional analyses. For instance, they could plot the histograms (models vs. experiments) of the TAD2-TAD4 distance distributions and check whether the models can recapitulate the FISH-observed variance or standard deviation. They could also add other testable predictions, e.g., on gyration radius distributions, kurtosis, all-against-all comparison of single-molecule distance matrices, etc,.

      (6) The authors state that loop extrusion is crucial for enhancer function only at large distances. How does that reconcile, e.g., with Mach et al. Nature Gen. (2022) where LE is found to constrain the dynamics of genomically close (150kb) chromatin loci?

    1. eLife assessment

      This valuable study presents a series of results aimed at uncovering the involvement of the endosomal sorting protein SNX4 in neurotransmitter release. While the evidence supporting the conclusions is solid, the molecular mechanisms remain unclear, and the study would significantly benefit from additional experiments to strengthen its findings. This paper will be of interest to cell biologists and neurobiologists.

    2. Reviewer #1 (Public Review):

      Summary:

      In the work: "Endosomal sorting protein SNX4 limits synaptic vesicle docking and release" Josse Poppinga and collaborators addressed the synaptic function of Sortin-Nexin 4 (SNX4). Employing a newly developed in vitro KO model, with live imaging experiments, electrophysiological recordings, and ultrastructural analysis, the authors evaluate modifications in synaptic morphology and function upon loss of SNX4. The data demonstrate increased neurotransmitter release and alteration in synapse ultrastructure with a higher number of docked vesicles and shorter AZ. The evaluation of the presynaptic function of SNX4 is of relevance and tackles an open and yet unresolved question in the field of presynaptic physiology.

      Strengths:

      The sequential characterization of the cellular model is nicely conducted and the different techniques employed are appropriate for the morpho-functional analysis of the synaptic phenotype and the derived conclusions on SNX4 function at presynaptic site. The authors succeeded in presenting a novel in vitro model that resulted in chronical deletion of SNX4 in neurons. A convincing sequence of experimental techniques is applied to the model to unravel the role of SNX4, whose functions in neuronal cells and at synapses are largely unknown. The understanding of the role of endosomal sorting at the presynaptic site is relevant and of high interest in the field of synaptic physiology and in the pathophysiology of the many described synaptopathies that broadly result in loss of synaptic fidelity and quality control at release sites.

      Weaknesses:

      The flow of the data presentation is mostly descriptive with several consistent morphological and functional modifications upon SNX loss. The paper would benefit from a wider characterization that would allow us to address the physiological roles of SNX4 at the synaptic site and speculate on the underlying molecular mechanisms. In addition, due to the described role of SNX4 in autophagy and the high interest in the regulation of synaptic autophagy in the field of synaptic physiology, an initial evaluation of the autophagy phenotype in the neuronal SNX4KO model is important, and not to be only restricted to the discussion section.

    3. Reviewer #2 (Public Review):

      Summary:

      SNX4 is thought to mediate recycling from endosomes back to the plasma membrane in cells. In this study, the authors demonstrate the increases in the amounts of transmitter release and the number of docked vesicles by combining genetics, electrophysiology, and EM. They failed to find evidence for its role in synaptic vesicle cycling and endocytosis, which may be intuitively closer to the endosome function.

      Strengths:

      The electrophysiological data and EM data are in principle, convincing, though there are several issues in the study.

      Weaknesses:

      It is unclear why the increase in the amounts of transmitter release and docked vesicles happened in the SNX4 KO mice. In other words, it is unclear how the endosomal sorting proteins in the end regulate or are connected to presynaptic, particularly the active zone function.

    4. Reviewer #3 (Public Review):

      Summary:

      The study aims to determine whether the endosomal protein SNX4 performs a role in neurotransmitter release and synaptic vesicle recycling. The authors exploited a newly generated conditional knockout mouse to allow them to interrogate the SNX4 function. A series of basic parameters were assessed, with an observed impact on neurotransmitter release and active zone morphology. The work is interesting, however as things currently stand, the work is descriptive with little mechanistic insight. There are a number of places where the data appear to be a little preliminary, and some of the conclusions require further validation.

      Strengths:

      The strengths of the work are the state-of-the-art methods to monitor presynaptic function.

      Weaknesses:

      The weaknesses are the fact that the work is largely descriptive, with no mechanistic insight into the role of SNX4. Further weaknesses are the absence of controls in some experiments and the design of specific experiments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank all of the reviewers for their helpful and the effort they made in reading and evaluating our manuscript. In response to them, we have made major changes to the text and figures and performed substantial new experiments. These new data and changes to the text and figures have substantially strengthened the manuscript. We believe that the manuscript is now very strong in both its impact and scope and we hope that reviewers will find it suitable for publication in eLife

      A point-by-point response to the reviewers' specific comments is provided below.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this report, Yu et al ascribe potential tumor suppressive functions to the non-core regions of RAG1/2 recombinases. Using a well-established BCR-ABL oncogene-driven system, the authors model the development of B cell acute lymphoblastic leukemia in mice and found that RAG mutants lacking non-core regions show accelerated leukemogenesis. They further report that the loss of non-core regions of RAG1/2 increases genomic instability, possibly caused by increased off-target recombination of aberrant RAG-induced breaks. The authors conclude that the non-core regions of RAG1 in particular not only increase the fidelity of VDJ recombination, but may also influence the recombination "range" of off-target joints, and that in the absence of the non-core regions, mutant RAG1/2 (termed cRAGs) catalyze high levels of off-target recombination leading to the development of aggressive leukemia.

      Strengths:

      The authors used a genetically defined oncogene-driven model to study the effect of RAG non-core regions on leukemogenesis. The animal studies were well performed and generally included a good number of mice. Therefore, the finding that cRAG expression led to the development of more aggressive BCR-ABL+ leukemia compared to fRAG is solid.

      Weaknesses:

      In general, I find the mechanistic explanation offered by the authors to explain how the non-core regions of RAG1/2 suppress leukemogenesis to be less convincing. My main concern is that cRAG1 and cRAG2 are overexpressed relative to fRAG1/2. This raises the possibility that the observed increased aggressiveness of cRAG tumors compared to fRAG tumors could be solely due to cRAG1/2 overexpression, rather than any intrinsic differences in the activity of cRAG1/2 vs fRAG1/2; and indeed, the authors allude to this possibility in Fig S8, where it was shown that elevated expression of RAG (i.e. fRAG) correlated with decreased survival in pediatric ALL. Although it doesn't mean the authors' assertions are incorrect, this potential caveat should nevertheless be discussed.

      We appreciate the valuable suggestions from the reviewer. BCR-ABL1+ B-ALL is characterized by halted early B-lineage differentiation. In BCR-ABL1+ B cells, RAG recombinases are highly expressed, leading to the inactivation of genes that encode essential transcription factors for B-lineage differentiation. This results in cells being trapped within the precursor compartment, thereby elevating RAG gene expression. Our interpretation of the data suggests that, in BCR-ABL1+ B-ALL mouse models, the high expression of both cRAG and fRAG and the deletion of the non-core regions influence the precision of RAG targeting within the genome. This causes more genomic damage in cRAG tumors than in fRAG tumors, consequently leading to the observed increased aggressiveness of cRAG tumors compared to fRAG tumors. We discussed the issues on Page 12, lines 295-307 in the revised manuscript.

      Some of the conclusions drawn were not supported by the data.

      (1) I'm not sure that the authors can conclude based on μHC expression that there is a loss of pre-BCR checkpoint in cRAG tumors. In fact, Fig. 2B showed that the differences are not statistically significant overall, and more importantly, μHC expression should be detectable in small pre-B cells (CD43-). This is also corroborated by the authors' analysis of VDJ rearrangements, showing that it has occurred at the H chain locus in cRAG cells.

      We appreciate the insightful comment from the reviewer. Upon reevaluation of the data presented in Fig. 2B, we identified and rectified certain errors. The revised analysis now shows that the differences in μHC expression are statistically significant. This significant expression of μHC in fRAG leukemic cells implies that these cells may progress further in differentiation, potentially acquiring an immune phenotype. These modifications have been incorporated into the manuscript on page 7, lines 153-156 in the revised manuscript.

      (2) The authors found a high degree of polyclonal VDJ rearrangements in fRAG tumor cells but a much more limited oligoclonal VDJ repertoire in cRAG tumors. They concluded that this explains why cRAG tumors are more aggressive because BCR-ABL induced leukemia requires secondary oncogenic hits, resulting in the outgrowth of a few dominant clones (Page 19, lines 381-398). I'm not sure this is necessarily a causal relationship since we don't know if the oligoclonality of cRAG tumors is due to selection based on oncogenic potential or if it may actually reflect a more restricted usage of different VDJ gene segments during rearrangement.

      Thank you for your insightful comments and questions regarding the relationship between the oligoclonality of V(D)J rearrangements and the aggressiveness of cRAG tumors. You raise an important point regarding whether the observed oligoclonality is a result of selective pressure favoring clones with specific oncogenic potential, or if it reflects inherent limitations in V(D)J segment usage during rearrangement in cRAG models. In our study, we observed a marked difference in the V(D)J rearrangement patterns between fRAG and cRAG tumor cells, with cRAG tumors exhibiting a more limited, oligoclonal repertoire. This observation led us to speculate that the aggressive nature of cRAG tumors might be linked to a selective advantage conferred by specific V(D)J rearrangements that cooperate with the BCR-ABL1 oncogene to drive leukemogenesis. However, we acknowledge that our current data do not definitively establish a causal relationship between oligoclonality and tumor aggressiveness. The restricted V(D)J repertoire in cRAG tumors could indeed be due to a more constrained rearrangement process, possibly influenced by the altered expression or function of RAG1/2 in the absence of non-core regions. This could limit the diversity of V(D)J rearrangements, leading to the emergence of a few dominant clones not necessarily because they have greater oncogenic potential, but because of a narrowed field of rearrangement possibilities.

      To address this question more thoroughly, future studies could examine the functional consequences of specific V(D)J rearrangements found in dominant cRAG tumor clones. This could include assessing the oncogenic potential of these rearrangements in isolation and in cooperation with BCR-ABL1, as well as exploring the mechanistic basis for the restricted V(D)J repertoire. Such studies would provide deeper insight into the interplay between RAG-mediated recombination, clonal selection, and leukemogenesis in BCR-ABL1+ B-ALL.

      We appreciate your feedback on this matter and agree that further investigation is required to unravel the precise relationship between V(D)J rearrangement diversity and leukemic progression in cRAG models. We have revised our discussion to reflect these considerations and to clarify the speculative nature of our conclusions regarding the link between oligoclonality and tumor aggressiveness. We added more discussion on this issue on Page 7, lines 166-170 in the revised manuscript.

      (3) What constitutes a cancer gene can be highly context- and tissue-dependent. Given that there is no additional information on how any putative cancer gene was disrupted (e.g., truncation of regulatory or coding regions), it is not possible to infer whether increased off-target cRAG activity really directly contributed to the increased aggressiveness of leukemia.

      We totally agree you raised the issues. In Supplementary Table 3, we have presented data on off-target gene disruptions, specifically in introns, exons, downstream regions, promoters, 3' UTRs, and 5' UTRs. However, this dataset alone does not suffice to conclusively determine whether the increased off-target activity of cRAG directly influences the heightened aggressiveness of leukemia. To bridge this knowledge gap, our future research will extend to include both knockout and overexpression experiments targeting these off-target genes.

      (4) Fig. 6A, it seems that it is really the first four nucleotide (CACA) that determines fRAG binding and the first three (CAC) that determine cRAG binding, as opposed to five for fRAG and four for cRAG, as the author wrote (page 24, lines 493-497).

      We thank the reviewer for the insightful comment. In response, we have revised the text to accurately reflect the nucleotide sequences responsible for RAG binding and cleavage. Specifically, we now clarify that the first four nucleotides (CACA) are crucial for fRAG binding and cleavage, while the initial three nucleotides (CAC) are essential for cRAG binding and cleavage. These updates have been made on page 10, lines 242-245 of the revised manuscript.

      (5) Fig S3B, I don't really see why "significant variations in NHEJ" would necessarily equate "aberrant expression of DNA repair pathways in cRAG leukemic cells". This is purely speculative. Since it has been reported previously that alt-EJ/MMEJ can join off target RAG breaks, do the authors detect high levels of microhomology usage at break points in cRAG tumors?

      We appreciate the reviewer's comment. Currently, we have not observed microhomology usage at breakpoints in cRAG tumors. We plan to address this aspect in a future, more detailed study. Regarding the 'aberrant expression of DNA repair pathways in cRAG leukemic cells, we acknowledge that this is speculative. Therefore, we have carefully rephrased this to 'suggesting a potential aberrant expression of DNA repair pathways in cRAG leukemic cells.' This modification is reflected on page 12, lines 290-291 of the revised manuscript.

      (6) Fig. S7, CDKN2B inhibits CDK4/6 activation by cyclin D, but I don't think it has been shown to regulate CDK6 mRNA expression. The increase in CDK6 mRNA likely just reflects a more proliferative tumor but may have nothing to do with CDKN2B deletion in cRAG1 tumors.

      We fully concur with the reviewer's comment. We have deleted this inappropriate part from the text.

      Insufficient details in some figures. For instance, Fig. 1A, please include statistics in the plot showing a comparison of fRAG vs cRAG1, fRAG vs cRAG2, cRAG1 vs cRAG2. As of now, there's a single p-value (0.0425) stated in the main text and the legend but why is there only one p-value when fRAG is compared to cRAG1 or cRAG2? Similarly, the authors wrote "median survival days 11-26, 10-16, 11-21 days, P < 0.0023-0.0299, Fig. S2B." However, it is difficult for me to figure out what are the numbers referring to. For instance, is 11-26 referring to median survival of fRAG inoculated with three different concentrations of GFP+ leukemic cells or is 11-26 referring to median survival of fRAG, cRAG1, cRAG2 inoculated with 10^5 cells? It would be much clearer if the authors can provide the numbers for each pair-wise comparison, if not in the main text, then at least in the figure legend. In Fig. 5A-B, do the plots depict SVs in cRAG tumors or both cRAG and fRAG cells? Also in Fig. 5, why did 24 SVs give rise to 42 breakpoints, and not 48? Doesn't it take 2 breaks to accomplish rearrangement? In Fig. 6B-C, it is not clear how the recombination sizes were calculated. In the examples shown in Fig. 4, only cRAG1 tumors show intra-chromosomal joins (chr 12), while fRAG and cRAG2 tumors show exclusively inter-chromosomal joins.

      We appreciate the reviewer's feedback and have made the following revisions:

      (1) The text has been adjusted to rectify the previously mentioned error in the figure legends (page 1, lines 5-6).

      (2) We have clarified the intended message in the revised text (page 6, lines 129-130) and the figure legend (page 4-5, lines 107-113) for greater precision.

      (3) Figure 5A-B now presents an overview of all structural variants (SVs) identified in both cRAG and fRAG cells, offering a comprehensive comparison.

      (4) Among the analyzed SVs, 24 generated a total of 48 breakpoints, with 41 occurring within gene bodies and the remaining 7 in adjacent flanking sequences. This informs our exon-intron distribution profile analysis.

      (5) We have defined recombination sizes as ‘the DNA fragment size spanning the two breakpoints’ for clarity (page 10, lines 251-252).

      (6) All off-target recombinations identified in the genome-wide analyses of fRAG, cRAG1, and cRAG2 leukemic cells were determined to be intra-chromosomal joins, highlighting their specific nature within the genomic context.

      Insufficient details on certain reagents/methods. For instance, are the cRAG1/2 mice of the same genetic background as fRAG mice (C57BL/6 WT)? On Page 23, line 481, what is a cancer gene? How are they defined? In Fig. 3C, are the FACS plots gated on intact cells? Since apoptotic cells show high levels of gH2AX, I'm surprised that the fraction of gH2AX+ cells is so much lower in fRAG tumors compared to cRAG tumors. The in vitro VDJ assay shown in Fig 3B is not described in the Method section (although it is described in Fig S5b). Fig. 5A-B, do the plots depict SVs in cRAG tumors or both cRAG and fRAG cells?

      We are grateful for the reviewer's feedback and have incorporated their insights as follows:

      (1) We clarify that both cRAG1/2 and fRAG mice share the same genetic background, specifically the C57BL/6 WT strain, ensuring consistency across experimental models.

      (2) We define a 'cancer gene' as one harboring somatic mutations implicated in cancer. To support our analysis, we refer to the Catalogue Of Somatic Mutations In Cancer (COSMIC) at http://cancer.sanger.ac.uk/cosmic. COSMIC serves as the most extensive repository for understanding the role of somatic mutations in human cancers.

      (3) Upon thorough review of the raw data for γ-H2AX and the fluorescence-activated cell sorting (FACS) plots gated on intact cells, we propose that the observed discrepancies might stem from the limited sensitivity of the γ-H2AX flow cytometry detection method. This insight prompts our commitment to employing more efficient detection methodologies in forthcoming studies.

      (4) Detailed procedures for the in vitro V(D)J recombination assay have been included in the Methods section (page 15, lines 384-388) to enhance the manuscript's comprehensiveness and reproducibility.

      (5) The presented plots offer a comprehensive overview of structural variants (SVs) identified in both cRAG and fRAG cells, providing a holistic view of the genomic landscape across different models.

      Reviewer #3 (Public Review):

      Summary:

      In the manuscript, the authors summarized and introduced the correlation between the non-core regions of RAG1 and RAG2 in BCR-ABL1+acute B lymphoblastic leukemia and off-target recombination which has certain innovative and clinical significance.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      I would suggest that the authors tone down some of their conclusions, which are not necessarily supported by their own data. in addition, there are some minor mistakes in figure assembly/presentation. For instance, I believe that the axes labels in Fig. 1E were flipped. BrdU should be on y-axis and 7-AAD on the x-axis. Fig. 3B, the y-axis contains a typo, it should be "CD90.1..." and not "D90.1...". In Fig. 5C, the numbers seem to be flipped, with 93% corresponding to cRAG1 and 100% to cRAG2 (compare with the description on page 23, lines 474-475). Fig. 5C, y-axis, "hybrid" is a typo. Page 3, line 59: The abbreviation of RSS has already been described earlier (p4, line 53).

      We thank the reviewer for these suggestions. We carefully checked the raw data and corrected these mistakes in the revised manuscript.

      Page 3, line 63: "signal" segment (commonly referred to as signal ends), not "signaling" segment.

      We have changed “signaling segment” to “signal ends in the revised manuscript. (page 3, lines 54-55)

      Page 3, lines 64-65: VDJ recombination promotes the development of both B and T cells, and aberrant recombination can cause both B and T cell lymphomas.

      The statement about the role of V(D)J recombination in B and T cell development and its link to lymphomagenesis is grounded in a substantial body of research. Theoretical frameworks and empirical studies delineate how aberrations in the recombination process can lead to genomic instability, potentially triggering oncogenic events. This connection is extensively documented in immunology and oncology literature, illustrating the critical balance between necessary genetic rearrangements for immune diversity and the risk of malignancy when these processes are dysregulated (Thomson, et al.,2020; Mendes, et al.,2014; Onozawa and Aplan,2012).

      Page 4, line 72: "recombinant dispensability" is not a commonly used phrase. Do the authors mean the say that the non-core regions of RAG1/2 are not strictly required for VDJ recombination?

      We thank the reviewers for their insightful suggestion. We have revised the sentence to read, 'Although the non-core regions of RAG1/2 are not essential for V(D)J recombination, the evolutionary conservation of these regions suggests their potential significance in vivo, possibly affecting RAG activity and expression in both quantitative and qualitative manners.' This revision appears on page 3, lines 61-62, in the revised manuscript.

      Fig. 4. It would have been nice to show at least one more cRAG1 tumor circus plot.

      We appreciate the reviewer's comment and concur with the suggestion. In future sequencing experiments, we will consider including additional replicates. However, due to time and financial constraints, the current sequencing effort was limited to a maximum of three replicates.

      Reviewer #3 (Recommendations For The Authors):

      In the manuscript, the authors summarized and introduced the correlation between the non-core regions of RAG1 and RAG2 in BCR-ABL1+acute B lymphoblastic leukemia and off-target recombination which has certain innovative and clinical significance. The following issues need to be addressed by the authors.

      (1) Authors should check and review extensively for improvements to the use of English.

      We thank the reviewer for their comment. With assistance from a native English speaker, we have carefully revised the manuscript to enhance its readability.

      (2) Authors should revise the conclusion so that the above can be clearly reviewed and summarized.

      The conclusion has been partially revised in the revised manuscript.

      (3) The article should state that the experiment was independently repeated three times.

      The experiment was repeated under the same conditions three times and the information has been descripted in Statistics section on page 19, lines 473-475 in the revised manuscript.

      (4) The article will be more convincing if it uses references in the last 5 years.

      We are grateful to the reviewer for their guidance in enhancing our manuscript. We have incorporated additional references from the past five years in the revised version.

      (5) Additional experiments are suggested to elucidate the molecular mechanisms related to off-target recombination.

      We thank the reviewer for this suggestion. In future experiments, we plan to perform ChIP-seq analysis to investigate the relationship between chromatin accessibility and off-target effects, as well as to examine the impact of knocking out and overexpressing off-target genes on cancer development and progression.

      (6) It is suggested to further analyze the effect of the absence of non-core RAG region on the differentiation and development of peripheral B cells in mice by flow analysis and expression of B1 and B2.

      Thank you very much for highlighting this crucial issue. FACS analysis was performed, revealing that leukemia cells in peripheral B cells in mice did not express CD5. The data are presented as follows:

      Author response image 1.

      (7) Fig3A should have three biological replicates and the molecular weight should be labeled on the right side of the strip.

      Thank you for this suggestion. The experiment was independently repeated three times, and the molecular weights have been labeled on the right side of the bands in the revised version

      References:

      Mendes RD, Sarmento LM, Canté-Barrett K, Zuurbier L, Buijs-Gladdines JG, Póvoa V, Smits WK, Abecasis M, Yunes JA, Sonneveld E, Horstmann MA, Pieters R, Barata JT, Meijerink JP. 2014. PTEN microdeletions in T-cell acute lymphoblastic leukemia are caused by illegitimate RAG-mediated recombination events. BLOOD 124:567-578. doi:10.1182/blood-2014-03-562751

      Onozawa M, Aplan PD. 2012. Illegitimate V(D)J recombination involving nonantigen receptor loci in lymphoid malignancy. Genes Chromosomes Cancer 51:525-535. doi:10.1002/gcc.21942

      Thomson DW, Shahrin NH, Wang P, Wadham C, Shanmuganathan N, Scott HS, Dinger ME, Hughes TP, Schreiber AW, Branford S. 2020. Aberrant RAG-mediated recombination contributes to multiple structural rearrangements in lymphoid blast crisis of chronic myeloid leukemia. LEUKEMIA 34:2051-2063. doi:10.1038/s41375-020-0751-y

    2. eLife assessment

      Using a set of animal models, this valuable paper shows tumor suppressive function of the non-core regions of RAG1/2 recombinases. The conclusions are supported by solid evidence.

    3. Reviewer #1 (Public Review):

      Summary:

      In this report, Yu et al ascribe potential tumor suppressive functions to the non-core regions of RAG1/2 recombinases. Using a well-established BCR-ABL oncogene-driven system, the authors model the development of B cell acute lymphoblastic leukemia in mice and found that RAG mutants lacking non-core regions show accelerated leukemogenesis. They further report that the loss of non-core regions of RAG1/2 increases genomic instability, possibly caused by increased off-target recombination of aberrant RAG-induced breaks. The authors conclude that the non-core regions of RAG1 in particular not only increases the fidelity of VDJ recombination, but may also influence the recombination "range" of off-target joints, and that in the absence of the non-core regions, mutant RAG1/2 (termed cRAGs) catalyze high levels of off-target recombination leading to the development of aggressive leukemia.

      Strengths:

      The authors used a genetically defined oncogene-driven model to study the effect of RAG non-core regions have on leukemogenesis. The animal studies were well performed and generally included a good number of mice. Therefore, the finding that cRAG expression led to development of more aggressive BCR-ABL+ leukemia compared to fRAG is solid. The authors also present some nice analyses that characterize the (genomic) nature of aggressive leukemia that develop in the absence of RAG non-core regions.

      Weaknesses:

      The paper relies on cRAG1/2 overexpression, an experimental limitation that needs to be taken into consideration when extrapolating the physiological relevance of the findings.

    1. Reviewer #2 (Public Review):

      Summary:

      The paper entitled "Goal-directed motor actions drive acetylcholine dynamics in sensory cortex" aims to characterize the dynamics of cholinergic signaling in sensory cortex during perceptual behavior. The authors showed that acetylcholine release in S1 was linked to goal-directed motor actions rather than sensory input or reward delivery, a pattern also observed in the auditory cortex (A1). This release was specifically associated with whisking and licking and was potentiated by training. The results contribute to a better understanding of neuromodulator actions. That said, several aspects of the manuscript could benefit from improved writing, data presentation, and statistical analysis.

      Strengths:

      The evidence provided is clear to link ACh response to different task-related events. Implementing two different tasks to show generality is appreciated. Important control analysis is included.

      Weaknesses:

      The quantification of ACh signal differences across different trial types or between expert and early-training mice is lacking. Although statistical significance is occasionally mentioned, the indication of significance in figures seems rare. For example, in Figures 5A and E, it is difficult to tell when p is < 0.05. Based on the sentence "small, but significant increase on Hits over False Alarm trials (Figure 5A, S Figure 4A)" there is indeed a time point where the difference is significant, and more details should be added (when and the p-value).

      For Figure 5D, it seems like there is no significant difference between Hit and False alarm trials, however, for the trials with 1 or 2 lick there appears to be a difference. Is it due to a lack of power? Moreover, in Figure 5 H the first licks also seem to differ.

      Linear regression: the coefficient of determination (R²) is absent, in Figures 4E, F, and 6B, H, making it hard to evaluate the goodness of the fitting.

      Similar comments apply to Figure 7: the lack of quantitative comparisons between the coefficients of first lick and other regressors, and between early and expert training, as well as the change in goodness of fit by removing a regressor.

      The writing of the introduction and discussion could be improved to enhance readability, and the manuscript could improve its discussion on orofacial movement and acetylcholine release by citing relevant studies demonstrating the association between neuronal activity and orofacial/body movements.

    2. eLife assessment

      This study provides important evidence that links acetylcholine responses in the sensory cortex to motor actions during perceptual tasks, rather than to rewards. The evidence for the association between acetylcholine responses and motor actions is solid, but does not demonstrate the causal link implied by the title and abstract. The manuscript would benefit from a more detailed description of results and methodologies. This study is of broad interest to the neuroscience field.

    3. Reviewer #1 (Public Review):

      Summary:

      This study aimed at gaining a better comprehension of the functional role of acetylcholine release within the sensory cortex. To this end, the authors measured the dynamics of cortical acetylcholine release using two-photon imaging of the GRAB-Ach3.0 fluorescent sensor, either in the mouse primary somatosensory cortex (S1), throughout the learning of a whisker-dependent object position discrimination task, or in the primary auditory cortex (A1) of mice engaged in a specific sound signal detection task.

      The illustrated results suggest that variations in acetylcholine release tend to be associated, in the primary sensory areas, with goal-directed actions (whisking in the case of the object position discrimination task, and more strongly with licking), rather than with sensory inputs or rewards. They also indicate that the variations in cholinergic signal specifically associated with licking increase with learning.

      Strengths:

      The impact of cholinergic inputs on cortical function has intrigued neuroscientists for many decades due to the complexity of its mode of action on the molecular and cellular points of view.

      Being able to image the dynamics of cortical cholinergic release in vivo on mice engaged in goal-directed tasks has moved this field into a really exciting phase, where it becomes possible to draw links between specific behavioral features and local variations of cholinergic release in given cortical areas.

      This study is therefore particularly timely, it provides a set of precious and original data. Globally the experiments were rigorously designed, and the illustrated quantifications and analyses follow high standards. This work therefore constitutes a valuable contribution to this field of research and could be of interest to a large audience.

      Weaknesses:

      Although the manuscript reports very interesting links between behavior and cortical cholinergic release, the study remains correlative and is devoid of experiments allowing to link causally cholinergic cortical inputs with motor actions, and more globally to gauge their impact on learning and execution of the tasks. Since the nature of the link between goal-directed motor actions and acetylcholine dynamics is not really clarified here, the word "drive" in the title of the paper, which may have a causal connotation should be replaced (especially since acetylcholine-related signal fluctuations seems often to precede motor actions).

      As high-speed videography of the C2 whisker was achieved during the object position discrimination task, it seems that the whisker curvature changes could have been quantified in addition to the whisker angle. This would allow appreciation of how acetylcholine related signals vary according to both whisker-related motor output and sensory input, hereby providing clearer support for the assertion that acetylcholine levels are "related to motor actions rather than sensory inputs".

      The data set related to the auditory task is used here to support the claim that licks rather than rewards are linked to variations of fluorescence of the cholinergic sensor in sensory cortices. These data seem very interesting indeed but are shown here in a very incomplete manner (a figure illustrating the learning curves of the 6 recorded animals, and acetylcholine dynamics during the four types of trials would be very welcome). If the animals were placed on a treadmill and the locomotion measured, together with pupil size, during the task as in Gee et al., BioRxiv 2022, one could ask how these other motor activities are linked with acetylcholine dynamics in A1. By comparing the impact of goal-directed actions versus motor activities accompanying more global state transitions on acetylcholine dynamics, these data could provide a particularly valuable contribution to this study. They could in addition rule out potential confounding factors regarding the claim that cholinergic dynamics are here mainly linked to first licks.

      Coming back to the whisker-dependent object localization task, if cholinergic-related signals have been recorded during the "no whisker sessions", analyzing these data would be very useful in the scope of this study. Indeed, during these sessions, the animals were not naive, since they went through the learning of the task, but could not resolve it anymore, still they most probably kept on licking upon the pole-in and/or pole-out cues. In these sessions, the licking is fully dissociated from tactile sensory inputs, and for this reason it would be particularly interesting to see how the fluorescence varies with first licks. In addition, plotting these sessions in Figure 6C would be informative. Indeed, if the increase of cholinergic signals with performance comes progressively due to changes in the internal state of the animal and/or plasticity mechanisms, first lick related cholinergic signal variations could remain high despite the decrease of performance in these sessions.

      Finally, because the functional role of cortical cholinergic release is a hot topic, a few recent studies addressing this question with slightly different approaches in the visual cortex would be worth mentioning, at least in the discussion, as well as a recent study focusing on motor learning, which revealed an apparent decrease of acetylcholine dynamics associated with goal-directed motor actions upon learning.

    1. Reviewer #2 (Public Review):

      Summary:

      While many studies have explored the impacts of pathogens on hosts, the effect of hosts on pathogens has received less attention. In this manuscript, Wang et al. utilize Drosophila melanogaster and an opportunistic pathogen, Serratia marcescens, to explore how the host impacts pathogenicity. Beginning with an observation that larval presence and density impacted microbial growth in fly vials (which they assess qualitatively as the amount of 'slick' and quantitatively as microbial load/CFUs), the authors focus on the impact of axenic/germ-free larvae on an opportunistic pathogen S. marcescens. Similar to their observations with general microbial load, they find that larvae reduce the presence of a pinkish slick of Sm, indicative of its secondary metabolite prodigiosin. The presence of larvae alters prodigiosin production, pathogen load, pathogen cellular morphology, and virulence, and this effect is through transcriptional and metabolic changes in the pathogen. Overall, they observe a loss of virulence factors/pathways and an increase in pathways contributing to growth. Given the important role the host plays in this lifestyle shift, the authors then examined host features that might influence these effects, focusing on the role of antimicrobial peptides (Amps). The authors combine the use of synthetic Amps and an Amp-deficient fly line and conclude much of the larval inhibitory effect is due to their production of AMPs.

      Strengths:

      This is a very interesting question and the use of Drosophila-Serratia marcescens is a great model to explore these interactions and effects.

      The authors have an interesting and compelling phenotype and are asking a unique question on the impact of the host on the pathogen. The use of microbial transcriptomics and metabolomics is a strength, especially in order to assess these impacts on the pathogen level and at single-cell level to capture heterogeneity.

      Weaknesses:

      Overall, the writing style in the manuscript makes it difficult to fully understand and appreciate the data and its interpretation.

      The data on the role of AMPs would benefit from strengthening. Some of the arguments in the text of that section are also counterintuitive. The authors show that AMP larvae have a reduced impact on Sm as compared to wt larvae, but it seems less mild of an effect than that observed with wt excreta (assuming the same as secreta in Figures 7, should be corrected or harmonized). Higher doses of AMPs give a phenotype similar to wt larvae, but a lower dose (40 ng/ul) gives phenotypes more similar to controls. The authors argue that this data suggests AMPs are the factor responsible for much of the inhibition, but their data seems more to support that it's synergistic- you seem to still need larvae (or some not yet defined feature larvae make, although secreta/excreta was not sufficient) + AMPs to see similar effects as wt. Based on positioning and color scheme guessing that AMP 40ng/ul was used in Figures 7D-H, but could not find this detail in the text, methods, or figure legend and it should be indicated. This section does not seem to be well supported by the provided data, and this inconsistency greatly dampened this reviewer's enthusiasm for the paper.

    2. eLife assessment

      This valuable study examines the role of a host in conditions that shift pathogenicity of opportunistic microbes. The use of single-cell microbial transcriptomics and metabolomics to demonstrate the host's effects on pathogen dynamics is interesting and convincing. However, the connection to host antimicrobial peptides driving these effects is incomplete and would benefit from additional evidence and improved explanation in the text. This paper has the potential to be of broad interest to those working in host-microbe (microbiome and pathogen) interactions.

    3. Reviewer #1 (Public Review):

      Summary:

      In this work, Wang and colleagues used Drosophila-Serratia as a host-microbe model to investigate the impact of the host on gut bacteria. The authors showed that Drosophila larvae reduce S. marcescens abundance in the food likely due to a combination of mechanical force and secretion of antimicrobial peptides. S. marcescens exposed to Drosophila larvae lost virulence to flies and could promote larval growth similar to typical Drosophila gut commensals. These phenotypic changes were reflected in the transcriptome and metabolome of bacteria, suggesting that the host could drive the switch from pathogenicity to commensalism in bacteria. Further, the authors used single-cell bacterial RNA-seq to demonstrate the heterogeneity in gut bacterial populations.

      Strengths:

      This is a valuable work that addresses an important question of the effect of the host on its gut microbes. The authors could convincingly demonstrate that gut bacteria are strongly affected by the host with important consequences for both interacting partners. Moreover, the authors used state-of-the-art bacterial single-cell RNA-seq to reveal heterogeneity in host-associated commensal populations.

      Weaknesses:

      Some of the conclusions are not fully supported by the data.

      Specifically, in lines 142-143, the authors claim that larva antagonizes the pathogenicity of S. marcescens based on the survival data. I do not fully agree with this statement. An alternative possibility could be that, since there are fewer S. marcescens in larvae-processed food, flies receive a lower pathogen load and consequently survive. Can the authors rule this out?

      Also, the authors propose that Drosophila larvae induce a transition from pathogenicity to commensalism in S. marcescens and provide nice phenotypic and transcriptomic data supporting this claim. However, is it driven only by transcriptional changes? Considering high mutation rates in bacteria, it is possible that S. marcescens during growth in the presence of larvae acquired mutations causing all the observed phenotypic and transcriptional changes. To test this possibility, the authors could check how long S. marcescens maintains the traits it acquires during growth with Drosophila. If these traits persist after reculturing isolated bacteria, it is very likely they are caused by genome alterations, if not - likely it is a phenotypic switch driven by transcriptional changes.

    4. Reviewer #3 (Public Review):

      In this study, Wang and coworkers established a model of Drosophila-S. marcescens interactions and thoroughly examined host-microbe bidirectional interactions. They found that:

      (1) Drosophila larvae directly impact microbial aggregation and density;<br /> (2) Drosophila larvae affect microbial metabolism and cell wall morphology, as evidenced by reduced prodigiosin production and EPS production, respectively;<br /> (3) Drosophila larvae attenuate microbial virulence;<br /> (4) Drosophila larvae modulate the global transcription of microbes for adaptation to the host;<br /> (5) Microbial single-cell RNA sequencing (scRNA-seq) analysis revealed heterogeneity in microbial pathogenicity and growth;<br /> (6) AMPs are key factors controlling microbial virulence phenotypes.

      Taken together, they concluded that host immune factors such as AMPs are directly involved in the pathogen-to-commensal transition by altering microbial transcription.

      General comments:

      In general, this study is intriguing as it demonstrates that host immune effectors such as AMPs can serve as critical factors capable of modulating microbial transcription for host-microbe symbiosis. However, several important questions remain unanswered. One such question is: What is the mechanism by which AMPs modulate the pathogen-to-commensal transition? One hypothesis suggests that antimicrobial activity may influence microbial physiology, subsequently modulating transcription for the transition from pathogen to commensal. In this context, it is imperative to test various antibiotics with different modes of action (e.g., targeting the cell wall, transcription, or translation) at sub-lethal concentrations to determine whether sub-lethal doses of antimicrobial activity are sufficient to induce the pathogen-to-commensal transition.

    1. Author response:

      The authors express their gratitude to the reviewers for their insightful comments.

      Reviewer #1: We are uncertain about the reference to an overjudgement of the recovery of spermatogonial stem cells, as we did not draw any conclusions on this in the current study. Additionally, we have received feedback mentioning the multitude and diversity of datasets as both a strength and a weakness. However, we would appreciate clarification on which datasets may have been insufficiently reviewed and how our selection of highlights may have introduced bias to the interpretation and conclusion of the study. It is important to note that we did not select any patients/ data; all patient data were incorporated into our results section. We acknowledge the need for clarification regarding our study population for the germ cell stainings. As stated in our Materials and Methods section, our current study population includes the cohort from our previous publication (Vereecke et al., 2020), supplemented by nine additional participants, totaling n=106 trans women. While Fig. 1C incorporates both previous and new data on germ cells, we understand the need to clarify this to avoid confusion. Additionally, we will include information on the Tanner stages of the trans women in our cohort (all G5), as well as details on the selection criteria for our controls and their Tanner stages. As briefly touched upon in the discussion, a marker such as delta-like homolog 1 would indeed be valuable to assess the presence of truly immature Leydig cells. Unfortunately, our attempts to optimize the immunofluorescence protocol for this marker were unsuccessful, resulting in a double staining instead of a triple staining for the Leydig cells. The suboptimal resolution of Fig.1 will be solved.

      Reviewer #2 raises concerns regarding the suitability of rejuvenated testicular tissue for research purposes. However, we emphasize that this tissue source holds significant value. Although there is a wide availability of adult testicular tissue (coming from prostate cancer patients or vasectomy reversal patients), we are especially looking for alternatives for the scarce prepubertal/ pubertal tissue for research on in vitro spermatogenesis. While we acknowledge that transgender tissue with severe hyalinization or without spermatogonia may not be suitable for such research, the abundance of transgender tissue without these issues emphasizes the value of this tissue source.

    2. eLife assessment

      This important study presents new knowledge of the spermatogonial stem cell (SSC) niche in trans women after gender-affirming hormone therapy (GAHT). While the evidence supporting the claims is convincing, weaknesses identified by both reviewers should be addressed. The work will be of interest to researchers and clinicians working in the field of sexual medicine and andrology.

    3. Reviewer #1 (Public Review):

      Summary:

      This is a nice paper taking a broad range of aspects and endpoints into account. The effect of GAHT in girls has been nicely worked out. Changes in Sertoli and peritubular cells appear valid, less strong evidence is provided for Leydig cell development. The recovery of SSCs appears an overjudgement and should be rephrased. The multitude and diversity of datasets appear a strength and a weakness as some datasets were not sufficiently critically reviewed and a selection of highlights provides a certain bias to the interpretation and conclusion of the study.

      The authors need to indicate that the subset of data on SSCs has been reported previously (Human Reprod 36: 5-15 (2021) and is simply re-incorporated in the present paper. as Fig. 1C. There are sufficient new results to publish the remaining datasets as a separate paper. Authors could refer to the SSC data with reference to the previous publication.

      Strengths:

      The patient cohort is impressive and is nicely characterized. Here, histological endpoints and endocrine profiles were analyzed appropriately for most endpoints. The paper is well-written and has many new findings.

      Weaknesses:

      The patients and controls are poorly separated in regard to pubertal status. Here additional endpoints (e.g. Tanner status) would have been helpful especially as the individual patient history is unknown. Pre- and peri-puberty is a very rough differentiation. The characterization and evaluation of Leydig cells is the weakest histological endpoint. Here, additional markers may be required. Fig. 1 suffers from suboptimal micrograph quality.

    4. Reviewer #2 (Public Review):

      Summary:

      The study is devoted to the deep investigation of the spermatogonial stem cell (SSC) niche in trans women after gender-affirming hormone therapy (GAHT). Both cellular structure and functionality of the niche were studied. The authors evidently demonstrated that all cellular components of SSC niche were affected by hormone therapy. Interestingly, the signs of "rejuvenation" within the niche were also observed indicating the possible reverse to the immature condition.

      Strengths:

      The obtained findings are important for the better understanding of hormonal regulation of testis and SSC niche and provide some clues for using the biomaterials from these specific and even unique donors for biomedical research.

      Weaknesses:

      This study has some limitations. Many studies can't be done using the testes cells of trans women, since their cells are significantly different from adult man cells and less from prepubertal and pubertal cells. The authors themselves identify some of the limitations: this material is suitable only for studying prepubertal processes in the testis. However, the authors also report large variability in data due to different hormonal therapy regimens and, apparently, age. Accordingly, not all material obtained from trans women can also be used for studies of prepubertal processes.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to thank both Editors and reviewers for their valuable time, careful reading, and constructive comments. The comments have been highly valuable and useful for improving the quality of our study, as well as important in guiding the direction of our present and future research. In the revised manuscript, we have incorporated the necessary changes including additional experimental data as suggested; please find our detailed pointby-point response to the reviewer’s comments and the changes we have made in the manuscript as follows.

      Reviewer #1 (Public Review):

      In this work, the authors have explored how treating C. albicans fungal cells with EDTA affects their growth and virulence potential. They then explore the use of EDTA-treated yeast as a whole-cell vaccine in a mouse model of systemic infection. In general, the results of the paper are unsurprising. Treating yeast cells with EDTA affects their growth and the addition of metals rescues the phenotype. Because of the significant growth defects of the cells, they don't infect mice and you see reduced virulence. Injection with these cells effectively immunises the mice, in the same way that heatkilled yeast cells would. The data is fairly sound and mostly well-presented, and the paper is easy to follow. However, I feel the data is an incremental advance at best, and the immune analysis in the paper is very basic and descriptive.

      Strengths:

      Detailed analysis of EDTA-treated yeast cells

      Weaknesses:

      • Basic immune data with little advance in knowledge.

      • No comparison between their whole-cell vaccine and others tried in the field.

      • The data is largely unsurprising and not novel.

      Reply: Thank you so much for appreciating our effort to generate a whole cell anti-fungal vaccine by treating C. albicans cells with EDTA. Also, we appreciate your comment that the manuscript is sound and well-presented. However, we are afraid that the respected reviewer assumed the CAET cells as dead cells while they only divide relatively slower than the untreated cells. In the revised manuscript, we have presented additional evidence to show that CAET are live cells (Supp. Figs 2) and based on the new data, we expect a positive change in the reviewer’s opinion. Since CAET is a live strain, the data presented here is novel.

      Reviewer #2 (Public Review):

      Summary:

      Invasive fungal infections are very difficult to treat with limited drug options. With the increasing concern of drug resistance, developing an antifungal vaccine is a high priority. In this study, the authors studied the metal metabolism in Candida albicans by testing some chelators, including EDTA, to block the metal acquisition and metabolism by the fungus. Interestingly, they found EDTAtreated yeast cells grew poorly in vitro and non-pathogenic in vivo in a murine model. Mice immunized by EDTA-treated Candida (CAET) were protected against challenge with wild-type Candida cells. RNA-Seq analysis to survey the gene expression profile in response to EDTA treatment in vitro revealed upregulation of genes in metal homeostasis and downregulation of ribosome biogenesis. They also revealed an induction of both pro- and anti-inflammatory cytokines involved in Th1, Th2 and Th17 host immune response in response to CAET immunization. Overall, this is an interesting study with translational potential.

      Strengths:

      The main strength of the report is that the authors identified a potential whole-cell live vaccine strain that can provide full protection against candidiasis. Abundant data both on in vitro phenotype, gene expression profile, and host immune response have been presented.

      Weaknesses:

      A weakness is that the immune mechanism of CAET-mediated host protection remains unclear. The immune data is somewhat confusing. The authors only checked cytokines and chemokines in blood. The immune response in infected tissues and antibody response may be investigated.

      Reply: Thank you very much for appreciating our work and finding our strain to be a live whole-cell anti-fungal vaccine strain with translational potential. Since the current study focused on the identification and detailed characterizations of a non-genetically modified live-attenuated strain and determination of its safety and efficacy as a potential vaccine candidate in the preclinical model, we have excluded the possible immune mechanisms involving CAET. In a separate study, we are currently investigating both cellular and molecular mechanisms that provide protective immunity in CAET-vaccinated mice.

      Reviewer #3 (Public Review):

      Summary:

      The authors are trying to find a vaccine solution for invasive candidiasis.

      Strengths:

      The testing of the antifungal activity of EDTA on Candida is not new as many other papers have examined this effect. The novelty here is the use of this EDTA-treated strain as a vaccine to protect against a secondary challenge with wild-type Candida.

      Weaknesses:

      However, data presented in Figure 5 and Figure 6 are not convincing and need further experimental controls and analysis as the authors do not show a time-dependent effect on the CFU of their vaccine formulation. The methodology used is also an issue. As it stands, the impact is minor.

      Reply: Thank you so much for appreciating our efforts to develop a novel vaccine against fungal infections. We are extremely sorry for the lack of clarity in our writing related to Figs. 5 and 6, we have now modified the text and hope that the respected reviewer will find these convincing.

      Recommendations for the authors:

      Although the reviewers recognize the importance of the manuscript, they would like to see: 1) comparisons between their whole-cell vaccine and others tried in the field, 2) an investigation of the immune response in infected tissues and antibody response, and 3) more controls in Figures 5 and 6, and a time-dependent effect on the colony-forming units of their vaccine formulation. Please, address the questions and submit a revised version together with a rebuttal letter addressing point-by-point raised by each reviewer.

      Reply: (1) We are afraid that a comparative study of a live and heat-killed cell vaccines will mislead the information presented here. This is the only non-genetically modified antifungal vaccine candidate therefore a comparison with a dead strain at present is unwarranted. We have now added supporting data to confirm that, the survivability of C. albicans cells was unaffected at 6 hr of EDTA treatment (CAET, Supp. Fig. S2). (2) Since the current study focused on the identification and a detailed characterization of a non-genetically modified live attenuated strain and its safety and efficacy as a potential vaccine candidate in the preclinical model, we have excluded the possible immune mechanisms involving CAET. However, in a separate study, we are currently investigating both cellular and molecular mechanisms that provide protective immunity in CAET-vaccinated mice. (3) The results of Figs 5 and 6 were misinterpreted by the respected reviewer, please see the explanation below.

      Reviewer #1 (Recommendations For The Authors):

      Some specific comments/suggestions for the authors: (1) What was the viability of the yeast after EDTA treatment? Is the delayed growth response because many cells died and it takes a while for remaining viable cells to catch up? This is important to know because it may mean the dose given to mice is substantially different and that should be accounted for. Some PI staining of the cells after treatment would help.

      Reply: The growth curve assays (Fig. 1A and 1E) were initiated with O.D.600nm=0.5 of each cultures (~ 107 cells/mL) and the analyses suggested that the EDTA-treated C. albicans cells grew slower than the untreated cells. Fig. 1B and 1F further demonstrated that EDTA has minimal effect on the survival of the strain up to 8 hrs post-exposure. The proportion of the number of cells increased without and with metal chelators almost remained the same for this duration (0 – 8 hrs). Therefore, for subsequent analyses, 6 hr treatment was selected and such treated cells were considered as CAET, which were actively dividing live cells, albeit slower than untreated cells. As suggested and to strengthen our finding, a time dependent SYTOX Green and Propidium iodide staining of C. albicans cells without and with EDTA treatment was carried out and analysed by flow cytometry and microscopy, respectively. Both analyses revealed that the percentage of dead cells up to 12 hrs of without and with EDTA treatment remained the same. The new data has now been added in the revised version of the manuscript as Supplementary figure 2.

      Author response image 1.

      (2) In line with the above, what was the viability of the CAET cells after 3h in media? In the macrophage in vitro experiments, how do you know the reduced viability of the CAET cells is macrophage-specific? Did you run a control of CAET cells in media on their own to determine how CFU changed in macrophage-free conditions? Is the proliferation rates of untreated and CAET cells different? That would affect CFSE labelling and results. These experiments would work better with a GFP-expressing C. albicans strain, which is widely available. In the images in Figure 4c, it looks like there are more hyphae in CAET than untreated - was hyphal induction checked/measured? That's important to know because more hyphae usually means more clumping and this can affect CFU counts (giving the impression of less CFU when actually there is more). Because of all the issues above, I'm not fully convinced by the uptake/killing data.

      Reply: As explained in response 1, we used actively dividing WT and CAET cells, and equal number of these cells were CFSE labelled. As can be seen in Fig.4A, the rate of phagocytosis was the same in 1 hr of pre-culture, but in the subsequent time points the double-positive cells were reduced in the case of CAET cells and that is due to fungal killing by macrophages. Fungal cells were released from the macrophages by warm water treatment and CFU was determined. Fig. 4B suggested that at 1hr of co-culture, the CFU of both fungal cells (WT and CAET) were the same and the fungal clearance was observed at later time points. Thus, the reduced viability of CAET cells was macrophagespecific. EDTA has minimal effect on hyphal transition without and with the presence of serum and the new data has now been provided in the revised version (Supplementary Fig. 3).

      Author response image 2.

      (3) Pooled data should be shown for all animal experiments.

      Reply: Thank you for the suggestion, wherever it was meaningful pooled data for the animal experiments have now been provided.

      (4) Immune cell counts/analysis in the kidney and bone marrow would be hugely helpful and more relevant to understanding immune responses following immunisation/infection. I think a more interesting analysis for the authors to consider would be to immunise with heat-killed yeast vs EDTAtreated yeast and see if there is a qualitative difference or better protection, i.e. is the EDTA-treated whole-cell vaccine superior to the heat-killed version? That is a better question to address. As it stands, the data in the paper is not surprising.

      Reply: The studies on cellular and molecular mechanisms underlying protective immunity in CAETvaccinated mice are under progress in a separate study. This study mostly focused on the identification and detailed characterization of a non-genetically modified live-attenuated strain and its safety and efficacy as a potential vaccine candidate in a preclinical model. We are afraid that a comparison of a live cell (CAET) with a dead cell (heat-killed) will dilute the content of the manuscript and will not be meaningful. It is well accepted that the heat-killed C. albicans strain only provides partial short-lived protection to re-challenge (Refs-PMIDs: 12146759, and 9916097), thus, it does not warrant any comparison with CAET.

      Reviewer #2 (Recommendations For The Authors):

      Overall, this is a highly interesting study. I have the following specific comments for clarification.

      (1) In the introduction, the authors mentioned other anti-candida vaccines that are mostly effective against Candida infection by inducing neutralizing antibodies. However, in their CAET vaccine candidate, they only checked the cellular immunity in blood and found a balanced immune response (both pro- and anti-inflammatory responses are induced). How about the antibody production in these mice? It is a bit surprising that both untreated Candida infection and CAET Candida infection produced similar immune activation based on Figure 6, yet the CAET immunization provides protection. Some innate cell recruitment is higher in untreated Ca infection than the CAET infected mice (Figure 5F). The overall results on immune response characterization did not seem to explain why the CAET infection led to host protection while untreated Ca infection cannot. Characterizing infected tissue immune cell differentiation and cytokine production may offer some additional insights.

      Reply: We agree with you that in this manuscript we have not provided any mechanistic study on the protective immunity in CAET-vaccinated mice. This will be demonstrated in a subsequent study.

      (2) In Figure 5, some critical data seem to be missing in panels B and C. The CFU and histopathological images for CAET-treated mice challenged by Ca should also be shown there for comparison. Although they did show some data in Figure 5E and Figure S4, it is necessary to have that data in 5B and 5C from the same experiment. Figure S4 is a very busy figure and the images are quite small. It may be necessary to use arrows to point out what information authors want to emphasize.

      Reply: Fig 5 B and 5C showed the data for mice that succumbed to infection. Since the other mice (saline control groups, CAET infected, CAET vaccinated, and re-challenged groups) survived, they were not sacrificed; therefore, the CFU data was not collected. In addition, we wanted to see the longevity of these survived mice and after 1 year of observations, they were handed over to the animal house for clearance as per the institutional guidelines. However, Figure 5E and Figure S4 (now Fig. S6) included all the mice groups as they were sacrificed at various time points irrespective of humane end points. As suggested FigS6 has now been modified and fungal cells were denoted by yellow arrows.

      (3) EDTA-treated yeast cells showed poor growth but also had thicker cell walls with high chitin, glucan, and mannan levels. What leads to its clearance in vivo remains unclear, as usually, cells with thick cell wall structures and low metabolism are more resistant to stress, e.g., dormant cells. Macrophages were shown to contribute to CAET killing in a phagocytosis assay (Figure 4). Checking cytokines produced by macrophages during co-incubation may offer some insights. In all, additional discussion on what caused in vivo clearance would be helpful.

      Reply: Mechanistic study on the protective immune responses of CAET will be demonstrated in a separate study. As suggested, the discussion section now contains additional information emphasising the in vivo clearance of CAET cells in the 3rd paragraph of discussion section.

      (4) Long paragraphs in the discussion section could be divided into a bigger number of shorter paragraphs.

      Reply: Thank you for the suggestion, it has now been modified in the revised version (7 short paragraphs). To make it more comprehensive, some of the content has been removed.

      Reviewer #3 (Recommendations For The Authors):

      (1) It is unclear how many cells were treated with 250 micromolar of EDTA for 6 hours before preparing the inoculum. It seems that only the OD was measured before adding EDTA. This is not a very rigorous and reproducible method.

      Reply: In this manuscript, we have repeatedly used the same protocol to generate CAET cells for various analyses. The O.D.600nm= 0.5 culture is equivalent to 107 C. albicans cells per mL and this information has now been added in the revised manuscript.

      (2) Upon treatment with 250 micromolar of EDTA, cells were harvested and counted to prepare the inoculum (5x10e5) for injecting it in mice. However, it appears that CFU of the inoculum was not done. Based on data shown in Fig. 1B, 250 micromolar of EDTA does inhibit Candida cell replication. Thus, the authors may have counted dead cells and, thus, injected dead cells together with live cells for the CAET inoculum. Thus, mice receiving this inoculum may have been infected (and vaccinated) with a lower number of live Candida cells.

      Reply: Please see a similar response to reviewer #1. EDTA has minimal effect on the survival of C. albicans cells at 6 hr (also see supp. Fig. S2). We have already mentioned the CFU analysis of untreated and CAET cells in the methodology section related to inoculum preparation.

      (3) It is unclear if 6 hours of treatment with 250 micromolar of EDTA is enough to induce a block of Candida cell replication. In Figure 1B, the authors treated for 24h. The authors are encouraged to wash the cells after 6 hours of treatment and see if their cell division will recover upon removal of EDTA.

      Reply: Thank you for the suggestion. At 6 hr treatment, survivability of C. albicans cells was unaffected upon EDTA exposure. PI and SYTOX GREEN staining confirmed it (Supp. Fig. 2). Additionally, as suggested a rescue experiment was carried out by exogenous addition of divalent metals after 6 hr EDTA treatment and growth/CFU analyses were followed thereafter. A modified Fig. 1 A and B with new data has been provided.

      (4) The data shown in Figure 5A is extremely exciting. However, the number of mice in each group (n=6) is too low. Normally, 10 mice per group are used for virulence studies unless the authors provide a power analysis that 6 mice per group will be sufficient. Also, CFU data were only provided for Ca and saline-Ca groups (Fig. 5B) and not for the other groups. CFU data should be provided for all mice.

      Reply: Thank you for the suggestion and a statistical analysis of Fig. 5A was provided in the revised version. The rationale behind not including all mice groups in Fig. 5B is already explained in a response to reviewer #2.

      (5) It is unclear how the authors differentiate between CFU arising from CAET or from WT Candida.

      Reply: Since the Fig 5 E demonstrated that no CAET cells were detected in the kidney beyond 10 days of inoculation, in the re-challenged mice group (1CAET 2 Ca), the fungal cells those detected in the 3rd and 7th days were from the later inoculated cells (brown colour).

      (6) Figure 5E: it is unclear if a 1 saline-2 saline (Figure legend) or if 1 saline-2 Ca (text) group was included. If the latter, where are the CFU? It is impossible that 1 saline-2 Ca mice have no CFU.

      Reply: Thank you so much for pointing this out. The legend has now been modified that include 1saline-2saline and 1CAET-2Ca.

      (7) It seems that CFU is significantly present in the kidney in the 1 CAET - 2 Ca group at day 7 but not at day 3. How is this possible? This is an extremely invasive model of infection, and the authors are challenging intravenously 500,000 live Candida cells. If by the 3rd day, the authors detect no CFU, then how is it possible that CFUs are arising on day 7?

      Reply: We do detect fungal cells on 3rd day in 1CAET 2 WT mice group (~2000 cells), albeit much lower than in 7 days (~11200 cells). A Log10 scale graph has now been provided for better representation.

      (8) Most importantly, if the authors are not detecting CFU at day 3, then earlier time points (e.g. day 2, day 1, or even 12 hours post-challenge) must be analyzed. The authors should show that CFU from the organs is decreasing in a time-dependent manner. Also, all CFU should be shown as Log10.

      Reply: please see the previous response.

      (9) Fig. 6: because it is unclear if the mice were challenged with the same inoculum of live Candida cells (untreated and treated with EDTA), the different cytokine profiles between the two groups could be simply due to the different inoculum sizes and not to the effect of EDTA on Ca.

      Reply: please see the previous response as given also for Reviewer 1.

    2. eLife assessment

      This study presents a useful strategy in which the authors devised a simple method to attenuate Candida albicans and deliver a live whole-cell vaccine in a mouse model of systemic candidiasis. The reviewers are not convinced about the completeness of the study: the strength of the evidence is incomplete and could be augmented with additional experiments to more fully characterize vaccine efficacy and host immune responses.

    3. Reviewer #2 (Public Review):

      Summary:

      Invasive fungal infections are very difficult to treat with limited drug options. With the increasing concern of the drug resistance, developing antifungal vaccine is a high priority. In this study, authors studied the metal metabolism in Candida albicans by testing some chelators, including EDTA, to block the metal acquisition and metabolism by the fungus. Interestingly, they found EDTA treated yeast cells grew poorly in vitro and non-pathogenic in vivo in a murine model. Mice immunized by EDTA-treated Candida (CAET) were protected against challenge with wild type Candida cells. RNA-Seq analysis to survey the gene expression profile in response to EDTA treatment in vitro revealed upregulation of genes in metal homeostasis and down regulation of ribosome biogenesis. They also revealed an induction of both pro- and anti-inflammatory cytokines involved in Th1, Th2 and Th17 host immune response in response to CAET immunization. Overall, this is an interesting study with a translational potential.

      Strengths:

      The main strength of the report is that authors identified a potential whole cell live vaccine strain that can provide a full protection against candidiasis. Abundant data both on in vitro phenotype, gene expression profile and host immune response have been presented.

      Weaknesses:

      A weakness is that the immune mechanism of CAET mediated host protection remain unclear. The immune data is somewhat confusing. Authors only checked cytokines and chemokines in blood. The immune response in infected tissues and antibody response may be investigated.

      Another potential concern is that using live wild type Candida cells treated with EDTA may still have chance to evolve and become infectious, considering that these treated cells still proliferate in vivo. Some of the gene regulation profiles may be transit and subjected to reverse, adding to the safety concern.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors are trying to find a vaccine solution for invasive candidiasis.

      Strengths:

      The testing of the antifungal activity of EDTA on Candida is not new as many other papers have examined this effect. The novelty here is on the use of this such EDTA treated strain as a vaccine to protect against a secondary challenge with wild-type Candida.

      Weaknesses:

      However, data presented in Fig. 5 and in Fig. 6 are not convincing and need further experimental controls and analysis as the authors do not show a time-dependent effect on the CFU of their vaccine formulation. Specific points are below.

      Methodology used is also an issue. As it stands, the impact is minor, if any.

      Comments on revised version:

      The data provided in the revised paper are simply not satisfactory and do not give confidence that a rigorous design and methodologies were used to obtain the results illustrated in this paper.

    1. eLife assessment

      This valuable study assesses through simulations how several known features of local cortical circuits - interneuron subtypes, their specific targeting of dendritic compartments, and certain brain rhythms - together affect the integration of synaptic inputs by a pyramidal cell into a spiking output signal. Employing several carefully considered simulation setups they convincingly demonstrate that beta rhythms are best suited to modulate and control dendritic Ca-spikes while gamma rhythms affect their coupling to somatic spiking, or how basal inputs are directly integrated into somatic spikes. However, the baseline setup may be idealized for the generation of the events in question and it would be beneficial if the similarity to the in-vivo activity regime was demonstrated further. The results will be relevant for neuroscientists studying local circuits or developing more abstract theories at the systems level.

    2. Reviewer #1 (Public Review):

      In this study, the authors explore the implications of two types of rhythmic inhibition - "gamma" (30-80 Hz) and "beta"(13-30Hz) - for synaptic integration. They study this in a multi-compartmental model L5 pyramidal neuron with Poisson excitation and rhythmic inhibition (16 Hz and 64 Hz), applied either to the perisomatic or apical tuft regions in the neuron. They find that 64 Hz inhibition applied to the cell body is effective in phasic modulation of AP generation, while 16 Hz inhibition applied to the apical tufts is effective in phasic modulation of dendritic spikes (in addition to APs). Switching the location of the two kinds of rhythmic inhibition reduces the overall excitability, but is not effective in phasic modulation of either dendritic spikes and weakly so for somatic APs.

      Strengths:

      The effect of the timescale of rhythmic inhibition on synaptic integration is an interesting question, since a) rhythmic spiking is most strongly evident in inhibitory population, b) rhythmic spiking is modulated by behavioral states and the sensory environment. The methods are clear and the data are well-presented. The study systematically explores the effect of two frequencies of rhythmic inhibition in a biophysically detailed model. The study considers not only idealized rhythmic inhibition but also the bursty kind that is observed in in-vivo conditions. Both distributed and clustered excitatory synaptic organization are simulated, which covers the two extremes of the spatial organization of excitatory inputs in-vivo.

      Weaknesses:

      SOM+ interneurons such as Martinotti cells target the apical tufts of pyramidals in the cortex. Since interneurons in general are strongly implicated in mediating rhythmic population activity over a range of timescales, it is quite appropriate to study the consequence of rhythmic inhibition provided by SOM+ interneurons for synaptic integration, including the phenomenon of dendritic spikes. However, using conclusions from a singular study (ref 22) to identify the beta band as the rhythm mediated by SOM+ is not very accurate. SOM+ interneurons have been implicated in regulating rhythms centered just below 30 Hz (refs 22, 21). It is a range that lies in the grey zone of the traditional definition of beta and gamma. However, it is significantly higher than the 16 Hz rhythms explored in this study. It thus remains unknown how a 25-30 Hz rhythmic inhibition (that has an experimentally suggested role for dendrite targeting SOM+ INs) in apical tufts regulates dendritic spikes.

      Distal dendritic inhibition has been previously shown to be more effective in controlling dendritic spikes. However, given the slow timescale of dendritic spikes, it can be hypothesized that high-frequency rhythmic inhibition would be ineffective in entraining the dendritic spikes either in distal or proximal location, as demonstrated by 4H and 5F, and vice versa. A computational study can take this further by exploring the robustness of this hypothesis. By sticking to a single-frequency definition of what constitutes Gamma (64 Hz) and Beta (16 Hz) inhibition, the current exploration does support the core hypothesis. However, given the temporal dynamics of dendritic spikes, it is valuable to learn, for example, the upper bound of "Beta" range (13-30Hz) inhibition that fails to phasically modulate them. In addition to the reason stated in the earlier paragraph, Alpha band activity (8-12 Hz), has been implicated (e.g. van Kerkoerle, 2014) in signaling of inter-areal feedback to the superficial layer in the cortex, potentially targeting apical tufts of pyramidals from multiple layers and resulting in alpha-range rhythmic inhibition. To make the findings significant, it might therefore be more pertinent to understand the consequences of ~10Hz rhythmic inhibition (in addition to the ~25-30 Hz Beta/Gamma) in the apical tufts for phasic modulation of dendritic spikes.

      The differential effect of Gamma and Beta range inhibition on basal and apical excitatory clusters is not convincing from the information provided. The basal cluster appears to overlap with perisomatic inhibitory synapses. The description in the methods does not have enough information to negate the visual perception (ln 979-81). With this understanding, it is not surprising that the correlation between excitation and APs is high (during the trough of gamma) for basal and not apical excitation. A more comparable scenario would be a more distal location of the basal excitatory cluster.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript illustrates how spatial targeting (perisomatic vs distal, apical, and basal dendritic) and timing of inhibition are crucial to distinct effects on neuronal integration and show that beta and gamma oscillations differentially engage dendritic spiking mechanisms.

      Strengths:

      The strength of this study lies in the integrative biophysical modelling of a layer 5 pyramidal neuron by bringing together in vitro and in vivo observations.

      Weaknesses:

      The weaknesses are probably in some of the parameterizations of inhibitory synaptic dynamics. A unitary peak conductance of 1nS is very high for inhibitory synapses. This high value could invariably skew some of the network-level predictions. The authors could obtain specific parameters from the Neocortical Collaboration Portal (https://bbp.epfl.ch/nmc-portal/microcircuit.html), which is an incredible resource for cortical neurons and synapses.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors consider several known aspects of PV and SOM interneurons and tie them together into a coherent single-cell model that demonstrates how the aspects interact. These aspects are:<br /> (1) While SOM interneurons target distal parts of pyramidal cell dendrites, PV interneurons target perisomatic regions.<br /> (2) SOM interneurons are associated with beta rhythms, PV interneurons with gamma rhythms.<br /> (3) Clustered excitation on dendrites can trigger various forms of dendritic spikes independent of somatic spikes. The main finding is that SOM and PV interneurons are not simply associated with beta and gamma frequencies respectively, but that their ability to modulate the activity of a pyramidal cell "works best" at their assigned frequencies. For example, distally targeting SOM interneurons are ideally placed to precisely modulate dendritic Ca-spikes when their firing is modulated at beta frequencies or timed relative to excitatory inputs. Outside those activity regimes, not only is modulation weakened, but overall firing reduced.

      Strengths:

      I think the greatest strength is the model itself. While the various individual findings were largely known or strongly expected, the model provides a coherent and quantitative picture of how they come together and interact.

      The paper also powerfully demonstrates that an established view of "subtractive" vs. "divisive" inhibition may be too soma-focused and provide an incomplete picture in cells with dendritic nonlinearities giving rise to a separate, non-somatic all-or-nothing mechanism (Ca-spike).

      Weaknesses:

      While the authors overall did an admirable job of simulating the neuron in an in-vivo-like activity regime, I think it still provides an idealized picture that it optimized for the generation of the types of events the authors were interested in. That is not a problem per se - studying a mechanism under idealized conditions is a great advantage of simulation techniques - but this should be more clearly characterized. Specifics on this are very detailed and will follow in the comments to authors.

      What disappointed me a bit was the lack of a concise summary of what we learned beyond the fact that beta and gamma act differently on dendritic integration. The individual paragraphs of the discussion often are 80% summary of existing theories and only a single vague statement about how the results in this study relate. I think a summarizing schematic or similar would help immensely.

      Orthogonal to that, there were some points where the authors could have offered more depth on specific features. For example, the authors summarized that their "results suggest that the timescales of these rhythms align with the specialized impacts of SOM and PV interneurons on neuronal integration". Here they could go deeper and try to explain why SOM impact is specialized at slower time scales. (I think their results provide enough for a speculative outlook.)

      Beyond that, the authors invite the community to reappraise the role of gamma and beta in coding. This idea seems to be hindered by the fact that I cannot find a mention of a release of the model used in this work. The base pyramidal cell model is of course available from the original study, but it would be helpful for follow-up work to release the complete setup including excitatory and inhibitory synapses and their activation in the different simulation paradigms used. As well as code related to that.

      Impact:

      Individually, most results were at least qualitatively known or at least expected. However, demonstrating that beta-modulation of dendritic events and gamma-modulation of soma spiking can work together, at the same time and in the same model can lead to highly valuable follow-up work. For example, by studying how top-down excitation onto apical compartments and bottom-up excitation on basal compartments interacts with the various rhythms; or what the impact of silencing of SOM neurons by VIP interneuron activation entails. But this requires - again - public release of the model and the code controlling the simulation setups.

      Beyond that, the authors clearly demonstrated that a single compartment, i.e., only a soma-focused view is too simple, at least when beta is considered. Conversely, the authors were able to describe the impact of most things related to the apical dendrite on somatic spiking as "going through" the Ca-spike mechanism. Therefore, the setup may serve as the basis of constraining simplified two-compartment models in the future.

    1. eLife assessment

      This valuable paper presents convincing evidence that changing the constraint of how long to stop at an intermediate target significantly influences the degree of coarticulation of two sequential reaching movements, as well as their response to mechanical perturbations. Using an optimal-control framework, the authors offer a normative explanation of how both co-articulated and separated sequential movement can be understood as an optimal solution to the task requirements.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper, Kalidini and Crevecoeur ask why sequential movements are sometimes coarticulated. To answer this question, first, they modified a standard optimal controller to perform consecutive reaches to two targets (T1 and T2). They investigated the optimal solution with and without a constraint on the endpoint's velocity in the via target (T1). They observed that the controller coarticulates the movements only when there is no constraint on the speed at the via-point. They characterized coarticulation in two ways: First, T2 affected the curvature of the first reach in unperturbed reaches. Second, T2 affected corrective movements in response to a mechanical perturbation of the first reach.

      Parallel to the modeling work, they ran the same experiment on human participants. The participants were instructed to either consider T1 as via point (go task) or to slow down in T1 and then continue to T2 (stop task). Mirroring the simulation results, they observed coarticulation only in the go task. Interestingly, in the go task, when the initial reach was occasionally perturbed, the long-latency feedback responses differed for different T2 targets, suggesting that the information about the final target was already present in the motor circuits that mediate the long-latency response. In summary, they conclude that coarticulation in sequential tasks depends on instruction, and when coarticulation happens, the corrections in earlier segments of movement reflect the entirety of the coarticulated sequence.

      Evaluation

      Among many strengths of this paper, most notably, the results and the experiment design are grounded in, and guided by the optimal control simulation. The methods and procedures are appropriate and standard. The results and methods are explained sufficiently and the paper is written clearly. The results on modulation of long-latency response based on future goals are interesting and of broad interest for future experiments on motor control in sequential movement. However, I find the authors' framing of these results, mostly in the introduction section, somewhat complicated.

      The current version of the introduction motivates the study by suggesting that "coarticulation and separation of sub-movement [in sequential movements] have been formulated as distinct hypotheses" and this apparent distinction, which led to contradictory results, can be resolved by Optimal Feedback Control (OFC) framework in which task-optimized control gains control coarticulation. This framing seems complicated for two main reasons. First, the authors use chunking and coarticulation interchangeably. However, as originally proposed by (Miller 1956), the chunking of the sequence items may fully occur at an abstract level like working memory, with no motoric coarticulation of sequence elements at the level of motor execution. In this scenario, sequence production will be faster due to the proactive preparation of sequence elements. This simple dissociation between chunking and coarticulation may already explain the apparent contradiction between the previous works mentioned in the introduction section. Second, the authors propose the OFC as a novel approach for studying neural correlates of sequence production. While I agree that OFC simulations can be highly insightful as a normative model for understanding the importance of sequence elements, it is unclear to me how OFCs can generate new hypotheses regarding the neural implementation of sequential movements. For instance, if the control gains are summarizing the instruction of the task and the relevance of future targets, it is unclear in which brain areas, or how these control gains are implemented. I believe the manuscript will benefit from making points more clear in the introduction and the discussion sections.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors examine the question of whether discrete action sequences and coarticulated continuous sequential actions can be produced from the same controller, without having to derive separate control policies for each sequential movement. Using modeling and behavioral experiments, the authors demonstrate that this is indeed possible if the constraints of the policy are appropriately specified. These results are of interest to those interested in motor sequences, but it is unclear whether these findings can be interpreted to apply to the control of sequences more broadly (see weaknesses below).

      Strengths:

      The authors provide an interesting and novel extension of the stochastic optimal control model to demonstrate how different temporal constraints can lead to either individual or coarticulated movements. The authors use this model to make predictions about patterns of behavior (e.g., in response to perturbations), which they then demonstrate in human participants both by measuring movement kinematics as well as EMG. Together this work supports the authors' primary claims regarding how changes in task instructions (i.e., task constraints) can result in coarticulated or separated movement sequences and the extent to which the subsequent movement goal affects the planning and control of the previous movement.

      Weaknesses:

      I reviewed a prior version of this manuscript, and appreciate the authors addressing many of my previous comments. However, there are some concerns, particularly with regard to how the authors interpret their findings.

      (1) It would be helpful for the authors to discuss whether they think there is a fundamental distinction between a coarticulated sequence and a single movement passing through a via point (or equivalently, avoiding an obstacle). The notion of a coarticulated sequence brings with it the notion of sequential (sub)movements and temporal structure, whereas the latter can be treated as more of a constraint on the production of a single continuous movement. If I am interpreting the authors' findings correctly it seems they are suggesting that these are not truly different kinds of movements at the level of a control policy, but it would be helpful for the authors to clarify this claim.

      (2) The authors' model clearly shows that each subsequent target only influences the movement of one target back, but not earlier ones (page 7 lines 199-204). This stands in contrast to the paper they cite from Kashefi 2023, in which those authors clearly show that people account for at least 2 targets in the future when planning/executing the current movement. It would be useful to know whether this distinction arises because of a difference in experimental methodology, or because the model is not capturing something about human behavior.

      (3) In my prior review I raised a concern that the authors seem to be claiming that because they can use a single control policy for both coarticulated and separated movement sequences, there need not be any higher-level or explicit specification of whether the movements are sequential. While much of that language has been removed, it still appears in a few places (e.g., p. 13, lines 403-404). As previously noted, the authors' control policy can generate both types of movements as long as the proper constraints are provided to the model. However, these constraints must be specified somewhere (potentially explicitly, as the authors do by providing them as task instructions). Moreover, in typical sequence tasks, although some movements become coarticulated, people also tend to form chunks with distinct chunk boundaries, which presumably means that there is at least some specification of the sequential ordering of these chunks that must exist (otherwise the authors' model might suggest that people can coarticulate forever without needing to exhibit any chunk boundaries). Hence the authors should limit themselves to the narrow claim that a single control policy can lead to separated or coarticulated movements given an appropriate set of constraints, but acknowledge that their work cannot speak to where or how those constraints are specified in humans (i.e., that there could still be an explicit sequence representation guiding coarticulation).

    1. Author response:

      Reviewer #1

      […] it seems that the readout units are not operating in continuous time, and that interval discrimination relies in part on external information. Specifically, the readout units only look at the spike counts during the window delta_t_w.

      In the first version of the review, the reviewer implied that each readout unit only receives input during a small window around the interval it represents. However, this is not the case. The small window that is depicted in Fig. 16 is a sliding window that is used to compute the states (i.e., an estimate of the instantaneous firing rate) at each point in time. The fact that the readout units indeed do operate in continuous time is apparent from Fig. 2A, showing the activity of all output units as a function of time: There is gradually changing activity with a peak at the represented interval. If each unit would only receive input during a window of a couple milliseconds, there would be a single peak of activity at the represented interval, and near-zero activity at any other time.

      This misunderstanding has been cleared out in the current version of the review (see last paragraph of review #1).

      Stimulus onset occurs at 1500 ms in order to allow the network to stabilize. Ideally, this value should be randomized across trials to ensure performance generalizes across initial states.

      This is a valid point which we will address in the revision. However, we note that experimentation with different onset values did not change the dynamics of the network systematically in previous studies (i.e., Hass et al., 2022).

      Why does StDev saturate? Is that because subjective time saturates as well?

      Indeed, the two phenomena are closely related. In section “Deviations from the scalar property and the origin on Vierordt’s law”, we discuss that both is caused by the broadening of the tuning curves of the readout units (Fig 1A) as the longest time constants of the network are exceeded.

      In the discussion, it would be nice to explain that dopaminergic modulation of subjective timing is not as universally observed as the linear psychophysical law or the scalar property, and I believe somewhat controversial (e.g., Ward, ..., Balsam, 2009).

      We are thankful for this advice and will adapt the discussion accordingly in the revision. Still, we note that dopaminergic modulation of subjective timing is one of the more robust effects observed in several time perception experiments.

      Reviewer #2:

      (1) Lack of Empirical Data: […] The paper would benefit from quantitative and qualitative simulations of results from specific, large-sample studies to anchor the model's predictions in concrete empirical evidence.

      While it is correct that this study does not attempt the replicate a concrete empirical study, we note that do compare the model's results with specific studies wherever possible. The comparison is done on the level of parameters of functional relationships: For the linear psychophysical law, we compare the slope and the indifference point of the model with those from experimental studies. For the scalar property, we compare the Weber fraction of the model to those computed from experiments. For dopaminergic modulation of subjective duration, no direct comparison with experimental data is possible, as the levels of modulation are estimated from in vitro experiments and cannot be directly compared with modulations in vivo. However, we discuss a range of qualitative observations in experiments that are reproduced (and explained) by the model.

      The above arguments notwithstanding, one can discuss whether the presentation of the experimental results and the comparison with the simulations is appropriate, and we do plan to extend this presentation in a revision.

      (2) Methodological Ambiguities: The training and testing procedures lack robust checks for generalization, leading to potential overfitting issues.

      It is correct that formal checks for generalization, such as cross-validation protocols, are missing, and we will include them in the revision. However, as we obtained a mechanistic understanding of how the model tells time, we are confident that our results are not due to overfitting.

      (3) Inadequate Visualization of Empirical Data: References to empirical data are vague and not directly visualized alongside model outputs. Future iterations should include empirical data, not general trends from psychophysics, in figures for a clear comparison.

      As mentioned above, the comparison between simulation and empirical data will be extended in a revision. However, we argue that the “general trends”, namely adherence of the model to the often-reported psychophysical regularities, are of greater importance compared to the replication of, e.g. one specific slope of the linear psychophysical law, which does vary a lot between experiments.

      (4) Limitations in Model Scope and Dynamics: […] Expanding the model limitations to consider isochronous pulse processing and the emergence of limit-cycle behaviors after prolonged stimulation would provide a more comprehensive understanding of the model's capabilities and limitations.

      The current research focuses on the estimation of a single duration rather than the processing of sequences of durations. Sequence processing is a vast field, and it has been argued that it comprises different mechanisms compared to duration estimation. Thus, we feel that including sequences processing would be beyond the scope of the already quite extensive paper. However, we will discuss a possible extension of the model to sequence processing in the revision.

      Additionally, the justification for using(N_{Poisson}\) as a proxy for more connections is unclear and warrants a more direct approach.

      We considered different means to vary the noise input into the network, including changes in the number of connections. We ultimately chose to vary the firing rate of a fixed number of Poisson input neurons. As the sum of the firing rates of N independent Poisson neurons with the same f is simply N*f and the synaptic contributions from each spike also linearly add up, this is equivalent to adding more Poisson neurons and thus, more connections.

      (5) Omissions and Redundancies: Certain omissions, such as the lack of a condition in Figure 7A or missing references to relevant models and reviews, detract from the paper's thoroughness.

      The reviewer refers to a condition where everything is ablated except NMDA. We will include such a condition in the revision. Regarding missing references, the reviewer requests including references that focus on sequence processing. While the focus of the current work is on estimating a single duration rather than a sequence of durations (see above), we will include a review on this topic as an outlook on this possible extension of the model.

      Moreover, some statements and terms like "internal clock" are used without a clear mechanistic definition within the model.

      We are thankful for this advice and will adapt the revision accordingly.

    2. Reviewer #1 (Public Review):

      Summary:<br /> This paper addresses the important question of the neural mechanisms underlying interval discrimination. The authors develop a detailed and biologically plausible model based on a previously proposed theory of timing. The model proposes that the interval between two stimuli can be encoded in the state of the neuronal and synaptic properties, specifically those with time constants on the order of hundreds of milliseconds, such as short-term synaptic plasticity and GABAb currents. Based on biological parameters in the PFC the authors show that the model can account for interval discrimination for up to 750 ms. Furthermore, the model accounts for three well-established psychophysical properties of interval timing: the linear relation between objective and neural time, the scalar property/Weber's law, and dopaminergic modulation of timing (although this property is less robust). Of particular novelty is the demonstration of Weber's law, and an explanation of how many complex and nonlinear neuronal properties produce a linear relationship between the standard deviation of interval estimates and their mean.

      This is an interesting paper that addresses a significant gap in the field. However, I have one major concern. As I understood the methods (and I may have misunderstood) it seems that the readout units are not operating in continuous time, and that interval discrimination relies in part on external information. Specifically, the readout units only look at the spike counts during the window delta_t_w. Thus, discrimination between 100 and 200 ms looks only at the spikes at 120-145 and 220-245, respectively, meaning that the experimenters are providing interval information for the readout of the intervals being discriminated. If this is indeed the case the model is fairly limited in biological plausibility and significantly dampens my enthusiasm for the paper.

      Stimulus onset occurs at 1500 ms in order to allow the network to stabilize. Ideally, this value should be randomized across trials to ensure performance generalizes across initial states.

      Why does StDev saturate? Is that because subjective time saturates as well?

      The model captures the effect of D2 receptors observed in some timing studies, specifically and DR2 activation increases "clock" speed. In the discussion, it would be nice to explain that dopaminergic modulation of subjective timing is not as universally observed as the linear psychophysical law or the scalar property, and I believe somewhat controversial (e.g., Ward, ..., Balsam, 2009).

      (NB: Regarding my potential concern that that the decoding was performed in discontinuous time, the authors have clarified that decoding was done in continuous time--i.e., each output unit was trained to respond to a given time bin of the target interval but exposed to all time bins of all intervals during testing. Thus confirming the robustness of their decoding procedure and model.)

    3. eLife assessment

      This useful paper explores a mathematical model of subsecond time perception, testing potential neural mechanisms behind the linear psychophysical law, Weber's law, and dopaminergic modulation of subjective durations. The model employed readout units to decode an interval. Nevertheless, the work is incomplete and presented as data-driven, but there is no analysis of empirical data.

    4. Reviewer #2 (Public Review):

      Summary:<br /> The paper explores a mathematical model of subsecond time perception, engaging with established theories such as the linear psychophysical law, Weber's law, and dopaminergic modulation of subjective durations. While it ambitiously attempts to confirm specific mechanisms of time perception and presents a comprehensive description of these mechanisms, the work is presented as data-driven but its empirical backing and model generalization capabilities are questionable. The title's implication of a robust empirical foundation is misleading, as the main figures do not reflect empirical data directly but rather model outputs aligned with general trends in psychophysical studies. This disjunction raises concerns about the model's applicability and the strength of the claims made regarding time perception mechanisms.

      Strengths:<br /> (1) The paper describes specific mechanisms of time perception, providing a theoretical examination of linear psychophysical law, Weber's law, and dopaminergic modulation. This aspect is valuable for readers seeking a theoretical understanding of temporal perception.

      (2) The authors describe a range of psychophysical studies and theories, attempting to position their model within the broader scientific discourse on time perception.

      Weaknesses:<br /> (1) Lack of Empirical Data: The absence of two things: 1) quantification of error between model and empirical data with interpretation of what this degree of error means, and 2) clear comparisons between model and empirical data in all figures and tables, to substantiate the model's predictions stands out. The reliance on general trends rather than specific empirical studies undermines the strength and reliability of the model's claims. The paper would benefit from quantitative and qualitative simulations of results from specific, large-sample studies to anchor the model's predictions in concrete empirical evidence.

      (2) Methodological Ambiguities: The training and testing procedures lack robust checks for generalization, leading to potential overfitting issues. Clarifications are needed on whether and how the model reaches a steady state before stimulation and the implications of the chosen model time constants in the absence of stimulation. The overlap between training (50ms) and testing (25ms) steps and the implications for model generalization need validation with "traditional" parameter fitting protocols, such as formal model cross-validation across well-defined datasets and splits, as well as evaluations to understand and assess potential overfitting.

      (3) Inadequate Visualization of Empirical Data: References to empirical data are vague and not directly visualized alongside model outputs. Future iterations should include empirical data, not general trends from psychophysics, in figures for a clear comparison.

      (4) Limitations in Model Scope and Dynamics: The exploration of limitations is narrowly focused on interval length and noise. Expanding the model limitations to consider isochronous pulse processing and the emergence of limit-cycle behaviors after prolonged stimulation would provide a more comprehensive understanding of the model's capabilities and limitations. Additionally, the justification for using \(N_{Poisson}\) as a proxy for more connections is unclear and warrants a more direct approach. Adding more units to a truly data-driven model should be trivial.

      (5) Omissions and Redundancies: Certain omissions, such as the lack of a condition in Figure 7A or missing references to relevant models and reviews, detract from the paper's thoroughness. Moreover, some statements and terms like "internal clock" are used without a clear mechanistic definition within the model.

      Guidance for Readers<br /> Readers should approach this paper as a theoretical exploration into the mechanisms of subsecond-time perception. The model offers a detailed theoretical framework that engages with established laws and theories in time perception. However, it's crucial to note the model's reliance on general trends and its lack of direct empirical backing. The findings should be interpreted as a hypothesis-generating exercise rather than conclusive evidence.

    1. Reviewer #3 (Public Review):

      Summary:

      In this study the authors tested for alterations in selection intensity across ~13,000 protein coding genes along the gorilla lineage in order to test the hypothesis that the evolution of a polygynous social system resulted in relaxed selective constraint through a reduction in sperm competition. Of these genes, 578 exhibited signatures of relaxed purifying selection that were enriched for functions in male germ cells including meiosis and sperm biology. These genes were also more likely expressed in male germ cells and to contain deleterious mutations. Functional analysis of genes not previously implicated in male reproduction identified 41 new genes essential to male fertility in a Drosophila model. Moreover, genes under relaxed selective constraint in the gorilla lineage were more likely to contain loss of function variants in a cohort of infertile men. The authors conclude their results support the hypothesis that the emergence of a polygynous social system may have reduced the degree of selective pressures exerted through sperm competition.

      Strengths:

      (1) The identification of novel genes involved in spermatogenesis using signatures of relaxed selective constraint coupled to in vivo RNAi in Drosophila is very exciting and offers a proof of principal as to the power of evolutionarily-informed functional genomics that has been largely underutilized.

      Weaknesses:

      (1) The analysis is restricted to protein-coding regions of genes that have single, orthologous sequences spanning 261 mammalian species, and as such is a non-random set of 13,310 genes that have higher evolutionary conservation. While this approach is necessary for the analyses being performed, it excludes non-coding regions, recently duplicated genes/gene families, and rapidly evolving genes, which are all likely subject to stronger selection as compared to evolutionarily conserved genes (and gene regions). Thus, the conclusions of relaxed selective constraint as being pervasive is likely missing a large number of the most strongly selected genes, among which have repeatedly been shown to include sex and reproduction related genes. Would the results be similar if the set of orthologous genes were restricted to the primate lineage, as it may include more rapidly evolving genes?

      (2) The identification of genes showing relaxed selection along the gorilla lineage, which are overrepresented in male reproduction, supports the hypothesis that the emergency of polygyny resulted in relaxed sperm competition and is the driving force behind their observations. However, there is no control group to support that polygyny is the driving force. To more fully test this hypothesis the authors should consider contrasting their findings to observations for other species whereby polygyny did not evolve (or a gradation between). Ideally this could be integrated into RELAX-Scan comparisons, but even a semi-qualitative observation could be made for lineages more often having shared signatures of relaxed constraint across the 576 genes identified in gorilla.

      (3) The comparisons of infertile human males to a large number of presumably healthy males from a separate cohort can lead to genetic differences related to population structure and/or differences in study recruitment as compared to infertility, and care must be taken to avoid confounding in any association study before drawing conclusions. Population structure is likely to occur in human cohorts and is more likely to affect patterns of rare variation, even when controls are ascertained using similar enrollment criteria, geographic regions, racial/ethnic and national identities. In this study, the MERGE cohort upon a quick search appears to be largely recruited from Germany, vs. the control cohort gnomeAD is a more cosmopolitan study including somewhat diverse ancestries. Thus, it is likely the infertile vs. control cohort has existing genetic differences unrelated to the phenotype.

    2. eLife assessment

      This important work reports that genome-wide patterns of relaxed purifying selection on genes involved in male fertility may represent a response to the reduced sperm competition in the gorillas' mating system. However, the evidence supporting the conclusion is incomplete and needs to be strengthened. This work will be of interest to researchers working on evolution and reproductive biology.

    3. Reviewer #1 (Public Review):

      This manuscript describes the pattern of relaxed selection observed at spermatogenesis genes in gorillas, presumably due to the low sperm competition associated with single-male polygyny. The analyses to detect patterns of selection are very thorough, as are the follow up analyses to characterize the function of these genes. Furthermore, the authors take the extra steps of in vivo determination of function with a Drosophila model.

      This is an excellent paper. It addresses the interesting phenomenon of relaxation of selection as a genomic signal of reproductive strategies using multiple computational approaches and follow-up analyses by pulling in data from GO, mouse knockouts, human infertility database, and even Drosophila RNAi experiments. I really appreciate the comprehensive and creative approach to analyze and explore the data. As far as I can tell, the analyses were performed soundly and statistics are appropriate. The Introduction and Discussion sections are thoughtful and well-written. I have no major criticisms of the manuscript.

      The main area that I would suggest for improvement is in the "Caveats and Limitations" section of the Discussion. Currently, the first paragraph of this section states the obvious that genetic manipulation of gorillas is not feasible. Beyond a reminder to the reader that this was a rationale for the Drosophila work, it isn't really adding much insight. The second paragraph is a brief discussion of the directionality of change. I think it comes across as overly simplistic, with a sort of "well, we can never know" feel. Obviously, there are plenty of researchers who do model change to infer direction and causation, and there are plenty of published papers attempting to do so with respect to mating systems in primates.

      I do not think the authors need to remove these paragraphs, but I do encourage them to turn the "Caveats and Limitations" section into something more meaningful by addressing limitations of the work that was actually done rather than limitations of hypothetical things that were not done. A few areas come to mind. First, the authors should discuss the effect of gene-tree vs species-tree inconsistencies in the analyses, which could affect the identification of gorilla-specific amino acid changes and/or the dN/dS estimates. Incomplete lineage sorting is very common in primates including the gorilla-chimp-human splits (Rivas-González et al. 2023). It would be nice to hear the authors' thoughts on how that might affect their analyses. Second, the dN/dS-based analyses assume the neutrality of synonymous substitutions. Of course, that assumption is not completely true; it might be true enough, and the authors should at least note it as a caveat. Third, and potentially related, is the consideration that these protein-coding genes may be functioning in other ways such as via antisense transcription. The genes under relaxed selection may be on their way to becoming pseudogenes and evolving as such at the sequence level, but many pseudogenes continue to be transcribed sense or anti-sense in a regulatory purpose. I don't think there is a way to incorporate this into the authors' analyses but it would be nice to see it acknowledged as a caveat or limitation.

    4. Reviewer #2 (Public Review):

      Summary:

      Bowman and colleagues have compiled a large comparative genomic dataset to examine the molecular evolution of genes in mammals, with the primary goal of identifying how changes in the gorilla mating system have shaped the evolution of spermatogenesis. They report several patterns pointing to signal of relaxed purifying selection on genes involved in male fertility, a pattern that they interpret as a response to changes in the mating system of gorillas. Many previous studies have used comparisons among species of primates and other mammals to understand how changes in mating systems have shaped the evolution or reproductive traits and genes. These collective works have provided some of the best evidence that changes in the form and intensity of sexual selection has had a strong effect on the evolution of male reproduction. The current study builds on this rich history by exploring molecular evolution of over 13,310 genes across 261 mammals. This very large phylogenetic dataset allows affords considerable power to characterize patterns of molecular evolution along the gorilla lineage. This allows for some added power relative to a previous study that interrogated the same lineage-specific patterns (Scally et al. 2021). They report a subset of genes showing evidence for either positive directional selection (less than 1% of genes) or relaxed purifying selection (4% of genes) in gorillas. Relaxed purifying selection is more common than positive selection, and genes showing signatures of relaxed constraint are enriched for spermatogenesis functions using various tests based on functional annotation or gene expression and infertility associations in humans and mice. The authors also report new functional data - the only original data in this study - using a high throughput genetic screen showing that some of these genes are also expressed in spermatogenesis in flies, and when perturbed they affect male fertility.

      These results are interpreted as strong evidence that changes in mating system, specifically that loss of sperm competition, has shaped the evolution of male reproduction in gorillas. The authors argue that these discoveries illustrate, for the first time, the genome-wide effect of striking changes in mating behavior in gorillas on the genetic underpinnings of male reproduction and provide new candidates relevant to male fertility in humans. Support for these central conclusions is eroded by a lack of appropriate comparative contrasts needed clarify the uniqueness of these patterns to gorillas and, critically, establish a direct phylogenetic association with mating system or correlated reproductive traits.

      Strengths:

      The presentation is engaging, clear, and easy to follow throughout. I enjoyed reading the overall narrative and I think that the authors did a good job of presenting the details of male reproductive biology in an informative and accessible manner. Given the general interest in gorilla evolution, and the clear relevance to humans, studies of this scope on male reproductive biology are likely to be of broad interest to both evolutionary and reproductive biologists.

      The reported signatures of molecular evolution in gorillas appear robust, well-executed, and supported by several lines of evidence that establish some links with male reproduction. The authors have presented a series of molecular evolution analyses that demonstrate both rigor and attention to analytical details and quality control. Although all the primary sequence data has been previously published by others, the compilation of a high-quality curated comparative dataset of this scale is impressive and inspires confidence in the underlying molecular results. Likewise, the incorporation of diverse other data from mice and humans helps shape the overall narrative. To my knowledge, this represents the most focused and detailed analysis of protein-coding evolution specific to gorillas to date (although parallel results from the landmark gorilla genome study - Scally et al. 2012 - are downplayed somewhat).

      Likewise, the inclusion of new functional data from Drosophila establishes a subset of genes showing recent changes in molecular evolution in gorillas that appear to be both deeply conserved in animals and related to male fertility.

      Weaknesses:

      This study lacks the necessary comparative framework needed to ascribe any of the reported patterns to changes in the reproductive system of gorillas, or to really understand the uniqueness of these patterns relative to other species. Although wording is careful at times, the authors repeatedly ascribe the patterns they are finding directly to the specific changes in mating system biology that has occurred in gorillas. The general framing and significance rests on the central finding that "these data provide compelling evidence that reduced sperm competition in gorillas is associated with relaxed purifying selection on genes related to male reproductive function (Abstract)". No such association between variation in mating system or at any correlated reproductive traits and molecular evolution is ever directly tested let alone established as a clear statistical correlation. The massive comparative dataset is used to localize patterns of molecular evolution to the gorilla lineage and then these patterns are interpreted in the context of changes in mating system, as an assumption of the study not a direct result. Although basic information of the reproductive system (or correlates thereof) likely exists for many of the 261 species included here, this information is never used to test for a relationship between changes in positive or purifying selection and reproduction.

      The lack of any such comparisons is especially curious given that there are many previous studies that have sought and established such connections for traits and/or genes in mammals (dozens now?), and especially great apes, before. This comparative approach is the gold standard to making claims linking mating system to molecular evolution and yet this is not pursued here. The authors are correct in that they provide a rigorous genome-wide analysis (but not at all for the first time, see Scally et al. 2012), but they skip this critical central step to rigorous inference in comparative genomics. This is essentially a broad comparative study, but the central conclusion (a direct link between mating system and molecular evolution) is speculative and not actually tested.

      Note that despite the framing here, there are of course several aspects of lineage specific biology that undoubtedly shape molecular evolution of male reproduction and fertility but could be unrelated to sperm competition per se. For example, shift in operational sex ratios can have profound effects on effective population sizes and the efficacy of selection, which of course would be expected to change the intensity and direction of molecular evolution. Likewise, shifts in population size, structure, and diet all can affect molecular evolution and reproduction.

      In the absence of a broad phylogenetically independent contrast (which would be really interesting here), the authors need to at least establish that there is indeed something noteworthy about the specific findings they report relative to other systems that have a different mating system. Such comparisons would be readily available within the great apes, especially compared to chimpanzees and bonobos (Pan). Most of the patterns are presented in such a way to suggest a clear connection between the result and the unique features of gorilla reproduction, but are these clearly outliers? Relaxed purifying selection is much more common than positive selection, is this result qualitatively or quantitatively unique to gorillas as implied (I would honestly be surprised if it was as this is a common outcome of these dn/ds-based tests)? Similar questions and the need for more context apply to the various enrichment tests. That genes involved in male reproduction evolve rapidly and that this reflects both relaxed constraint and positive selection is an exceptionally well-established pattern, as is enrichment for reproductive functions/expression of such genes in unbiased genome-wide screens (as cited by the authors, including in gorillas by Scally et al. 2012 who performed a very similar analysis albeit with some model advances used in the current study). Do chimpanzees or humans lack these specific signatures of relaxed constraint at reproductive genes or is it a much stronger enrichment in gorillas? Establishing these baseline comparisons would help a lot with interpretation of the core findings. A little bit of this is explored with the human comparisons but not in a parallel genome-wide manner that places the signatures in gorillas in context.

      I had similar questions related to the high-throughput Drosophila screen. This is a creative and novel component of the study. However, I am unclear on how to interpret the results or the conclusions drawn from them. It is very interesting that a subset of genes showing relaxed constraint are conserved to Drosophila and that perturbation of some of these cause fertility issues. However, the conclusion that these genes reflect novel candidates not implicated in sperm biology is a bit overstated. Here implicated means genes with an annotated sterility phenotype in humans, mice, flies, or gorillas - specific annotations which are pretty limited at least in the mammalian systems. The entire design was conditioned on analyzing genes that were reliably expressed during Drosophila spermatogenesis, and then focusing on those. But the comparative set for the enrichment test was a random set of genes. Shouldn't the background be a random set of testis-expressed genes? I would say that genes that are reliably expressed during spermatogenesis in both mammals and flies are implicated in sperm biology and genetic manipulation of such genes would be expected to produce fertility phenotypes at some appreciable rate. So the result here adds some interesting data but it does not seem unexpected or significant as framed.

    1. eLife assessment

      This valuable study explores the sequence characteristics and conservation of high-occupancy target loci, which are genomic regions bound by a multitude of transcription factors, at promoters and enhancers throughout the human genome. The computational analyses presented in this study are solid, although the evidence for some claims is inadequate. This study would be a helpful resource for researchers performing ChIP-seq based analyses of transcription factor binding.

    2. Reviewer #3 (Public Review):

      Summary:

      Hudaiberdiev and Ovcharenko investigate regions within the genome where a high abundance of DNA-associated proteins are located and identify DNA sequence features enriched in these regions, their conservation in evolution, and variation in disease. Using ChIP-seq binding profiles of over 1,000 proteins in three human cell lines (HepG2, K562, and H1) as a data source they're able to identify nearly 44,000 high-occupancy target loci (HOT) that form at promoter and enhancer regions, thus suggesting these HOT loci regulate housekeeping and cell identity genes. Their primary investigative tool is HepG2 cells, but they employ K562 and H1 cells as tools to validate these assertions in other human cell types. Their analyses use RNA pol II signal, super-enhancer, regular-enhancer, and epigenetic marks to support the identification of these regions. The work is notable, in that it identifies a set of proteins that are invariantly associated with high-occupancy enhancers and promoters and argues for the integration of these molecules at different genomic loci. These observations are leveraged by the authors to argue HOT loci as potential sites of transcriptional condensates, a claim that they are well poised to provide information in support of. This work would benefit from refinement and some additional work to support the claims.

      Comments:

      Condensates are thought to be scaffolded by one or more proteins or RNA molecules that are associated together to induce phase separation. The authors can readily provide from their analysis a check of whether HOT loci exist within different condensate compartments (or a marker for them). Generally, ChIPSeq signal from MED1 and Ronin (THAP11) would be anticipated to correspond with transcriptional condensates of different flavors, other coactivator proteins (e.g., BRD4), would be useful to include as well. Similarly, condensate scaffolding proteins of facultative and constitutive heterochromatin (HP1a and EZH2/1) would augment the authors' model by providing further evidence that HOT Loci occur at transcriptional condensates and not heterochromatin condensates. Sites of splicing might be informative as well, splicing condensates (or nuclear speckles) are scaffolded by SRRM/SON, which is probably not in their data set, but members of the serine arginine-rich splicing factor family of proteins can serve as a proxy-SRSF2 is the best studied of this set. This would provide a significant improvement to their proposed model and be expected since the authors note that these proteins occur at the enhancers and promoter regions of highly expressed genes.

      It is curious that MAX is found to be highly enriched without its binding partner Myc, is Myc's signal simply lower in abundance, or is it absent from HOT loci? How could it be possible that a pair of proteins, which bind DNA as a heterodimer are found in HOT loci without invoking a condensate model to interpret the results?

      Numerous studies have linked the physical properties of transcription factor proteins to their role in the genome. The authors here provide a limited analysis of the proteins found at different HOT-loci by employing go terms. Is there evidence for specific types of structural motifs, disordered motifs, or related properties of these proteins present in specific loci?

      Condensates themselves possess different emergent properties, but it is a product of the proteins and RNAs that concentrate in them and not a result of any one specific function (condensates can have multiple functions!)

      Transcriptional condensates serve as functional bodies. The notion the authors present in their discussion is not held by practitioners of condensate science, in that condensates exist to perform biochemical functions and are dissolved in response to satisfying that need, not that they serve simply as reservoirs of active molecules. For example, transcriptional condensates form at enhancers or promoters that concentrate factors involved in the activation and expression of that gene and are subsequently dissolved in response to a regulatory signal (in transcription this can be the nascently synthesized RNA itself or other factors). The association reactions driving the formation of active biochemical machinery within condensates are materially changed, as are the kinetics of assembly. It is unnecessary and inaccurate to qualify transcriptional condensates as depots for transcriptional machinery.

      This work has the potential to advance the field forward by providing a detailed perspective on what proteins are located in what regions of the genome. Publication of this information alongside the manuscript would advance the field materially.

    3. Reviewer #1 (Public Review):

      Summary:

      This study explores the sequence characteristics and features of high-occupancy target (HOT) loci across the human genome. The computational analyses presented in this paper provide information into the correlation of TF binding and regulatory networks at HOT loci that were regarded as lacking sequence specificity.

      By leveraging hundreds of ChIP-seq datasets from the ENCODE Project to delineate HOT loci in HepG2, K562, and H1-hESC cells, the investigators identified the regulatory significance and participation in 3D chromatin interactions of HOT loci. Subsequent exploration focused on the interaction of DNA-associated proteins (DAPs) with HOT loci using computational models. The models established that the potential formation of HOT loci is likely embedded in their DNA sequences and is significantly influenced by GC contents. Further inquiry exposed contrasting roles of HOT loci in housekeeping and tissue-specific functions spanning various cell types, with distinctions between embryonic and differentiated states, including instances of polymorphic variability. The authors conclude with a speculative model that HOT loci serve as anchors where phase-separated transcriptional condensates form. The findings presented here open avenues for future research, encouraging more exploration of the functional implications of HOT loci.

      Strengths:

      The concept of using computational models to define characteristics of HOT loci is refreshing and allows researchers to take a different approach to identifying potential targets. The major strengths of the study lies in the very large number of datasets analyzed, with hundreds of ChIP-seq data sets for both HepG2 and K562 cells as part of the ENCODE project. Such quantitative power allowed the authors to delve deeply into HOT loci, which were previously thought to be artifacts.

      Weaknesses:

      While this study contributes to our knowledge of HOT loci, there are critical weaknesses that need to be addressed. There are questions on the validity of the assumptions made for certain analyses. The speculative nature of the proposed model involving transcriptional condensates needs either further validation or be toned down. Furthermore, some apparent contradictions exist among the main conclusions, and these either need to be better explained or corrected. Lastly, several figure panels could be better explained or described in the figure legends.

    4. Reviewer #2 (Public Review):

      Summary:

      The paper 'Sequence characteristic and an accurate model of abundant hyperactive loci in human genome' by Hydaiberdiev and Ovcharenko offers comprehensive analyses and insights about the 'high-occupancy target' (HOT) loci in the human genome. These are considered genomic regions that overlap with transcription factor binding sites. The authors provided very comprehensive analyses of the TF composition characteristics of these HOT loci. They showed that these HOT loci tend to overlap with annotated promoters and enhancers, GC-rich regions, open chromatin signals, and highly conserved regions, and that these loci are also enriched with potentially causal variants with different traits.

      Strengths:

      Overall, the HOT loci' definition is clear and the data of HOT regions across the genome can be a useful dataset for studies that use HepG2 or K562 as a model. I appreciate the authors' efforts in presenting many analyses and plots backing up each statement.

      Weaknesses:

      It is noteworthy that the HOT concept and their signature characteristics as being highly functional regions of the genome are not presented for the first time here. Additionally, I find the main manuscript, though very comprehensive, long-winded and can be put in a shorter, more digestible format without sacrificing scientific content.

      The introduction's mention of the blacklisted region can be rather misleading because when I read it, I was anticipating that we are uncovering new regulatory regions within the blacklisted region. However, the paper does not seem to address the question of whether the HOT regions overlap, if any, with the ENCODE blacklisted regions afterward. This plays into the central assessment that this manuscript is long-winded.

      The introduction also mentioned that HOT regions correspond to 'genomic regions that seemingly get bound by a large number of TFs with no apparent DNA sequence specificity' (this point of 'no sequence specificity' is reiterated in the discussion lines 485-486). However, later on in the paper, the authors also presented models such as convolutional neural networks that take in one-hot-encoded DNA sequence to predict HOT performed really well. It means that the sequence contexts with potential motifs can still play a role in forming the HOT loci. At the same time, lines 59-60 also cited studies that "detected putative drive motifs at the core segments of the HOT loci". The authors should edit the manuscript to clarify (or eradicate) contradictory statements.

    1. Reviewer #1 (Public Review):

      Summary:

      By using the biophysical chromosome stretching, the authors measured the stiffness of chromosomes of mouse oocytes in meiosis I (MI) and meiosis II (MII). This study was the follow-up of previous studies in spermatocytes (and oocytes) by the authors (Biggs et al. Commun. Biol. 2020: Hornick et al. J. Assist. Rep. and Genet. 2015). They showed that MI chromosomes are much stiffer (~10 fold) than mitotic chromosomes of mouse embryonic fibroblast (MEF) cells. MII chromosomes are also stiffer than the mitotic chromosomes. The authors also found that oocyte aging increases the stiffness of the chromosomes. Surprisingly, the stiffness of meiotic chromosomes is independent of meiotic chromosome components, Rec8, Stag3, and Rad21L. with aging.

      Strengths:

      This provides a new insight into the biophysical property of meiotic chromosomes, that is chromosome stiffness. The stiffness of chromosomes in meiosis prophase I is ~10-fold higher than that of mitotic chromosomes, which is independent of meiotic cohesin. The increased stiffness during oocyte aging is a novel finding.

      Weaknesses:

      A major weakness of this paper is that it does not provide any molecular mechanism underlying the difference between MI and MII chromosomes (and/or prophase I and mitotic chromosomes).

    2. eLife assessment

      This valuable paper describes the stiffness of meiotic chromosomes in both oocytes and spermatocytes. The authors identify differences in stiffness between meiosis I and II chromosomes, as well as an age-dependent increase in stiffness in meiosis I (and meiosis II) chromosomes, results that are highly significant for the field of chromosome biology. The mechanisms underlying age-dependent changes in chromosome stiffness remain unclear, and the evidence to suggest that changes in stiffness are independent of cohesin, which is known to deteriorate with age, is incomplete.

    3. Reviewer #2 (Public Review):

      This paper reports investigations of chromosome stiffness in oocytes and spermatocytes. The paper shows that prophase I spermatocytes and MI/MII oocytes yield high Young Modulus values in the assay the authors applied. Deficiency in each one of three meiosis-specific cohesins they claim did not affect this result and increased stiffness was seen in aged oocytes but not in oocytes treated with the DNA-damaging agent etoposide.

      The paper reports some interesting observations which are in line with a report by the same authors of 2020 where increased stiffness of spermatocyte chromosomes was already shown. In that sense, the current manuscript is an extension of that previous paper, and thus novelty is somewhat limited. The paper is also largely descriptive as it does neither propose a mechanism nor report factors that determine the chromosomal stiffness.

      There are several points that need to be considered.

      (1) Limitations of the study and the conclusions are not discussed in the "Discussion" section and that is a significant gap. Even more so as the authors rely on just one experimental system for all their data - there is no independent verification - and that in vitro system may be prone to artefacts.

      (2) It is somewhat unfortunate that they jump between oocytes and spermatocytes to address the cohesin question. Prophase I (pachytene) spermatocytes chromosomes are not directly comparable to MI or MII oocyte chromosomes. In fact, the authors report Young Modulus values of 3700 for MI oocytes and only 2700 for spermatocyte prophase chromosomes, illustrating this difference. Why not use oocyte-specific cohesin deficiencies?

      (3) It remains unclear whether the treatment of oocytes with the detergent TritonX-100 affects the spindle and thus the chromosomes isolated directly from the Triton-lysed oocytes. In fact, it is rather likely that the detergent affects chromatin-associated proteins and thus structural features of the chromosomes.

      (4) Why did the authors use mouse strains of different genetic backgrounds, CD-1, and C57BL/6? That makes comparison difficult. Breeding of heterozygous cohesin mutants will yield the ideal controls, i.e. littermates.

      (5) How did the authors capture chromosome axes from STAG3-deficienct spermatocytes which feature very few if any axes? How representative are those chromosomes that could be captured?

    4. Reviewer #3 (Public Review):

      Summary:

      Understanding the mechanical properties of chromosomes remains an important issue in cell biology. Measuring chromosome stiffness can provide valuable insights into chromosome organization and function. Using a sophisticated micromanipulation system, Liu et al. analyzed chromosome stiffness in MI and MII oocytes. The authors found that chromosomes in MI oocytes were ten-fold stiffer than mitotic ones. The stiffness of chromosomes in MI mouse oocytes was significantly higher than that in MII oocytes. Furthermore, the knockout of the meiosis-specific cohesin component (Rec8, Stag3, Rad21l) did not affect meiotic chromosome stiffness. Interestingly, the authors showed that chromosomes from old MI oocytes had higher stiffness than those from young MI oocytes. The authors claimed this effect was not due to the accumulated DNA damage during the aging process because induced DNA damage reduced chromosome stiffness in oocytes.

      Strengths:

      The technique used (isolating the chromosomes in meiosis and measuring their stiffness) is the authors' specialty. The results are intriguing and informative to the chromatin/chromosome and other related fields.

      Weaknesses:

      (1) How intact the measured chromosomes were is unclear.

      (2) Some control data needs to be included.

      (3) The paper was not well-written, particularly the Introduction section.

      (4) How intact were the measured chromosomes? Although the structural preservation of the chromosomes is essential for this kind of measurement, the meiotic chromosomes were isolated in PBS with Triton X-100 and measured at room temperature. It is known that chromosomes are very sensitive to cation concentrations and macromolecular crowding in the environment (PMID: 29358072, 22540018, 37986866). It would be better to discuss this point.

    1. eLife assessment

      This important study addresses the challenge of antimicrobial resistance by targeting plasmid proteins that interfere with plasmid transfer as a novel strategy to limit the spread of antibiotic resistance genes. While the evidence presented is solid, the work would benefit from a clear integration of the approaches used and more thorough analyses to fully assess the effectiveness of this strategy. This study will interest those working on plasmid transfer and antimicrobial resistance.

    2. Reviewer #1 (Public Review):

      The study by Prieto et al. faces the increasingly serious problem of bacterial resistance to antimicrobial agents. This work has an important element of novelty proposing a new approach to control antibiotic resistance spread by plasmids. Instead of targeting the resistance determinant, plasmid-borne proteins are used as antigens to be bound by specific nanobodies (Nbs). Once bound plasmid transfer was inhibited and Salmonella infection blocked. This in-depth study is quite detailed and complex, with many experiments (9 figures with multiple panels), rigorously carried out. Results fully support the authors' conclusions. Specifically, the authors investigated the role of two large molecular weight proteins (RSP and RSP2) encoded by the IncHI1 derivative-plasmid R27 of Salmonella. These proteins have bacterial Ig-like (Big) domains and are expressed on the cell surface, creating the opportunity for them to serve as immunostimulatory antigens. Using a mouse infection model, the authors showed that RSP proteins can properly function as antigens, in Salmonella strains harboring the IncHI1 plasmid. The authors clearly showed increased levels of specific IgG and IgA antibodies against these RSP proteins proteins in different tissues of immunized animals. In addition, non-immunized mice exhibited Salmonella colonization in the spleen and much more severe disease than immunized ones.

      However, the strength of this work is the selection and production of nanobodies (Nbs) that specifically interact with the extracellular domain of RSP proteins. The procedure to obtain Nbs is lengthy and complicated and includes the immunization of dromedaries with purified RPS and the construction of a VHH (H-chain antibody variable region) library in E. coli. As RSP is expressed on the surface of E. coli, specific Nbs were able to agglutinate Salmonella strains harboring the p27 plasmid encoding the RSP proteins.<br /> The authors demonstrated that Nbs-RSP reduced the conjugation frequency of p27 thus limiting the diffusion of the amp resistance harbored by the plasmid. This represents an innovative and promising strategy to fight antibiotic resistance, as it is not blocked by the mechanism that determines, in the specific case, the amp resistance of p27 but it targets an antigen associated with HincHI- derivative plasmids. Thus, RPS vaccination could be effective not only against Salmonella but also against other enteric bacteria. A possible criticism could be that Nbs against RSP proteins reduce the severity of the disease but do not completely prevent the infection by Salmonella.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript aims to tackle the antimicrobial resistance through the development of vaccines. Specifically, the authors test the potential of the RSP protein as a vaccine candidate. The RSP protein contains bacterial Ig-like domains that are typically carried in IncHl1 plasmids like R27. The extracellular location of the RSP protein and its role in the conjugation process makes it a good candidate for a vaccine. The authors then use Salmonella carrying an IncHl plasmid to test the efficacy of the RSP protein as a vaccine antigen in providing protection against infection of antibiotic-resistant bacteria carrying the IncHl plasmid. The authors found no differences in total IgG or IgA levels, nor in pro-inflammatory cytokines between immunized and non-immunized mice. They however found differences in specific IgG and IgA, attenuated disease symptoms, and restricted systemic infection.

      The manuscript also evaluates the potential use of nanobodies specifically targeting the RSP protein by expressing it in E. coli and evaluating their interference in the conjugation of IncHl plasmids. The authors found that E. coli strains expressing RSP-specific nanobodies bind to Salmonella cells carrying the R27 plasmid thereby reducing the conjugation efficacy of Salmonella.

      Strengths:

      - The main strength of this manuscript is that it targets the mechanism of transmission of resistance genes carried by any bacterial species, thus making it broad.

      - The experimental setup is sound and with proper replication.

      Weaknesses:

      - The two main experiments, evaluating the potential of the RSP protein and the effects of nanobodies on conjugation, seem as parts of two different and unrelated strategies.

      - The survival rates shown in Figure 1A and Figure 3A for Salmonella pHCM1 and non-immunized mice challenged with Salmonella, respectively, are substantially different. In the same figures, the challenge of immunized mice and Salmonella pHCM1 and mice challenged with Salmonella pHCM1 with and without ampicillin are virtually the same. While this is not the only measure of the effect of immunization, the inconsistencies in the resulting survival curves should be addressed by the authors more thoroughly as they can confound the effects found in other parameters, including total and specific IgG and IgA, and pro-inflammatory cytokines.

      - Overall the results are inconsistent and provide only partial evidence of the effectiveness of the RSP protein as a vaccine target.

      - The conjugative experiments use very long conjugation times, making it harder to asses if the resulting transconjugants are the direct result of conjugation or just the growth of transconjugants obtained at earlier points in time. While this could be assessed from the obtained results, it is not a direct or precise measure.

      - While the potential outcomes of these experiments could be applied to any bacterial species carrying this type of plasmids, it is unclear why the authors use Salmonella strains to evaluate it. The introduction does a great job of explaining the importance of these plasmids but falls short in introducing their relevance in Salmonella.

    1. eLife assessment

      This valuable study reports on a series of artificial selection experiments for microbiomes that confer drought tolerance to rice plants. A major strength is the solid experimental design with multiple soils, which will likely guide others in designing their experiments, but the study has also shortcomings in that the rescuing effect is not benchmarked against healthy well-watered plants, the sterilized controls do not add much information, and the dispersal between inocula confounds the interpretation of the results. In addition, while the type of work presented here is a first step towards the eventual goal of plant microbiome engineering, that goal is still mainly an ambition. The abstract would benefit from this being made clear, and the presentation would overall benefit from more extensive consideration of recent developments in the field.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The study claims to explore plant microbiome engineering using host-mediated selection as a strategy to enhance rice growth and drought tolerance.

      Strengths:

      The authors have derived and identified simplified microbiomes from wild microbial communities of rice fields, deserts, and serpentine seep soils by selecting microbiomes from plants with desired phenotypes across generations. Metagenome-assembled genomes revealed enriched functions, such as glycerol-3-phosphate and iron transport, known to mediate plant-microbe interactions during drought.

      Weaknesses:

      The findings demonstrate the efficacy of host-mediated microbiome selection, but the engineering part for enhancing rice performance under drought-stress conditions has not been provided. The proposed mechanisms rely on correlations but not direct experimental proofs.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, Styer et al. impose artificial selection on root-associated microbiomes to increase drought tolerance in rice plants using different soils as starting microbiomes. Using NDVI and biomass as a proxy for plant health, they find that iterative passaging of the microbiomes of the best-performing plants increased plant resilience to drought stress in a soil-dependent manner. The study makes use of numerous controls. The authors survey the microbiota of the plants across generations, using an array of interesting analyses to characterize their observations. Firstly, the authors find that the acquired microbiomes are divergent towards the beginning of the selection experiment, but nearly converge later suggesting that the selected communities become more similar over time. One reason is that the diversity of the microbiomes severely decreases after only one or two generations of selection AND that microbes from each inoculation source appear to easily disperse across the experiment, leading to microbiome homogeneity. The authors then present an analysis to correlate ASVs with the NDVI and Biomass over the course of the experiment (using the rice soil selection lines) to develop hypotheses about which ASVs may impact plant traits.

      Strengths:

      The authors set out to refine the understanding of microbiome artificial selection, a topic of recent interest to the plant microbiome field. The authors use an established approach (Mueller et al), expanding upon it by including multiple starting soil inocula to ask whether the strength of selection varies by input microbiome. This is an important and novel question. Using drought resilience as measured by NDVI and plant biomass to select upon was a wise choice for this type of study, given their relative ease and quickness to assess. The inclusion of several types of controls, multiple selection lines, and several starting soil inocula showed a thoughtful experimental design. The analyses were diverse, non-standard, and attempted to address microbiome dynamics on multiple fronts. I am not necessarily convinced by some of the conclusions (see below), however, I think this study examines an important and exciting topic in the area of plant microbiomes. I predict the findings of the experiments will inform a wide audience of researchers attempting similar studies and be helpful in their designs.

      Weaknesses:

      Although the controls were well designed, the dispersal of the microbiomes erased the utility of the sterile inoculated (SI) controls, at least from my reading of the manuscript. Perhaps the original intent of the SI plants was to contrast the selected microbiomes vs axenic plants to show that plant resilience to drought increased generation after generation. If the controls had worked properly under my presumed scenario, this would allow the authors to account for batch variation across the generations (due to slight differences in MS media prep, water quality, etc.). Instead, the SI lines acquired microbes from the experiment and never appeared to significantly deviate from the SL plants. The dispersal of the microbes amongst soils and selection lines also minimizes any conclusions that can be made about the different starting inocula and how prone to selection they may be.

    4. Reviewer #3 (Public Review):

      Summary:

      In this work, Styer et al. explore host selection as a means for recruiting microbes that may aid their host under stressful conditions, in this case under drought stress, as an alternative to target-SynCom design. They do so by subjecting rice plants to several generations of soil transplantation, and by using the most successful rice plants as donors for the next generation. By using several NGS approaches and very thorough bioinformatics analysis, the authors identify potential microbial taxa and the associated functions enriched in the conditions of interest.

      Strengths:

      In general, I think this approach was very much needed in the field as an alternative to SynComs, which are still not readily usable in croplands. This work sets the grounds for future similar approaches, using different stresses and different host plants.

      In this work, the experimental setup is well thought-through and well-replicated. In addition, an exhaustive set of preliminary experiments was performed before deciding on the final panel of soils to use and scoring methodology. The figures are clear and well-explained.

      Weaknesses:

      One of the more unexpected results is that sterile/non-inoculated calcined clay also tends to enrich similar microbes, and the authors did extensive work exploring possible sources and microbial dispersal within the growth chamber. In a future experiment, the work would benefit from including a truly sterile control (same growth chamber but completely isolated from possible contaminations). In this regard, the reader may get to wonder whether these efforts are necessary at all (selection experiments), since plants seem to get from their environment what they need to survive. This is discussed across the paper but not directly addressed and I think the manuscript would benefit from a clear argument for or against this idea.

    1. eLife assessment

      This article is a valuable addition to the growing literature on the developmental patterning of insect wings. Using CRISPR mutagenesis and localization of mRNA, the authors present solid evidence that the transcription factor Mirror is necessary for specifying the morphological identity of the most posterior regions of butterfly wings. The manuscript would benefit from more careful use of terminology and appropriate citation of related Drosophila literature, and there are also some concerns about whether the phenotype represents transformation or loss which might be clarified through a closer look at ultrastructure. With a clearer presentation of terminology, this paper would be of general interest to developmental and evolutionary biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      This short report shows that the transcription factor gene mirror is specifically expressed in the posterior region of the butterfly wing imaginal disk, and uses CRISPR mosaic knock-outs to show it is necessary to specify the morphological features (scales, veins, and surface) of this area.

      Strengths:

      The data and figures support the conclusions. The article is swiftly written and makes an interesting evolutionary comparison to the function of this gene in Drosophila. Based on the data presented, it can now be established that mirror likely has a similar selector function for posterior-wing identity in a plethora of insects.

      Weaknesses:

      This first version has minor terminological issues regarding the use of the terms "domains" and "compartment".

    3. Reviewer #2 (Public Review):

      This is a short and unpretentious paper. It is an interesting area and therefore, although much of this area of research was pioneered in flies, extending basic findings to butterflies would be worthwhile. Indeed, there is an intriguing observation but it is technically flawed and these flaws are serious.

      The authors show that mirror is expressed at the back of the wing in butterflies (as in flies). They present some evidence that is required for the proper development of the back of the wing in butterflies (a region dubbed the vannus by the ancient guru Snodgrass). But there are problems with that evidence. First, concerning the method, using CRISP they treat embryos and the expectation is that the mirror gene will be damaged in groups of cell lineages, giving a mosaic animal in which some lines of cells are normal for mirror and others are not. We do not know where the clones or patches of cells that are defective for mirror are because they are not marked. Also, we do not know what part of the wing is wild type and what part is mutant for mirror. When the mirror mutant cells colonise the back of the wing and that butterfly survives (many butterflies fail to develop), the back of the wing is altered in some selected butterflies. This raises a second problem: we do not know whether the rear of the wing is missing or transformed. From the images, the appearance of the back of the wing is clearly different from the wild type, but is that due to transformation or not? And then I believe we need to know specifically what the difference is between the rear of the wing and the main part. What we see is a silvery look at the back that is not present in the main part, is it the structure of the scales? We are not told. There are other problems. Mirror is only part of a group of genes in flies and in flies both iroquois and mirror are needed to make the back of the wing, the alula (Kehl et al). What is known about iro expression in butterflies?

      In flies, mirror regulates a late and local expression of dpp that seems to be responsible for making the alula. What happens in butterflies? Would a study of the expression of Dpp in wildtype and mirror compromised wings be useful?

      Thus, I find the paper to be disappointing for a general journal as it does little more than claim what was discovered in Drosophila is at least partly true in butterflies. Also, it fails to explain what the authors mean by "wing domains" and "domain specification". They are not alone, butterfly workers, in general, appear vague about these concepts, their vagueness allowing too much loose thinking.

      Since these matters are at the heart of the purpose and meaning of the work reported here, we readers need a paper containing more critical thought and information. I would like to have a better and more logical introduction and discussion.

      The authors do define what they mean by the vannus of the wing. In flies the definition of compartments is clear and abundantly demonstrated, with gene expression and requirement being limited precisely to sets of cells that display lineage boundaries. It is true that domains of gene expression in flies, for example of the iroquois complex, which includes mirror, can only be related to patterns with difficulty. Some recap of what is known plus the opinion of the authors on how they interpret papers on possible lineage domains in butterflies might also be useful as the reader, is no wiser about what the authors might mean at the end of it!

      The references are sometimes inappropriate. The discovery of the AP compartments should not be referred to Guillen et al 1995, but to Morata and Lawrence 1975. Proofreading is required.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Chatterjee et al. examines the role of the mirror locus in patterning butterfly wings. The authors examine the pattern of mirror expression in the common buckeye butterfly, Junonia coenia, and then employ CRISPR mutagenesis to generate mosaic butterflies carrying clones of mirror mutant cells. They find that mirror is expressed in a well-defined posterior sector of final-instar wing discs from both hindwings and forewings and that CRISPR-injected larvae display a loss of adult wing structures presumably derived from the mirror expressing region of hindwing primordium (the case for forewings is a bit less clear since the mirror domain is narrower than in the hindwing, but there also do seem to be some anomalies in posterior regions of forewings in adults derived from CRISPR injected larvae). The authors conclude that the wings of these butterflies have at least three different fundamental wing compartments, the mirror domain, a posterior domain defined by engrailed expression, and an anterior domain expressing neither mirror nor engrailed. They speculate that this most posterior compartment has been reduced to a rudiment in Drosophila and thus has not been adequately recognized as such a primary regional specialization.

      Critique:

      This is a very straightforward study and the experimental results presented support the key claims that mirror is expressed in a restricted posterior section of the wing primordium and that mosaic wings from CRISPR-injected larvae display loss of adult wing structures presumably derived from cells expressing mirror (or at least nearby). The major issue I have with this paper is the strong interpretation of these findings that lead the authors to conclude that mirror is acting as a high-level gene akin to engrailed in defining a separate extreme posterior wing compartment. To place this claim in context, it is important in my view to consider what is known about engrailed, for which there is ample evidence to support the claim that this gene does play a very ancestral and conserved function in defining posterior compartments of all body segments (including the wing) across arthropods.

      (1) Engrailed is expressed in a broad posterior domain with a sharp anterior border in all segments of virtually all arthropods examined (broad use of a very good pan-species anti-En antibody makes this case very strong).

      (2) In Drosophila, marked clones of wing cells (generated during larval stages) strictly obey a straight anterior-posterior border indicating that cells in these two domains do not normally intermix, thus, supporting the claim that a clear A/P lineage compartment exists.

      In my opinion, mirror does not seem to be in the same category of regulator as engrailed for the following reasons:

      (1) There is no evidence that I am aware of, either from the current experiments, or others that the mirror expression domain corresponds to a clonal lineage compartment. It is also unclear from the data shown in this study whether engrailed is co-expressed with mirror in the posterior-most cells of J. coenia wing discs. If so, it does not seem justified to infer that mirror acts as an independent determinant of the region of the wing where it is expressed.

      (2) Mirror is not only expressed in a posterior region of the wing in flies but also in the ventral region of the eye. In Drosophila, mirror mutants not only lack the alula (derived approximately from cells where mirror is expressed), but also lack tissue derived from the ventral region of the eye disc (although this ventral tissue loss phenotype may extend beyond the cells expressing mirror).

      In summary, it seems most reasonable to me to think of mirror as a transcription factor that provides important development information for a diverse set of cells in which it can be expressed (posterior wing cells and ventral eye cells) but not that it acts as a high-level regulator as engrailed.

      Recommendation:

      While the data provided in this succinct study are solid and interesting, it is not clear to me that these findings support the major claim that mirror defines an extreme posterior compartment akin to that specified by engrailed. Minimally, the authors should address the points outlined above in their discussion section and greatly tone down their conclusion regarding mirror being a conserved selector-like gene dedicated to establishing posterior-most fates of the wing. They also should cite and discuss the original study in Drosophila describing the mirror expression pattern in the embryo and eye and the corresponding eye phenotype of mirror mutants: McNeill et al., Genes & Dev. 1997. 11: 1073-1082; doi:10.1101/gad.11.8.1073.

    1. eLife assessment

      This valuable study addresses the interpretation of patterns of synonymous and nonsynonymous diversity in microbial genomes. The authors present solid theoretical and computational evidence that adaptive mutations that revert the amino acids to an earlier state can significantly impact the observed ratios of synonymous and nonsynonymous mutations in human commensal bacteria. This paper will be of interest to microbiologists with a background in evolution.

    2. Reviewer #1 (Public Review):

      This study makes a substantial contribution to our understanding of the molecular evolutionary dynamics of microbial genomes by proposing a model that incorporates relatively frequent adaptive reversion mutations. In many ways, this makes sense from my own experience with evolutionary genomic data of microbes, where reversions are surprisingly familiar as evidence of the immense power of selection in large populations.

      One criticism is the reliance on one major data set of B. fragilis to test fits of these models, but this is relatively minor in my opinion and can be caveated by discussion of other relevant datasets for parallel investigation.

      Another point is that this problem isn't as new as the manuscript indicates, see for example https://journals.asm.org/doi/10.1128/aem.02002-20.

      Nonetheless, the paper succeeds by both developing theory and offering concrete parameters to illustrate the magnitudes of the problems that distinguish competing ideas, for example, the risk of mutational load posed in the absence of frequent back mutation.

    3. Reviewer #3 (Public Review):

      The diversity of bacterial species in the human gut microbiome is widely known, but the extensive diversity within each species is far less appreciated. Strains found in individuals on opposite sides of the globe can differ by as little as handfuls of mutations, while strains found in an individual's gut, or in the same household, might have a common ancestor tens of thousands of years ago. What are the evolutionary, ecological, and transmission dynamics that established and maintain this diversity?

      The time, T, since the common ancestor of two strains, can be directly inferred by comparing their core genomes and finding the fraction of synonymous (non-amino acid changing) sites at which they differ: dS. With the per-site per-generation mutation rate, μ, and the mean generation times roughly known, this directly yields T (albeit with substantial uncertainty of the generation time.) A traditional way to probe the extent to which selection plays a role is to study pairs of strains and compare the fraction of non-synonymous (amino acid or stop-codon changing) sites, dN, at which the strains differ with their dS. Small dN/dS, as found between distantly related strains, is attributed to purifying selection against deleterious mutations dominating over mutations that have driven adaptive evolution. Large dN/dS as found in laboratory evolution experiments, is caused by beneficial mutations that quickly arise in large bacterial populations, and, with substantial selective advantages, per generation, can rise to high abundance fast enough that very few synonymous mutations arise in the lineages that take over the population.

      A number of studies (including by Lieberman's group) have analyzed large numbers of strains of various dominant human gut species and studied how dN/dS varies. Although between closely related strains the variations are large -- often much larger than attributable to just statistical variations -- a systematic trend from dN/dS around unity or larger for close relatives to dN/dS ~ 0.1 for more distant relatives has been found in enough species that it is natural to conjecture a general explanation.<br /> The conventional explanation is that, for close relatives, the effects of selection over the time since they diverged has not yet purged weakly deleterious mutations that arose by chance -- roughly mutations with sT<1 -- while since the common ancestor of more distantly related strains, there is plenty of time for most of those that arose to have been purged.

      Torrillo and Lieberman have carried out an in-depth -- sophisticated and quantitative -- analysis of models of some of the evolutionary processes that shape the dependence of dN/dS on dS -- and hence on their divergence time, T. They first review the purifying selection model and show that -- even ignoring its inability to explain dN/dS > 1 for many closely related pairs -- the model has major problems explaining the crossover from dN/dS somewhat less than unity to much smaller values as dS goes through -- on a logarithmic scale -- the 10^-4 range. The first problem, already seen in the infinite-population-size deterministic model, is that a very large fraction of non-synonymous mutations would have to have deleterious s's in the 10^-5 per generation range to fit the data (and a small fraction effectively neutral). As the s's are naturally expected (at least in the absence of quantitative analysis to the contrary) to be spread out over a wide range on a logarithmic scale of s, this seems implausible. But the authors go further and analyze the effects of fluctuations that occur even in the very large populations: ~ >10^12 bacteria per species in one gut, and 10^10 human guts globally. They show that Muller's ratchet -- the gradual accumulation of weakly deleterious mutations that are not purged by selection - leads to a mutational meltdown with the parameters needed to fit the purifying selection model. In particular, with N_e the "effective population size" that roughly parametrizes the magnitude of stochastic birth-death and transition fluctuations, and U the total mutation rate to such deleterious mutations this occurs for U/s > log(sN_e) which they show would obtain with the fitted parameters.

      Torrillo and Lieberman promise an alternate model: that there are a modest number of "loci" at which conditionally beneficial mutations can occur that are beneficial in some individual guts (or other environmental conditions) at some times, but deleterious in other (or the same) gut at other times. With the ancestors of a pair of strains having passed through one too many individuals and transmissions, it is possible for a beneficial mutation to occur and rise in the population, only later to be reverted by the beneficial inverse mutation. With tens of loci at which this can occur, they show that this process could explain the drop of dN/dS from short times -- in which very few such mutations have occurred -- to very long times by which most have flipped back and forth so that a random pair of strains will have the same nucleotide at such sites with 50% probability. Their qualitative analysis of a minimally simple model of this process shows that the bacterial populations are plenty big enough for such specific mutations to occur many times in each individual's gut, and with modest beneficials, to takeover. With a few of these conditionally beneficial mutations or reversions occurring during an individuals lifetime, they get a reasonably quantitative agreement with the dN/dS vs dS data with very few parameters. A key assumption of their model is that genetically exact reversion mutations are far more likely to takeover a gut population -- and spread -- than compensatory mutations which have a similar phenotypic-reversion effect: a mutation that is reverted does not show up in dN, while one that is compensated by another shows up as a two-mutation difference after the environment has changed twice.

      Strengths:

      The quantitative arguments made against the conventional purifying selection model are highly compelling, especially the consideration of multiple aspects that are usually ignored, including -- crucially -- how Muller's ratchet arises and depends on the realistic and needed-to-fit parameters; the effects of bottlenecks in transmission and the possibility that purifying selection mainly occurs then; and complications of the model of a single deleterious s, to include a distribution of selective disadvantages. Generally, the author's approach of focusing on the simplest models with as few as possible parameters (some roughly known), and then adding in various effects one-by-one, is outstanding and, in being used to analyze environmental microbial data, exceptional.

      The reversion model the authors propose and study is a simple general one and they again explore carefully various aspects of it -- including dynamics within and between hosts -- and the consequent qualitative and quantitative effects. Again, the quantitive analysis of almost all aspects is exemplary. Although it is hard to make a compelling guess of the number of loci that are subject to alternating selection on the needed time-scales (years to centuries) they make a reasonable argument for a lower bound in terms of the number of known invertible promoters (that can genetically switch gene expression on and off).

      Weaknesses:

      The primary weakness of this paper is one that the author's are completely open about: the assumption that, collectively, any of possibly-many compensatory mutations that could phenotypically revert an earlier mutation, are less likely to arise and takeover local populations than the exact specific reversion mutation. While detailed analysis of this is, reasonably enough, beyond the scope of the present paper, more discussion of this issue would add substantially to this work. Quantitatively, the problem is that even a modest number of compensatory mutations occurring as the environmental pressures change could lead to enough accumulation of non-synonymous mutations that they could cause dN/dS to stay large -- easily >1 -- to much larger dS than is observed. If, say, the appropriate locus is a gene, the number of combinations of mutations that are better in each environment would play a role in how large dN would saturate to in the steady state (1/2 of n_loci in the author's model). It is possible that clonal interference between compensatory and reversion mutations would result in the mutations with the largest s -- eg, as mentioned, reversion of a stop codon -- being much more likely to take over, and this could limit the typical number of differences between quite well-diverged strains. However, the reversion and subsequent re-reversion would have to both beat out other possible compensatory mutations -- naively less likely. I recommend that a few sentences in the Discussion be added on this important issue along with comments on the more general puzzle -- at least to this reader! -- as to why there appear to be so little adaptive genetic changes in core genomes on time scales of human lifetimes and civilization.

      An important feature of gut bacterial evolution that is now being intensely studied is only mentioned in passing at the end of this paper: horizontal transfer and recombination of core genetic material. As this tends to bring in many more mutations overall than occur in regions of a pair of genomes with asexual ancestry, the effects cannot be neglected. To what extent can this give rise to a similar dependence of dN/dS on dS as seen in the data? Of course, such a picture begs the question as to what sets the low dN/dS of segments that are recombined --- often from genetic distances comparable to the diameter of the species.

    1. eLife assessment

      This important work identifies a p. aeruginosa strain and enzyme that can degrade 1-naphthylamine, a harmful industrial pollutant. Data resulting from in vivo and structural approaches are compelling, but additional mutagenesis would further test and establish the broad substrate specificity of NpaA1. With this additional data, this paper would be of high interest to biologists and enzymologists studying biodegradation of industrial pollutants.

    2. Reviewer #1 (Public Review):

      (1) Napthylamine (1NA), an industrial reagent used in the manufacturing of dyes and pesticides is harmful to humans and the environment. In the current manuscript, the authors report the successful isolation of a Pseudomonas strain from a former naphthylamine manufacturing site that is capable of degrading 1NA. Using genetic and enzymatic analysis they identified the initial stages of 1NA degradation and the enzymes responsible for downstream processing of 1,2-dihydroxynapthalene and Salicylate. The authors determined the molecular structure of NpaA1, the first enzyme in the pathway responsible for glutamylation of 1NA. NpaA1 has a border substrate specificity compared to previously characterized enzymes involved in aromatic amine degradation. They carried out structural comparison of NpaA1 with glutamine synthase structures, alfa-fold models of similar enzymes and put forth hypothesis to explain the broad substrate specificity of NpaA1.

      The manuscript is well written and easy to understand. The authors carried out careful genetic analysis to identify the genes/enzymes responsible for degradation of 1NA to catechol. They characterized the first enzyme in the pathway, NpaA1 which is responsible glutamylation of 1NA. and determined the molecular structure of apo-NpaA1, NpaA1 - AMPPNP complex and Npa1 - ADP - Met-Sox-P complex using X-ray crystallography.<br /> The proposed mechanism of broad substrate specificity of NpaA1, however, is based on comparison of 1NA docked NpaA1 structure with St-GS (Glutamate synthase) and Alphafold2 predicted model of AtdA1 from an aniline degrading strain of Acinetobacter sp. Lack of molecular structure or mutational studies to back the proposed mechanism makes it difficult to agree with the proposed mechanism.

    3. Reviewer #2 (Public Review):

      Microbial degradation of synthetic organic compounds is the basis of bioremediation. Biodegradation of 1NA has not been previously reported. The report describes a complete study of 1NA biodegradation by a new isolate Pseudomonas sp. strain JS3066. The study includes the enrichment and isolation of the 1NA-degrading bacterium Pseudomonas sp. strain JS3066, the identification of the genes and enzymes involved in 1NA degradation, and the detailed characterization of γ-glutamylorganoamide synthetase by using biochemical and structural analysis. In the discussion, the potential evolution of 1NA degradation pathway, the similarity and difference between γ-glutamylorganoamide synthetase and glutamine synthetase, and the significance were explained. The conclusions were well supported by the results presented.

    1. eLife assessment

      This valuable study offers new insight into the role of centrosome protein ninein in skeletal development through an analysis of the skeletal phenotype of ninein-deficient mice. While there is solid evidence supporting the conclusion that the absence of ninein leads to transient skeletal abnormalities and a lasting reduction in osteoclastogenesis, the evidence to substantiate the claim that enhanced ossification is attributed to reduced osteoclast formation/activity is insufficient. This work will be of interest to scientists in bone biology and skeletal development field.

    2. Reviewer #2 (Public Review):

      The paper by Gilbert et al. is well-written in a detailed format and the authors are candid in their data interpretation by acknowledging that the described ninein bone defects are mild, transient, and do not lead to major long-lasting defects in adulthood.

      The main strength of the study is presenting a novel link between a centrosomal protein and osteoclasts in the mouse. However, the majority of the work is dedicated to describing the premature ossification phenotype and less attention is paid to how a centrosomal protein affects osteoclast proliferation, survival, and/or differentiation into mature osteoclasts.

      Based on the decrease in the number of osteoclasts (Fig 5E, G, and also per coverslip after 2 days in culture), the authors suggest that the loss of ninein impacts osteoclast proliferation. First, proliferation can be directly quantified using Ki67 staining or EdU incorporation. Second, other interpretations are also plausible and can also be experimentally tested. These include less adhesion and attachment of the mutants to the coverslips, but perhaps more relevant in vivo is cell death of the ninein mutant osteoclasts. It has been established that the loss of centrosome function activates p53-dependent cell death and osteoclasts might be a vulnerable cell population. Quantifying p53 immunoreactivity and/or cell death in osteoclasts might help clarify the phenotype of osteoclast reduction.

    3. Reviewer #3 (Public Review):

      Ninein is a centrosome protein that has been implicated in microtubule anchorage and centrosome cohesion. Mutations in the human ninein gene have been linked to Seckel syndrome and a rare form of skeletal dysplasia. However, the role of ninein in skeletal development remains unknown. Here, we describe a ninein knockout mouse with advanced endochondral ossification during embryonic development. Although the long bones maintain a regular size, the absence of ninein delays the formation of the bone marrow cavity in the prenatal tibia. Likewise, intramembranous ossification in the skull is more developed, leading to a premature closure of the interfrontal suture. We demonstrate that ninein is strongly expressed in osteoclasts of control mice and that its absence reduces the fusion of precursor cells into syncytial osteoclasts. As a consequence, ninein-deficient osteoclasts have a reduced capacity to resorb bone. At the cellular level, the absence of ninein interferes with<br /> centrosomal microtubule organization, reduces centrosome cohesion, and provokes the loss of centrosome clustering in multinucleated mature osteoclasts. We propose that centrosomal ninein is important for osteoclast fusion, to enable a functional balance between bone-forming osteoblasts and bone-resorbing osteoclasts during skeletal development.

    1. Author response:

      The following is the authors’ response to the current reviews.

      eLife assessment

      This useful manuscript challenges the utility of current paradigms for estimating brain-age with magnetic resonance imaging measures, but presents inadequate evidence to support the suggestion that an alternative approach focused on predicting cognition is more useful. The paper would benefit from a clearer explication of the methods and a more critical evaluation of the conceptual basis of the different models. This work will be of interest to researchers working on brain-age and related models.

      Thank you so much for providing high-quality reviews on our manuscript. We revised the manuscript to address all of the reviewers’ comments and provided full responses to each of the comments below. Importantly, in this revision, we clarified that we did not intend to use Brain Cognition as an alternative approach as mentioned by the editor. This is because, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Here we made this point more explicit and further stated that the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. By examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. And such quantification is the third aim of this study.

      Reviewer #1 (Public Review):

      In this paper, the authors evaluate the utility of brain age derived metrics for predicting cognitive decline by performing a 'commonality' analysis in a downstream regression that enables the different contribution of different predictors to be assessed. The main conclusion is that brain age derived metrics do not explain much additional variation in cognition over and above what is already explained by age. The authors propose to use a regression model trained to predict cognition ('brain cognition') as an alternative suited to applications of cognitive decline. While this is less accurate overall than brain age, it explains more unique variance in the downstream regression.

      Importantly, in this revision, we clarified that we did not intend to use Brain Cognition as an alternative approach. This is because, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Here we made this point more explicit and further stated that the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. By examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age.

      REVISED VERSION: while the authors have partially addressed my concerns, I do not feel they have addressed them all. I do not feel they have addressed the weight instability and concerns about the stacked regression models satisfactorily.

      Please see our responses to #3 below

      I also must say that I agree with Reviewer 3 about the limitations of the brain age and brain cognition methods conceptually. In particular that the regression model used to predict fluid cognition will by construction explain more variance in cognition than a brain age model that is trained to predict age. This suffers from the same problem the authors raise with brain age and would indeed disappear if the authors had a separate measure of cognition against which to validate and were then to regress this out as they do for age correction. I am aware that these conceptual problems are more widespread than this paper alone (in fact throughout the brain age literature), so I do not believe the authors should be penalised for that. However, I do think they can make these concerns more explicit and further tone down the comments they make about the utility of brain cognition. I have indicated the main considerations about these points in the recommendations section below.

      Thank you so much for raising this point. We now have the following statement in the introduction and discussion to address this concern (see below).

      Briefly, we made it explicit that, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. That is, the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. More importantly, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. And this is the third goal of this present study.

      From Introduction:

      “Third and finally, certain variation in fluid cognition is related to brain MRI, but to what extent does Brain Age not capture this variation? To estimate the variation in fluid cognition that is related to the brain MRI, we could build prediction models that directly predict fluid cognition (i.e., as opposed to chronological age) from brain MRI data. Previous studies found reasonable predictive performances of these cognition-prediction models, built from certain MRI modalities (Dubois et al., 2018; Pat, Wang, Anney, et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). Analogous to Brain Age, we called the predicted values from these cognition-prediction models, Brain Cognition. The strength of an out-of-sample relationship between Brain Cognition and fluid cognition reflects variation in fluid cognition that is related to the brain MRI and, therefore, indicates the upper limit of Brain Age’s capability in capturing fluid cognition. This is, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Consequently, if we included Brain Cognition, Brain Age and chronological age in the same model to explain fluid cognition, we would be able to examine the unique effects of Brain Cognition that explain fluid cognition beyond Brain Age and chronological age. These unique effects of Brain Cognition, in turn, would indicate the amount of co-variation between brain MRI and fluid cognition that is missed by Brain Age.”

      From Discussion:

      “Third, by introducing Brain Cognition, we showed the extent to which Brain Age indices were not able to capture the variation in fluid cognition that is related to brain MRI. More specifically, using Brain Cognition allowed us to gauge the variation in fluid cognition that is related to the brain MRI, and thereby, to estimate the upper limit of what Brain Age can do. Moreover, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age.

      From our results, Brain Cognition, especially from certain cognition-prediction models such as the stacked models, has relatively good predictive performance, consistent with previous studies (Dubois et al., 2018; Pat, Wang, Anney, et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). We then examined Brain Cognition using commonality analyses (Nimon et al., 2008) in multiple regression models having a Brain Age index, chronological age and Brain Cognition as regressors to explain fluid cognition. Similar to Brain Age indices, Brain Cognition exhibited large common effects with chronological age. But more importantly, unlike Brain Age indices, Brain Cognition showed large unique effects, up to around 11%. As explained above, the unique effects of Brain Cognition indicated the amount of co-variation between brain MRI and fluid cognition that was missed by a Brain Age index and chronological age. This missing amount was relatively high, considering that Brain Age and chronological age together explained around 32% of the total variation in fluid cognition. Accordingly, if a Brain Age index was used as a biomarker along with chronological age, we would have missed an opportunity to improve the performance of the model by around one-third of the variation explained.”

      This is a reasonably good paper and the use of a commonality analysis is a nice contribution to understanding variance partitioning across different covariates. I have some comments that I believe the authors ought to address, which mostly relate to clarity and interpretation

      Reviewer #1 Public Review #1

      First, from a conceptual point of view, the authors focus exclusively on cognition as a downstream outcome. I would suggest the authors nuance their discussion to provide broader considerations of the utility of their method and on the limits of interpretation of brain age models more generally.

      Thank you for your comments on this issue.

      We now discussed the broader consideration in detail:

      (1) the consistency between our findings on fluid cognition and other recent works on brain disorders,

      (2) the difference between studies investigating the utility of Brain Age in explaining cognitive functioning, including ours and others (e.g., Butler et al., 2021; Cole, 2020, 2020; Jirsaraie, Kaufmann, et al., 2023) and those explaining neurological/psychological disorders (e.g., Bashyam et al., 2020; Rokicki et al., 2021)

      and

      (3) suggested solutions we and others made to optimise the utility of Brain Age for both cognitive functioning and brain disorders.

      From Discussion:

      “This discrepancy between the predictive performance of age-prediction models and the utility of Brain Age indices as a biomarker is consistent with recent findings (for review, see Jirsaraie, Gorelik, et al., 2023), both in the context of cognitive functioning (Jirsaraie, Kaufmann, et al., 2023) and neurological/psychological disorders (Bashyam et al., 2020; Rokicki et al., 2021). For instance, combining different MRI modalities into the prediction models, similar to our stacked models, often leads to the highest performance of age-prediction models, but does not likely explain the highest variance across different phenotypes, including cognitive functioning and beyond (Jirsaraie, Gorelik, et al., 2023).”

      “There is a notable difference between studies investigating the utility of Brain Age in explaining cognitive functioning, including ours and others (e.g., Butler et al., 2021; Cole, 2020, 2020; Jirsaraie, Kaufmann, et al., 2023) and those explaining neurological/psychological disorders (e.g., Bashyam et al., 2020; Rokicki et al., 2021). We consider the former as a normative type of study and the latter as a case-control type of study (Insel et al., 2010; Marquand et al., 2016). Those case-control Brain Age studies focusing on neurological/psychological disorders often build age-prediction models from MRI data of largely healthy participants (e.g., controls in a case-control design or large samples in a population-based design), apply the built age-prediction models to participants without vs. with neurological/psychological disorders and compare Brain Age indices between the two groups. On the one hand, this means that case-control studies treat Brain Age as a method to detect anomalies in the neurological/psychological group (Hahn et al., 2021). On the other hand, this also means that case-control studies have to ignore under-fitted models when applied prediction models built from largely healthy participants to participants with neurological/psychological disorders (i.e., Brain Age may predict chronological age well for the controls, but not for those with a disorder). On the contrary, our study and other normative studies focusing on cognitive functioning often build age-prediction models from MRI data of largely healthy participants and apply the built age-prediction models to participants who are also largely healthy. Accordingly, the age-prediction models for explaining cognitive functioning in normative studies, while not allowing us to detect group-level anomalies, do not suffer from being under-fitted. This unfortunately might limit the generalisability of our study into just the normative type of study. Future work is still needed to test the utility of brain age in the case-control case.”

      “Next, researchers should not select age-prediction models based solely on age-prediction performance. Instead, researchers could select age-prediction models that explained phenotypes of interest the best. Here we selected age-prediction models based on a set of features (i.e., modalities) of brain MRI. This strategy was found effective not only for fluid cognition as we demonstrated here, but also for neurological and psychological disorders as shown elsewhere (Jirsaraie, Gorelik, et al., 2023; Rokicki et al., 2021). Rokicki and colleagues (2021), for instance, found that, while integrating across MRI modalities led to age-prediction models with the highest age-prediction performance, using only T1 structural MRI gave age-prediction models that were better at classifying Alzheimer’s disease. Similarly, using only cerebral blood flow gave age-prediction models that were better at classifying mild/subjective cognitive impairment, schizophrenia and bipolar disorder.

      As opposed to selecting age-prediction models based on a set of features, researchers could also select age-prediction models based on modelling methods. For instance, Jirsaraie and colleagues (2023) compared gradient tree boosting (GTB) and deep-learning brain network (DBN) algorithms in building age-prediction models. They found GTB to have higher age-prediction performance but DBN to have better utility in explaining cognitive functioning. In this case, an algorithm with better utility (e.g., DBN) should be used for explaining a phenotype of interest. Similarly, Bashyam and colleagues (2020) built different DBN-based age-prediction models, varying in age-prediction performance. The DBN models with a higher number of epochs corresponded to higher age-prediction performance. However, DBN-based age-prediction models with a moderate (as opposed to higher or lower) number of epochs were better at classifying Alzheimer’s disease, mild cognitive impairment and schizophrenia. In this case, a model from the same algorithm with better utility (e.g., those DBN with a moderate epoch number) should be used for explaining a phenotype of interest. Accordingly, this calls for a change in research practice, as recently pointed out by Jirasarie and colleagues (2023, p7), “Despite mounting evidence, there is a persisting assumption across several studies that the most accurate brain age models will have the most potential for detecting differences in a given phenotype of interest”. Future neuroimaging research should aim to build age-prediction models that are not necessarily good at predicting age, but at capturing phenotypes of interest.”

      Reviewer #1 Public Review #2

      Second, from a methods perspective, there is not a sufficient explanation of the methodological procedures in the current manuscript to fully understand how the stacked regression models were constructed. I would request that the authors provide more information to enable the reader to better understand the stacked regression models used to ensure that these models are not overfit.

      Thank you for allowing us an opportunity to clarify our stacked model. We made additional clarification to make this clearer (see below). We wanted to confirm that we did not use test sets to build a stacked model in both lower and higher levels of the Elastic Net models. Test sets were there just for testing the performance of the models.

      From Methods: “We used nested cross-validation (CV) to build these prediction models (see Figure 7). We first split the data into five outer folds, leaving each outer fold with around 100 participants. This number of participants in each fold is to ensure the stability of the test performance across folds. In each outer-fold CV loop, one of the outer folds was treated as an outer-fold test set, and the rest was treated as an outer-fold training set. Ultimately, looping through the nested CV resulted in a) prediction models from each of the 18 sets of features as well as b) prediction models that drew information across different combinations of the 18 separate sets, known as “stacked models.” We specified eight stacked models: “All” (i.e., including all 18 sets of features), “All excluding Task FC”, “All excluding Task Contrast”, “Non-Task” (i.e., including only Rest FC and sMRI), “Resting and Task FC”, “Task Contrast and FC”, “Task Contrast” and “Task FC”. Accordingly, there were 26 prediction models in total for both Brain Age and Brain Cognition.

      To create these 26 prediction models, we applied three steps for each outer-fold loop. The first step aimed at tuning prediction models for each of 18 sets of features. This step only involved the outer-fold training set and did not involve the outer-fold test set. Here, we divided the outer-fold training set into five inner folds and applied inner-fold CV to tune hyperparameters with grid search. Specifically, in each inner-fold CV, one of the inner folds was treated as an inner-fold validation set, and the rest was treated as an inner-fold training set. Within each inner-fold CV loop, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters and applied the estimated model to the inner-fold validation set. After looping through the inner-fold CV, we, then, chose the prediction models that led to the highest performance, reflected by coefficient of determination (R2), on average across the inner-fold validation sets. This led to 18 tuned models, one for each of the 18 sets of features, for each outer fold.

      The second step aimed at tuning stacked models. Same as the first step, the second step only involved the outer-fold training set and did not involve the outer-fold test set. Here, using the same outer-fold training set as the first step, we applied tuned models, created from the first step, one from each of the 18 sets of features, resulting in 18 predicted values for each participant. We, then, re-divided this outer-fold training set into new five inner folds. In each inner fold, we treated different combinations of the 18 predicted values from separate sets of features as features to predict the targets in separate “stacked” models. Same as the first step, in each inner-fold CV loop, we treated one out of five inner folds as an inner-fold validation set, and the rest as an inner-fold training set. Also as in the first step, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters from our grid. We tuned the hyperparameters of stacked models using grid search by selecting the models with the highest R2 on average across the inner-fold validation sets. This led to eight tuned stacked models.

      The third step aimed at testing the predictive performance of the 18 tuned prediction models from each of the set of features, built from the first step, and eight tuned stacked models, built from the second step. Unlike the first two steps, here we applied the already tuned models to the outer-fold test set. We started by applying the 18 tuned prediction models from each of the sets of features to each observation in the outer-fold test set, resulting in 18 predicted values. We then applied the tuned stacked models to these predicted values from separate sets of features, resulting in eight predicted values.

      To demonstrate the predictive performance, we assessed the similarity between the observed values and the predicted values of each model across outer-fold test sets, using Pearson’s r, coefficient of determination (R2) and mean absolute error (MAE). Note that for R2, we used the sum of squares definition (i.e., R2 = 1 – (sum of squares residuals/total sum of squares)) per a previous recommendation (Poldrack et al., 2020). We considered the predicted values from the outer-fold test sets of models predicting age or fluid cognition, as Brain Age and Brain Cognition, respectively.”

      Note some previous research, including ours (Tetereva et al., 2022), splits the observations in the outer-fold training set into layer 1 and layer 2 and applies the first and second steps to layers 1 and 2, respectively. Here we decided against this approach and used the same outer-fold training set for both first and second steps in order to avoid potential bias toward the stacked models. This is because, when the data are split into two layers, predictive models built for each separate set of features only use the data from layer 1, while the stacked models use the data from both layers 1 and 2. In practice with large enough data, these two approaches might not differ much, as we demonstrated previously (Tetereva et al., 2022).

      Reviewer #1 Public Review #3

      Please also provide an indication of the different regression strengths that were estimated across the different models and cross-validation splits. Also, how stable were the weights across splits?

      The focus of this article is on the predictions. Still, it is informative for readers to understand how stable the feature importance (i.e., Elastic Net coefficients) is. To demonstrate the stability of feature importance, we now examined the rank stability of feature importance using Spearman’s ρ (see Figure 4). Specifically, we correlated the feature importance between two prediction models of the same features, used in two different outer-fold test sets. Given that there were five outer-fold test sets, we computed 10 Spearman’s ρ for each prediction model of the same features. We found Spearman’s ρ to be varied dramatically in both age-prediction (range=.31-.94) and fluid cognition-prediction (range=.16-.84) models. This means that some prediction models were much more stable in their feature importance than others. This is probably due to various factors such as a) the collinearity of features in the model, b) the number of features (e.g., 71,631 features in functional connectivity, which were further reduced to 75 PCAs, as compared to 19 features in subcortical volume based on the ASEG atlas), c) the penalisation of coefficients either with ‘Ridge’ or ‘Lasso’ methods, which resulted in reduction as a group of features or selection of a feature among correlated features, respectively, and d) the predictive performance of the models. Understanding the stability of feature importance is beyond the scope of the current article. As mentioned by Reviewer 1, “The predictions can be stable when the coefficients are not,” and we chose to focus on the prediction in the current article.

      Reviewer #1 Public Review #4

      Please provide more details about the task designs, MRI processing procedures that were employed on this sample in addition to the regression methods and bias correction methods used. For example, there are several different parameterisations of the elastic net, please provide equations to describe the method used here so that readers can easily determine how the regularisation parameters should be interpreted.

      Thank you for the opportunity for us to provide more methodical details.

      First, for the task design, we included the following statements:

      From Methods:

      “HCP-A collected fMRI data from three tasks: Face Name (Sperling et al., 2001), Conditioned Approach Response Inhibition Task (CARIT) (Somerville et al., 2018) and VISual MOTOR (VISMOTOR) (Ances et al., 2009).

      First, the Face Name task (Sperling et al., 2001) taps into episodic memory. The task had three blocks. In the encoding block [Encoding], participants were asked to memorise the names of faces shown. These faces were then shown again in the recall block [Recall] when the participants were asked if they could remember the names of the previously shown faces. There was also the distractor block [Distractor] occurring between the encoding and recall blocks. Here participants were distracted by a Go/NoGo task. We computed six contrasts for this Face Name task: [Encode], [Recall], [Distractor], [Encode vs. Distractor], [Recall vs. Distractor] and [Encode vs. Recall].

      Second, the CARIT task (Somerville et al., 2018) was adapted from the classic Go/NoGo task and taps into inhibitory control. Participants were asked to press a button to all [Go] but not to two [NoGo] shapes. We computed three contrasts for the CARIT task: [NoGo], [Go] and [NoGo vs. Go].

      Third, the VISMOTOR task (Ances et al., 2009) was designed to test simple activation of the motor and visual cortices. Participants saw a checkerboard with a red square either on the left or right. They needed to press a corresponding key to indicate the location of the red square. We computed just one contrast for the VISMOTOR task: [Vismotor], which indicates the presence of the checkerboard vs. baseline.”

      Second, for MRI processing procedures, we included the following statements.

      From Methods: “HCP-A provides details of parameters for brain MRI elsewhere (Bookheimer et al., 2019; Harms et al., 2018). Here we used MRI data that were pre-processed by the HCP-A with recommended methods, including the MSMALL alignment (Glasser et al., 2016; Robinson et al., 2018) and ICA-FIX (Glasser et al., 2016) for functional MRI. We used multiple brain MRI modalities, covering task functional MRI (task fMRI), resting-state functional MRI (rsfMRI) and structural MRI (sMRI), and organised them into 19 sets of features.”

      “ Sets of Features 1-10: Task fMRI contrast (Task Contrast) Task contrasts reflect fMRI activation relevant to events in each task. Bookheimer and colleagues (2019) provided detailed information about the fMRI in HCP-A. Here we focused on the pre-processed task fMRI Connectivity Informatics Technology Initiative (CIFTI) files with a suffix, “_PA_Atlas_MSMAll_hp0_clean.dtseries.nii.” These CIFTI files encompassed both the cortical mesh surface and subcortical volume (Glasser et al., 2013). Collected using the posterior-to-anterior (PA) phase, these files were aligned using MSMALL (Glasser et al., 2016; Robinson et al., 2018), linear detrended (see https://groups.google.com/a/humanconnectome.org/g/hcp-users/c/ZLJc092h980/m/GiihzQAUAwAJ) and cleaned from potential artifacts using ICA-FIX (Glasser et al., 2016).

      To extract Task Contrasts, we regressed the fMRI time series on the convolved task events using a double-gamma canonical hemodynamic response function via FMRIB Software Library (FSL)’s FMRI Expert Analysis Tool (FEAT) (Woolrich et al., 2001). We kept FSL’s default high pass cutoff at 200s (i.e., .005 Hz). We then parcellated the contrast ‘cope’ files, using the Glasser atlas (Gordon et al., 2016) for cortical surface regions and the Freesurfer’s automatic segmentation (aseg) (Fischl et al., 2002) for subcortical regions. This resulted in 379 regions, whose number was, in turn, the number of features for each Task Contrast set of features. “

      “ Sets of Features 11-13: Task fMRI functional connectivity (Task FC) Task FC reflects functional connectivity (FC ) among the brain regions during each task, which is considered an important source of individual differences (Elliott et al., 2019; Fair et al., 2007; Gratton et al., 2018). We used the same CIFTI file “_PA_Atlas_MSMAll_hp0_clean.dtseries.nii.” as the task contrasts. Unlike Task Contrasts, here we treated the double-gamma, convolved task events as regressors of no interest and focused on the residuals of the regression from each task (Fair et al., 2007). We computed these regressors on FSL, and regressed them in nilearn (Abraham et al., 2014). Following previous work on task FC (Elliott et al., 2019), we applied a highpass at .008 Hz. For parcellation, we used the same atlases as Task Contrast (Fischl et al., 2002; Glasser et al., 2016). We computed Pearson’s correlations of each pair of 379 regions, resulting in a table of 71,631 non-overlapping FC indices for each task. We then applied r-to-z transformation and principal component analysis (PCA) of 75 components (Rasero et al., 2021; Sripada et al., 2019, 2020). Note to avoid data leakage, we conducted the PCA on each training set and applied its definition to the corresponding test set. Accordingly, there were three sets of 75 features for Task FC, one for each task.

      Set of Features 14: Resting-state functional MRI functional connectivity (Rest FC) Similar to Task FC, Rest FC reflects functional connectivity (FC ) among the brain regions, except that Rest FC occurred during the resting (as opposed to task-performing) period. HCP-A collected Rest FC from four 6.42-min (488 frames) runs across two days, leading to 26-min long data (Harms et al., 2018). On each day, the study scanned two runs of Rest FC, starting with anterior-to-posterior (AP) and then with posterior-to-anterior (PA) phase encoding polarity. We used the “rfMRI_REST_Atlas_MSMAll_hp0_clean.dscalar.nii” file that was pre-processed and concatenated across the four runs. We applied the same computations (i.e., highpass filter, parcellation, Pearson’s correlations, r-to-z transformation and PCA) with the Task FC.

      Sets of Features 15-18: Structural MRI (sMRI)

      sMRI reflects individual differences in brain anatomy. The HCP-A used an established pre-processing pipeline for sMRI (Glasser et al., 2013). We focused on four sets of features: cortical thickness, cortical surface area, subcortical volume and total brain volume. For cortical thickness and cortical surface area, we used Destrieux’s atlas (Destrieux et al., 2010; Fischl, 2012) from FreeSurfer’s “aparc.stats” file, resulting in 148 regions for each set of features. For subcortical volume, we used the aseg atlas (Fischl et al., 2002) from FreeSurfer’s “aseg.stats” file, resulting in 19 regions. For total brain volume, we had five FreeSurfer-based features: “FS_IntraCranial_Vol” or estimated intra-cranial volume, “FS_TotCort_GM_Vol” or total cortical grey matter volume, “FS_Tot_WM_Vol” or total cortical white matter volume, “FS_SubCort_GM_Vol” or total subcortical grey matter volume and “FS_BrainSegVol_eTIV_Ratio” or ratio of brain segmentation volume to estimated total intracranial volume.”

      Third, for regression methods and bias correction methods used, we included the following statements:

      From Methods:

      “For the machine learning algorithm, we used Elastic Net (Zou & Hastie, 2005). Elastic Net is a general form of penalised regressions (including Lasso and Ridge regression), allowing us to simultaneously draw information across different brain indices to predict one target variable. Penalised regressions are commonly used for building age-prediction models (Jirsaraie, Gorelik, et al., 2023). Previously we showed that the performance of Elastic Net in predicting cognitive abilities is on par, if not better than, many non-linear and more-complicated algorithms (Pat, Wang, Bartonicek, et al., 2022; Tetereva et al., 2022). Moreover, Elastic Net coefficients are readily explainable, allowing us the ability to explain how our age-prediction and cognition-prediction models made the prediction from each brain feature (Molnar, 2019; Pat, Wang, Bartonicek, et al., 2022) (see below).

      Elastic Net simultaneously minimises the weighted sum of the features’ coefficients. The degree of penalty to the sum of the feature’s coefficients is determined by a shrinkage hyperparameter ‘α’: the greater the α, the more the coefficients shrink, and the more regularised the model becomes. Elastic Net also includes another hyperparameter, ‘l1 ratio’, which determines the degree to which the sum of either the squared (known as ‘Ridge’; l1 ratio=0) or absolute (known as ‘Lasso’; l1 ratio=1) coefficients is penalised (Zou & Hastie, 2005). The objective function of Elastic Net as implemented by sklearn (Pedregosa et al., 2011) is defined as:

      where X is the features, y is the target, and β is the coefficient. In our grid search, we tuned two Elastic Net hyperparameters: α using 70 numbers in log space, ranging from .1 and 100, and l_1-ratio using 25 numbers in linear space, ranging from 0 and 1.

      To understand how Elastic Net made a prediction based on different brain features, we examined the coefficients of the tuned model. Elastic Net coefficients can be considered as feature importance, such that more positive Elastic Net coefficients lead to more positive predicted values and, similarly, more negative Elastic Net coefficients lead to more negative predicted values (Molnar, 2019; Pat, Wang, Bartonicek, et al., 2022). While the magnitude of Elastic Net coefficients is regularised (thus making it difficult for us to interpret the magnitude itself directly), we could still indicate that a brain feature with a higher magnitude weights relatively stronger in making a prediction. Another benefit of Elastic Net as a penalised regression is that the coefficients are less susceptible to collinearity among features as they have already been regularised (Dormann et al., 2013; Pat, Wang, Bartonicek, et al., 2022).

      Given that we used five-fold nested cross validation, different outer folds may have different degrees of ‘α’ and ‘l1 ratio’, making the final coefficients from different folds to be different. For instance, for certain sets of features, penalisation may not play a big part (i.e., higher or lower ‘α’ leads to similar predictive performance), resulting in different ‘α’ for different folds. To remedy this in the visualisation of Elastic Net feature importance, we refitted the Elastic Net model to the full dataset without splitting them into five folds and visualised the coefficients on brain images using Brainspace (Vos De Wael et al., 2020) and Nilern (Abraham et al., 2014) packages. Note, unlike other sets of features, Task FC and Rest FC were modelled after data reduction via PCA. Thus, for Task FC and Rest FC, we, first, multiplied the absolute PCA scores (extracted from the ‘components_’ attribute of ‘sklearn.decomposition.PCA’) with Elastic Net coefficients and, then, summed the multiplied values across the 75 components, leaving 71,631 ROI-pair indices. “

      References

      Abraham, A., Pedregosa, F., Eickenberg, M., Gervais, P., Mueller, A., Kossaifi, J., Gramfort, A., Thirion, B., & Varoquaux, G. (2014). Machine learning for neuroimaging with scikit-learn. Frontiers in Neuroinformatics, 8, 14. https://doi.org/10.3389/fninf.2014.00014

      Ances, B. M., Liang, C. L., Leontiev, O., Perthen, J. E., Fleisher, A. S., Lansing, A. E., & Buxton, R. B. (2009). Effects of aging on cerebral blood flow, oxygen metabolism, and blood oxygenation level dependent responses to visual stimulation. Human Brain Mapping, 30(4), 1120–1132. https://doi.org/10.1002/hbm.20574

      Bashyam, V. M., Erus, G., Doshi, J., Habes, M., Nasrallah, I. M., Truelove-Hill, M., Srinivasan, D., Mamourian, L., Pomponio, R., Fan, Y., Launer, L. J., Masters, C. L., Maruff, P., Zhuo, C., Völzke, H., Johnson, S. C., Fripp, J., Koutsouleris, N., Satterthwaite, T. D., … on behalf of the ISTAGING Consortium, the P. A. disease C., ADNI, and CARDIA studies. (2020). MRI signatures of brain age and disease over the lifespan based on a deep brain network and 14 468 individuals worldwide. Brain, 143(7), 2312–2324. https://doi.org/10.1093/brain/awaa160

      Bookheimer, S. Y., Salat, D. H., Terpstra, M., Ances, B. M., Barch, D. M., Buckner, R. L., Burgess, G. C., Curtiss, S. W., Diaz-Santos, M., Elam, J. S., Fischl, B., Greve, D. N., Hagy, H. A., Harms, M. P., Hatch, O. M., Hedden, T., Hodge, C., Japardi, K. C., Kuhn, T. P., … Yacoub, E. (2019). The Lifespan Human Connectome Project in Aging: An overview. NeuroImage, 185, 335–348. https://doi.org/10.1016/j.neuroimage.2018.10.009

      Butler, E. R., Chen, A., Ramadan, R., Le, T. T., Ruparel, K., Moore, T. M., Satterthwaite, T. D., Zhang, F., Shou, H., Gur, R. C., Nichols, T. E., & Shinohara, R. T. (2021). Pitfalls in brain age analyses. Human Brain Mapping, 42(13), 4092–4101. https://doi.org/10.1002/hbm.25533

      Cole, J. H. (2020). Multimodality neuroimaging brain-age in UK biobank: Relationship to biomedical, lifestyle, and cognitive factors. Neurobiology of Aging, 92, 34–42. https://doi.org/10.1016/j.neurobiolaging.2020.03.014

      Destrieux, C., Fischl, B., Dale, A., & Halgren, E. (2010). Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. NeuroImage, 53(1), 1–15. https://doi.org/10.1016/j.neuroimage.2010.06.010

      Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., Marquéz, J. R. G., Gruber, B., Lafourcade, B., Leitão, P. J., Münkemüller, T., McClean, C., Osborne, P. E., Reineking, B., Schröder, B., Skidmore, A. K., Zurell, D., & Lautenbach, S. (2013). Collinearity: A review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27–46. https://doi.org/10.1111/j.1600-0587.2012.07348.x

      Dubois, J., Galdi, P., Paul, L. K., & Adolphs, R. (2018). A distributed brain network predicts general intelligence from resting-state human neuroimaging data. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1756), 20170284. https://doi.org/10.1098/rstb.2017.0284

      Elliott, M. L., Knodt, A. R., Cooke, M., Kim, M. J., Melzer, T. R., Keenan, R., Ireland, D., Ramrakha, S., Poulton, R., Caspi, A., Moffitt, T. E., & Hariri, A. R. (2019). General functional connectivity: Shared features of resting-state and task fMRI drive reliable and heritable individual differences in functional brain networks. NeuroImage, 189, 516–532. https://doi.org/10.1016/j.neuroimage.2019.01.068

      Fair, D. A., Schlaggar, B. L., Cohen, A. L., Miezin, F. M., Dosenbach, N. U. F., Wenger, K. K., Fox, M. D., Snyder, A. Z., Raichle, M. E., & Petersen, S. E. (2007). A method for using blocked and event-related fMRI data to study “resting state” functional connectivity. NeuroImage, 35(1), 396–405. https://doi.org/10.1016/j.neuroimage.2006.11.051

      Fischl, B. (2012). FreeSurfer. NeuroImage, 62(2), 774–781. https://doi.org/10.1016/j.neuroimage.2012.01.021

      Fischl, B., Salat, D. H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., van der Kouwe, A., Killiany, R., Kennedy, D., Klaveness, S., Montillo, A., Makris, N., Rosen, B., & Dale, A. M. (2002). Whole Brain Segmentation. Neuron, 33(3), 341–355. https://doi.org/10.1016/S0896-6273(02)00569-X

      Glasser, M. F., Smith, S. M., Marcus, D. S., Andersson, J. L. R., Auerbach, E. J., Behrens, T. E. J., Coalson, T. S., Harms, M. P., Jenkinson, M., Moeller, S., Robinson, E. C., Sotiropoulos, S. N., Xu, J., Yacoub, E., Ugurbil, K., & Van Essen, D. C. (2016). The Human Connectome Project’s neuroimaging approach. Nature Neuroscience, 19(9), 1175–1187. https://doi.org/10.1038/nn.4361

      Glasser, M. F., Sotiropoulos, S. N., Wilson, J. A., Coalson, T. S., Fischl, B., Andersson, J. L., Xu, J., Jbabdi, S., Webster, M., Polimeni, J. R., Van Essen, D. C., & Jenkinson, M. (2013). The minimal preprocessing pipelines for the Human Connectome Project. NeuroImage, 80, 105–124. https://doi.org/10.1016/j.neuroimage.2013.04.127

      Gordon, E. M., Laumann, T. O., Adeyemo, B., Huckins, J. F., Kelley, W. M., & Petersen, S. E. (2016). Generation and Evaluation of a Cortical Area Parcellation from Resting-State Correlations. Cerebral Cortex, 26(1), 288–303. https://doi.org/10.1093/cercor/bhu239

      Gratton, C., Laumann, T. O., Nielsen, A. N., Greene, D. J., Gordon, E. M., Gilmore, A. W., Nelson, S. M., Coalson, R. S., Snyder, A. Z., Schlaggar, B. L., Dosenbach, N. U. F., & Petersen, S. E. (2018). Functional Brain Networks Are Dominated by Stable Group and Individual Factors, Not Cognitive or Daily Variation. Neuron, 98(2), 439-452.e5. https://doi.org/10.1016/j.neuron.2018.03.035

      Hahn, T., Fisch, L., Ernsting, J., Winter, N. R., Leenings, R., Sarink, K., Emden, D., Kircher, T., Berger, K., & Dannlowski, U. (2021). From ‘loose fitting’ to high-performance, uncertainty-aware brain-age modelling. Brain, 144(3), e31–e31. https://doi.org/10.1093/brain/awaa454

      Harms, M. P., Somerville, L. H., Ances, B. M., Andersson, J., Barch, D. M., Bastiani, M., Bookheimer, S. Y., Brown, T. B., Buckner, R. L., Burgess, G. C., Coalson, T. S., Chappell, M. A., Dapretto, M., Douaud, G., Fischl, B., Glasser, M. F., Greve, D. N., Hodge, C., Jamison, K. W., … Yacoub, E. (2018). Extending the Human Connectome Project across ages: Imaging protocols for the Lifespan Development and Aging projects. NeuroImage, 183, 972–984. https://doi.org/10.1016/j.neuroimage.2018.09.060

      Insel, T., Cuthbert, B., Garvey, M., Heinssen, R., Pine, D. S., Quinn, K., Sanislow, C., & Wang, P. (2010). Research Domain Criteria (RDoC): Toward a New Classification Framework for Research on Mental Disorders. American Journal of Psychiatry, 167(7), 748–751. https://doi.org/10.1176/appi.ajp.2010.09091379

      Jirsaraie, R. J., Gorelik, A. J., Gatavins, M. M., Engemann, D. A., Bogdan, R., Barch, D. M., & Sotiras, A. (2023). A systematic review of multimodal brain age studies: Uncovering a divergence between model accuracy and utility. Patterns, 4(4), 100712. https://doi.org/10.1016/j.patter.2023.100712

      Jirsaraie, R. J., Kaufmann, T., Bashyam, V., Erus, G., Luby, J. L., Westlye, L. T., Davatzikos, C., Barch, D. M., & Sotiras, A. (2023). Benchmarking the generalizability of brain age models: Challenges posed by scanner variance and prediction bias. Human Brain Mapping, 44(3), 1118–1128. https://doi.org/10.1002/hbm.26144

      Marquand, A. F., Rezek, I., Buitelaar, J., & Beckmann, C. F. (2016). Understanding Heterogeneity in Clinical Cohorts Using Normative Models: Beyond Case-Control Studies. Biological Psychiatry, 80(7), 552–561. https://doi.org/10.1016/j.biopsych.2015.12.023

      Molnar, C. (2019). Interpretable Machine Learning. A Guide for Making Black Box Models Explainable. https://christophm.github.io/interpretable-ml-book/

      Nimon, K., Lewis, M., Kane, R., & Haynes, R. M. (2008). An R package to compute commonality coefficients in the multiple regression case: An introduction to the package and a practical example. Behavior Research Methods, 40(2), 457–466. https://doi.org/10.3758/BRM.40.2.457

      Pat, N., Wang, Y., Anney, R., Riglin, L., Thapar, A., & Stringaris, A. (2022). Longitudinally stable, brain‐based predictive models mediate the relationships between childhood cognition and socio‐demographic, psychological and genetic factors. Human Brain Mapping, hbm.26027. https://doi.org/10.1002/hbm.26027

      Pat, N., Wang, Y., Bartonicek, A., Candia, J., & Stringaris, A. (2022). Explainable machine learning approach to predict and explain the relationship between task-based fMRI and individual differences in cognition. Cerebral Cortex, bhac235. https://doi.org/10.1093/cercor/bhac235

      Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12(85), 2825–2830.

      Poldrack, R. A., Huckins, G., & Varoquaux, G. (2020). Establishment of Best Practices for Evidence for Prediction: A Review. JAMA Psychiatry, 77(5), 534–540. https://doi.org/10.1001/jamapsychiatry.2019.3671

      Rasero, J., Sentis, A. I., Yeh, F.-C., & Verstynen, T. (2021). Integrating across neuroimaging modalities boosts prediction accuracy of cognitive ability. PLOS Computational Biology, 17(3), e1008347. https://doi.org/10.1371/journal.pcbi.1008347

      Robinson, E. C., Garcia, K., Glasser, M. F., Chen, Z., Coalson, T. S., Makropoulos, A., Bozek, J., Wright, R., Schuh, A., Webster, M., Hutter, J., Price, A., Cordero Grande, L., Hughes, E., Tusor, N., Bayly, P. V., Van Essen, D. C., Smith, S. M., Edwards, A. D., … Rueckert, D. (2018). Multimodal surface matching with higher-order smoothness constraints. NeuroImage, 167, 453–465. https://doi.org/10.1016/j.neuroimage.2017.10.037

      Rokicki, J., Wolfers, T., Nordhøy, W., Tesli, N., Quintana, D. S., Alnæs, D., Richard, G., de Lange, A.-M. G., Lund, M. J., Norbom, L., Agartz, I., Melle, I., Nærland, T., Selbæk, G., Persson, K., Nordvik, J. E., Schwarz, E., Andreassen, O. A., Kaufmann, T., & Westlye, L. T. (2021). Multimodal imaging improves brain age prediction and reveals distinct abnormalities in patients with psychiatric and neurological disorders. Human Brain Mapping, 42(6), 1714–1726. https://doi.org/10.1002/hbm.25323

      Somerville, L. H., Bookheimer, S. Y., Buckner, R. L., Burgess, G. C., Curtiss, S. W., Dapretto, M., Elam, J. S., Gaffrey, M. S., Harms, M. P., Hodge, C., Kandala, S., Kastman, E. K., Nichols, T. E., Schlaggar, B. L., Smith, S. M., Thomas, K. M., Yacoub, E., Van Essen, D. C., & Barch, D. M. (2018). The Lifespan Human Connectome Project in Development: A large-scale study of brain connectivity development in 5–21 year olds. NeuroImage, 183, 456–468. https://doi.org/10.1016/j.neuroimage.2018.08.050

      Sperling, R. A., Bates, J. F., Cocchiarella, A. J., Schacter, D. L., Rosen, B. R., & Albert, M. S. (2001). Encoding novel face-name associations: A functional MRI study. Human Brain Mapping, 14(3), 129–139. https://doi.org/10.1002/hbm.1047

      Sripada, C., Angstadt, M., Rutherford, S., Kessler, D., Kim, Y., Yee, M., & Levina, E. (2019). Basic Units of Inter-Individual Variation in Resting State Connectomes. Scientific Reports, 9(1), Article 1. https://doi.org/10.1038/s41598-018-38406-5

      Sripada, C., Angstadt, M., Rutherford, S., Taxali, A., & Shedden, K. (2020). Toward a “treadmill test” for cognition: Improved prediction of general cognitive ability from the task activated brain. Human Brain Mapping, 41(12), 3186–3197. https://doi.org/10.1002/hbm.25007

      Tetereva, A., Li, J., Deng, J. D., Stringaris, A., & Pat, N. (2022). Capturing brain‐cognition relationship: Integrating task‐based fMRI across tasks markedly boosts prediction and test‐retest reliability. NeuroImage, 263, 119588. https://doi.org/10.1016/j.neuroimage.2022.119588

      Vieira, B. H., Pamplona, G. S. P., Fachinello, K., Silva, A. K., Foss, M. P., & Salmon, C. E. G. (2022). On the prediction of human intelligence from neuroimaging: A systematic review of methods and reporting. Intelligence, 93, 101654. https://doi.org/10.1016/j.intell.2022.101654

      Vos De Wael, R., Benkarim, O., Paquola, C., Lariviere, S., Royer, J., Tavakol, S., Xu, T., Hong, S.-J., Langs, G., Valk, S., Misic, B., Milham, M., Margulies, D., Smallwood, J., & Bernhardt, B. C. (2020). BrainSpace: A toolbox for the analysis of macroscale gradients in neuroimaging and connectomics datasets. Communications Biology, 3(1), 103. https://doi.org/10.1038/s42003-020-0794-7

      Woolrich, M. W., Ripley, B. D., Brady, M., & Smith, S. M. (2001). Temporal Autocorrelation in Univariate Linear Modeling of FMRI Data. NeuroImage, 14(6), 1370–1386. https://doi.org/10.1006/nimg.2001.0931

      Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x


      The following is the authors’ response to the previous reviews.

      eLife assessment

      This useful manuscript challenges the utility of current paradigms for estimating brain-age with magnetic resonance imaging measures, but presents inadequate evidence to support the suggestion that an alternative approach focused on predicting cognition is more useful. The paper would benefit from a clearer explication of the methods and a more critical evaluation of the conceptual basis of the different models. This work will be of interest to researchers working on brain-age and related models.

      Thank you so much for providing high-quality reviews on our manuscript. We revised the manuscript to address all of the reviewers’ comments and provided full responses to each of the comments below. Importantly, in this revision, we clarified that we did not intend to use Brain Cognition as an alternative approach. This is because, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Here we made this point more explicit and further stated that the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. By examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. And such quantification is the third aim of this study.

      Public Reviews:

      Reviewer 1 (Public Review):

      In this paper, the authors evaluate the utility of brain-age-derived metrics for predicting cognitive decline by performing a 'commonality' analysis in a downstream regression that enables the different contribution of different predictors to be assessed. The main conclusion is that brain-age-derived metrics do not explain much additional variation in cognition over and above what is already explained by age. The authors propose to use a regression model trained to predict cognition ("brain-cognition") as an alternative suited to applications of cognitive decline. While this is less accurate overall than brain age, it explains more unique variance in the downstream regression.

      (1) I thank the authors for addressing many of my concerns with this revision. However, I do not feel they have addressed them all. In particular I think the authors could do more to address the concern I raised about the instability of the regression coefficients and about providing enough detail to determine that the stacked regression models do not overfit.

      Thank you Reviewer 1 for the comment. We addressed them in our response to Reviewer 1 Recommendations For The Authors #1 and #2 (see below).

      (2) In considering my responses to the authors revision, I also must say that I agree with Reviewer 3 about the limitations of the brain age and brain cognition methods conceptually. In particular that the regression model used to predict fluid cognition will by construction explain more variance in cognition than a brain age model that is trained to predict age. To be fair, these conceptual problems are more widespread than this paper alone, so I do not believe the authors should be penalised for that. However, I would recommend to make these concerns more explicit in the manuscript

      Thank you Reviewer 1 for the comment. We addressed them in our response to Reviewer 1 Recommendations For The Authors #3 (see below).

      Reviewer 2 (Public Review):

      In this study, the authors aimed to evaluate the contribution of brain-age indices in capturing variance in cognitive decline and proposed an alternative index, brain-cognition, for consideration.

      The study employs suitable methods and data to address the research questions, and the methods and results sections are generally clear and easy to follow.

      I appreciate the authors' efforts in significantly improving the paper, including some considerable changes, from the original submission. While not all reviewer points were tackled, the majority of them were adequately addressed. These include additional analyses, more clarity in the methods and a much richer and nuanced discussion. While recognising the merits of the revised paper, I have a few additional comments.

      (1) Perhaps it would help the reader to note that it might be expected for brain-cognition to account for a significantly larger variance (11%) in fluid cognition, in contrast to brain-age. This stems from the fact that the authors specifically trained brain-cognition to predict fluid cognition, the very variable under consideration. In line with this, the authors later recommend that researchers considering the use of brain-age should evaluate its utility using a regression approach. The latter involves including a brain index (e.g. brain-cognition) previously trained to predict the regression's target variable (e.g. fluid cognition) alongside a brain-age index (e.g., corrected brain-age gap). If the target-trained brain index outperforms the brain-age metric, it suggests that relying solely on brain-age might not be the optimal choice. Although not necessarily the case, is it surprising for the target-trained brain index to demonstrate better performance than brain-age? This harks back to the broader point raised in the initial review: while brain-age may prove useful (though sometimes with modest effect sizes) across diverse outcomes as a generally applicable metric, a brain index tailored for predicting a specific outcome, such as brain-cognition in this case, might capture a considerably larger share of variance in that specific context but could lack broader applicability. The latter aspect needs to be empirically assessed.

      Thank you so much for raising this point. Reviewer 1 (Public Review #2/Recommendations For The Authors #3) and Reviewer 3 (Recommendations for the Authors #1) made a similar observation. We now made changes to the introduction and discussion to address this concern (please see our responses to Reviewer 1 Recommendations For The Authors #3 below).

      Briefly, as in our 2nd revision, we did not intend to compare Brain Age with Brain Cognition since, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Here we made this point more explicit and further stated that the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. By examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. And such quantification is the third aim of this study.

      (2) Furthermore, the discussion pertaining to training brain-age models on healthy populations for subsequent testing on individuals with neurological or psychological disorders seems somewhat one-sided within the broader debate. This one-sidedness might potentially confuse readers. It is worth noting that the choice to employ healthy participants in the training model is likely deliberate, serving as a norm against which atypical populations are compared. To provide a more comprehensive understanding, referencing Tim Hans's counterargument to Bashyam's perspective could offer a more complete view (https://academic.oup.com/brain/article/144/3/e31/6214475?login=false).

      Thank you Reviewer 2 for bringing up this issue. We have now revised the paragraph in question and added nuances on the usage of Brain Age for normative vs. case-control studies. We also cited Tim Hahn’s article that explained the conceptual foundation of the use of Brain Age in case-control studies. Please see below. Additionally, we also made a statement about our study not being able to address issues about the case-control studies directly in the newly written conclusion (see Reviewer 3 Recommendations for the Authors #3).

      Discussion:

      “There is a notable difference between studies investigating the utility of Brain Age in explaining cognitive functioning, including ours and others (e.g., Butler et al., 2021; Cole, 2020, 2020; Jirsaraie et al., 2023) and those explaining neurological/psychological disorders (e.g., Bashyam et al., 2020; Rokicki et al., 2021). We consider the former as a normative type of study and the latter as a case-control type of study (Insel et al., 2010; Marquand et al., 2016). Those case-control Brain Age studies focusing on neurological/psychological disorders often build age-prediction models from MRI data of largely healthy participants (e.g., controls in a case-control design or large samples in a population-based design), apply the built age-prediction models to participants without vs. with neurological/psychological disorders and compare Brain Age indices between the two groups. On the one hand, this means that case-control studies treat Brain Age as a method to detect anomalies in the neurological/psychological group (Hahn et al., 2021). On the other hand, this also means that case-control studies have to ignore under-fitted models when applied prediction models built from largely healthy participants to participants with neurological/psychological disorders (i.e., Brain Age may predict chronological age well for the controls, but not for those with a disorder). On the contrary, our study and other normative studies focusing on cognitive functioning often build age-prediction models from MRI data of largely healthy participants and apply the built age-prediction models to participants who are also largely healthy. Accordingly, the age-prediction models for explaining cognitive functioning in normative studies, while not allowing us to detect group-level anomalies, do not suffer from being under-fitted. This unfortunately might limit the generalisability of our study into just the normative type of study. Future work is still needed to test the utility of brain age in the case-control case.”

      (3) Overall, this paper makes a significant contribution to the field of brain-age and related brain indices and their utility.

      Thank you for the encouragement.

      Reviewer 3 (Public Review):

      The main question of this article is as follows: "To what extent does having information on brain-age improve our ability to capture declines in fluid cognition beyond knowing a person's chronological age?" This question is worthwhile, considering that there is considerable confusion in the field about the nature of brain-age.

      (1) Thank you to the authors for addressing so many of my concerns with this revision. There are a few points that I feel still need addressing/clarifying related to 1) calculating brain cognition, 2) the inevitability of their results, and 3) their continued recommendation to use brain-age metrics.

      Thank you Reviewer 3 for the comment. We addressed them in our response to Reviewer 3 Recommendations For The Authors #1-3 (see below).

      Recommendations for the authors:

      Reviewer 1 (Recommendations For The Authors):

      (1) I do not feel the authors have fully addressed the concern I raised about the stacked regression models. Despite the new figure, it is still not entirely clear what the authors are using as the training set in the final step. To be clear, the problem occurs because of the parameters, not the hyperparameters (which the authors now state that they are optimising via nested grid search). in other words, given a regression model y = X*beta, if the X are taken to be predictions from a lower level regression model, then they contain information that is derived from both the training set at the test set for the model that this was trained on. If the split is the same (i.e. the predictions are derived on the same test set as is being used at the second level), then this can lead to overfitting. It is not clear to me whether the authors have done this or not. Please provide additional detail to clarify this point.

      Thank you for allowing us an opportunity to clarify our stacked model. We wanted to confirm that we did not use test sets to build a stacked model in both lower and higher levels of the Elastic Net models. Test sets were there just for testing the performance of the models. We made additional clarification to make this clearer (see below). Let us explain what we did and provide the rationales below.

      From Methods:

      “We used nested cross-validation (CV) to build these prediction models (see Figure 7). We first split the data into five outer folds, leaving each outer fold with around 100 participants. This number of participants in each fold is to ensure the stability of the test performance across folds. In each outer-fold CV loop, one of the outer folds was treated as an outer-fold test set, and the rest was treated as an outer-fold training set. Ultimately, looping through the nested CV resulted in a) prediction models from each of the 18 sets of features as well as b) prediction models that drew information across different combinations of the 18 separate sets, known as “stacked models.” We specified eight stacked models: “All” (i.e., including all 18 sets of features), “All excluding Task FC”, “All excluding Task Contrast”, “Non-Task” (i.e., including only Rest FC and sMRI), “Resting and Task FC”, “Task Contrast and FC”, “Task Contrast” and “Task FC”. Accordingly, there were 26 prediction models in total for both Brain Age and Brain Cognition.

      To create these 26 prediction models, we applied three steps for each outer-fold loop. The first step aimed at tuning prediction models for each of 18 sets of features. This step only involved the outer-fold training set and did not involve the outer-fold test set. Here, we divided the outer-fold training set into five inner folds and applied inner-fold CV to tune hyperparameters with grid search. Specifically, in each inner-fold CV, one of the inner folds was treated as an inner-fold validation set, and the rest was treated as an inner-fold training set. Within each inner-fold CV loop, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters and applied the estimated model to the inner-fold validation set. After looping through the inner-fold CV, we, then, chose the prediction models that led to the highest performance, reflected by coefficient of determination (R2), on average across the inner-fold validation sets. This led to 18 tuned models, one for each of the 18 sets of features, for each outer fold.

      The second step aimed at tuning stacked models. Same as the first step, the second step only involved the outer-fold training set and did not involve the outer-fold test set. Here, using the same outer-fold training set as the first step, we applied tuned models, created from the first step, one from each of the 18 sets of features, resulting in 18 predicted values for each participant. We, then, re-divided this outer-fold training set into new five inner folds. In each inner fold, we treated different combinations of the 18 predicted values from separate sets of features as features to predict the targets in separate “stacked” models. Same as the first step, in each inner-fold CV loop, we treated one out of five inner folds as an inner-fold validation set, and the rest as an inner-fold training set. Also as in the first step, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters from our grid. We tuned the hyperparameters of stacked models using grid search by selecting the models with the highest R2 on average across the inner-fold validation sets. This led to eight tuned stacked models.

      The third step aimed at testing the predictive performance of the 18 tuned prediction models from each of the set of features, built from the first step, and eight tuned stacked models, built from the second step. Unlike the first two steps, here we applied the already tuned models to the outer-fold test set. We started by applying the 18 tuned prediction models from each of the sets of features to each observation in the outer-fold test set, resulting in 18 predicted values. We then applied the tuned stacked models to these predicted values from separate sets of features, resulting in eight predicted values.

      To demonstrate the predictive performance, we assessed the similarity between the observed values and the predicted values of each model across outer-fold test sets, using Pearson’s r, coefficient of determination (R2) and mean absolute error (MAE). Note that for R2, we used the sum of squares definition (i.e., R2 = 1 – (sum of squares residuals/total sum of squares)) per a previous recommendation (Poldrack et al., 2020). We considered the predicted values from the outer-fold test sets of models predicting age or fluid cognition, as Brain Age and Brain Cognition, respectively.”

      Author response image 1.

      Diagram of the nested cross-validation used for creating predictions for models of each set of features as well as predictions for stacked models.

      Note some previous research, including ours (Tetereva et al., 2022), splits the observations in the outer-fold training set into layer 1 and layer 2 and applies the first and second steps to layers 1 and 2, respectively. Here we decided against this approach and used the same outer-fold training set for both first and second steps in order to avoid potential bias toward the stacked models. This is because, when the data are split into two layers, predictive models built for each separate set of features only use the data from layer 1, while the stacked models use the data from both layers 1 and 2. In practice with large enough data, these two approaches might not differ much, as we demonstrated previously (Tetereva et al., 2022).

      (2) I also do not feel the authors have fully addressed the concern I raised about stability of the regression coefficients over splits of the data. I wanted to see the regression coefficients, not the predictions. The predictions can be stable when the coefficients are not.

      The focus of this article is on the predictions. Still, as pointed out by reviewer 1, it is informative for readers to understand how stable the feature importance (i.e., Elastic Net coefficients) is. To demonstrate the stability of feature importance, we now examined the rank stability of feature importance using Spearman’s ρ (see Figure 4). Specifically, we correlated the feature importance between two prediction models of the same features, used in two different outer-fold test sets. Given that there were five outer-fold test sets, we computed 10 Spearman’s ρ for each prediction model of the same features. We found Spearman’s ρ to be varied dramatically in both age-prediction (range=.31-.94) and fluid cognition-prediction (range=.16-.84) models. This means that some prediction models were much more stable in their feature importance than others. This is probably due to various factors such as a) the collinearity of features in the model, b) the number of features (e.g., 71,631 features in functional connectivity, which were further reduced to 75 PCAs, as compared to 19 features in subcortical volume based on the ASEG atlas), c) the penalisation of coefficients either with ‘Ridge’ or ‘Lasso’ methods, which resulted in reduction as a group of features or selection of a feature among correlated features, respectively, and d) the predictive performance of the models. Understanding the stability of feature importance is beyond the scope of the current article. As mentioned by Reviewer 1, “The predictions can be stable when the coefficients are not,” and we chose to focus on the prediction in the current article.

      Author response image 2.

      Stability of feature importance (i.e., Elastic Net Coefficients) of prediction models. Each dot represents rank stability (reflected by Spearman’s ρ) in the feature importance between two prediction models of the same features, used in two different outer-fold test sets. Given that there were five outer-fold test sets, there were 10 Spearman’s ρs for each prediction model. The numbers to the right of the plots indicate the mean of Spearman’s ρ for each prediction model.

      (3) I also must say that I agree with Reviewer 3 about the limitations of the brain-age and brain-cognition methods conceptually. In particular that the regression model used to predict fluid cognition will by construction explain more variance in cognition than a brain-age model that is trained to predict age. This suffers from the same problem the authors raise with brain-age and I agree that this would probably disappear if the authors had a separate measure of cognition against which to validate and were then to regress this out as they do for age correction. I am aware that these conceptual problems are more widespread than this paper alone (in fact throughout the brain-age literature), so I do not believe the authors should be penalised for that. However, I do think they can make these concerns more explicit and further tone down the comments they make about the utility of brain-cognition.

      Thank you so much for raising this point. Reviewer 2 (Public Review #1) and Reviewer 3 (Recommendations for the Authors #1) made a similar observation. We now made changes to the introduction and discussion to address this concern (see below).

      Briefly, we made it explicit that, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. That is, the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. More importantly, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. And this is the third goal of this present study.

      From Introduction:

      “Third and finally, certain variation in fluid cognition is related to brain MRI, but to what extent does Brain Age not capture this variation? To estimate the variation in fluid cognition that is related to the brain MRI, we could build prediction models that directly predict fluid cognition (i.e., as opposed to chronological age) from brain MRI data. Previous studies found reasonable predictive performances of these cognition-prediction models, built from certain MRI modalities (Dubois et al., 2018; Pat et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). Analogous to Brain Age, we called the predicted values from these cognition-prediction models, Brain Cognition. The strength of an out-of-sample relationship between Brain Cognition and fluid cognition reflects variation in fluid cognition that is related to the brain MRI and, therefore, indicates the upper limit of Brain Age’s capability in capturing fluid cognition. This is, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Consequently, if we included Brain Cognition, Brain Age and chronological age in the same model to explain fluid cognition, we would be able to examine the unique effects of Brain Cognition that explain fluid cognition beyond Brain Age and chronological age. These unique effects of Brain Cognition, in turn, would indicate the amount of co-variation between brain MRI and fluid cognition that is missed by Brain Age.”

      From Discussion:

      “Third, by introducing Brain Cognition, we showed the extent to which Brain Age indices were not able to capture the variation in fluid cognition that is related to brain MRI. More specifically, using Brain Cognition allowed us to gauge the variation in fluid cognition that is related to the brain MRI, and thereby, to estimate the upper limit of what Brain Age can do. Moreover, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age.

      From our results, Brain Cognition, especially from certain cognition-prediction models such as the stacked models, has relatively good predictive performance, consistent with previous studies (Dubois et al., 2018; Pat et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). We then examined Brain Cognition using commonality analyses (Nimon et al., 2008) in multiple regression models having a Brain Age index, chronological age and Brain Cognition as regressors to explain fluid cognition. Similar to Brain Age indices, Brain Cognition exhibited large common effects with chronological age. But more importantly, unlike Brain Age indices, Brain Cognition showed large unique effects, up to around 11%. As explained above, the unique effects of Brain Cognition indicated the amount of co-variation between brain MRI and fluid cognition that was missed by a Brain Age index and chronological age. This missing amount was relatively high, considering that Brain Age and chronological age together explained around 32% of the total variation in fluid cognition. Accordingly, if a Brain Age index was used as a biomarker along with chronological age, we would have missed an opportunity to improve the performance of the model by around one-third of the variation explained.”

      Reviewer #3 (Recommendations For The Authors):

      Thank you to the authors for addressing so many of my concerns with this revision. There are a few points that I feel still need addressing/clarifying related to: 1) calculating brain cognition, 2) the inevitability of their results, and 3) their continued recommendation to use brain age metrics.

      (1) I understand your point here. I think the distinction is that it is fine to build predictive models, but then there is no need to go through this intermediate step of "brain-cognition". Just say that brain features can predict cognition XX well, and brain-age (or some related metric) can predict cognition YY well. It creates a confusing framework for the reader that can lead them to believe that "brain-cognition" is not just a predicted value of fluid cognition from a model using brain features to predict cognition. While you clearly state that that is in fact what it is in the text, which is a huge improvement, I do not see what is added by going through brain-cognition instead of simply just obtaining a change in R2 where the first model uses brain features alone to predict cognition, and the second adds on brain-age (or related metrics), or visa versa, depending on the question. Please do this analysis, and either compare and contrast it with going through "brain-cognition" in your paper, or switch to this analysis, as it more directly addresses the question of the incremental predictive utility of brain-age above and beyond brain features.

      Thank you so much for raising this point. Reviewer 1 (Public Review #2/Recommendations For The Authors #3) and Reviewer 2 (Public Review #1) made a similar observation. We now made changes to the introduction and discussion to address this concern (see our responses to Reviewer 1 Recommendations For The Authors #3 above).

      Briefly, as in our 2nd revision, we made it explicitly clear that we did not intend to compare Brain Age with Brain Cognition since, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. And, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age.

      We have thought about changing the name Brain Cognition into something along the lines of “predicted values of prediction models predicting fluid cognition based on brain MRI.” However, this made the manuscript hard to follow, especially with the commonality analyses. For instance, the sentence, “Here, we tested Brain Cognition’s unique effects in multiple regression models with a Brain Age index, chronological age and Brain Cognition as regressors to explain fluid cognition” would become “Here, we tested predicted values of prediction models predicting fluid cognition based on brain MRI unique effects in multiple regression models with a Brain Age index, chronological age and predicted values of prediction models predicting fluid cognition based on brain MRI as regressors to explain fluid cognition.” We believe, given our additional explanation (see our responses to Reviewer 1 Recommendations For The Authors #3 above), readers should understand what Brain Cognition is, and that we did not intend to compare Brain Age and Brain Cognition directly.

      As for the suggested analysis, “obtaining a change in R2 where the first model uses brain features alone to predict cognition, and the second adds on brain-age (or related metrics), or visa versa,” we have already done this in the form of commonality analysis (Nimon et al., 2008) (see Figure 7 below). That is, to obtain unique and common effects of the regressors, we need to look at all of the possible changes in R2 when all possible subsets of regressors were excluded or included, see equations 12 and 13 below.

      From Methods:

      “Similar to the above multiple regression model, we had chronological age, each Brain Age index and Brain Cognition as the regressors for fluid cognition:

      Fluid Cognitioni = β0 + β1 Chronological Agei + β2 Brain Age Indexi,j + β3 Brain Cognitioni + εi, (12)

      Applying the commonality analysis here allowed us, first, to investigate the addictive, unique effects of Brain Cognition, over and above chronological age and Brain Age indices. More importantly, the commonality analysis also enabled us to test the common, shared effects that Brain Cognition had with chronological age and Brain Age indices in explaining fluid cognition. We calculated the commonality analysis as follows (Nimon et al., 2017):

      Unique Effectchronological age = ΔR2chronological age = R2chronological age, Brain Age index, Brain Cognition – R2 Brain Age index, Brain Cognition

      Unique EffectBrain Age index = ΔR2Brain Age index = R2chronological age, Brain Age index, Brain Cognition – R2 chronological age, Brain Cognition

      Unique EffectBrain Cognition = ΔR2Brain Cognition = R2chronological age, Brain Age index, Brain Cognition – R2 chronological age, Brain Age Index

      Common Effectchronological age, Brain Age index = R2chronological age, Brain Cognition + R2 Brain Age index, Brain Cognition – R2 Brain Cognition – R2chronological age, Brain Age index, Brain Cognition

      Common Effectchronological age, Brain Cognition = R2chronological age, Brain Age Index + R2 Brain Age index, Brain Cognition – R2 Brain Age Index – R2chronological age, Brain Age index, Brain Cognition

      Common Effect Brain Age index, Brain Cognition = R2chronological age, Brain Age Index + R2 chronological age, Brain Cognition – R2 chronological age – R2chronological age, Brain Age index, Brain Cognition

      Common Effect chronological age, Brain Age index, Brain Cognition = R2 chronological age + R2 Brain Age Index + R2 Brain Cognition – R2chronological age, Brain Age Index – R2 chronological age, Brain Cognition – R2 Brain Age Index, Brain Cognition – R2chronological age, Brain Age index, Brain Cognition , (13)”

      (2) I agree that the solution is not to exclude age as a covariate, and that there is a big difference between inevitable and obvious. I simply think a further discussion of the inevitability of the results would be clarifying for the readers. There is a big opportunity in the brain-age literature to be as direct as possible about why you are finding what you are finding. People need to know not only what you found, but why you found what you found.

      Thank you. We agreed that we need to make this point more explicit and direct. In the revised manuscript, we had the statements in both Introduction and Discussion (see below) about the tight relationship between Brain Age and chronological age by design, making the small unique effects of Brain Age inevitable.

      Introduction:

      “Accordingly, by design, Brain Age is tightly close to chronological age. Because chronological age usually has a strong relationship with fluid cognition, to begin with, it is unclear how much Brain Age adds to what is already captured by chronological age.“

      Discussion:

      “First, Brain Age itself did not add much more information to help us capture fluid cognition than what we had already known from a person’s chronological age. This can clearly be seen from the small unique effects of Brain Age indices in the multiple regression models having Brain Age and chronological age as the regressors. While the unique effects of some Brain Age indices from certain age-prediction models were statistically significant, there were all relatively small. Without Brain Age indices, chronological age by itself already explained around 32% of the variation in fluid cognition. Including Brain Age indices only added around 1.6% at best. We believe the small unique effects of Brain Age were inevitable because, by design, Brain Age is tightly close to chronological age. Therefore, chronological age and Brain Age captured mostly a similar variation in fluid cognition.

      Investigating the simple regression models and the commonality analysis between each Brain Age index and chronological age provided additional insights….”

      (3) I believe it is very important to critically examine the use of brain-age and related metrics. As part of this process, I think we should be asking ourselves the following questions (among others): Why go through age prediction? Wouldn't the predictions of cognition (or another variable) using the same set of brain features always be as good or better? You still have not justified the use of brain-age. As I said before, if you are going to continue to recommend the use of brain-age, you need a very strong argument for why you are recommending this. What does it truly add? Otherwise, temper your statements to indicate possible better paths forward.

      Thank you Reviewer 3 for making an argument against the use of Brain Age. We largely agree with you. However, our work only focuses on one phenotype, fluid cognition, and on the normative situation (i.e., not having a case vs control group). As Reviewer 2 pointed out, Brain Age might still have utility in other cases, not studied here. Still, future studies that focus on other phenotypes may consider using our approach as a template to test the utility of Brain Age in other situations. We added the conclusion statement to reflect this.

      From Discussion:

      “Altogether, we examined the utility of Brain Age as a biomarker for fluid cognition. Here are the three conclusions. First, Brain Age failed to add substantially more information over and above chronological age. Second, a higher ability to predict chronological age did not correspond to a higher utility to capture fluid cognition. Third, Brain Age missed up to around one-third of the variation in fluid cognition that could have been explained by brain MRI. Yet, given our focus on fluid cognition, future empirical research is needed to test the utility of Brain Age on other phenotypes, especially when Brain Age is used for anomaly detection in case-control studies (e.g., Bashyam et al., 2020; Rokicki et al., 2021). We hope that future studies may consider applying our approach (i.e., using the commonality analysis that includes predicted values from a model that directly predicts the phenotype of interest) to test the utility of Brain Age as a biomarker for other phenotypes.”

      References

      Bashyam, V. M., Erus, G., Doshi, J., Habes, M., Nasrallah, I. M., Truelove-Hill, M., Srinivasan, D., Mamourian, L., Pomponio, R., Fan, Y., Launer, L. J., Masters, C. L., Maruff, P., Zhuo, C., Völzke, H., Johnson, S. C., Fripp, J., Koutsouleris, N., Satterthwaite, T. D., … on behalf of the ISTAGING Consortium, the P. A. disease C., ADNI, and CARDIA studies. (2020). MRI signatures of brain age and disease over the lifespan based on a deep brain network and 14 468 individuals worldwide. Brain, 143(7), 2312–2324. https://doi.org/10.1093/brain/awaa160

      Butler, E. R., Chen, A., Ramadan, R., Le, T. T., Ruparel, K., Moore, T. M., Satterthwaite, T. D., Zhang, F., Shou, H., Gur, R. C., Nichols, T. E., & Shinohara, R. T. (2021). Pitfalls in brain age analyses. Human Brain Mapping, 42(13), 4092–4101. https://doi.org/10.1002/hbm.25533

      Cole, J. H. (2020). Multimodality neuroimaging brain-age in UK biobank: Relationship to biomedical, lifestyle, and cognitive factors. Neurobiology of Aging, 92, 34–42. https://doi.org/10.1016/j.neurobiolaging.2020.03.014

      Dubois, J., Galdi, P., Paul, L. K., & Adolphs, R. (2018). A distributed brain network predicts general intelligence from resting-state human neuroimaging data. Philosophical Transactions of the Royal Society B: Biological Sciences, 373(1756), 20170284. https://doi.org/10.1098/rstb.2017.0284

      Hahn, T., Fisch, L., Ernsting, J., Winter, N. R., Leenings, R., Sarink, K., Emden, D., Kircher, T., Berger, K., & Dannlowski, U. (2021). From ‘loose fitting’ to high-performance, uncertainty-aware brain-age modelling. Brain, 144(3), e31–e31. https://doi.org/10.1093/brain/awaa454

      Insel, T., Cuthbert, B., Garvey, M., Heinssen, R., Pine, D. S., Quinn, K., Sanislow, C., & Wang, P. (2010). Research Domain Criteria (RDoC): Toward a New Classification Framework for Research on Mental Disorders. American Journal of Psychiatry, 167(7), 748–751. https://doi.org/10.1176/appi.ajp.2010.09091379

      Jirsaraie, R. J., Kaufmann, T., Bashyam, V., Erus, G., Luby, J. L., Westlye, L. T., Davatzikos, C., Barch, D. M., & Sotiras, A. (2023). Benchmarking the generalizability of brain age models: Challenges posed by scanner variance and prediction bias. Human Brain Mapping, 44(3), 1118–1128. https://doi.org/10.1002/hbm.26144

      Marquand, A. F., Rezek, I., Buitelaar, J., & Beckmann, C. F. (2016). Understanding Heterogeneity in Clinical Cohorts Using Normative Models: Beyond Case-Control Studies. Biological Psychiatry, 80(7), 552–561. https://doi.org/10.1016/j.biopsych.2015.12.023

      Nimon, K., Lewis, M., Kane, R., & Haynes, R. M. (2008). An R package to compute commonality coefficients in the multiple regression case: An introduction to the package and a practical example. Behavior Research Methods, 40(2), 457–466. https://doi.org/10.3758/BRM.40.2.457

      Pat, N., Wang, Y., Anney, R., Riglin, L., Thapar, A., & Stringaris, A. (2022). Longitudinally stable, brain‐based predictive models mediate the relationships between childhood cognition and socio‐demographic, psychological and genetic factors. Human Brain Mapping, hbm.26027. https://doi.org/10.1002/hbm.26027

      Poldrack, R. A., Huckins, G., & Varoquaux, G. (2020). Establishment of Best Practices for Evidence for Prediction: A Review. JAMA Psychiatry, 77(5), 534–540. https://doi.org/10.1001/jamapsychiatry.2019.3671

      Rasero, J., Sentis, A. I., Yeh, F.-C., & Verstynen, T. (2021). Integrating across neuroimaging modalities boosts prediction accuracy of cognitive ability. PLOS Computational Biology, 17(3), e1008347. https://doi.org/10.1371/journal.pcbi.1008347

      Rokicki, J., Wolfers, T., Nordhøy, W., Tesli, N., Quintana, D. S., Alnæs, D., Richard, G., de Lange, A.-M. G., Lund, M. J., Norbom, L., Agartz, I., Melle, I., Nærland, T., Selbæk, G., Persson, K., Nordvik, J. E., Schwarz, E., Andreassen, O. A., Kaufmann, T., & Westlye, L. T. (2021). Multimodal imaging improves brain age prediction and reveals distinct abnormalities in patients with psychiatric and neurological disorders. Human Brain Mapping, 42(6), 1714–1726. https://doi.org/10.1002/hbm.25323

      Sripada, C., Angstadt, M., Rutherford, S., Taxali, A., & Shedden, K. (2020). Toward a “treadmill test” for cognition: Improved prediction of general cognitive ability from the task activated brain. Human Brain Mapping, 41(12), 3186–3197. https://doi.org/10.1002/hbm.25007

      Tetereva, A., Li, J., Deng, J. D., Stringaris, A., & Pat, N. (2022). Capturing brain‐cognition relationship: Integrating task‐based fMRI across tasks markedly boosts prediction and test‐retest reliability. NeuroImage, 263, 119588. https://doi.org/10.1016/j.neuroimage.2022.119588

      Vieira, B. H., Pamplona, G. S. P., Fachinello, K., Silva, A. K., Foss, M. P., & Salmon, C. E. G. (2022). On the prediction of human intelligence from neuroimaging: A systematic review of methods and reporting. Intelligence, 93, 101654. https://doi.org/10.1016/j.intell.2022.101654

    1. eLife assessment

      Using anchored phylogenomic analyses, this valuable study sheds new light on the evolutionary history of the plant diet of Belidae weevil beetles and their geographic distribution. Using convincing methodological approaches, the authors suggest a continuous association of certain belid lineages with Araucaria hosts, since the Mesozoic era. While the biogeographical analysis has weaknesses due to uncertainties in vicariance explanations, the study overall offers contributions to understanding the evolutionary dynamics of Belidae and provides novel insights into ancient community ecology.

    2. Reviewer #1 (Public Review):

      This is a very nice study of Belidae weevils using anchored phylogenomics that presents a new backbone for the family and explores, despite a limited taxon sampling, several evolutionary aspects of the group. The phylogeny is useful to understand the relationships between major lineages in this group and preliminary estimation of ancestral traits reveals interesting patterns linked to host-plant diet and geographic range evolution. I find that the methodology is appropriate, and all analytical steps are well presented. The paper is well-written and presents interesting aspects of Belidae systematics and evolution. The major weakness of the study is the very limited taxon sampling which has deep implications for the discussion of ancestral estimations.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors used a combination of anchored hybrid enrichment and Sanger sequencing to construct a phylogenomic data set for the weevil family Belidae. Using evidence from fossils and previous studies they can estimate a phylogenetic tree with a range of dates for each node - a time tree. They use this to reconstruct the history of the belids' geographic distributions and associations with their host plants. They infer that the belids' association with conifers pre-dates the rise of the angiosperms. They offer an interpretation of belid history in terms of the breakup of Gondwanaland but acknowledge that they cannot rule out alternative interpretations that invoke dispersal.

      Strengths:

      The strength of any molecular-phylogenetic study hinges on four things: the extent of the sampling of taxa; the extent of the sampling of loci (DNA sequences) per genome; the quality of the analysis; and - most subjectively - the importance and interest of the evolutionary questions the study allows the authors to address. The first two of these, sampling of taxa and loci, impose a tradeoff: with finite resources, do you add more taxa or more loci? The authors follow a reasonable compromise here, obtaining a solid anchored-enrichment phylogenomic data set (423 genes, >97 kpb) for 33 taxa, but also doing additional analyses that included 13 additional taxa from which only Sanger sequencing data from 4 genes was available. The taxon sampling was pretty solid, including all 7 tribes and a majority of genera in the group. The analyses also seemed to be solid - exemplary, even, given the data available.

      This leaves the subjective question of how interesting the results are. The very scale of the task that faces systematists in general, and beetle systematists in particular, presents a daunting challenge to the reader's attention: there are so many taxa, and even a sophisticated reader may never have heard of any of them. Thus it's often the case that such studies are ignored by virtually everyone outside a tiny cadre of fellow specialists. The authors of the present study make an unusually strong case for the broader interest and importance of their investigation and its focal taxon, the belid weevils.

      The belids are of special interest because - in a world churning with change and upheaval, geologically and evolutionarily - relatively little seems to have been going on with them, at least with some of them, for the last hundred million years or so. The authors make a good case that the Araucaria-feeding belid lineages found in present-day Australasia and South America have been feeding on Araucaria continuously since the days when it was a dominant tree taxon nearly worldwide before it was largely replaced by angiosperms. Thus these lineages plausibly offer a modern glimpse of an ancient ecological community.

      Weaknesses:

      I didn't find the biogeographical analysis particularly compelling. The promise of vicariance biogeography for understanding Gondwanan taxa seems to have peaked about 3 or 4 decades ago, and since then almost every classic case has been falsified by improved phylogenetic and fossil evidence. I was hopeful, early in my reading of this article, that it would be a counterexample, showing that yes, vicariance really does explain the history of *something*. But the authors don't make a particularly strong claim for their preferred minimum-dispersal scenario; also they don't deal with the fact that the range of Araucaria was vastly greater in the past and included places like North America. Were there belids in what is now Arizona's petrified forest? It seems likely. Ignoring all of that is methodologically reasonable but doesn't yield anything particularly persuasive.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This paper provides useful information about how the ionome of Arabidopsis thaliana adapts to very high CO2-levels, backed up by solid evidence and carefully designed studies. However, the broader claims of the paper about climate change and food security - heavily emphasized in the abstract, introduction, and discussion - are inappropriate, as there is no direct link to the presented work.

      We sincerely thank you for the work you have done in reviewing our manuscript. We very much appreciate your overall positive assessment of the experimental work as a whole, its value and robustness.

      In this revised version, we took on board the majority of your suggestions and your comments. In particular, we understood your critical point about overstating our objectives, which might in turn seem uncorrelated with our results. We fully agree with the comments that have been made on this point. Consequently, we have made substantial modifications and corrections in order to clarify our objectives and their implications: exploring in depth the natural variation of the shoot ionome response to elevated CO2, and generating a valuable resource allowing a better understanding of the genetic and molecular mechanisms involved in the regulation of plant mineral nutrition by the elevation of atmospheric CO2.

      We also made modifications in response to the other suggestions, including a clarification of the functional experiments carried out around the function of TIP2;2 in response to elevated CO2. Figure 7 now comprises the comparison between both ambient and elevated CO2 conditions, which is much more informative that what appeared in the previous version.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study's abstract, introduction, and conclusions are not supported by the methods and results conducted. In fact, the results presented suggest that Arabidopsis could easily adapt to an extremely high CO2 environment.

      We understand the reviewer’s comment. Although our work is considered useful, robust and well designed, we agree with the reviewer's point. We have certainly overemphasized the significance of our work to address the issue of food security in response to rising atmospheric CO2, at the expense of the factual description of the results of our fundamental study of the mechanisms at the interface between CO2 and mineral nutrition. We have clarified this focus by modifying the text of the introduction, objectives and discussion. We hope that these modifications will enable readers to better appreciate the core of this work.

      Regarding the last part of the comment, our results do suggest that genetic variation could allow adaptation to rising atmospheric CO2, and our study does indeed aim to identify the extent and basis of this genetic variation.

      This study offers good evidence pointing to a genetic basis for Arabidopsis thaliana's response to elevated CO2 (eCO2) levels and its subsequent impact on the leaf ionome. The natural variation analyses in the study support the hypothesis that genetic factors, rather than local adaptation, guide the influence of eCO2 on the ionome of rosette leaves in Arabidopsis. However, the manuscript's claim regarding its role in "the development of biofortified crops adapted to a high-CO2 world" (line 23) is overstated, especially given the absence of any analysis on the influence of eCO2 on the seed ionome and Arabidopsis is a poor model for harvest index for any crop. The manuscript, in its current form, necessitates massive revisions, particularly in clarifying its broader implications and in providing more substantial evidence for some of its assertions.

      We thank the reviewer for this comment, and we would like to thank the reviewer for the positive appreciation for the identification of genetic basis for Arabidopsis thaliana's response to elevated CO2 and its subsequent impact on the leaf ionome. Nevertheless, it is true that the study of the leaf ionome is far from being able to lead to the development of biofortified plants. Some papers described that nutrient harvest index in Arabidopsis is a potential indicator of nutrient use efficiency (for instance, Masclaux-Daubresse and Chardon, Journal of Experimental Botany 2011 or Aranjuelo et al., Journal of Experimental Botany 2013). However, as we did not include any seed ionome data in the paper, we added clear mentions that our analyses were made on leaves (lines 56/57/250/319) and a comment in the discussion section to address this limitation (lines 325-328).

      Major Drawbacks and Questions:

      (1) Evidence for the Central Premise:

      The foundational premise of the study is the assertion that rising atmospheric CO2 levels result in a decline in plant mineral content. This phenomenon is primarily observed in C3 plants, with C4 plants seemingly less affected. The evidence provided on this topic is scant and, in some instances, contradicts the authors' own references. The potential reduction of certain minerals, especially in grains, can be debated. For instance, reduced nitrogen (N) and phosphorus (P) content in grains might not necessarily be detrimental for human and animal consumption. In fact, it could potentially mitigate issues like nitrogen emissions and phosphorus leaching. Labeling this as a "major threat to food security" (line 30) is exaggerated. While the case for microelements might be more compelling, the introduction fails to articulate this adequately. Furthermore, the introduction lacks any discussion on how eCO2 might influence nutrient allocation to grains, which would be crucial in substantiating the claim that eCO2 poses a threat to food security. A more comprehensive introduction that clearly delineates the adverse effects of eCO2 and its implications for food security would greatly enhance the manuscript.

      We partially agree with this comment. The decline in mineral status of C3 plants under conditions of elevated atmospheric CO2 has been widely described in the literature, and specifically documented for the cereal grains. While there are variations in this effect (depending on species, ecotype, cultivar), there is no debate about its acceptance. Here are just a few of the many works describing this effect, both on a global scale and at the level of the individual plant (Cotrufo MF (1998) Elevated CO2 reduces the nitrogen concentration of plant tissues. Global Change Biology 4: 43-54; Loladze I (2014) Hidden shift of the ionome of plants exposed to elevated CO(2)depletes minerals at the base of human nutrition. eLife 3: e02245; Myers SS (2014) Increasing CO2 threatens human nutrition. Nature 510: 139-142; Poorter H (1997) The effect of elevated CO2 on the chemical composition and construction costs of leaves of 27 C3 species. Plant, Cell & Environment 20: 472-482 ; Soares JC (2019) Preserving the nutritional quality of crop plants under a changing climate: importance and strategies. Plant and Soil 443: 1-26; Stitt] M (1999) The interaction between elevated carbon dioxide and nitrogen nutrition: the physiological and molecular background. Plant, Cell & Environment 22: 583-621; Uddling J (2018) Crop quality under rising atmospheric CO2. Curr Opin Plant Biol 45: 262-267).

      In addition to this, the threat to food security posed by this alteration in plant mineral status has also been well described in the literature by several modeling approaches (Beach RH (2019) Combining the effects of increased atmospheric carbon dioxide on protein, iron, and zinc availability and projected climate change on global diets: a modelling study. Lancet Planet Health 3: e307-e317; Ebi KL (2019) Elevated atmospheric CO(2) concentrations and climate change will affect our food's quality and quantity. Lancet Planet Health 3: e283-e284; Medek DE (2017) Estimated Effects of Future Atmospheric CO2 Concentrations on Protein Intake and the Risk of Protein Deficiency by Country and Region. Environ Health Perspect 125: 087002; Smith MR (2018) Impact of anthropogenic CO2 emissions on global human nutrition. Nature Climate Change 8: 834-839; Weyant C (2018) Anticipated burden and mitigation of carbon-dioxide-induced nutritional deficiencies and related diseases: A simulation modeling study. PLoS Med 15: e1002586; Zhu C (2018) Carbon dioxide (CO2) levels this century will alter the protein, micronutrients, and vitamin content of rice grains with potential health consequences for the poorest rice-dependent countries. Sci Adv 4: eaaq1012). To reinforce this point, we have added a sentence and references (lines 30-33). Nevertheless, we understand the reviewer's comment on the nuance to be given to the intensity of this potential threat. We have therefore modified the text, replacing "major threat" by "significant threat" (lines 3 and 29).

      We also would like to answer the reviewer’s comment on the potential environmental benefit associated with reduced N and P content in grains (mitigation of N emissions and P leaching). Indeed, if this reduced N and P content results from a lowered use efficiency of soil nutrients by plants, as suggested by several studies (Bloom 2010, Cassan 2023, Gojon 2023 and references therein), this may at the opposite favor N oxides emission and P leaching from the soil.

      (2) Exaggerated Concerns:

      The paper begins with the concern that carbon fertilization will lead to carbon dilution in our foods. While we indeed face numerous genuine threats in the coming decades, this particular issue is manageable. The increase in CO2 alone offers many opportunities for boosting yield. However, the heightened heat and increased evapotranspiration will pose massive challenges in many environments.

      While there are indeed multiple threats that we are facing in the coming decades, we don't fully agree with this comment. At present, there's no evidence to say that the negative effect of CO2 on plant mineral content will be manageable. Furthermore, there is compelling evidence that altered mineral nutrition and mineral status of plants will be an important factor limiting the high CO2-induced increase in yield, as will be heat or increased evapotranspiration (see for instance Coskun et al (2016) Nutrient constraints on terrestrial carbon fixation: The role of Nitrogen. J. Plant Physiol. 203: 95-109; Jiang M (2020) Low phosphorus supply constrains plant responses to elevated CO2 : A meta-analysis. Glob Chang Biol 26: 5856-5873 ; Reich PB (2006) Nitrogen limitation constrains sustainability of ecosystem response to CO2. Nature 440: 922-925). Thus, although we do not negate the crucial importance of heat and water stress, we believe it is relevant to study the basic mechanisms responsible for the negative effect of CO2 on plant mineral composition.

      Figure 4 in fact suggests that 43% of the REGMAP panel (cluster 3) is already pre-adapted to very high CO2 levels. This suggests annual species could adapt very rapidly.

      We agree with the reviewer. However, this suggests that genetic variation exists in some ecotypes to support adaptation to elevated CO2. The purpose of this work is indeed to identify this genetic variation, in order to characterize the mechanisms behind.

      (3) Assumptions on CO2 Levels:

      The assumption of 900ppm seems to be based on a very extreme climate change scenario. Most people believe we will overshoot the 1.5°C scenario, however, it seems plausible that 2.5 to 3°C scenarios are more likely. This would correspond to around 500ppm of CO2. https://www.nature.com/articles/s41597-022-01196-7/tables/4

      We agree with the reviewer that the CO2 concentration we used corresponds to a high value in the IPCC projections. That said, this value is currently considered very plausible: the following figure (from Smith and Myers (2018) Nature Climate Change) shows that current CO2 emissions align with the IPCC's most extreme model (RCP 8.5), which would result in a CO2 concentration of around 900 ppm in 2100. Furthermore, nothing allows to exclude the 4°C scenario in the 6th IPCC report.

      Author response image 1.

      (4) Focus on Real Challenges:

      We have numerous real challenges, such as extreme heat and inconsistent rainfall, to address in the context of climate change. However, testing under extreme CO2 conditions and then asserting that carbon dilution will negatively impact nutrition is exaggerated.

      While we fully agree that several threats linked to climate change exist, and all deserve to be studied, we find it questionable to consider that the potential effect of high CO2 on the mineral nutrition of plants is not a real challenge. The mineral nutrition of plants is already a current major environmental challenge. This perspective seems to reflect the reviewer's personal opinion rather than an analysis of our work.

      In contrast, the FACE experiments are fundamental and are conducted at more realistic eCO2 levels. Understanding the interaction between a 20% increase in CO2 and new precipitation patterns is key for global carbon flux prediction.

      Again, we do not fully understand this comment, as the aim of our study was not to perform a global carbon flux prediction, but to unravel genes and mechanisms underlying the negative effect of elevated CO2 on the nutrient content of Arabidopsis rosettes. However, we agree with the reviewer’s comment and with the fact that FACE are useful facilities to explore the CO2 response in more natural environments, and we highlight the fact that the decrease in mineral status of C3 plants has been widely documented in FACE studies. FACE experiments do not facilitate, however, to conduct fully controlled experiments (temperature, rainfall, wind and light intensities are not controllable in FACE), that allow to disentangle the mechanisms by which elevated CO2 regulates the signaling pathways associated with the plant mineral composition. In the longer term, studying the mechanisms we have identified in a more global context of climate change could be highly relevant.

      As I look at the literature on commercial greenhouse tomato production, 1000ppm of eCO2 is common, but it also looks like the breeders and growers have already solved for flavor and nutrition under these conditions.

      Indeed, tomato is often cultivated in CO2-enriched greenhouses at 1000 ppm. According to the literature, this results in a 20-25% reduction in vitamin C or lycopene, and requires a significantly higher nitrogen and water intake to reach expected sugar levels (Doddrell H (2023) Horticulture Research). In addition, the negative effect of elevated CO2 on tomato nutrient content seems to have significant repercussions on nutrition-health properties (Boufeldja (2023), Molecules).

      Conclusion:

      While the study provides valuable insights into the genetic underpinnings of Arabidopsis thaliana's response to elevated CO2 levels, it requires an entirely revised writeup, especially in its abstract, broader claims and implications. The manuscript would benefit from a more thorough introduction, a clearer definition of its scope, and a clear focus on the limits of this study.

      We thank the reviewer for the comments made on our manuscript. In addition to the responses that we provide to these comments, we have modified the main text of the introduction, objectives and discussion to take these comments into consideration. We believe that this will significantly improve the manuscript.

      Reviewer #2 (Public Review):

      Strengths:

      The authors have conducted a large, well-designed experiment to test the response to eCO2. Overall, the experimental design is sound and appropriate for the questions about how a change in CO2 affects the ionome of Arabidopsis. Most of the conclusions in this area are well supported by the data that the authors present.

      We thank the reviewer for this positive appreciation.

      Weakness:

      While the authors have done good experiments, it is a big stretch from Arabidopsis grown in an arbitrary concentration of CO2 to relevance to human and animal nutrition in future climates. Arabidopsis is a great model plant, but its leaves are not generally eaten by humans or animals.

      We agree with the reviewer’s comment. We recognized that implying a direct contribution of our work to human nutrition in the future climates is overstated, as mentioned by the reviewer 1 as well. This was not an intentional overstatement, as we have always been convinced that our work contributed to the understanding of the basic mechanisms involved in the negative regulation of plant mineral nutrition by high CO2. We have significantly modified the text to correct any misunderstanding of our work’s implication.

      The authors don't justify their choice of a CO2 concentration. Given the importance of the parameter for the experiment, the rationale for selecting 900 ppm as elevated CO2 compared to any other concentration should be addressed. And CO2 is just one of the variables that plants will have to contend with in future climates, other variables will also affect elemental concentrations.

      We agree with this comment. We added a justification of the high CO2 concentration used in this work in the Material and Methods section (lines 343-344). You can also read the explanation of this choice in the response to the reviewer 1’s point 3.

      Given these concerns, I think the emphasis on biofortification for future climates is unwarranted for this study.

      Anew, we agree with this comment and we have significantly modified the text to correct any misunderstanding of our work’s implication.

      Additionally, I have trouble with these conclusions:

      -Abstract "Finally, we demonstrate that manipulating the function of one of these genes can mitigate the negative effect of elevated CO2 on the plant mineral composition."

      -Discussion "Consistent with these results, we show that manipulating TIP2;2 expressions with a knock-out mutant can modulate the Zn loss observed under high CO2."

      The authors have not included the data to support this conclusion as stated. They have shown that this mutant increases the Zn content of the leaves when compared to WT but have not demonstrated that this response is different than in ambient CO2. This is an important distinction: one way to ameliorate the reduction of nutrients due to eCO2 is to try to identify genes that are involved in the mechanism of eCO2-induced reduction. Another way is to increase the concentration of nutrients so that the eCO2-induced reduction is not as important (i.e. a 10% reduction in Zn due to eCO2 is not as important if you have increased the baseline Zn concentration by 20%). The authors identified tip2 as a target from the GWAS on difference, but their validation experiment only looks at eCO2.

      We thank the reviewer for this comment, and we agree with it. It is much more interesting, especially in the context of this paper, to analyze the function of a candidate gene not only in elevated CO2, but in both ambient and elevated CO2. Therefore, we added in Figure 7 data for the expression of TIP2;2 in contrasted haplotypes under ambient CO2, in comparison to those already presented under elevated CO2 (now Fig. 7C and 7D). This showed that TIP2;2 expression is lower in haplotype 0 also under ambient CO2. We also added in Figure 7 (Fig. 7E) the Zn level in WT and tip2;2-1 mutant under ambient CO2, in comparison to those already presented under elevated CO2. This showed that that the tip2;2-1 mutant line did not present any decrease in Zn shoot content in response to elevated CO2, in opposition to what is observed for the WT.

      We have added comments associated to these new results in the Results and Discussion sections and in the discussion section (lines 233-242 in the results section, and lines 310-314 in the discussion section).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Reviewer Comments on the Article's Approach to Ionome Analysis

      (1) Omission of Phosphorus from the Ionome:

      It's surprising that phosphorus (P) was not measured in the ionome. After nitrogen (N), P is often the most limiting mineral for plant development and yield, making it a significant component of the ionome. Why did the authors omit this crucial element?

      We agree with the reviewer that P is an important mineral for plant growth. The absence of data related to P content is due to feasibility constraints rather than oversight. The MP-AES instrument we used to analyze the ionome (except N and C, that we obtained from an Elementar Analyzer) would have required an extra-step and an extra-analysis to obtain data for macronutrient such as P or K. In the context of this large-scale experiment, we faced the necessity to compromise and proceed without these data.

      (2) Relationship Between Leaf Ionome and Seed:

      The manuscript lacks evidence demonstrating the relationship between the leaf ionome and the seed. This connection is vital to establish the study's aims as outlined in lines 20-24. If the central argument is that eCO2 threatens food security, it's essential for the authors to either:

      • Provide evidence that eCO2 induces changes in the ionome profiles of seeds.

      • Show that changes in the rosette leaf ionome lead to alterations in seed ionome profiles.

      We agree with the reviewer. Although we know that seed ionome composition of Arabidopsis model accession such as Columbia is indeed negatively affected by eCO2, we do not provide the data that support some of the terms used in lines 20-24. The correspondence between leaf and seed ionome in natural population under eCO2 is certainly a next question that we will address. Therefore, to align our stated objectives with our data, we have modified the sentence in lines 20-24. We also added a comment on this point lines on the discussion section (lines 324-328).

      (3) Analysis of Ionome in Rosette Leaves:

      Why did the authors choose to analyze the ionome specifically in rosette leaves? Is there a known correlation between the ionome profile in rosette leaves and seeds?

      See our answer to the above comment.

      (4) Experimental Design Comments:

      • The layout of the accession growouts, the methods of randomization, blocking, and controls/checks should be detailed.

      • Were BLUEs (Best Linear Unbiased Estimators) or BLUPs (Best Linear Unbiased Predictors) employed to account for experimental design conditions? If not, it's recommended that they be used.

      We thank the reviewer for this comment. A note on replicates has been added in the Method/Plant Material section. Concerning the BLUEs/BLUPs, although I am not familiar with their use, I do not think that these approaches are relevant in our experimental design. Indeed, we pooled 3 to 5 replicates for each accession to measure the ionome (as mentioned in the Method/Ionome analysis section – we realized this was perhaps not clear enough, and thus we reinforced this point in this section). Therefore, we do not have the variance data required to perform BLUEs/BLUPs.

      (5) Carbon Dilution Effect:

      The statement, "The first component of the PCA described a clear antagonistic trend between C content and the change of other mineral elements (Fig. 3B)..." suggests a well-understood carbon dilution effect. These results are anticipated and align with existing knowledge.

      We thank the reviewer for this comment. However, this sentence does not relate to the biomass dilution hypothesis referred to by the reviewer. Indeed, the composition of each mineral (C and others) is expressed as a percentage of biomass, not as an absolute value. Therefore, this reflects more a probable effect of the increase in carbon compounds (notably soluble sugars), which could influence mineral composition.

      (6) Heritability Estimates:

      The authors should report both the broad-sense heritability and an estimate of heritability based on a GRM or Kinship matrix.

      We thank the reviewer for this suggestion. We are skeptical of using a kinship matrix to estimate heritability in our study. Estimating narrow-sense heritability using a kinship matrix is conceptually based on the infinitesimal model of Fisher, thereby meaning that phenotypic variation is driven by hundreds to thousands of QTLs with small effects. If this is the case, GWAS conducted on several hundred (or even thousands) of genotypes will not be powerful enough to detect such QTLs. Accordingly, estimates of broad-sense heritability based on estimates of variance components can drastically differ from estimates of narrow-sense heritability based on the use of a kinship matrix, as illustrated in the study of Bergelson et al. (2019 Scientific Reports).

      (7) Application of the Breeder's Equation:

      It would be beneficial if the authors applied the breeder's equation to estimate the species' potential rate of response. Based on the allele frequency of the adapted cluster 3 (69 ecotypes or 43% frequency of Figure 3B), it seems plausible that the populations could adapt within 23 generations.

      We thank the reviewer for this suggestion. Indeed, it would be really interesting to test whether sub-populations could adapt in comparison with others, and over what period of time. It is nevertheless not possible to do so using the Breeder’s equation in our case, as this requires fitness data under conditions of ambient or elevated CO2 (i.e. production of seeds) to be applied, and we do not have these data at the level of the whole population.

      (8) Overall Quality:

      In general, the authors have executed a high-quality ionome mapping experiment. However, the abstract, introduction, and discussion should be entirely rewritten and reframed.

      We thank the reviewer for the positive evaluation of our experiment. As previously mentioned, we are for the most part in agreement with the comments made about the need to align our stated objectives with our experimental data and conclusions. To do so, we have rewritten part of the abstract, introduction and discussion. The details of these modifications are described in the responses made to each comment.

      Here's a line-by-line list of suggestions on writing:

      Line 30 would read better with a comma after thus (or by replacing thus with therefore and then a comma at the start of the sentence).

      Line 33 nevertheless would read better in between commas.

      Lines 45 - 48 sentence is too long, could probably divide it into two.

      Lines 90 - 94 are hard to interpret, recommend rephrasing for clarity.

      Line 130 - keep verbs in the past tense for consistency (ran instead of run).

      Line 194 - what do the authors mean by crossed? I'm inferring they looked at the intersection of DEGs with the list of genes identified by GWA mapping, probably should use a more concise word.

      There's a concurrent use of the adjective strong (Lines 80, 142, 144, 197, 245). I would advise using a more concise adjective or avoiding its use to let the reader form their own opinion on the data.

      Lines 174-176 the cited reference (No. 15) is incorrect. The study by Katz et al. (2022) does not provide information on the role of ZIF1 in zinc sequestration mechanisms under elevated CO2 conditions.

      We thank the reviewer for these detailed recommendations. We have corrected or rephrased the text according to these suggestions.

      Reviewer #2 (Recommendations For The Authors):

      Technical points:

      900 ppm as elevated CO2: Given the importance of the parameter for the experiment, the rationale for selection 900 ppm as elevated CO2 compared to any other concentration should be addressed.

      We acknowledge the reviewer's point and have previously addressed related aspects earlier in our response. In line with this, we have included a justification for this particular parameter in the Method section.

      The authors do not mention what genotype was used for their root/shoot RNAseq experiment.

      We thank the reviewer for this comment, and indeed, this information was not mentioned. This is now done, in the Method section.

      Line 125: Spelling error "REGMPA".

      This has been corrected.

      Line 338: Removal of outlier observations - "Prior to GWAS and multivariate analyses such as PCA or clustering, mineral composition measures were pre-processed to remove technical outliers". The authors should mention the exact number of outliers that were removed and what the explicit criteria were for removal.

      The number of outliers removed from each dataset is now indicated in Supplemental Table 7 (this is cited in the Method section). The explicit criteria used for this analysis is actually mentioned in the corresponding Method section: “the values positioned more than 5 median absolute deviations away from the median were removed from the dataset”.

      Line 379: "Lowly expressed genes with an average value across conditions under 25 reads were excluded from the analysis". Providing information about the number of the lowly expressed genes that were removed from the analysis can help with the interpretation of the likelihood of the candidates selected being correct.

      This is a standard procedure in RNAseq analysis. It avoids many false positives in the differential analysis of gene expression based on ratios (where a very small number in the denominator can lead to a very high variation in expression, of no real significance). For information, this step led to the removal of 11607 and 10121 genes for the shoot and root datasets.

      Line 384: It's not clear how many biological replicates were used.

      This has been corrected.

      Additional comment: We have also become aware of a confusion concerning one of the candidate genes located close to GWA peaks: line 180 of the first version, we mentioned CAX1 (AT1G16380) for its role on nutrient deficiency response. There are actually two genes annotated as CAX1 in TAIR (both are cation exchangers), but the one involved in nutrient deficiency response is AT2G38170. We therefore removed the sentence mentioning AT1G16380/CAX1 as a potential candidate gene.

    2. eLife assessment

      This paper provides useful information about how the ionome of Arabidopsis thaliana adapts to very high CO2-levels, backed up by solid evidence and carefully designed studies. The work will be of interest to anyone studying natural genetic variation as well as the response of plants to altered CO2 levels in the atmosphere.

    3. Reviewer #1 (Public Review):

      This study offers good evidence pointing to a genetic basis for Arabidopsis thaliana's response to elevated CO2 (eCO2) levels and its subsequent impact on the leaf ionome. The natural variation analyses in the study support the hypothesis that genetic factors, rather than local adaptation, guide the influence of eCO2 on the ionome of rosette leaves in Arabidopsis.

      Comments on current version:

      I appreciate the revisions and the effort the authors have made.

      Most of the abstract now accurately reflects the results and methods. It would be nice to have a few more technical details in the abstract, such as:<br /> * What was the CO2 level?<br /> * Which gene was identified?

      I still have a problem with this sentence:

      "The elevation of atmospheric CO2 leads to a decline in plant mineral content, which might pose a significant threat to food security in the coming decades."

      The authors provide a wide range of published studies that support this statement. I fully agree that this is what the literature suggests. However, I think the literature has asked the wrong question.

      In general, these studies addressed the question: Given no time for adaptation, do plants grown under high CO2 have a different mineral composition? The answer is yes.

      But a more important question is: Can plants and food crops adapt in time? I believe the strength of this study is that it tests this, and it suggests that the answer is yes. I also think there is a lot of unpublished results and greenhouse breeding success that supports the contention that most plants can adapt to the CO2.

      "The artificial elevation of atmospheric CO2 leads to a physiological response and decline in plant mineral content, which might pose a significant threat to food security in the coming decades if plants cannot adapt."

      It needs to be made clear throughout the paper when high CO2 levels lead to low mineral composition. These are all artificial manipulations without allowing the plants to adapt to the new environment.

      "The elevation of atmospheric CO2 concentration leads to a decline in the mineral composition of C3 plants (Gojon et al., 2023)." - this is well supported in artificial environments.

      Do wild plants have fewer minerals in their leaves today compared to plants in 1950? This would be great evidence and framing for this experiment.

      Crop plants having lower nitrogen and different mineral compositions over time is substantially a product of breeders initially increasing inputs and then, over the last decade, selecting for higher input efficiency.

      At the end of the introduction or the beginning of the results, please define why the CO2 level was chosen and its context as being at the high end of current predictions.

      "According to the literature, this results in a 20-25% reduction in vitamin C or lycopene and requires a significantly higher nitrogen and water intake to reach expected sugar levels (Doddrell H (2023), Horticulture Research). In addition, the negative effect of elevated CO2 on tomato nutrient content seems to have significant repercussions on nutrition-health properties (Boufeldja (2023), Molecules)."

      Thank you for sharing these reviews. These suggest to me that breeders favored the 80% yield bump over other traits. Either there was no breeding, or the breeding focused on other traits. It is important to mention that breeders should include mineral nutrition in their selection index while they maximize yield. Simpler breeding strategies can sometimes heavily favor one trait over others, but cattle breeders today regularly use selection indices that incorporate weights for two dozen traits.

      This study provides nice evidence that an annual weed species is likely to be able to adapt easily to high eCO2. Whether perennial species will be able to adapt in time is clearly a topic that needs to be investigated.

    4. Reviewer #2 (Public Review):

      The research uses a large collection of Arabidopsis thaliana accessions from various geographic scales to investigate the natural genetic variation underlying the response of ionome (elemental) composition to elevated CO2 (eCO2), a concern for future food security. While most accessions show a decrease in elemental accumulation, the authors demonstrate a wide variety of responses to eCO2 across the diversity of Arabidopsis, including lines that increase elemental content in eCO2. The demonstration of genetic diversity in eCO2 response is a significant contribution to our understanding of this important phenomenon.

      Comments on revised version:

      The authors made significant improvements in the manuscript from the original preprint, and the conclusions are now well supported by the evidence presented.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their constructive comments and suggestions. We have prepared a revised manuscript with updated quantification of theta cycle skipping, new statistical comparisons of the difference between the two behavioral tasks, and general improvements to the text and figures.

      Reviewer #1 (Public Review):

      Summary

      The authors provide very compelling evidence that the lateral septum (LS) engages in theta cycle skipping.

      Strengths

      The data and analysis are highly compelling regarding the existence of cycle skipping.

      Weaknesses

      The manuscript falls short on in describing the behavioral or physiological importance of the witnessed theta cycle skipping, and there is a lack of attention to detail with some of the findings and figures:

      More/any description is needed in the article text to explain the switching task and the behavioral paradigm generally. This should be moved from only being in methods as it is essential for understanding the study.

      Following this suggestion, we have expanded the description of the behavioral tasks in the Results section.

      An explanation is needed as to how a cell can be theta skipping if it is not theta rhythmic.

      A cell that is purely theta skipping (i.e., always fires on alternating theta cycles and never on adjacent theta cycles) will only have enhanced power at half theta frequency and not at theta frequency. Such a cell will therefore not be considered theta rhythmic in our analysis. Note, however, that there is a large overlap between theta rhythmic and theta skipping cell populations in our data (Figure 3 - figure supplement 2), indicating that most cells are not purely theta skipping.

      The most interesting result, in my opinion, is the last paragraph of the entire results section, where there is more switching in the alternation task, but the reader is kind of left hanging as to how this relates to other findings. How does this relate to differences in decoding of relative arms (the correct or incorrect arm) during those theta cycles or to the animal's actual choice? Similarly, how does it relate to the animal's actual choice? Is this phenomenon actually behaviorally or physiologically meaningful at all? Does it contribute at all to any sort of planning or decision-making?

      We agree that the difference between the two behavioral tasks is very interesting. It may provide clues about the mechanisms that control the cycle-by-cycle expression of possible future paths and the potential impact of goal-directed planning and (recent) experience. In the revised manuscript, we have expanded the analysis of the differences in theta-cycle dynamics between the two behavioral tasks. First, we confirm the difference through a new quantification and statistical comparison. Second, we performed additional analyses to explore the idea that the alternation of non-local representations reflects the number of relevant paths available to the animal (Figure 11 – figure supplements 2 and 3), but this did not appear to be the case. However, these results provide a starting point for future studies to clarify the task dependence of the theta- cycle dynamics of spatial representations and to address the important question of behavioral/physiological relevance.

      The authors state that there is more cycle skipping in the alternation task than in the switching task, and that this switching occurs in the lead-up to the choice point. Then they say there is a higher peak at ~125 in the alternation task, which is consistent. However, in the final sentence, the authors note that "This result indicates that the representations of the goal arms alternate more strongly ahead of the choice point when animals performed a task in which either goal arm potentially leads to reward." Doesn't either arm potentially lead to a reward (but different amounts) in the switching task, not the alternation task? Yet switching is stronger in the alternation task, which is not constant and contradicts this last sentence.

      The reviewer is correct that both choices lead to (different amounts of) reward in the switching task. As written, the sentence that the reviewer refers to is indeed not accurate and we have rephrased it to: “This result indicates that the representations of the goal arms alternate more strongly ahead of the choice point when animals performed a task in which either goal arm potentially leads to a desirable high-value reward.”.

      Additionally, regarding the same sentence - "representations of the goal arms alternate more strongly ahead of the choice point when the animals performed a task in which either goal arm potentially leads to reward." - is this actually what is going on? Is there any reason at all to think this has anything to do with reward versus just a navigational choice?

      We appreciate the reviewer’s feedback and acknowledge that our statement needs clarification. At the choice point in the Y-maze there are two physical future paths available to the animal (disregarding the path that the animal took to reach the choice point) – we assume this is what the reviewer refers to as “a navigational choice”. One hypothesis could be that alternation of goal arm representations is present whenever there are multiple future paths available, irrespective of the animal’s (learned) preference to visit one or the other goal arm. However, the reduced alternation of goal arm representations in the switching task that we report, suggests that the animal’s recent history of goal arm visits and reward expectations likely do influence the theta-cycle representations ahead of the choice point. We have expanded our analysis to test if theta cycle dynamics differ for trials before and after a switch in reward contingency in the switching task, but there was no statistical difference in our data. We have rewritten and expanded this part of the results to make our point more clearly.

      Similarly, the authors mention several times that the LS links the HPC to 'reward' regions in the brain, and it has been found that the LS represents rewarded locations comparatively more than the hippocampus. How does this relate to their finding?

      Indeed, Wirtshafter and Wilson (2020) reported that lateral septum cells are more likely to have a place field close to a reward site than elsewhere in their double-sided T-maze. It is possible that this indicates a shift towards reward or value representations in the lateral septum. In our study we did not look at reward-biased cells and whether they are more or less likely to engage in theta cycle skipping. This could be a topic for future analyses. It should be noted that the study by Wirtshafter and Wilson (2020) reports that a reward bias was predominantly present for place fields in the direction of travel away from the reward site. These reward-proximate LS cells may thus contribute to theta-cycle skipping in the inbound direction, but it is not clear if these cells would be active during theta sweeps when approaching the choice point in the outbound direction.

      Reviewer #2 (Public Review)

      Summary

      Recent evidence indicates that cells of the navigation system representing different directions and whole spatial routes fire in a rhythmic alternation during 5-10 Hz (theta) network oscillation (Brandon et al., 2013, Kay et al., 2020). This phenomenon of theta cycle skipping was also reported in broader circuitry connecting the navigation system with the cognitive control regions (Jankowski et al., 2014, Tang et al., 2021). Yet nothing was known about the translation of these temporally separate representations to midbrain regions involved in reward processing as well as the hypothalamic regions, which integrate metabolic, visceral, and sensory signals with the descending signals from the forebrain to ensure adaptive control of innate behaviors (Carus-Cadavieco et al., 2017). The present work aimed to investigate theta cycle skipping and alternating representations of trajectories in the lateral septum, neurons of which receive inputs from a large number of CA1 and nearly all CA3 pyramidal cells (Risold and Swanson, 1995). While spatial firing has been reported in the lateral septum before (Leutgeb and Mizumori, 2002, Wirtshafter and Wilson, 2019), its dynamic aspects have remained elusive. The present study replicates the previous findings of theta-rhythmic neuronal activity in the lateral septum and reports a temporal alternation of spatial representations in this region, thus filling an important knowledge gap and significantly extending the understanding of the processing of spatial information in the brain. The lateral septum thus propagates the representations of alternative spatial behaviors to its efferent regions. The results can instruct further research of neural mechanisms supporting learning during goal-oriented navigation and decision-making in the behaviourally crucial circuits entailing the lateral septum.

      Strengths

      To this end, cutting-edge approaches for high-density monitoring of neuronal activity in freely behaving rodents and neural decoding were applied. Strengths of this work include comparisons of different anatomically and probably functionally distinct compartments of the lateral septum, innervated by different hippocampal domains and projecting to different parts of the hypothalamus; large neuronal datasets including many sessions with simultaneously recorded neurons; consequently, the rhythmic aspects of the spatial code could be directly revealed from the analysis of multiple spike trains, which were also used for decoding of spatial trajectories; and comparisons of the spatial coding between the two differently reinforced tasks.

      Weaknesses

      Possible in principle, with the present data across sessions, longitudinal analysis of the spatial coding during learning the task was not performed. Without using perturbation techniques, the present approach could not identify the aspects of the spatial code actually influencing the generation of behaviors by downstream regions.

      Reviewer #3 (Public Review)

      Summary

      Bzymek and Kloosterman carried out a complex experiment to determine the temporal spike dynamics of cells in the dorsal and intermediate lateral septum during the performance of a Y-maze spatial task. In this descriptive study, the authors aim to determine if inputting spatial and temporal dynamics of hippocampal cells carry over to the lateral septum, thereby presenting the possibility that this information could then be conveyed to other interconnected subcortical circuits. The authors are successful in these aims, demonstrating that the phenomenon of theta cycle skipping is present in cells of the lateral septum. This finding is a significant contribution to the field as it indicates the phenomenon is present in neocortex, hippocampus, and the subcortical hub of the lateral septal circuit. In effect, this discovery closes the circuit loop on theta cycle skipping between the interconnected regions of the entorhinal cortex, hippocampus, and lateral septum. Moreover, the authors make 2 additional findings: 1) There are differences in the degree of theta modulation and theta cycle skipping as a function of depth, between the dorsal and intermediate lateral septum; and 2) The significant proportion of lateral septum cells that exhibit theta cycle skipping, predominantly do so during 'non-local' spatial processing.

      Strengths

      The major strength of the study lies in its design, with 2 behavioral tasks within the Y-maze and a battery of established analyses drawn from prior studies that have established spatial and temporal firing patterns of entorhinal and hippocampal cells during these tasks. Primary among these analyses, is the ability to decode the animal's position relative to locations of increased spatial cognitive demand, such as the choice point before the goal arms. The presence of theta cycle skipping cells in the lateral septum is robust and has significant implications for the ability to dissect the generation and transfer of spatial routes to goals within and between the neocortex and subcortical neural circuits.

      Weaknesses

      There are no major discernable weaknesses in the study, yet the scope and mechanism of the theta cycle phenomenon remain to be placed in the context of other phenomena indicative of spatial processing independent of the animal's current position. An example of this would be the ensemble-level 'scan ahead' activity of hippocampal place cells (Gupta et al., 2012; Johnson & Redish, 2007). Given the extensive analytical demands of the study, it is understandable that the authors chose to limit the analyses to the spatial and burst firing dynamics of the septal cells rather than the phasic firing of septal action potentials relative to local theta oscillations or CA1 theta oscillations. Yet, one would ideally be able to link, rather than parse the phenomena of temporal dynamics. For example, Tingley et al recently showed that there was significant phase coding of action potentials in lateral septum cells relative to spatial location (Tingley & Buzsaki, 2018). This begs the question as to whether the non-uniform distribution of septal cell activity within the Y-maze may have a phasic firing component, as well as a theta cycle skipping component. If so, these phenomena could represent another means of information transfer within the spatial circuit during cognitive demands. Alternatively, these phenomena could be part of the same process, ultimately representing the coherent input of information from one region to another. Future experiments will therefore have to sort out whether theta cycle skipping, is a feature of either rate or phase coding, or perhaps both, depending on circuit and cognitive demands.

      The authors have achieved their aims of describing the temporal dynamics of the lateral septum, at both the dorsal extreme and the intermediate region. All conclusions are warranted.

      Reviewer #1 (Recommendations For The Authors)

      The text states: "We found that 39.7% of cells in the LSD and 32.4% of cells in LSI had significantly higher CSI values than expected by chance on at least one of the trajectories." The text in the supplemental figure indicates a p-value of 0.05 was used to determine significance. However, four trajectory categories are being examined so a Bonferroni correction should be used (significance at p<0.0125).

      Indeed, a p-value correction for multiple tests should be performed when determining theta cycle skipping behavior for each of the four trajectories. We thank the reviewer for pointing out this oversight. We have implemented a Holm-Sidak p-value correction for the number of tested trajectories per cell (excluding trajectories with insufficient spikes). As a consequence, the number of cells with significant cycle-skipping activity decreased, but overall the results have not changed.

      Figure 4 is very confusing as raster plots are displayed for multiple animals but it is unclear which animal the LFP refers to? The bottom of the plot is also referenced twice in the figure caption.

      We apologize for the confusion. We have removed this figure in the revised manuscript, as it was not necessary to make the point about the spatial distribution of theta cycle skipping. Instead, we show examples of spatially-resolved cycle skipping in Figure 4 (formerly Figure 5 - supplementary figures 1 and 2) and we have added a plot with the spatially-resolved cycle skipping index for all analyzed cells in Figure 5A.

      Figure 6 has, I think, an incorrect caption or figure. Only A and B are marked in the figure but A-G are mentioned in the caption but do not appear to correspond to anything in the figure.

      Indeed, the caption was outdated. This has now been corrected.

      Figure 8 is also confusing for several reasons: how is the probability scale on the right related to multiple semi-separate (top and middle) figures? In the top and bottom figures, it is not clear what the right and left sides refer to. It is also unclear why a probability of 0.25 is used for position (seems potentially low). The caption also mentions Figure A but there are no lettered "sub" figures in Figure 8.

      The color bar on the right applies to both the top plot (directional decoding) and the middle plot (positional decoding). However, the maximum probability that is represented by black differs between the top and middle plots. We acknowledge that a shared color bar may lead to confusion and we have given each of the plots a separate color bar.

      As for the maximum probability of 0.25 for position: this was a typo in the legend. The correct maximum value is 0.5. In general, the posterior probability will be distributed over multiple (often neighboring) spatial bins, and the distribution of maximum probabilities will depend on the number of spatial bins, the level of spatial smoothing in the decoding algorithm, and the amount of decodable information in the data. It would be more appropriate to consider the integrated probability over a small section of the maze, rather than the peak probability that is assigned to a single 5 cm bin. Also, note that a posterior probability of 0.5 is many times higher than the probability associated with a uniform distribution, which is in our case.

      The left and right sides of the plots represent two different journeys that the animal ran. On the left an outbound journey is shown, and on the right an inbound journey. We have improved the figure and the description in the legend to make this clearer.

      The reviewer is correct that there are no panels in Figure 8 and we have corrected the legend.

      Some minor concerns

      The introduction states that "a few studies have reported place cell-like activity in the lateral septum (Tingley and Buzsaki, 2018; Wirtshafter and Wilson, 2020, 2019)." However, notably and controversially, the Tingley study is one of the few studies to find NO place cell activity in the lateral septum. This is sort of mentioned later but the citation in this location should be removed.

      The reviewer is correct, Tingley and Buzsaki reported a spatial phase code but no spatial rate code. We have removed the citation.

      Stronger position/direction coding in the dLS consistent with prior studies and they should be cited in text (not a novel finding).

      Thank you for pointing out this omission. Indeed, a stronger spatial coding in the dorsal lateral septum has been reported before, for example by Van der Veldt et al. (2021). We now cite this paper when discussing these findings.

      Why is the alternation task administered for 30m but the switching task for 45m?

      The reason is that rats received a larger reward in the switching task (in the high-reward goal arm) and took longer to complete trials on average. To obtain a more-or-less similar number of trials per session in both tasks, we extended the duration of switching task sessions to 45 minutes. We have added this explanation to the text.

      Regarding the percentage of spatially modulated cells in the discussion, it is also worth pointing out that bits/sec information is consistent with previous studies.

      Thank you for the suggestion. We now point out that the spatial information in our data is consistent with previous studies.

      Reviewer #2 (Recommendations For The Authors)

      While the results of the study are robust and timely, further details of behavioural training, additional quantitative comparisons, and improvements in the data presentation would make the study more comprehensible and complete.

      Major comments

      (1) I could not fully comprehend the behavioural protocols. They require a clearer explanation of both the specific rationale of the two tasks as well as a more detailed presentation of the protocols. Specifically:

      (1.1) In the alternation task, were the arms baited in a random succession? How many trials were applied per session? Fig 1D: how could animals reach high choice accuracy if the baiting was random?

      We used a continuous version of the alternation task, in which the animals were rewarded for left→home→right and right→home→left visit sequences. In addition, animals were always rewarded on inbound journeys. There was no random baiting of goal arms. Perhaps the confusion stems from our use of the word “trial” to refer to a completed lap (i.e., a pair of outbound/inbound journeys). On average, animals performed 54 of such trials per 30-minute session in the alternation task. We have expanded the description of the behavioral tasks in the Results and further clarified these points in the Methods section.

      (1.2) Were they rewarded for correct inbound trials? If there was no reward, why were they considered correct?

      Yes, rats received a reward at the home platform for correct inbound trials. We have now explicitly stated this in the text.

      (1.3) In the switch alternation protocol, for how many trials was one arm kept more rewarding than the other, and how many trials followed after the rewarding value switch?

      A switch was triggered when rats (of their own volition) visited the high-reward goal arm eight times in a row. Following a switch, the animals could complete as many trials as necessary until they visited the new high- reward goal arm in eight consecutive trials, which triggered another switch. As can be seen in Figure 1D, at the population level, animals needed ~13 trials to fully commit to the high-reward goal arm following a switch. We have further clarified the switching task protocol in the Results and Methods sections.

      (1.4) What does the phrase "the opposite arm (as 8 consecutive visits)" exactly mean? Sounds like 8 consecutive visits signalled that the arm was rewarded (as if were not predefined in the protocol).

      The task is self-paced and the animals initially visit both goal arms, before developing a bias for the high- reward goal arm. A switch of reward size was triggered as soon as the animal visited the high-reward goal arm for eight consecutive trials. We have rewritten the description of the switching task protocol, including this sentence, which hopefully clarifies the procedure.

      (1.5) P. 15, 1st paragraph, Theta cycle skipping and alternation of spatial representations is more prominent in the alternation task. Why in the switching task, did rats visit the left and right arms approximately equally often if one was more rewarding than the other? How many switches were applied per recording session, and how many trials were there in total?

      Both the left and right goal arms were sampled more or less equally by the animals because both goal arms at various times were associated with a large reward following switches in reward values during sessions. The number of switches per session varied from 1 to 3. Sampling of both goal arms was also evident at the beginning of each session and following each reward value switch, before animals switched their behavior to the (new) highly rewarded goal arm. In Table 1, we have now listed the number of trials and the number of reward-value switches for all sessions.

      (1.6) Is the goal arm in figures the rewarded/highly rewarded arm only or are non-baited arms also considered here?

      Both left and right arms are considered goal arms and were included in the analyses, irrespective of the reward that was received (or not received).

      (2) The spatial navigation-centred behavioural study design and the interpretation of results highlight the importance of the dorsal hippocampal input to the LS. Yet, the recorded LSI cells are innervated by intermediate and ventral aspects of the hippocampus, and LS receives inputs from the amygdala and the prefrontal cortex, which together may together bring about - crucial for the adaptive behaviours regulated by the LS - reward, and reward-prediction-related aspects in the firing of LS cells during spatial navigation. Does success or failure to acquire reward in a trial modify spatial coding and cycle skipping of LSD vs. LSI cells in ensuing inbound and outbound trials?

      This is an excellent question and given the length of the current manuscript, we think that exploration of this question is best left for a future extension of our study.

      A related question: in Figure 10, it is interesting that cycle skipping is prominent in the goal arm for outbound switching trials and inbound trials of both tasks. Could it be analytically explained by task contingencies and behaviour (e.g. correct/incorrect trial, learning dynamics, running speed, or acceleration)?

      Our observation of cycle skipping at the single-cell level in the goal arms is somewhat surprising and, we agree with the reviewer, potentially interesting. However, it was not accompanied by alternation of representations at the population level. Given the current focus and length of the manuscript, we think further investigation of cycle skipping in the goal arm is better left for future analyses.

      (3) Regarding possible cellular and circuit mechanisms of cycle skipping and their relation to the alternating representations in the LS. Recent history of spiking influences the discharge probability; e.g. complex spike bursts in the hippocampus are associated with a post-burst delay of spiking. In LS, cycle skipping was characteristic for LS cells with high firing rates and was not uniformly present in all trajectories and arms. The authors propose that cycle skipping can be more pronounced in epochs of reduced firing, yet the opposite seems also possible - this phenomenon can be due to an intermittently increased drive onto some LS cells. Was there a systematic relationship between cycle skipping in a given cell and the concurrent firing rate or a recent discharge with short interspike intervals?

      In our discussion, we tried to explain the presence of theta cycle skipping in the goal arms at the single-cell level without corresponding alternation dynamics at the population level. We mentioned the possibility of a decrease in excitatory drive. As the reviewer suggests, an increase in excitatory drive combined with post- burst suppression or delay of spiking is an alternative explanation. We analyzed the spatial tuning of cells with theta cycle skipping and found that, on average, these cells have a higher firing rate in the goal arm than the stem of the maze in both outbound and inbound run directions (Figure 5 – figure supplement 1). In contrast, cells that do not display theta cycle skipping do not show increased firing in the goal arm. These results are more consistent with the reviewer’s suggested mechanism and we have updated the discussion accordingly.

      (4) Were the differences between the theta modulation (cycle skipping) of local vs. non-local representations (P.14, line 10-12, "In contrast...", Figure 9A) and between alternation vs. switching tasks (Figure 10 C,D) significantly different?

      We have added quantification and statistical comparisons for the auto- and cross-correlations of the local/non-local representations. The results indeed show significantly stronger theta cycle skipping of the non-local representations as compared to the local representations (Figure 10 - figure supplement 1A), a stronger alternation of non-local representations in the outbound direction (Figure 10 - figure supplement 1B), and significant differences between the two tasks (Figure 11E,F).

      (5) Regarding the possibility of prospective coding in LS, is the accurate coding of run direction not consistent with prospective coding? Can the direction be decoded from the neural activity in the start arm? Are the cycling representations of the upcoming arms near the choice point equally likely or preferential for the then- selected arm?

      The coding of run direction (outbound or inbound) is distinct from the prospective/retrospective coding of the goal arm. As implemented, the directional decoding model does not differentiate between the two goal arms and accurate decoding of direction with this model can not inform us whether or not there is prospective (or retrospective) coding. To address the reviewer’s comments, we performed two additional analyses. First, we analyzed the directional (outbound/inbound) decoding performance as a function of location in the maze (Figure 6 - figure supplement 3E). The results show that directional decoding performance is high in both stem and goal arms. Second, we analyzed how well we can predict the trajectory type (i.e., to/from the left or right goal arm) as a function of location in the maze, and separately for outbound and inbound trajectories (Figure 6 - figure supplement 3C,D). The results show that on outbound journeys, decoding the future goal arm is close to chance when the animals are running along the stem. The decoding performance goes up around the choice point and reaches the highest level when animals are in the goal arm.

      (6) Figure 10 seems to show the same or similar data as Figures 5 (A,B) and 9 (C,D).

      Figure 10 (figure 11 in revised manuscript) re-analyzes the same data as presented in Figures 5 and 9, but separates the experimental sessions according to the behavioral task. We now explicitly state this.

      Minor comments

      (1) If cycle skipping in the periodicity of non-local representations was more prominent in alternation than in the switching task, one might expect them to be also prominent in early trials of the switching task, when the preference of a more rewarding arm is not yet established. Was this the case?

      The reviewer makes an interesting suggestion. Indeed, if theta cycle skipping and the alternation of non-local representations reflect that there are multiple paths that the animal is considering, one may predict that the theta skipping dynamics are similar between the two tasks in early trials (as the reviewer suggests). Similarly, one may predict that in the switching task, the alternation of non-local representations is weaker immediately before a reward contingency switch (when the animal has developed a bias towards the goal arm with a large reward) as compared to after the switch.

      We have now quantified the theta cycle dynamics of spatial representations in the early trials in each session of both tasks (Figure 11 - figure supplement 2) and in the trials before and after each switch in the switching task (Figure 11 - figure supplement 3).

      The results of the early trial analysis indicate stronger alternation of non-local representations in the alternation task than in the switching task (consistent with the whole session analysis), which is contrary to the prediction.

      The pre-/post-switch analysis did not reveal a significant difference between the trials before and after a reward contingency switch. If anything, there was a trend towards stronger theta cycle skipping/alternation in the trials before a switch, which would be opposite to the prediction.

      These results do not appear to support the idea that the alternation of non-local representations reflects the number of relevant paths available to the animal. We have updated the text to incorporate these new data and discuss the implications.

      (2) Summary: sounds like the encoding of spatial information and its readout in the efferent regions are equally well established.

      Thank you for pointing this out.

      (3) Summary: "motivation and reward processing centers such as the ventral tegmental area." How about also mentioning here the hypothalamus, which is a more prominent output of the lateral septum than the VTA?

      We have now also mentioned the hypothalamus.

      (4) "lateral septum may contribute to the hippocampal theta" - readers not familiar with details of the medial vs. lateral septum research may misinterpret the modest role of LS in theta compared to MS.

      We have added “in addition to the strong theta drive originating from the medial septum” to make clear that the lateral septum has a modest role in hippocampal theta generation.

      (5) "(Tingley and Buzsáki, 2018) found a lack of spatial rate coding in the lateral septum and instead reported a place coding by specific phases of the hippocampal theta rhythm (Rizzi-Wise and Wang, 2021) " needs rephrasing.

      Thank you, we have rephrased the sentence.

      (6) Figure 4 is a bit hard to generalize. The authors may additionally consider a sorted raster presentation of the dataset in this main figure.

      We have removed this figure in the revised manuscript, as it was not necessary to make the point about the location of theta cycle skipping. Instead, we show examples of spatially-resolved cycle skipping in Figure 4 (formerly Figure 5 - supplementary figures 1 and 2), and, following the reviewer’s suggestion, we have added a plot with the spatially-resolved cycle skipping index for all analyzed cells (Figure 5A).

      (7) It would help if legends of Figure 5 (and related supplementary figures) state in which of the two tasks the data was acquired, as it is done for Figure 10.

      Thank you for the suggestion. The legends of Figure 4A,B (formerly Figure 5 – supplemental figures 1 and 2) and Figure 5 now include in which behavioral task the data was acquired.

      (8) Page 10, "Spatial coding...", 1st Citing the initial report by Leugeb and Mizumori would be appropriate here too.

      The reviewer is correct. We have added the citation.

      (9) The legend in Figure 6 (panels A-G) does not match the figure (only panels A,B). What is shown in Fig. 6B, the legend does not seem to fully match.

      Indeed, the legend was outdated. This has now been corrected.

      (10) 7 suppl., if extended to enable comparisons, could be a main figure. Presently, Figure 7C does not account for the confounding effect of population size and is therefore difficult to interpret without complex comparisons with the Supplementary Figure which is revealing per se.

      We thank the reviewer for their suggestion. We have changed Figure 7 such that it only shows the analysis of decoding performed with all LSD and LSI cells. Figure 7 – supplemental figure 1 has been transformed into main Figure 8, with the addition of a panel to show a statistical comparison between decoding performance in LSD and LSI with a fixed number of cells.

      (11) 14, line 10 there is no Figure 8A

      This has been corrected.

      (12) 15 paragraph 1, is the discussed here model the one from Kay et al?

      From Kay et al. (2020) and also Wang et al. (2020). We have added the citations.

      (13) Figure 5 - Figure Supplement 1 presents a nice analysis that, in my view, can merit a main figure. I could not find the description of the colour code in CSI panels, does grey/red refer to non/significant points?

      Indeed, grey/red refers to non-significant points and significant points respectively. We have clarified the color code in the figure legend. Following the reviewer’s suggestion, we have made Figure 5 Supplement 1 and 2 a main figure (Figure 4).

      (14) Figure 5 -Figure Supplement 2. Half of the cells (255 and 549) seems not to be representative of the typically high SCI in the goal arm in left and right inbound trials combined (Figure 5 A). Were the changes in CSI in the right and left inbound trials similar enough to be combined in Fig 5A? Otherwise, considering left and right inbound runs separately and trying to explain where the differences come from would seem to make sense.

      Figure 5 – figure supplement 2 is now part of the new main Figure 4. Originally, the examples were from a single session and the same cells as shown in the old Figure 4. However, since the old Figure 4 has been removed, we have selected examples from different sessions and both left/right trajectories that are more representative of the overall distribution. We have further added a plot with the spatially-resolved cycle skipping for all analyzed cells in Figure 5A.

      (15) In the second paragraph of the Discussion, dorso-ventral topography of hippocampal projections to the LS (Risold and Swanson, Science, 90s) could be more explicitly stated here.

      Thank you for the suggestion. We have now explicitly mentioned the dorsal-ventral topography of hippocampal-lateral septum projections and cite Risold & Swanson (1997).

      (16) Discussion point: why do the differences in spatial information of cells in the ventral/intermediate vs. dorsal hippocampus not translate into similarly prominent differences in LSI vs. LSD?

      In our data, we do observe clear differences in spatial coding between LSD and LSI. Specifically, cell activity in the LSD is more directional, has higher goal arm selectivity, and higher spatial information (we have now added statistical comparisons to Figure 6 – figure supplement 1). As a result, spatial decoding performance is much better for LSD cell populations than LSI cell populations (see updated Figure 8, with statistical comparison of decoding performance). Spatial coding in the LS is not as strong as in the hippocampus, likely because of the convergence of hippocampal inputs, which may give the impression of a less prominent difference between the two subregions.

      (17) Discussion, last paragraph: citation of the few original anatomical and neurophysiological studies would be fitting here, in addition to the recent review article.

      Thank you for the suggestion. We have added selected citations of the original literature.

      (18) Methods, what was the reference electrode?

      We used an external reference electrode that was soldered to a skull screw, which was positioned above the cerebellum. We have added this to the Methods section.

      (19) Methods, Theta cycle skipping: bandwidth = gaussian kerner parameter?

      The bandwidth is indeed a parameter of the Gaussian smoothing kernel and is equal to the standard deviation.

      Reviewer #3 (Recommendations For The Authors)

      Below I offer a short list of minor comments and suggestions that may benefit the manuscript.

      (A) I was not able to access the Open Science Framework Repository. Can this be rectified?

      Thank you for checking the OSF repository. The data and analysis code are now publicly available.

      (B) In the discussion the authors should attempt to flesh out whether they can place theta cycle skipping into context with left/right sweeps or scan ahead phenomena, as shown in the Redish lab.

      Thank you for the excellent suggestion. We have now added a discussion of the possible link between theta cycle skipping and the previously reported scan-ahead theta sweeps.

      (C) What is the mechanism of cycle skipping? This could be relevant to intrinsic vs network oscillator models. Reference should also be made to the Deshmukh model of interference between theta and delta (Deshmukh, Yoganarasimha, Voicu, & Knierim, 2010).

      We had discussed a potential mechanism in the discussion (2nd to last paragraph in the revised manuscript), which now includes a citation of a recent computational study (Chu et al., 2023). We have now also added a reference to the interference model in Deshmukh et al, 2010.

      (D) Little background was given for the motivation and expectation for potential differences between the comparison of the dorsal and intermediate lateral septum. I don't believe that this is the same as the dorsal/ventral axis of the hippocampus, but if there's a physiological justification, the authors need to make it.

      We have added a paragraph to the introduction to explain the anatomical and physiological differences across the lateral septum subregions that provide our rationale for comparing dorsal and intermediate lateral septum (we excluded the ventral lateral septum because the number of cells recorded in this region was too low).

      (E) It would help to label "outbound" and "inbound" on several of the figures. All axes need to be labeled, with appropriate units indicated.

      We have carefully checked the figures and added inbound/outbound labels and axes labels where appropriate.

      (F) In Figure 6, the legend doesn't match the figure.

      Indeed, the legend was outdated. This has now been corrected.

      (G) The firing rate was non-uniform across the Y-maze. Does this mean that the cells tended to fire more in specific positions of the maze? If so, how would this affect the result? Would increased theta cycle skipping at the choice point translate to a lower firing rate at the choice point? Perhaps less overdispersion of the firing rate (Fenton et al., 2010)?

      Individual cells indeed show a non-uniform firing rate across the maze. To address the reviewer’s comment and test if theta cycle skipping cells were active preferentially near the choice point or other locations, we computed the mean-corrected spatial tuning curves for cell-trajectory pairs with and without significant theta cycle skipping. This additional analysis indicates that, on average, the population of theta cycle skipping cells showed a higher firing rate in the goal arms than in the stem of the maze as compared to non-skipping cells for outbound and inbound directions (shown in Figure 5 - figure supplement 1).

      (H) As mentioned above, it could be helpful to look at phase preference. Was there an increased phase preference at the choice point? Would half-cycle firing correlate with an increased or decreased phase preference? Based on prior work, one would expect increased phase preference, at least in CA1, at the choice point (Schomburg et al., 2014). In contrast, other work might predict phasic preference according to spatial location (Tingley & Buzsaki, 2018). Including phase analyses is a suggestion, of course. The manuscript is already sufficiently novel and informative. Yet, the authors should state why phase was not analyzed and that these questions remain for follow-up analyses. If the authors did analyze this and found negative results, it should be included in this manuscript.

      We thank the reviewer for their suggestion. We have not yet analyzed the theta phase preference of lateral septum cells or other relations to the theta phase. We agree that this would be a valuable extension of our work, but prefer to leave it for future analyses.

      (I) One of the most important aspects of the manuscript, is that there is now evidence of theta cycle skipping in the circuit loop between the EC, CA1, and LS. This now creates a foundation for circuit-based studies that could dissect the origin of route planning. Perhaps the authors should state this? In the same line of thinking, how would one determine whether theta cycle skipping is necessary for route planning as opposed to a byproduct of route planning? While this question is extremely complex, other studies have shown that spatial navigation and memory are still possible during the optogenetic manipulation of septal oscillations (Mouchati, Kloc, Holmes, White, & Barry, 2020; Quirk et al., 2021). However, pharmacological perturbation or lesioning of septal activity can have a more profound effect on spatial navigation (Bolding, Ferbinteanu, Fox, & Muller, 2019; Winson, 1978). As a descriptive study, I think it would be helpful to remind the readers of these basic concepts.

      We thank the reviewer for their comment and for pointing out possible future directions for linking theta cycle skipping to route planning. Experimental manipulations to directly test this link would be very challenging, but worthwhile to pursue. We now mention how circuit-based studies may help to test if theta cycle skipping in the broader subcortical-cortical network is necessary for route planning. Given that the discussion is already quite long, we decided to omit a more detailed discussion of the possible role of the medial septum (which is the focus of the papers cited by the reviewer).

      Very minor points

      (A) In the introduction, "one study" begins the sentence but there is a second reference.

      Thank you, we have rephrased the sentence.

      (B) Also in the introduction, it could be helpful to have an operational definition of theta cycle skipping (i.e., 'enhanced rhythmicity at half theta frequency').

      We followed the reviewer’s suggestion.

      (C) The others should be more explicit in the introduction about their main question. Theta cycle skipping exists in CA1, and then import some of the explanations mentioned in the discussion to the introduction (i.e., attractors states of multiple routes). The main question is then whether this phenomenon, and others from CA1, translate to the output in LS.

      We have edited the introduction to more clearly state the main question of our study, following the suggestion from the reviewer.

      (D) There are a few instances of extra closing parentheses.

      We checked the text but did not find instances of erroneous extra closing parentheses. There are instances of nested parentheses, which may have given the impression that closing parentheses were duplicated.

      (E) The first paragraph of the Discussion lacks sufficient references.

      We have now added references to the first paragraph of the discussion.

      (F) At the end of the 2nd paragraph in the Discussion, the comparison is missing. More than what? It's not until the next reference that one can assume that the authors are referring to a dorsal/ventral axis. However, the physiological motivation for this comparison is lacking. Why would one expect a dorsal/intermediate continuum for theta modulation as there is along the dorsal/ventral axis of the hippocampus?

      Thank you for spotting this omission. We have rewritten the paragraph to more clearly make the parallel between dorsal-ventral gradients in the lateral septum and hippocampus and how this relates to the topographical connections between the two structures.

    2. eLife assessment

      In this study, the authors present convincing evidence to demonstrate theta cycle skipping by individual neurons of the lateral septum, which they then relate to population coding of future trajectories encapsulated by theta cycles. This valuable finding furthers our understanding of how the septum conveys navigational information downstream.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors provide very compelling evidence that the lateral septum (LS) engages in theta cycle skipping.

      Strengths:

      The data and analysis is highly compelling regarding the existence of cycle skipping.

      Comments on the revised version:

      All previous recommendations were addressed in this revision.

    4. Reviewer #2 (Public Review):

      Summary

      Recent evidence indicates that cells of the navigation system representing different directions and whole spatial routes fire in a rhythmic alternation during 5-10 Hz (theta) network oscillation (Brandon et al., 2013, Kay et al., 2020). This phenomenon of theta cycle skipping was also reported in broader circuitry connecting the navigation system with the cognitive control regions (Jankowski et al., 2014, Tang et al., 2021). Yet nothing was known about the translation of these temporally separate representations to midbrain regions involved in reward processing as well as the hypothalamic regions, which integrate metabolic, visceral, and sensory signals with the descending signals from the forebrain to ensure adaptive control of innate behaviors (Carus-Cadavieco et al., 2017). The present work aimed to investigate theta cycle skipping and alternating representations of trajectories in the lateral septum, neurons of which receive inputs from large number of CA1 and nearly all CA3 pyramidal cells (Risold and Swanson, 1995). While spatial firing has been reported in the lateral septum before (Leutgeb and Mizumori, 2002, Wirtshafter and Wilson, 2019), its dynamic aspects have remained elusive. The present study replicates the previous findings of theta-rhythmic neuronal activity in the lateral septum and reports a temporal alternation of spatial representations in this region, thus filling an important knowledge gap and significantly extending the understanding of the processing of spatial information in the brain. The lateral septum thus propagates the representations of alternative spatial behaviors to its efferent regions. The results can instruct further research of neural mechanisms supporting learning during goal-oriented navigation and decision-making in the behaviourally crucial circuits entailing the lateral septum.

      Strengths

      To this end, cutting-edge approaches for high-density monitoring of neuronal activity in freely behaving rodents and neural decoding were applied. Strengths of this work include comparisons of different anatomically and probably functionally distinct compartments of the lateral septum, innervated by different hippocampal domains and projecting to different parts of the hypothalamus; large neuronal datasets including many sessions with simultaneously recorded neurons; consequently, the rhythmic aspects of the spatial code could be directly revealed from the analysis of multiple spike trains, which were also used for decoding of spatial trajectories; and comparisons of the spatial coding between the two differently reinforced tasks.

      Weaknesses

      Without using perturbation techniques, the present approach could not identify the aspects of the spatial code actually influencing the generation of behaviors by downstream regions.

    1. eLife assessment

      This important work identifies a previously uncharacterized capacity for songbird to recover vocal targets even without sensory experience. The evidence supporting this claim is convincing, with technically difficult and innovative experiments exploring goal-directed vocal plasticity in deafened birds. This work has broad relevance to the fields of vocal and motor learning.

    2. Reviewer #1 (Public Review):

      Summary:

      Zai et al test if songbirds can recover the capacity to sing auditory targets without singing experience or sensory feedback. Past work showed that after the pitch of targeted song syllables are driven outside of birds' preferred target range with external reinforcement, birds revert to baseline (i.e. restore their song to their target). Here the authors tested the extent to which this restoration occurs in muted or deafened birds. If these birds can restore, this would suggest an internal model that allows for sensory-to-motor mapping. If they cannot, this would suggest that learning relies entirely on feedback dependent mechanisms, e.g. reinforcement learning (RL). The authors find that deafened birds exhibit moderate but significant restoration, consistent with the existence of a previously under-appreciated internal model in songbirds.

      Strengths:

      The experimental approach of studying vocal plasticity in deafened or muted birds is innovative, technically difficult and perfectly suited for the question of feedback-independent learning. The finding in Figure 4 that deafened birds exhibit subtle but significant plasticity toward restoration of their pre-deafening target is surprising and important for the songbird and vocal learning fields, in general.

      In this revision, the authors suitably addressed confusion about some statistical methods related to Fig. 4, where the main finding of vocal plasticity in deafened birds was presented.

      There remain minor issues in the presentation early in the results section and in Fig. 4 that should be straightforward to clarify in the revision.

    3. Reviewer #3 (Public Review):

      Summary:

      Zai et al. test whether birds can modify their vocal behavior in a manner consistent with planning. They point out that while some animals are known to be capable of volitional control of vocalizations, it has been unclear if animals are capable of planning vocalizations-that is, modifying vocalizations towards a desired target without the need to learn this modification by practising and comparing sensory feedback of practised behavior to the behavioral target. They study zebra finches that have been trained to shift the pitch of song syllables away from their baseline values. It is known that once this training ends, zebra finches have a drive to modify pitch so that it is restored back to its baseline value. They take advantage of this drive to ask whether birds can implement this targeted pitch modification in a manner that looks like planning, by comparing the time course and magnitude of pitch modification in separate groups of birds who have undergone different manipulations of sensory and motor capabilities. A key finding is that birds who are deafened immediately before the onset of this pitch restoration paradigm, but after they have been shifted away from baseline, are able to shift pitch partially back towards their baseline target. In other words, this targeted pitch shift occurs even when birds don't have access to auditory feedback, which argues that this shift is not due to reinforcement-learning-guided practice, but is instead planned based on the difference between an internal representation of the target (baseline pitch) and current behavior (pitch the bird was singing immediately before deafening).

      The authors present additional behavioral studies arguing that this pitch shift requires auditory experience of song in its state after it has been shifted away from baseline (birds deafened early on, before the initial pitch shift away from baseline, do not exhibit any shift back towards baseline), and that a full shift back to baseline requires auditory feedback. The authors synthesize these results to argue that different mechanisms operate for small shifts (planning, which does not need auditory feedback) and large shifts (through a mechanism that requires auditory feedback).

      The authors also make a distinction between two kinds of planning: covert-not requiring any motor practice and overt-requiring motor practice but without access to auditory experience from which target mismatch could be computed. They argue that birds plan overtly, based on these deafening experiments as well as an analogous experiment involving temporary muting, which suggests that indeed motor practice is required for pitch shifts.

      Strengths:

      The primary finding (that partially restorative pitch shift occurs even after deafening) rests on strong behavioral evidence. It is less clear to what extent this shift requires practice, since their analysis of pitch after deafening takes the average over within the first two hours of singing. If this shift is already evident in the first few renditions then this would be evidence for covert planning. Technical hurdles, such as limited sample sizes and unstable song after surgical deafening, make this difficult to test. (Similarly, the authors could test whether the first few renditions after recovery from muting already exhibit a shift back towards baseline.)

      This work will be a valuable addition to others studying birdsong learning and its neural mechanisms. It documents features of birdsong plasticity that are unexpected in standard models of birdsong learning based on reinforcement and are consistent with an additional, perhaps more cognitive, mechanism involving planning. As the authors point out, perhaps this framework offers a reinterpretation of the neural mechanisms underlying a prior finding of covert pitch learning in songbirds (Charlesworth et al., 2012).

      A strength of this work is the variety and detail in its behavioral studies, combined with sensory and motor manipulations, which on their own form a rich set of observations that are useful behavioral constraints on future studies.

      Weaknesses:

      The argument that pitch modification in deafened birds requires some experience hearing their song in its shifted state prior to deafening (Fig. 4) is solid but has an important caveat. Their argument rests on comparing two experimental conditions: one with and one without auditory experience of shifted pitch. However, these conditions also differ in the pitch training paradigm: the "with experience" condition was performed using white noise training, while the "without experience" condition used "lights off" training (Fig. 4A). It is possible that the differences in ability for these two groups to restore pitch to baseline reflects the training paradigm, not whether subjects had auditory experience of the pitch shift. Ideally, a control study would use one of the training paradigms for both conditions, which would be "lights off" or electrical stimulation (McGregor et al. 2022), since WN training cannot be performed in deafened birds. In the Discussion, in response to this point, the authors point out that birds are known to recover their pitch shift if those shifts are driven using electrical stimulation as reinforcement (McGregor et al. 2022); however, it is arguably still relevant to know whether a similar recovery occurs for the "lights off" paradigm used here.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #1 (Recommendations For The Authors):

      In this revision the authors address some of the key concerns, including clarification of the balanced nature of the RL driven pitch changes and conducting analyses to control for the possible effects of singing quantity on their results. The paper is much improved but still has some sources of confusion, especially around Fig. 4, that should be fixed. The authors also start the paper with a statistically underpowered minor claim that seems unnecessary in the context of the major finding. I recommend the authors may want to restructure their results section to focus on the major points backed by sufficient n and stats.

      Major issues.

      (1) The results section begins very weak - a negative result based on n=2 birds and then a technical mistake of tube clogging re-spun as an opportunity to peak at intermittent song in the otherwise muted birds. The logic may be sound but these issues detract from the main experiment, result, analysis, and interpretation. I recommend re-writing this section to home in on, from the outset, the well-powered results. How much is really gained from the n=2 birds that were muted before ANY experience? These negative results may not provide enough data to make a claim. Nor is this claim necessary to motivate what was done in the next 6 birds. I recommend dropping the claim?

      We thank the reviewer for the recommendation. We moved the information to the Methods.

      (2) Fig. 4 is very important yet remains very confusing, as detailed below.

      Fig. 4a. Can the authors clarify if the cohort of WNd birds that give rise to the positive result in Fig 4 ever experienced the mismatch in the absence of ongoing DAF reinforcement pre-deafening? Fig4a does nor the next clearly specifies this. This is important because we know that there are day timescale delays in LMAN-dependent bias away from DAF and consolidation into the HVC-RA pathway (Andalman and Fee, 2009). Thus, if birds experienced mismatch pre-deafening in the absence of DAF, then an earnly learning phase in Area X could be set in place. Then deafening occurs, but these weight changes in X could result in LMAN bias that expresses only days later -independent of auditory feedback. Such a process would not require an internal model as the authors are arguing for here. It would simply arise from delays in implementing reinforcement-driven feedback. If the birds in Fig 4 always had DAF on before deafening, then this is not an issue. But if the birds had hours of singing with DAF off before deafening, and therefore had the opportunity to associate DA error signals with the targeted time in the song (e.g. pauses on the far-from-target renditions (Duffy et al, 2022), then the return-to-baseline would be expected to be set in place independent of auditory feedback. Please clarify exactly if the pitch-contingent DAF was on or off in the WNd cohort in the hours before deafening. In Fig. 3b it looks like the answer is yes but I cannot find this clearly stated in the text.

      We did not provide DAF-free singing experience to the birds in Fig. 4 before deafening. Thus, according to the reviewer, the concern does not apply.

      Note that we disagree with the reviewer’s premise that there is ‘day timescale delay in LMAN-dependent bias away from DAF and consolidation into the HVC-RA pathway’. More recent data reveals immediate consolidation of the anterior forebrain bias without a night-time effect (Kollmorgen, Hahnloser, Mante 2020; Tachibana, Lee, Kai, Kojima 2022). Thus, the single bird in (Andalman and Fee 2009) seems to be somewhat of an outlier.

      Hearing birds can experience the mismatch regardless of whether they experience DAF-free singing (provided their song was sufficiently shifted): even the renditions followed by white noise can be assessed with regards to their pitch mismatch, so that DAF imposes no limitation on mismatch assessment.

      We disagree with their claim that no internal model would be needed in case consolidation was delayed in Area X. If indeed, Area X stores the needed change and it takes time to implement this change in LMAN, then we would interpret the change in Area X as the plan that birds would be able to implement without auditory feedback. Because pitch can either revert (after DAF stops) or shift further away (when DAF is still present), there is no rigid delay that is involved in recovering the target, but a flexible decision making of implementing the plan, which in our view amounts to using a model.

      Fig 4b. Early and Late colored dots in legend are both red; late should be yellow? Perhaps use colors that are more distinct - this may be an issue of my screen but the two colors are difficult to discern.

      We used colors yellow to red to distinguish different birds and not early and late. We modified the markers to improve visual clarity: Early is indicated with round markers and late with crosses.

      Fig 4b. R, E, and L phases are only plotted for 4c; not in 4b. But the figure legend says that R, E and L are on both panels.

      In Fig. 4b E and L are marked with markers because they are different for different birds. In Fig. 4c the phases are the same for all birds and thus we labeled them on top. We additionally marked R in Fig. 4b as in Fig. 4c.

      Fig 4e. Did the color code switch? In the rest of Fig 4, DLO is red and WND is blue. Then in 4e it swaps. Is this a typo in the caption? Or are the colors switch? Please fix this it's very confusing.

      Thank you for pointing out the typo in the caption. We corrected it.

      The y axes in Fig 4d-e are both in std of pitch change - yet they have different ylim which make it visually difficult to compare by eye. Is there a reason for this? Can the authors make the ylim the same for fig 4d-e?.

      We added dashed lines to clarify the difference in ylim.

      Fig 4d-3 is really the main positive finding of the paper. Can the others show an example bird that showcases this positive result, plotted as in Fig 3b? This will help the audience clearly visualize the raw data that go into the d' analyses and get a more intuitive sense of the magnitude of the positive result.

      We added example birds to figure 4, one for WNd and one for dLO.

      Please define 'late' in Fig.4 legend.

      Done

      Minor

      Define NRP In the text with an example. Is an NRP of 100 where the birds was before the withdrawal of reinforcement?

      We added the sentence to the results:

      "We quantified recovery in terms of 𝑵𝑹𝑷 to discount for differences in the amount of initial pitch shift where 𝑵𝑹𝑷 = 𝟎% corresponds to complete recovery and 𝑵𝑹𝑷 = 𝟏𝟎𝟎% corresponds pitch values before withdrawal of reinforcement (R) and thus no recovery."

      Reviewer #3 (Recommendations For The Authors):

      The use of "hierarchically lower" to refer to the flexible process is confusing to me, and possibly to many readers. Some people think of flexible, top-down processes as being _higher_ in a hierarchy. Regardless, it doesn't seem important, in this paper, to label the processes in a hierarchy, so perhaps avoid using that terminology.

      We reformulated the paragraph using ‘nested processes’ instead of hierarchical processes.

      In the statement "a seeming analogous task to re-pitching of zebra finch song, in humans, is to modify developmentally learned speech patterns", a few suggestions: it is not clear whether "re-pitching" refers to planning or feedback-dependent learning (I didn't see it introduced anywhere else). And if this means planning, then it is not clear why this would be analogous to "humans modifying developmentally learned speech patterns". As you mentioned, humans are more flexible at planning, so it seems re-pitching would _not_ be analogous (or is this referring to the less flexible modification of accents?).

      We changed the sentence to:

      "Thus, a seeming analogous task to feedback-dependent learning of zebra finch song, in humans, is to modify developmentally learned speech patterns."

    1. Author response:

      We would first like to thank the editor for considering our findings for publication in eLife. Furthermore, we thank the reviewers and editors for their encouraging reviews and for providing helpful and insightful comments.

      Reviewer #1 (Public Review):

      Summary:

      The pituitary gonadotropins, FSH and LH, are critical regulators of reproduction. In mammals, synthesis and secretion of FSH and LH by gonadotrope cells are controlled by the hypothalamic peptide, GnRH. As FSH and LH are made in the same cells in mammals, variation in the nature of GnRH secretion is thought to contribute to the differential regulation of the two hormones. In contrast, in fish, FSH and LH are produced in distinct gonadotrope populations and may be less (or differently) dependent on GnRH than in mammals. In the present manuscript, the authors endeavored to determine whether FSH may be independently controlled by a distinct peptide, cholecystokinin (CCK), in zebrafish.

      Strengths:

      The authors demonstrated that the CCK receptor is enriched in FSH-producing relative to LH-producing gonadotropes, and that genetic deletion of the receptor leads to dramatic decreases in gonadotropin production and gonadal development in zebrafish. Also, using innovative in vivo and ex vivo calcium imaging approaches, they show that LH- and FSH-producing gonadotropes preferentially respond to GnRH and CCK, respectively. Exogenous CCK also preferentially stimulated FSH secretion ex vivo and in vivo.

      Weaknesses:

      The concept that there may be a distinct FSH-releasing hormone (FSHRH) has been debated for decades. As the authors suggest that CCK is the long-sought FSHRH (at least in fish), they must provide data that convincingly leads to such a conclusion. In my estimation, they have not yet met this burden. In particular, they show that CCK is sufficient to activate FSH-producing cells, but have not yet demonstrated its necessity. Their one attempt to do so was using fish in which they inactivated the CCK receptor using CRISPR-Cas9. While this manipulation led to a reduction in FSH, LH was affected to a similar extent. As a result, they have not shown that CCK is a selective regulator of FSH.

      Our conclusion regarding the necessity of CCK signaling for FSH secretion is based on the following evidence:

      (1) CCK-like receptors are expressed in the pituitary gland predominantly on FSH cells.

      (2) Application of CCK to pituitaries elicits FSH cell activation and FSH release, and, to a lesser degree, activation of LH cells.

      (3) Mutating the CCK-like receptor causes a decrease in fsh and lh mRNA synthesis.

      (4) Mutating the CCK-like receptor gives rise to a phenotype which is identical to that caused by mutation of both lh and fsh genes in zebrafish.

      (5) Mutating the FSH-specific CCK receptor in a different species of fish (medaka) also causes a complete shutdown of FSH production and phenocopies a fsh-mutant phenotype (Uehara et al, BioRxiv, DOI: 10.1101/2023.05.26.542428).

      Taken together, we believe that this data strongly supports the conclusion that CCK is necessary for FSH production and release from the fish pituitary. Admittedly, the overlapping effects of CCK on both FSH and LH cells in zebrafish (evident in both our calcium imaging experiments and the KO phenotype) complicates the interpretation of the phenotype. We speculate that the effect of CCK on LH cells in zebrafish can be caused either by paracrine signaling within the gland or by the effects of CCK on higher levels of the axis. In our revised manuscript we will make sure to highlight the overlapping effects of CCK on LH cells rather than portray it as a selective activator of FSH cells.

      Moreover, they do not yet demonstrate that the effects observed reflect the loss of the receptor's function in gonadotropes, as opposed to other cell types.

      Although there is evidence for the expression of CCK receptor in other tissues, we do show a direct decrease of FSH and LH expression in the gonadotrophs of the pituitary of the mutant fish; taken together with its significant expression in FSH cells, it is the most reasonable and forward explanation for the mutant phenotype. Unfortunately, unlike in mice, technologies for conditional knockout of genes in specific cell types are not yet available for our model and cell types. However, in the revised manuscript we will add a supplementary figure describing the distribution of this receptor in other tissues.

      It also is not clear whether the phenotypes of the fish reflect perturbations in pituitary development vs. a loss of CCK receptor function in the pituitary later in life. Ideally, the authors would attempt to block CCK signaling in adult fish that develop normally. For example, if CCK receptor antagonists are available, they could be used to treat fish and see whether and how this affects FSH vs. LH secretion.

      While the observed gonadal phenotype of the KO (sex inversion) should have a developmental origin since it requires a long time to manifest, the effect of the KO on FSH and LH cells is probably more acute.

      In the Discussion, the authors suggest that CCK, as a satiety factor, may provide a link between metabolism and reproduction. This is an interesting idea, but it is not supported by the data presented. That is, none of the results shown link metabolic state to CCK regulation of FSH and fertility. Absent such data, the lengthy Discussion of the link is speculative and not fully merited.

      In the revised manuscript, we will address this comment by either providing data to link cck with metabolic status or tuning down the Discussion of this topic.

      Also in the Discussion, the authors argue that "CCK directly controls FSH cells by innervating the pituitary gland and binding to specific receptors that are particularly abundant in FSH gonadotrophs." However, their imaging does not demonstrate innervation of FSH cells by CCK terminals (e.g., at the EM level).

      Innervation of the fish pituitary does not imply a synaptic-like connection between axon terminals and endocrine cells. In fact, such connections are extremely rare, and their functionality is unclear. Instead, the mode of regulation between hypothalamic terminals and endocrine cells in the fish pituitary is more similar to "volume transmission" in the CNS, i.e. peptides are released into the tissue and carried to their endocrine cell targets by the circulation or via diffusion.

      Moreover, they have not demonstrated the binding of CCK to these cells. Indeed, no CCK receptor protein data are shown.

      Our revised manuscript will include detailed experiments showing the activation of the receptor by its ligand. Unfortunately, no antibody is available against this fish- specific receptor (one of the caveats of working with fish models); therefore, we cannot present receptor protein data.

      The calcium responses of FSH cells to exogenous CCK certainly suggest the presence of functional CCK receptors therein; but, the nature of the preparations (with all pituitary cell types present) does not demonstrate that CCK is acting directly in these cells.

      We agree with the reviewer that there are some disadvantages in choosing to work with a whole-tissue preparation. However, we believe that the advantages of working in a more physiological context far outweigh the drawbacks as it reflects the natural dynamics more precisely. Since our transcriptome data as well as our ISH staining, show that the CCK receptor is exclusively expressed on FSH cells, it is improbable that the observed calcium response is mediated via a different pituitary cell type.

      Indeed, the asynchrony in responses of individual FSH cells to CCK (Figure 4) suggests that not all cells may be activated in the same way. Contrast the response of LH cells to GnRH, where the onset of calcium signaling is similar across cells (Figure 3).

      The difference between the synchronization levels of LH and FSH cells activity stems from the gap-junction mediated coupling between LH cells that does not exist between FSH cells (Golan et al 2016, DOI: 10.1038/srep23777). Therefore, the onset of calcium response in FSH cells is dependent on the irregular diffusion rate of the peptide within the preparation, whereas the tight homotypic coupling between LH cells generates a strong and synchronized calcium rise that propagates quickly throughout the entire population; we will make sure this is clear in the final revision.

      Finally, as the authors note in the Discussion, the data presented do not enable them to conclude that the endogenous CCK regulating FSH (assuming it does) is from the brain as opposed to other sources (e.g., the gut).

      We agree with the reviewer that, for now, we are unable to determine whether hypothalamic or peripheral CCK are the main drivers of FSH cells. While the strong innervation of the gland by CCK-secreting hypothalamic neurons strengthens the notion of a hypothalamic-releasing hormone and also fits with the dogma of the neural control of the pituitary gland in fish (Ball, 1981; doi: 10.1016/0016-6480(81)90243-4.), more experiments are required to resolve this question.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript builds on previous work suggesting that the CCK peptide is the releasing hormone for FSH in fishes, which is different than that observed in mammals where both LH and FSH release are under the control of GnRH. Based on data using calcium imaging as a readout for stimulation of the gonadotrophs, the researchers present data supporting the hypothesis that CCK stimulates FSH- containing cells in the pituitary. In contrast, LH-containing cells show a weak and variable response to CCK but are highly responsive to GnRH. Data are presented that support the role of CCK in the release of FSH. Researchers also state that functional overlap exists in the potency of GnRH to activate FSH cells, thus the two signalling pathways are not separate.

      The results are of interest to the field because for many years the assumption has been that fishes use the same signalling mechanism. These data present an intriguing variation where a hormone involved in satiation acts in the control of reproduction.

      Strengths:

      The strengths of the manuscript are that researchers have shed light on different pathways controlling reproduction in fishes.

      Weaknesses:

      Weaknesses are that it is not clear if multiple ligand/receptors are involved (more than one CCK and more than one receptor?). The imaging of the CCK terminals and CCK receptors needs to be reinforced.

      Reviewer consultation summary:

      • The data presented establish sufficiency, but not necessity of CCK in FSH regulation. The paper did not show that CCK endogenously regulates FSH in fish. This has not been established yet.

      This is a very important comment, also raised by reviewer 1. To avoid repetition, please see our detailed response to the comment above.

      • The paper presents the pharmacological effects of CCK on ex vivo preparations but does not establish the in vivo physiological function of the peptide. The current evidence for a novel physiological regulatory mechanism is incomplete and would require further physiological experiments. These could include the use of a CCK receptor antagonist in adult fish to see the effects on FSH and LH release, the generation of a CCK knockout, or cell-specific genetic manipulations.

      As detailed in the responses to the first reviewer,we cannot conduct conditional, cell- specific gene knockout in our model.

      • Zebrafish have two CCK ligands: ccka, cckb and also multiple receptors: cckar, cckbra and cckbrb. There is ambiguity about which CCK receptor and ligand are expressed and which gene was knocked out.

      In the revised manuscript, we will clarify which of the receptors are expressed and which receptor is targeted. We will also provide data showing the specificity of the receptors (both WT and mutant) to the ligands.

      • Blocking CCK action in fish (with receptor KO) affects FSH and LH. Therefore, the work did not demonstrate a selective role for CCK in FSH regulation in vivo and any claims to have discovered FSHRH need to be more conservative.

      We agree with the reviewer that the overlap in the effect of CCK measured in the calcium activation of cells and in the KO model does not allow us to conclude selectivity. In this context, it is crucial to highlight that CCK-R exhibits high expression on FSH cells but not on LH cells. Therefore, the effect of CCK on LH cells is likely paracrine rather than solely endocrine. We will tone down our claims of selectivity in the revised manuscript.

      • The labelling of the terminals with anti-CCK looks a lot like the background and the authors did not show a specificity control (e.g. anti-CCK antibody pre-absorbed with the peptide or anti-CCK in morphant/KO animals).

      We will update the colors of the image for better clarity. Also, The same antibody had been previously used to mark CCK-positive cells in the gut of the red drum fish (K.A. Webb, Jr. 2010; DOI: https://doi.org/10.1016/j.ygcen.2009.10.010), where a control (pre-absorbed with the peptide) experiment had been conducted.

    2. eLife assessment

      This study presents valuable findings on the potential role of a peptide typically associated with feeding in the control of a pituitary hormone, FSH, which is a critical regulator of reproductive physiology. The evidence supporting the main claims of the authors is thought-provoking but incomplete. In particular, the authors demonstrate that the peptide is sufficient to regulate FSH, but they have not established its necessity. The work will be of interest to reproductive biologists, especially those with an interest in the endocrine control of fertility.

    3. Reviewer #1 (Public Review):

      Summary:

      The pituitary gonadotropins, FSH and LH, are critical regulators of reproduction. In mammals, synthesis and secretion of FSH and LH by gonadotrope cells are controlled by the hypothalamic peptide, GnRH. As FSH and LH are made in the same cells in mammals, variation in the nature of GnRH secretion is thought to contribute to the differential regulation of the two hormones. In contrast, in fish, FSH and LH are produced in distinct gonadotrope populations and may be less (or differently) dependent on GnRH than in mammals. In the present manuscript, the authors endeavored to determine whether FSH may be independently controlled by a distinct peptide, cholecystokinin (CCK), in zebrafish.

      Strengths:

      The authors demonstrated that the CCK receptor is enriched in FSH-producing relative to LH-producing gonadotropes, and that genetic deletion of the receptor leads to dramatic decreases in gonadotropin production and gonadal development in zebrafish. Also, using innovative in vivo and ex vivo calcium imaging approaches, they show that LH- and FSH-producing gonadotropes preferentially respond to GnRH and CCK, respectively. Exogenous CCK also preferentially stimulated FSH secretion ex vivo and in vivo.

      Weaknesses:

      The concept that there may be a distinct FSH-releasing hormone (FSHRH) has been debated for decades. As the authors suggest that CCK is the long-sought FSHRH (at least in fish), they must provide data that convincingly leads to such a conclusion. In my estimation, they have not yet met this burden. In particular, they show that CCK is sufficient to activate FSH-producing cells, but have not yet demonstrated its necessity. Their one attempt to do so was using fish in which they inactivated the CCK receptor using CRISPR-Cas9. While this manipulation led to a reduction in FSH, LH was affected to a similar extent. As a result, they have not shown that CCK is a selective regulator of FSH. Moreover, they do not yet demonstrate that the effects observed reflect the loss of the receptor's function in gonadotropes, as opposed to other cell types. It also is not clear whether the phenotypes of the fish reflect perturbations in pituitary development vs. a loss of CCK receptor function in the pituitary later in life. Ideally, the authors would attempt to block CCK signaling in adult fish that develop normally. For example, if CCK receptor antagonists are available, they could be used to treat fish and see whether and how this affects FSH vs. LH secretion.

      In the Discussion, the authors suggest that CCK, as a satiety factor, may provide a link between metabolism and reproduction. This is an interesting idea, but it is not supported by the data presented. That is, none of the results shown link metabolic state to CCK regulation of FSH and fertility. Absent such data, the lengthy discussion of the link is speculative and not fully merited.

      Also in the Discussion, the authors argue that "CCK directly controls FSH cells by innervating the pituitary gland and binding to specific receptors that are particularly abundant in FSH gonadotrophs." However, their imaging does not demonstrate innervation of FSH cells by CCK terminals (e.g., at the EM level). Moreover, they have not demonstrated the binding of CCK to these cells. Indeed, no CCK receptor protein data are shown. The calcium responses of FSH cells to exogenous CCK certainly suggest the presence of functional CCK receptors therein; but, the nature of the preparations (with all pituitary cell types present) does not demonstrate that CCK is acting directly in these cells. Indeed, the asynchrony in responses of individual FSH cells to CCK (Figure 4) suggests that not all cells may be activated in the same way. Contrast the response of LH cells to GnRH, where the onset of calcium signaling is similar across cells (Figure 3). Finally, as the authors note in the Discussion, the data presented do not enable them to conclude that the endogenous CCK regulating FSH (assuming it does) is from the brain as opposed to other sources (e.g., the gut).

    4. Reviewer #2 (Public Review):

      Summary:

      This manuscript builds on previous work suggesting that the CCK peptide is the releasing hormone for FSH in fishes, which is different than that observed in mammals where both LH and FSH release are under the control of GnRH. Based on data using calcium imaging as a readout for stimulation of the gonadotrophs, the researchers present data supporting the hypothesis that CCK stimulates FSH-containing cells in the pituitary. In contrast, LH-containing cells show a weak and variable response to CCK but are highly responsive to GnRH. Data are presented that support the role of CCK in the release of FSH. Researchers also state that functional overlap exists in the potency of GnRH to activate FSH cells, thus the two signalling pathways are not separate.

      The results are of interest to the field because for many years the assumption has been that fishes use the same signalling mechanism. These data present an intriguing variation where a hormone involved in satiation acts in the control of reproduction.

      Strengths:

      The strengths of the manuscript are that researchers have shed light on different pathways controlling reproduction in fishes.

      Weaknesses:

      Weaknesses are that it is not clear if multiple ligand/receptors are involved (more than one CCK and more than one receptor?). The imaging of the CCK terminals and CCK receptors needs to be reinforced.

      Reviewer consultation summary:

      - The data presented establish sufficiency, but not necessity of CCK in FSH regulation. The paper did not show that CCK endogenously regulates FSH in fish. This has not been established yet.

      - The paper presents the pharmacological effects of CCK on ex vivo preparations but does not establish the in vivo physiological function of the peptide. The current evidence for a novel physiological regulatory mechanism is incomplete and would require further physiological experiments. These could include the use of a CCK receptor antagonist in adult fish to see the effects on FSH and LH release, the generation of a CCK knockout, or cell-specific genetic manipulations.

      - Zebrafish have two CCK ligands: ccka, cckb and also multiple receptors: cckar, cckbra and cckbrb. There is ambiguity about which CCK receptor and ligand are expressed and which gene was knocked out.

      - Blocking CCK action in fish (with receptor KO) affects FSH and LH. Therefore, the work did not demonstrate a selective role for CCK in FSH regulation in vivo and any claims to have discovered FSHRH need to be more conservative.

      - The labelling of the terminals with anti-CCK looks a lot like the background and the authors did not show a specificity control (e.g. anti-CCK antibody pre-absorbed with the peptide or anti-CCK in morphant/KO animals).

    1. eLife assessment

      This study provides evidence that the quality of research in female-dominated fields of research is systematically undervalued by the research community. The authors' findings are based on analyses of data from a research assessment exercise in New Zealand and data on funding success rates in Australia, Canada, the European Union and the United Kingdom. This work is an important contribution to the discourse on gender biases in academia, underlining the pervasive influence of gender on whole fields of research, as well as on individual researchers. The evidence supporting the conclusions is solid, but the work would benefit from further explorations into the nuances of specific fields of fields of research.

    2. Reviewer #2 (Public Review):

      Summary:<br /> The authors used four datasets spanning 30 countries to examine funding success and research quality score for various disciplines. They examined whether funding or research quality score were influenced by majority gender of the discipline and whether these affected men, women, or both within each discipline. They found that disciplines dominated by women have lower funding success and research quality score than disciplines dominated by men. These findings, are surprising because even the men in women-dominated fields experienced lower funding success and research quality score.

      Strengths:<br /> - The authors utilized a comprehensive dataset covering 30 countries to explore the influence of the majority gender in academic disciplines on funding success and research quality scores.<br /> - Findings suggest a systemic issue where disciplines with a higher proportion of women have lower evaluations and funding success for all researchers, regardless of gender.<br /> - The manuscript is notable for its large sample size and the diverse international scope, enhancing the generalizability of the results.<br /> - The work accounts for various factors including age, number of research outputs, and bibliometric measures, strengthening the validity of the findings.<br /> - The manuscript raises important questions about unconscious bias in research evaluation and funding decisions, as evidenced by lower scores in women-dominated fields even for researchers that are men.<br /> - The study provides a nuanced view of gender bias, showing that it is not limited to individuals but extends to entire disciplines, impacting the perception and funding and quality or worth of research.<br /> - This work underscores the need to explore motivations behind gender distribution across fields, hinting at deep-rooted societal and institutional barriers.<br /> - The authors have opened a discussion on potential solutions to counter bias, like adjusting funding paylines or anonymizing applications, or other practical solutions.<br /> - While pointing out limitations such as the absence of data from major research-producing countries, the manuscript paves the way for future studies to examine whether its findings are universally applicable.

      Weaknesses:<br /> - The study does not provide data on the gender of grant reviewers or stakeholders, which could be critical for understanding potential unconscious bias in funding decisions. These data are likely not available; however, this could be discussed. Are grant reviewers in fields dominated by women more likely to be women?<br /> - There could be more exploration into whether the research quality score is influenced by inherent biases towards disciplines themselves, rather than only being gender bias.<br /> - The manuscript should discuss how non-binary gender identities were addressed in the research. There is an opportunity to understand the impact on this group.<br /> - A significant limitation is absence of data from other major research-producing countries like China and the United States, raising questions about the generalizability of the findings. How comparable are the findings observed to these other countries?<br /> - The motivations and barriers that drive gender distribution in various fields could be expanded on. Are fields striving to reach gender parity through hiring or other mechanisms?<br /> - The authors could consider if the size of funding awards correlates with research scores, potentially overlooking a significant factor in the evaluation of research quality. Presumably there is less data on smaller 'pilot' funds and startup funds for disciplines where these are more common. Would funding success follow the same trend for these types of funds?<br /> - The language used in the manuscript at times may perpetuate bias, particularly when discussing "lower quality disciplines," which could influence the reader's perception of certain fields.<br /> - The manuscript does not clarify how many gender identities were represented in the datasets or how gender identity was determined, potentially conflating gender identity with biological sex.

    3. Reviewer #3 (Public Review):

      This study seeks to investigate one aspect of disparity in academia: how gender balance in a discipline is valued in terms of evaluated research quality score and funding success. This is important in understanding disparities within academia.<br /> This study uses publicly available data to investigate covariation between gender balance in an academic discipline and:<br /> i) Individual research quality scores of New Zealand academics as evaluated by one of 14 broader subject panels.<br /> ii) Funding success in Australia, Canada, Europe, UK.

      The study would benefit from further discussion of it limitations, and from the clarification of some technical points (as described in the recommendations for the authors).

    1. Reviewer #3 (Public Review):

      Summary:

      The goal of this paper is to characterize an anti-diuretic signaling system in insects using Drosophila melanogaster as a model. Specifically, the authors wished to characterize a role of ion transport peptide (ITP) and its isoforms in regulating diverse aspects of physiology and metabolism. The authors combined genetic and comparative genomic approaches with classical physiological techniques and biochemical assays to provide a comprehensive analysis of ITP and its role in regulating fluid balance and metabolic homeostasis in Drosophila. The authors further characterized a previously unrecognized role for Gyc76C as a receptor for ITPa, an amidated isoform of ITP, and in mediating the effects of ITPa on fluid balance and metabolism. The evidence presented in favor of this model is very strong as it combines multiple approaches and employs ideal controls. Taken together, these findings represent an important contribution to the field of insect neuropeptides and neurohormones and have strong relevance for other animals.

      Strengths:

      Many approaches are used to support their model. Experiments were well-controlled, used appropriate statistical analyses, and were interpreted properly and without exaggeration.

      Weaknesses:

      No major weaknesses were identified by this reviewer. More evidence to support their model would be gained by using a loss-of-function approach with ITPa, and by providing more direct evidence that Gyc76C is the receptor that mediates the effects of ITPa on fat metabolism. However, these weaknesses do not detract from the overall quality of the evidence presented in this manuscript, which is very strong.

    2. eLife assessment

      This important study provides a comprehensive analysis of ITP and its role as an anti-diuretic and metabolic hormone in Drosophila. The evidence supporting the conclusion is solid in general with combined genetic, comparative genomic approaches, classical physiological techniques, and biochemical assays. However, the evidence of direct binding between ITPa and Gyc76C and their physiological functions is incomplete. This work represents a contribution to the field of neuropeptides and neurohormones in insects and other animals.

    3. Reviewer #1 (Public Review):

      Summary:

      In Drosophila melanogaster, ITP has functions on feeding, drinking, metabolism, excretion, and circadian rhythm. In the current study, the authors characterized and compared the expression of all three ITP isoforms (ITPa and ITPL1&2) in the CNS and peripheral tissues of Drosophila. An important finding is that they functionally characterized and identified Gyc76C as an ITPa receptor in Drosophila using both in vitro and in vivo approaches. In vitro, the authors nicely confirmed that the inhibitory function of recombinant Drosophila ITPa on MT secretion is Gyc76C-dependent (knockdown Gyc76C specifically in two types of cells abolished the anti-diuretic action of Drosophila ITPa on renal tubules). They also used a combination of multiple approaches to investigate the roles of ITPa and Gyc76C on osmotic and metabolic homeostasis modulation in vivo. They revealed that ITPa signaling to renal tubules and fat body modulates osmotic and metabolic homeostasis via Gyc76C.

      Furthermore, they tried to identify the upstream and downstream of ITP neurons in the nervous system by using connectomics and single-cell transcriptomic analysis. I found this interesting manuscript to be well-written and described. The findings in this study are valuable to help understand how ITP signals work on systemic homeostasis regulation. Both anatomical and single-cell transcriptome analysis here should be useful to many in the field.

      Strengths:

      - The question (what receptors of ITPa in Drosophila) that this study tries to address is important. The authors ruled out the Bombyx ITPa receptor orthologs as potential candidates. They identified a novel ITP receptor by using phylogenetic, anatomical analysis, and both in vitro and in vivo approaches.

      - The authors exhibited detailed anatomical data of both ITP isoforms and Gyc76C (in the main and supplementary figures), which helped audiences understand the expression of the neurons studied in the manuscript.

      - They also performed connectomes and single-cell transcriptomics analysis to study the synaptic and peptidergic connectivity of ITP-expressing neurons. This provided more information for better understanding and further study on systemic homeostasis modulation.

      Weaknesses:

      In the discussion section, the authors raised the limitations of the current study, which I mostly agree with, such as the lack of verification of direct binding between ITPa and Gyc76C, even though they provided different data to support that ITPa-Gyc76C signaling pathway regulates systemic homeostasis in adult flies.

    4. Reviewer #2 (Public Review):

      Summary:

      The physiology and behaviour of animals are regulated by a huge variety of neuropeptide signalling systems. In this paper, the authors focus on the neuropeptide ion transport peptide (ITP), which was first identified and named on account of its effects on the locust hindgut (Audsley et al. 1992). Using Drosophila as an experimental model, the authors have mapped the expression of three different isoforms of ITP (Figures 1, S1, and S2), all of which are encoded by the same gene.

      The authors then investigated candidate receptors for isoforms of ITP. Firstly, Drosophila orthologs of G-protein coupled receptors (GPCRs) that have been reported to act as receptors for ITPa or ITPL in the insect Bombyx mori were investigated. Importantly, the authors report that ITPa does not act as a ligand for the GPCRs TkR99D and PK2-R1 (Figure S3). Therefore, the authors investigated other putative receptors for ITPs. Informed by a previously reported finding that ITP-type peptides cause an increase in cGMP levels in cells/tissues (Dircksen, 2009, Nagai et al., 2014), the authors investigated guanylyl cyclases as candidate receptors for ITPs. In particular, the authors suggest that Gyc76C may act as an ITP receptor in Drosophila.

      Evidence that Gyc76C may be involved in mediating effects of ITP in Bombyx was first reported by Nagai et al. (2014) and here the authors present further evidence, based on a proposed concordance in the phylogenetic distribution ITP-type neuropeptides and Gyc76C (Figure 2). Having performed detailed mapping of the expression of Gyc76C in Drosophila (Figures 3, S4, S5, S6), the authors then investigated if Gyc76C knockdown affects the bioactivity of ITPa in Drosophila. The inhibitory effect of ITPa on leucokinin- and diuretic hormone-31-stimulated fluid secretion from Malpighian tubules was found to be abolished when expression of Gyc76C was knocked down in stellate cells and principal cells, respectively (Figure 4). However, as discussed below, this does not provide proof that Gyc76C directly mediates the effect of ITPa by acting as its receptor. The effect of Gyc76C knockdown on the action of ITPa could be an indirect consequence of an alteration in cGMP signalling.

      Having investigated the proposed mechanism of ITPa in Drosophila, the authors then investigated its physiological roles at a systemic level. In Figure 5 the authors present evidence that ITPa is released during desiccation and accordingly, overexpression of ITPa increases survival when animals are subjected to desiccation. Furthermore, knockdown of Gyc76C in stellate or principal cells of Malphigian tubules decreases survival when animals are subject to desiccation. However, whilst this is correlative, it does not prove that Gyc76C mediates the effects of ITPa. The authors investigated the effects of knockdown of Gyc76C in stellate or principal cells of Malphigian tubules on i). survival when animals are subject to salt stress and ii). time taken to recover from of chill coma. It is not clear, however, why animals over-expressing ITPa were also not tested for its effect on i). survival when animals are subject to salt stress and ii). time taken to recover from of chill coma. In Figures 6 and S8, the authors show the effects of Gyc76C knockdown in the female fat body on metabolism, feeding-associated behaviours and locomotor activity, which are interesting. Furthermore, the relevance of the phenotypes observed to potential in vivo actions of ITPa is explored in Figure 7. The authors conclude that "increased ITPa signaling results in phenotypes that largely mirror those seen following Gyc76C knockdown in the fat body, providing further support that ITPa mediates its effects via Gyc76C." Use of the term "largely mirror" seems inappropriate here because there are opposing effects- e.g. decreased starvation resistance in Figure 6A versus increased starvation resistance in Figure 7A. Furthermore, as discussed above, the results of these experiments do not prove that the effects of ITPa are mediated by Gyc76C because the effects reported here could be correlative, rather than causative.

      Lastly, in Figures 8, S9, and S10 the authors analyse publicly available connectomic data and single-cell transcriptomic data to identify putative inputs and outputs of ITPa-expressing neurons. These data are a valuable addition to our knowledge ITPa expressing neurons; but they do not address the core hypothesis of this paper - namely that Gyc76C acts as an ITPa receptor.

      Strengths:

      (1) The main strengths of this paper are i) the detailed analysis of the expression and actions of ITP and the phenotypic consequences of over-expression of ITPa in Drosophila. ii). the detailed analysis of the expression of Gyc76C and the phenotypic consequences of knockdown of Gyc76C expression in Drosophila.

      (2) Furthermore, the paper is generally well-written and the figures are of good quality.

      Weaknesses:

      (1) The main weakness of this paper is that the data obtained do not prove that Gyc76C acts as a receptor for ITPa. Therefore, the following statement in the abstract is premature: "Using a phylogenetic-driven approach and the ex vivo secretion assay, we identified and functionally characterized Gyc76C, a membrane guanylate cyclase, as an elusive Drosophila ITPa receptor." Further experimental studies are needed to determine if Gyc76C acts as a receptor for ITPa. In the section of the paper headed "Limitations of the study", the authors recognise this weakness. They state "While our phylogenetic analysis, anatomical mapping, and ex vivo and in vivo functional studies all indicate that Gyc76C functions as an ITPa receptor in Drosophila, we were unable to verify that ITPa directly binds to Gyc76C. This was largely due to the lack of a robust and sensitive reporter system to monitor mGC activation." It is not clear what the authors mean by "the lack of a robust and sensitive reporter system to monitor mGC activation". The discovery of mGCs as receptors for ANP in mammals was dependent on the use of assays that measure GC activity in cells (e.g. by measuring cGMP levels in cells). Furthermore, more recently cGMP reporters have been developed. The use of such assays is needed here to investigate directly whether Gyc76C acts as a receptor for ITPa. In summary, insufficient evidence has been obtained to conclude that Gyc76C acts as a receptor for ITPa. Therefore, I think there are two ways forward, either:<br /> (a) The authors obtain additional biochemical evidence that ITPa is a ligand for Gyc76C.<br /> or<br /> (b) The authors substantially revise the conclusions of the paper (in the title, abstract, and throughout the paper) to state that Gyc76C MAY act as a receptor for ITPa, but that additional experiments are needed to prove this.

      (2) The authors state in the abstract that a phylogenetic-driven approach led to their identification of Gyc76C as a candidate receptor for ITPa. However, there are weaknesses in this claim. Firstly, because the hypothesis that Gyc76C may be involved in mediating effects of ITPa was first proposed ten years ago by Nagai et al. 2014, so this surely was the primary basis for investigating this protein. Nevertheless, investigating if there is correspondence in the phylogenetic distribution of ITP-type and Gyc76C-type genes/proteins is a valuable approach to addressing this issue. Unfortunately, the evidence presented is rather limited in scope. Essentially, the authors report that they only found ITP-type and Gyc76C-type genes/proteins in protostomes, but not in deuterostomes. What is needed is a more fine-grained analysis at the species level within the protostomes. Thus, are there protostome species in which both ITP-type and Gyc76C-type genes/proteins have been lost? Furthermore, are there any protostome species in which an ITP-type gene is present but an Gyc76C-type gene is absent, or vice versa? If there are protostome species in which an ITP-type gene is present but a Gyc76C-type gene is absent or vice versa, this would argue against Gyc76C being a receptor for ITPa. In this regard, it is noteworthy that in Figure 2A there are two ITP-type precursors in C. elegans, but there are no Gyc76C-type proteins shown in the tree in Figure 2B. Thus, what is needed is a more detailed analysis of protostomes to investigate if there really is correspondence in the phylogenetic distribution of Gyc76C-type and ITP-type genes at the species level.

      (3) The manuscript would benefit from a more comprehensive overview and discussion of published literature on Gyc76C in Drosophila, both as a basis for this study and for interpretation of the findings of this study.

    1. eLife assessment

      In this study, the authors developed a cell-based screening assay for the identification of small molecule inhibitors of nonsense-mediated decay (NMD). They used it to validate a novel small molecule SMG1 kinase inhibitor that inhibits NMD in cultured cells leading to the expression of neoantigens from NMD-targeted genes, and in vivo slows tumor growth of cells with a significant number of out-of-frame indel mutations. The conclusions are supported by convincing evidence, and the significance of this work consists in the development of a novel and very promising NMD inhibitor drug that acts as an inhibitor of the SMG1 NMD kinase and is suitable for use in animals. This is an important advance for the field, as previous NMD inhibitors were not specific, lacked efficacy, or were very toxic and hence not suitable for animal application.

    2. Reviewer #1 (Public Review):

      Summary:

      This work identified new NMD inhibitors and tested them for cancer treatment, based on the hypothesis that inhibiting NMD could lead to the production of cancer neoantigens from the stabilized mutant mRNAs, thereby enhancing the immune system's ability to recognize and kill cancer cells. Key points of the study include:

      • Development of an RNA-seq based method for NMD analysis using mixed isogenic cells that express WT or mutant transcripts of STAG2 and TP53 with engineered truncation mutations.

      • Application of this method for a drug screen and identified several potential NMD inhibitors.

      • Demonstration that one of the identified compounds, LY3023414, inhibits NMD by targeting the SMG1 protein kinase in the NMD pathway in cultured cells and mouse xenografts.

      • Due to the in vivo toxicity observed for LY3023414, the authors developed 11 new SMG1 inhibitors (KVS0001-KVS0011) based on the structures of the known SMG1 inhibitor SMG1i-11 and the SMG1 protein itself.

      • Among these, KVS0001 stood out for its high potency, excellent bioavailability, and low toxicity in mice. Treatment with KVS0001 caused NMD inhibition and increased presentation of neoantigens on MHC-I molecules, resulting in the clearance of cancer cells in vitro by co-cultured T cells and cancer xenografts in mice by the immune system.

      These findings support the strategy of targeting the NMD pathway for cancer treatment and provide new research tools and potential lead compounds for further exploration.

      Strengths:

      The RNA-seq-based NMD analysis, using isogenic cell lines with specific NMD-inducing mutations, represents a novel approach for the high-throughput identification of potential NMD modulators or genetic regulators. The effectiveness of this method is exemplified by the identification of a new activity of AKT1/mTOR inhibitor LY3023414 in inhibiting NMD.

      The properties of KVS0001 described in the manuscript as a novel SMG1 inhibitor suggest its potential as a lead compound for further testing the NMD-targeting strategies in cancer treatment. Additionally, this compound may serve as a useful research tool.

      The results of the in vitro cell killing assay and in vivo xenograft experiments in both immuno-proficient and immune-deficient mice indicate that inhibiting NMD could be a viable therapeutic strategy for certain cancers.

      Weaknesses:

      The authors did not address the potential effects of NMD/SMG1 inhibitors on RNA splicing. Given that the transcripts of many RNA-binding proteins are natural targets of NMD, inhibiting NMD could significantly alter splicing patterns. This, in turn, might influence the outcomes of the RNA-seq-based method for NMD analysis and result interpretation.

      While the RNA-seq-based approach offers several advantages for analyzing NMD, the effects of NMD/SMG1 inhibitors observed through this method should be confirmed using established NMD reporters. This step is crucial to rule out the possibility that mutations in STAG2 or TP53 affect NMD in cells, as well as to address potential clonal variations between different engineered cell lines.

      The results from the SMG1/UPF1 knockdown and SMG1i-11 experiments presented in Figure 3 correlate with the effects seen for LY3023414, but they do not conclusively establish SMG1 as the direct target of LY3023414 in NMD inhibition. An epistatic analysis with LY3023414 and SMG1-knockdown is needed.

    3. Reviewer #2 (Public Review):

      Summary:

      Several publications during the past years provided evidence that NMD protects tumor cells from being recognized by the immune system by suppressing the display of neoantigens, and hence NMD inhibition is emerging as a promising anti-cancer approach. However, the lack of an efficacious and specific small-molecule NMD inhibitor with suitable pharmacological properties is currently a major bottleneck in the development of therapies that rely on NMD inhibition. In this manuscript, the authors describe their screen for identifying NMD inhibitors, which is based on isogenic cell lines that either express wild-type or NMD-sensitive transcript isoforms of p53 and STAG2. Using this setup, they screened a library of 2658 FDA-approved or late-phase clinical trial drugs and had 8 hits. Among them they further characterized LY3023414, showing that it inhibits NMD in cultured cells and in a mouse xenograft model, where it, however, was very toxic. Because LY3023414 was originally developed as a PI3K inhibitor, the authors claim that it inhibits NMD by inhibiting SMG1. While this is most likely true, the authors do not provide experimental evidence for this claim. Instead, they use this statement to switch their attention to another previously developed SMG1 inhibitor (SMG1i-11), of which they design and test several derivatives. Of these derivatives, KVS0001 showed the best pharmacological behavior. It upregulated NMD-sensitive transcripts in cultured cells and the xenograft mouse model and two predicted neoantigens could indeed be detected by mass spectrometry when the respective cells were treated with KVS0001. A bispecific antibody targeting T cells to a specific antigen-HLA complex led to increased IFN-gamma release and killing of cancer cells expressing this antigen-HLA complex when they were treated with KVS0001. Finally, the authors show that renal (RENCA) or lung cancer cells (LLC) were significantly inhibited in tumor growth in immunocompetent mice treated with KVS0001. Overall, this establishes KVS0001 as a novel and promising ant-cancer drug that by inhibiting SMG1 (and therewith NMD) increases the neoantigen production in the cancer cells and reveals them to the body's immune system as "foreign".

      Strengths:

      The novelty and significance of this work consists in the development of a novel and - judging from the presented data - very promising NMD inhibiting drug that is suitable for applications in animals. This is an important advance for the field, as previous NMD inhibitors were not specific, lacked efficacy, or were very toxic and hence not suitable for animal application. It will be still a long way with many challenges ahead towards an efficacious NMD inhibitor that is safe for use in humans, but KVS0001 appears to be a molecule that bears promise for follow-up studies. In addition, while the idea of inhibiting NMD to trigger neoantigen production in cancer cells and so reveal them to the immune system has been around for quite some time, this work provides ample and compelling support for the feasibility of this approach, at least for tumors with a high mutational burden.

      Main weaknesses:

      There is a disconnect between the screen and the KVS0001 compound, that they describe and test in the second part of the manuscript since KVS0001 is a derivative of the SMG1 inhibitors developed by Gopalsamy et al. in 2012 and not of the lead compound identified in the screen (LY3023414). Because of high toxicity in the mouse xenograft experiments, the authors did not follow up LY3023414 but instead switched to the published SMG1i-11 drug of Gopalsamy and colleagues, a molecule that is widely used among NMD researchers for NMD inhibition in cultured cells. Therefore, in my view, the description of the screen is obsolete, and the paper could just start with the optimization of the pharmacological properties of SMG1i-11 and the characterization of KVS0001. Even though the screen is based on an elegant setup and was executed successfully, it was ultimately a failure as it didn't reveal a useful lead compound that could be further optimized.

      Additional points:

      - Compared to SMG1i-11, KVS0001 seems less potent in inhibiting SMG1 (higher IC50). It would therefore be important to also compare the specificity of both drugs for SMG1 over other kinases at the applied concentrations (1 uM for SMG1i-11, 5 uM for KVS0001). The Kinativ Assay (Fig. S13) was performed with 100 nM KVS0001, which is 50-fold less than the concentration used for functional assays and hence not really meaningful. In addition, more information on the pharmacokinetic properties and toxicology of KVS0001 would allow a better judgment of the potential of this molecule as a future therapeutic agent.

      - On many figures, the concentrations of the used drugs are missing. Please ensure that for every experiment that includes drugs, the drug concentration is indicated.

      - Do the authors have an explanation for why LY3023414 has a much stronger effect on the p53 than on the STAG2 nonsense allele (Figure 1B, S8), whereas emetine upregulates the STAG2 nonsense alleles more than the p53 nonsense allele (Figure S5). I find this curious, but the authors do not comment on it.

      - While it is a strength of the study that the NMD inhibitors were validated on many different truncation mutations in different cell lines, it would help readers if a table or graphic illustration was included that gives an overview of all mutant alleles tested in this study (which gene, type of mutation, in which cell type). In the current version, this information is scattered throughout the manuscript.

      - Lines 194 and 302: That SMG1i-11 was highly insoluble in the hands of the authors is surprising. It is unclear why they used variant 11j, since variant 11e of this inhibitor is widely used among NMD researchers and readily dissolves in DMSO.

      - Line 296: The authors claim that they were able to show that LY3023414 inhibited the SMG1 kinase, which is not true. To show this, they would have for example to show that LY3023414 prevents SMG1-mediated UPF1 phosphorylation, as they did for KVS0001 and SMG1i-11 in Fig. 3F. Unless the authors provide this data, the statement should be deleted or modified.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      The authors have addressed my comments. As a final minor point, regarding comment 2, these condensates are likely viscoelastic rather than purely viscous. It is prudent to indicate that the data may refer to an apparent viscosity.

      We added the following text to the manuscript to highlight the viscoelastic nature of ELP condensates, and the relationship of reported values with the steady state viscosity. “It is worth noting that the reported values, although related, may not quantitatively represent the steady-state viscosity. This discrepancy arises from the slow relaxation timescale inherent in ELP condensates with viscoelastic properties.”

    2. eLife assessment

      This important study investigates the structural organization of a series of diblock elastin-like polypeptide condensates. The methodology is highly compelling, as it combines multiscale simulations and fluorescence lifetime imaging microscopy experiments. The results increase our understanding of model biomolecular condensates.

    3. Reviewer #1 (Public Review):

      This is an interesting, informative, and well-designed study that combines theoretical and experimental methodologies to tackle the phenomenon of higher-resolution structures/substructures in model biomolecular condensates.

      The authors have adequately addressed my previous concerns.

    4. Reviewer #2 (Public Review):

      Summary:

      Latham A.P. et al. apply simulations and FLIM to analyse several di-block elastin-like polypetides and connect their sequence to the micro-structure of coacervates resulting from their phase-separation.

      Strengths:

      Understanding the molecular grammar of phase separating proteins and the connection with mesoscale properties of the coacervates is highly relevant. This work provides insights into micro-structures of coacervates resulting from di-block polypetides.

      Weaknesses:

      The results apply to a very specific architecture (di-block polypetides) with specific sequences.

    1. eLife assessment:

      This work describes an easily implemented method for measuring solid food intake in Drosophila, which is necessary for studying the consumption of experimentally challenging diets, such as high-fat foods, as well as their nutritional impacts on the organism. It is a valuable technical contribution with solid evidence supporting the conclusions, filling a significant gap in the field.

    2. Reviewer #1 (Public Review):

      Summary:

      Thakare et al propose a gravimetric method to evaluate feeding from solid food in Drosophila adults that can be used to evaluate the nutritional impact of high-fat food.

      Strengths:

      This method is new and fills a gap in the methods used in Drosophila research.

      Weaknesses:

      The data presented address a number of questions that are mainly interesting for people needing to reproduce such experiments. The work could be improved by being presented within a broader scope.

    3. Reviewer #2 (Public Review):

      Summary:

      Thakare et al. present the DIETS assay for quantifying food consumption in adult Drosophila. DIETS measures food intake by weighing fly food before and after feeding. Technically, this is a well-designed, executed, and analyzed study. The interpretations are generally conservative and justified by the results. Although the results aren't always consistent with other published studies, which might reflect some of the unique conditions of the DIETS assay, the technique can clearly distinguish between some expected differences in food intake. Although lifespan is shortened in the DIETS chamber, the method seems robust for various time scales up to a week. DIETS adds another useful and versatile tool for fly researchers interested in studying feeding behavior.

      Strengths:

      The authors test various conditions, including food presentation, surface area, and humidity (by changing the food cup distance to an agar base) to demonstrate an optimized technique for quantifying consumption. Under these conditions, evaporation is generally limited to <10%.

      The authors use DIETS to validate diverse feeding paradigms, including the published effects of temperature, food dilution, and intermittent fasting on food intake.

      Weaknesses:

      The studies to optimize and test the DIETS assay are technically rigorous and well-designed. However, the results reveal some weaknesses or potential caveats of the assay. As highlighted below, how much nutrition flies are actually obtaining may be misestimated due to vapor diffusion, and crowding/competition for food. This appears largely acceptable though, since the 'group' measurement can still clearly distinguish between expected feeding differences under different conditions, but it likely reduces accuracy, which may be important in some studies, and probably nullifies the effectiveness of using DIETS to restrict caloric intake.

      It is my understanding that flies suck out nutrients from the medium, leaving behind the agar/cornmeal matrix. This seems consistent with the images in Figure S2B, where the spheroidal medium in the food cup maintains its shape as it shrinks, but there don't seem to be any pits or holes from fly consumption. Given that flies in DIETS consume a significant portion of the available food, it seems that the food concentration at the medium surface may be changing throughout the experiment. This may also make it challenging to use other common fly food ingredients, such as cornmeal, much of which is indigestible.

      Similarly, vapor diffusion is expected between the agar bed and food cup (which the authors observed; in line 385), which may further affect assay accuracy, especially in comparisons between foods with different osmolarity.

      In DIETS, increased feeding is observed with increased flies per chamber, but this is not observed in other techniques, such as EX-Q (Wu et al. 2020). It is unclear whether sensitivity to adult density is a DIETS-specific feature, or if adult density instead directly affects food intake estimates using DIETS (e.g., by affecting chamber humidity).

      In another example, there is a ~300% difference in absolute feeding when the DIETS food cup is presented in different formats (Figure 3C). Again, it is unclear whether food presentation has an inherently greater effect in DIETS, or if the measurements themselves are highly sensitive to the environment.

      Although the control of total food mass given to the animals is a novel feature of the assay, the likely differences in nutrient intake between individuals (and shortened lifespan) in a DIETS chamber makes this a challenging method to use to study caloric restriction. The shortened lifespan likely stems from the high adult density per vial, which has been explored in previous publications (e.g., Pearl in the 1920s; Mueller in the 1990s).

    1. eLife assessment

      This study reports some useful information on the mechanisms by which a high-fat diet induces arrhythmias in the model organism Drosophila. Specifically, the authors propose that adipokinetic hormone (Akh) secretion is increased with this diet, and through binding of Akh to its receptor on cardiac neurons, arrhythmia is induced. The presented data, however, incompletely support the conclusions, with a number of concerns identified, such as the need for editorial clarifications, issues with experimental design (including additional control experiments), and over or misinterpretation of some of the experimental data. Nonetheless, some of the data will be helpful to those who wish to extend the research to a more complex model system, such as the mouse.

    2. Reviewer #1 (Public Review):

      Summary:

      In the manuscript submission by Zhao et al. entitled, "Cardiac neurons expressing a glucagon-like receptor mediate cardiac arrhythmia induced by high-fat diet in Drosophila" the authors assert that cardiac arrhythmias in Drosophila on a high-fat diet are due in part to adipokinetic hormone (Akh) signaling activation. High-fat diet induces Akh secretion from activated endocrine neurons, which activate AkhR in posterior cardiac neurons. Silencing or deletion of Akh or AkhR blocks arrhythmia in Drosophila on a high-fat diet. Elimination of one of two AkhR-expressing cardiac neurons results in arrhythmia similar to a high-fat diet.

      Strengths:

      The authors propose a novel mechanism for high-fat diet-induced arrhythmia utilizing the Akh signaling pathway that signals to cardiac neurons.

      Weaknesses:

      Major comments:

      (1) The authors state, "Arrhythmic pathology is rooted in the cardiac conduction system." This assertion is incorrect as a blanket statement on arrhythmias. There are certain arrhythmias that have been attributable to the conduction system, such as bradycardic rhythms, heart block, sinus node reentry, inappropriate sinus tachycardia, AV nodal reentrant tachycardia, bundle branch reentry, fascicular ventricular tachycardia, or idiopathic ventricular fibrillation to name a few. However the etiological mechanism of many atrial and ventricular arrhythmias, such as atrial fibrillation or substrate-based ventricular tachycardia, are not rooted in the conduction system. The introduction should be revised to reflect a clear focus on atrial fibrillation (AF). In addition, AF susceptibility is known to be modulated by autonomic tone, which is topically relevant to this manuscript.

      (2) The authors state that "HFD led to increased heartbeat and an irregular rhythm." In representative examples shown, HFD resulted in pauses, slower heart rate, and increased irregularity in rhythm but not consistently increased heart rate (Figures 1B, 3A, and 4C). Based on the cited work by Ocorr et al (https://doi.org/10.1073/pnas.0609278104), Drosophila heart rate is highly variable with periods of fast and slow rates, which the authors attributed to neuronal and hormonal inputs. Ocorr et al then describe the use of "semi-intact" flies to remove autonomic input to normalize heart rate. Were semi-intact flies used? If not, how was heart rate variability controlled? And how was heart rate "increase" quantified in high-fat diet compared to normal-fat diet? Lastly, how does one measure "arrhythmia" when there is so much heart rate variability in normal intact flies?

      (3) The authors state, "to test whether the HFD-induced increase in Akh in the APC affects APC neuron activity, we used CaLexA (https://doi.org/10.3109/01677063.2011.642910)." According to the reference, CaLexA is a tool to map active neurons and would not indicate, as the authors state, whether Akh affects APC neuron activity specifically. It is equally possible that APC neurons may be activated by HFD and produce more Akh. Please clarify this language.

      (4) Are the AkhR+ neurons parasympathetic or sympathetic? Please provide additional experimentation that characterizes these neurons. The AkhR+ neurons appear to be anti-arrhythmic. Please expand the discussion to include a working hypothesis of the overall findings on Akh, AkhR, and AkhR+ neurons.

      (5) The authors state, "Heart function is dependent on glucose as an energy source." However, the heart's main energy source is fatty acids with minimal use of glucose (doi: 10.1016/j.cbpa.2006.09.014). Glucose becomes more utilized by cardiomyocytes under heart failure conditions. Please amend/revise this statement.

    3. Reviewer #2 (Public Review):

      This manuscript explores mechanisms underlying heart contractility problems in metabolic disease using Drosophila as a model. They confirm, as others have demonstrated, that a high-fat diet (HFD) induces cardiac problems in flies. They showed that a high-fat diet increased Akh mRNA levels and calcium levels in the Akh-producing cells (APC), suggesting there is increased production and release of this hormone in a HFD context. When they knock down Akh production in the APCs using RNAi they see that cardiac contractility problems are abolished. They similarly show that levels of the Akh receptor (Akhr) are increased on a HFD and that loss of Akhr also rescues contractility problems on a HFD.

      One highlight of the paper was the identification of a pair of neurons that express a receptor for the metabolic hormone Akh, and showing initial data that these neurons innervate the cardiac muscle. They then overexpress cell death gene reaper (rpr) in all Akhr-positive cells with Akhr-GAL4 and see that cardiac contractility becomes abnormal.

      However, this paper contains several findings that have been reported elsewhere and it contains key flaws in both experimental design and data interpretation. There is some rationale for doing the experiments, and the data and images are of good quality. However, others have shown that HFD induces cardiac contractility problems (Birse 2010), that Akh mRNA levels are changed with HFD (Liao 2021) that Akh modulates cardiac rhythms (Noyes 1995), so Figures 1-4 are largely a confirmation of what is already known. This limits the overall magnitude of the advances presented in these figures. Overall, the stated concerns limit the impact of the manuscript in advancing our understanding of heart contractility.

    4. Reviewer #3 (Public Review):

      Zhao et al. provide new insights into the mechanism by which a high-fat diet (HFD) induces cardiac arrhythmia employing Drosophila as a model. HFD induces cardiac arrhythmia in both mammals and Drosophila. Both glucagon and its functional equivalent in Drosophila Akh are known to induce arrhythmia. The study demonstrates that Akh mRNA levels are increased by HFD and both Akh and its receptor are necessary for high-fat diet-induced cardiac arrhythmia, elucidating a novel link. Notably, Zhao et al. identify a pair of AKH receptor-expressing neurons located at the posterior of the heart tube. Interestingly, these neurons innervate the heart muscle and form synaptic connections, implying their roles in controlling the heart muscle. The study presented by Zhao et al. is intriguing, and the rigorous characterization of the AKH receptor-expressing neurons would significantly enhance our understanding of the molecular mechanism underlying HFD-induced cardiac arrhythmia.

      Many experiments presented in the manuscript are appropriate for supporting the conclusions while additional controls and precise quantifications should help strengthen the authors' augments. The key results obtained by loss of Akh (or AkhR) and genetic elimination of the identified AkhR-expressing cardiac neurons do not reconcile, complicating the overall interpretation.

      It is intriguing to see an increase in Akh mRNA levels in HFD-fed animals. This is a key result for linking HFD-induced arrhythmia to Akh. Thus, demonstrating that HFD also increases the Akh protein levels and Akh is secreted more should significantly strengthen the manuscript.

      The experiments employing an AkhR null allele nicely demonstrate its requirement for HFD-induced cardiac arrhythmia. Depletion of Akh in Akh-expressing cells recapitulates the consequence of AkhR knockout, supporting that both Akh and its receptor are required for HFD-induced cardiac arrhythmia. Given that RNAi is associated with off-target effects and some RNAi reagents do not work, testing multiple independent RNAi lines is the standard procedure. It is also important to show the on-target effect of the RNAi reagents used in the study.

      The most exciting result is the identification of AkhR-expressing neurons located at the posterior part of the heart tube (ACNs). The authors attempted to determine the function of ACNs by expressing rpr with AkhR-GAL4, which would induce cell death in all AkhR-expressing cells, including ACNs. The experiments presented in Figure 6 are not straightforward to interpret. Moreover, the conclusion contradicts the main hypothesis that elevated Akh is the basis of HFD-induced arrhythmia. The results suggest the importance of AkhR-expressing cells for normal heartbeat. However, elimination of Akh or AkhR restores normal rhythm in HFD-fed animals, suggesting that Akh and AkhR are not important for maintaining normal rhythms. If Akh signaling in ACNs is key for HFD-induced arrhythmia, genetic elimination of ACNs should unalter rhythm and rescue the HFD-induced arrhythmia. An important caveat is that the experiments do not test the specific role of ACNs. ACNs should be just a small part of the cells expressing AkhR. The experiments presented in Figure 6 cannot justify the authors' conclusion. Specific manipulation of ACNs will significantly improve the study. Moreover, the main hypothesis suggests that HFD may alter the activity of ACNs in a manner dependent on Akh and AkhR. Testing how HFD changes calcium, possibly by CaLexA (Figure 2) and/or GCaMP, in wild-type and AkhR mutants could be a way to connect ACNs to HFD-induced arrhythmia. Moreover, optogenetic manipulation of ACNs will allow for specific manipulation of ACNs, which is crucial for studying the specific role of ACNs in controlling cardiac rhythms.

      Interestingly, expressing rpr with AkhR-GAL4 was insufficient to eliminate both ACNs. It is not clear why it didn't eliminate both ACNs. Given the incomplete penetrance, appropriate quantifications should be helpful. Additionally, the impact on other AhkR-expressing cells should be assessed. Adding more copies of UAS-rpr, AkhR-GAL4, or both may eliminate all ACNs and other AkhR-expressing cells. The authors could also try UAS-hid instead of UAS-rpr.

    1. Author response:

      We thank eLife and the reviewers for the thoughtful summary and valuable review of our manuscript. We largely agree with the summary and review and have provided our responses to the comments below. We believe BADGER is a significant new tool for identifying associated risk factors for complex diseases, and the associations we observed in the analysis provide insights into the genetic basis of Alzheimer's disease.

      Reviewer #1 (Public Review):

      The major aim of the paper was a method for determining genetic associations between two traits using common variants tested in genome-wide association studies. The work includes a software implementation and application of their approach. The results of the application of their method generally agree with what others have seen using similar AD and UKB data.

      The paper has several distinct portions. The first is a method for testing genetic associations between two or more traits using genome-wide association tests statistics. The second is a python implementation of the method. The last portion is the results of their method using GWAS from AD and UK Biobank.

      We thank the reviewer for the conclusion and positive comments.

      Regarding the method, it seems like it has similarities to LDSC, and it is not clear how it differs from LDSC or other similar methods. The implementation of the method used python 2.7 (or at least was reportedly tested using that version) that was retired in 2020. The implementation was committed between Wed Oct 3 15:21:49 2018 to Mon Jan 28 09:18:09 2019 using data that existed at the time so it was a bit surprising it used python 2.7 since it was initially going to be set for end-of-life in 2015. Anyway, trying to run the package resulted in unmet dependency errors, which I think are related to an internal package not getting installed. I would expect that published software could be installed using standard tooling for the language, and, ideally, software should have automated testing of key portions.

      We thank the reviewer for their comments. To clarify, the primary difference between our proposed method, BADGERS, and LDSC lies in their respective objectives and applications. LDSC is designed to estimate heritability and genetic correlations between traits by utilizing GWAS summary statistics, thereby aiding in the elucidation of the genetic architecture of complex traits and diseases. Conversely, BADGERS is specifically developed to explore causal relationships between risk factors, such as biomarkers, and diseases of interest. It employs genetic variants as variables to deduce causality, thereby addressing the challenges of confounding and reverse causation that are common in observational studies. Although BADGERS utilizes the LD reference panel derived from LDSC, the LD reference panel is used to obtain the predicted trait expression. The ultimate goal is to focus on linking biobank traits with Alzheimer’s disease and building causal relationships instead of identifying genetic architecture.

      Regarding the technical aspects mentioned, we acknowledge the concerns about the use of Python 2.7 and the issues encountered during the package installation. We are in the process of updating the software to ensure compatibility with current versions of Python and to enhance the installation process with standard tooling and automated testing for a more user-friendly experience. We have provided tests for each portion of the software so the user can test if the software is working properly.

      Regarding the main results, they find what has largely been shown by others using the same data or similar data, which add prima facie validity to the work The portions of the work dealing with AD subgroups, pathology, biomarkers, and cognitive traits of interest. I was puzzled why the authors suggested surprise regarding parental history and high cholesterol not associated with MCI or cognitive composite scores since the this would seem like the likely fallout of selection of the WRAP cohort. The discussion paragraph that started "What's more, environmental factors may play a big role in the identified associations." confused me. I think what the authors are referring to are how selection, especially in a biobank dataset, can induce correlations, which is not what I think of as an environmental effect.

      We thank the reviewer very much for their comment. We're glad that our findings align with existing research using similar data, increasing the validity of our work and the proposed BADGER algorithm. Your point about the lack of association between parental history, high cholesterol, and mild cognitive impairment (MCI) or cognitive composite scores in the WRAP cohort is well-taken. We agree that the selection criteria of the WRAP cohort may influence these findings, as it consists of individuals with a specific risk profile for Alzheimer's disease. This selection could indeed mitigate the observed association between these factors and cognitive outcomes, which we initially found surprising.

      Regarding the environmental factors, we appreciate your clarification and understand the confusion. Our intention was to discuss the potential for selection bias and confounding factors in biobank datasets for the identified associations, which might not necessarily be direct environmental effects.

      Overall, the work has merit, but I am left without a clear impression of the improvement in the approach over similar methods. Likewise, the results are interesting, but similar findings are described with the data that was used in the study, which are over 5 years old at the time of this review.

      We thank the reviewer a lot for their endorsement of the BADGER framework. We believe that our method, BADGER, improves on existing approaches by effectively linking genetic data with the detailed phenotypic information in biobanks and large disease GWAS. This enhances our ability to detect associations without needing individual-level data, offering clearer insights while reducing issues like reverse causality and confounding factors.

      Even though the IGAP dataset is over five years old, it remains one of the largest publicly available datasets for Alzheimer’s Disease. Likewise, the UK biobank is one of the largest publicly available human traits datasets, which researchers continue to use. These datasets' continued utility demonstrates their value in the research community. Additionally, the versatility of the BADGER framework makes it suitable for future research investigating the relationship between human traits and various diseases using different datasets.

      Reviewer #2 (Public Review):

      Summary:

      Yan, Hu, and colleagues introduce BADGERS, a new method for biobank-wide scanning to find associations between a phenotype of interest, and the genetic component of a battery of candidate phenotypes. Briefly, BADGERS capitalizes on publicly available weights of genetic variants for a myriad of traits to estimate polygenic risk scores for each trait, and then identify associations with the trait of interest. Of note, the method works using summary statistics for the trait of interest, which is especially beneficial for running in population-based cohorts that are not enriched for any particular phenotype (ie. with few actual cases of the phenotype of interest).

      Here, they apply BADGERS on Alzheimer's disease (AD) as the trait of interest, and a battery of circa 2,000 phenotypes with publicly available precalculated genome-wide summary statistics from the UK Biobank. They run it on two AD cohorts, to discover at least 14 significant associations between AD and traits. These include expected associations with dementia, cognition (educational attainment), and socioeconomic status-related phenotypes. Through multivariate modelling, they distinguish between (1) clearly independent components associated with AD, from (2) by-product associations that are inflated in the original bivariate analysis. Analyses stratified according to APOE inclusion show that this region does not seem to play a role in the association of some of the identified phenotypes. Of note, they observe overlap but significant differences in the associations identified with BADGERS and other Mendelian randomization (MR), hinting at BADGERS being more powerful than classical top variant-based MR approaches. They then extend BADGERS to other AD-related phenotypes, which serves to refine the hypotheses about the underlying mechanisms accounting for the genetic correlation patterns originally identified for AD. Finally, they run BADGERS on a pre-clinical cohort with mild cognitive impairment. They observe important differences in the association patterns, suggesting that this preclinical phenotype (at least in this cohort) has a different genetic architecture than general AD.

      We thank the reviewer a lot for the conclusion and positive comments.

      Strengths:

      BADGERS is an interesting new addition to a stream of attempts to "squeeze" biobank data beyond pure association studies for diagnosis. Increasingly available biobank cohorts do not usually focus on specific diseases. However, they tend to be data-rich, opening for deep explorations that can be useful to refine our knowledge of the latent factors that lead to diagnosis. Indeed, the possibility of running genetic correlation studies in specific sub-settings of interest (e.g. preclinical cohorts) is arguably the most interesting aspect of BADGERS. Classical methods like LDSC or two-sample MR capitalize on publicly available summary statistics from large cohorts, or having access to individual genotype data of large cohorts to ensure statistical power. Seemingly, BADGERS provides a balanced opportunity to dissect the correlation between traits of interest in settings with small sample size in which other methods do not work well.

      We thank the reviewer a lot for the conclusion and positive comments.

      Weaknesses:

      However, the increased statistical power is just hinted, and for instance, they do not explore if LDSC would have identified these associations. Although I suspect that is the case, this evidence is important to ensure that the abovementioned balance is right. Finally, as discussed by the authors, the reliance on polygenic risk scoring necessarily undermines the causality evidence gained through BADGERS. In this sense, BADGERS provides an alternative to strict instrumental-variable based analysis, which can be particularly useful to generate new mechanistic hypotheses.

      We thank the reviewer a lot for the comments. We understand the importance of comparing BADGER to other methods. The comparison with LDSC, while not directly relevant to BADGER’s causal inference aims, is indeed an interesting aspect to consider for future studies. In this paper, we focused on comparing BADGER with Mendelian Randomization (MR), which shares its causal inference objective.

      As a result, BADGERS identified a total of 48 traits that reached Bonferroni-corrected statistical significance. In contrast, MR-IVW only identified nine traits with Bonferroni-corrected statistical significance. Among these nine traits, seven were also identified by BADGERS. This demonstrates that BADGER holds higher power in detecting causal relationships.

      Regarding the use of polygenic risk scoring, we agree that it holds challenges in directly inferring causality. While BADGERS offers an innovative way to explore genetic correlations and can help generate new hypotheses about disease mechanisms, it does not replace the causal inferences that can be drawn from instrumental-variable-based analyses. Instead, it should be viewed as a complementary tool that can illuminate potential genetic relationships and guide further causal investigations.

      In summary, after 15 years of focus on diagnosis that would require having individual access to large patient cohorts, BADGERS can become an excellent tool to dig into trait heterogeneity, especially if it turns out to be more powerful than other available methodologies.

      We thank the reviewer a lot for the conclusion and positive comments.

    1. Author response:

      We thank the reviewers and editors for their time and effort reviewing and improving this manuscript. We also thank them for their support.

      Following the guidelines received by eLife we submit here the preliminary author’s response to the Public review with our planned changes to the manuscript.

      Reviewer 1.

      Comment 1. Issue on cross-reactivities of MafB antibodies.

      We are confident that our description of MafB V1 interneurons is correct despite some cross-reactivity with one of the antibodies used. We test all antibodies we use, and unfortunately, we found an inverse relationship between sensitivity and specificity with the two MafB antibodies used in this study. We chose for quantification the one with highest sensitivity, despite the presence of some cross-reactivity in interneurons other than the dorsal and ventral (Renshaw) V1 populations we focus on. The dorsal and ventral (Renshaw) V1 populations we describe here are also reactive with the more specific antibody (although with lower sensitivity) and both are neatly labeled in a MafB-GFP reporter mouse as described in Figure 3. We will add an image to the supplement with MafB-GFP V1 Interneurons at P5 showing the immunoreactivity of both MafB antibodies as suggested by the reviewer. We agree with the reviewer that this will give further support to the characterization of these populations by either immunocytochemical or genetic means at P5.

      Unfortunately, we cannot show lack of immunoreactivity for MafB antibodies in MafB GFP/GFP knockout mice at P5 because MafB global KOs die at birth as a result of respiratory failure. This is due to removal of inhibitory interneurons in brainstem centers critical for respiration (Blanchi at al. 2003 MafB deficiency causes defective respiratory rhythmogenesis and fatal central apnea at birth. Nat Neurosci. 6(10):1091-100. doi: 10.1038/nn1129. PMID: 14513037). This is why we used tissues from late embryos for testing antibody specificity in KO spinal cords. We will make this clearer in the text.

      Comment 2. Overlap of V1 clades with lineage labeled Foxp2-V1s at P5.

      We collected the data requested by the reviewer for P5 Foxp2-V1 interneurons and this will be added to an updated version of this figure. In comparison to the results with the OTP mouse, we only found marginal overlap at P5 with Renshaw cells, Pou6f2, and Sp8 V1s in our genetic intersection to label Foxp2-V1s. We apologize for not showing the data. We will make this clearer.

      Reviewer 2.

      Comment 1. Paper VERY hard to read.

      We will make every effort to make the paper more readable by moving methodological discussions to supplementary materials. We strive to keep our methods as rigorous, clean, and replicable as possible, and that sometimes requires lengthy explanations of the details and reasoning behind our approaches. We will make sure this does not distract from the principal scientific messages we want to convey. We agree with the reviewer that these should be emphasized over methodological detail, and we will correct any mistakes in the text that lead to confusion. Thank you for pointing out this problem that we hope to correct in a new version. Why focus on Foxp2 V1s? We focus in the Foxp2 population for several reasons: 1) This is the largest population of V1s, and it is the one with a close spatial association to motoneurons, in particular limb motoneurons; 2) Given previous results (Benito-Gonzalez and Alvarez, 2012, cited in bibliography) it likely includes many reciprocal inhibitory interneurons; 3) We do not have the mice for studying the Pou6f2 (or Sp8) population, but similar studies are now being carried out in the Bikoff lab.

      Comment 2. Lack of functional studies.

      Functional studies are currently being carried out, both during development of limb function in postnatal mice as well as in adult animals. These studies required the creation of several new animal models and reagents. As with the present manuscript, we thoroughly characterize all animals and methods. This takes time and space. These studies are beyond the goals and length of the current manuscript, but we agree with the reviewer that these are the critical next experiments that need to be performed. We are now finalizing studies on the role of Foxp2-V1 interneurons in the postnatal development of limb coordination and validating approaches for silencing them in the adult while also optimizing behavioral assays and recordings. The data presented here on Foxp2-V1 interneuron heterogeneity and relations with limb motoneurons gives the necessary context for raising stronger hypotheses and aiding in the interpretation of future results in functional studies.

      Synapse counts.

      We respectfully disagree with the reviewer’s comments on our synapse density estimates. To fully explain the reasons and prevent any ambiguity, we need to focus on detailed methodological aspects. We apologize for the lengthy response. Two major issues were raised:

      (1) Focus on the cell body.

      The issue pointed by the reviewer of potential synapses in distal dendrites from V1 subgroups not projecting proximally was already discussed in the text. The reason we focus on the cell body is because 1) it is not feasible to study the full dendritic arbor of so many different types of motoneurons and 2) it allows us to identify V1 subpopulations that likely exert stronger modulation of motoneuron firing by targeting the proximal somatodendritic membrane. The fact that synaptic organization on motoneurons is similar on cell bodies and proximal dendrites (first 100 µm) suggests that inputs from V1 clades other than Renshaw cells are likely further away, and therefore there is limited benefit to include analyses of proximal dendrites in these data. Additionally, dendrites would be difficult to consistently follow in Chat immunostained tissue. We are currently using novel viral approaches to obtain labeling of single motoneurons and their full dendritic trees for more in depth dendritic analyses in the mouse. The classical method based on single cell in vivo intracellular labeling using micropipettes is presently very low yield in the adult mouse. We are experienced with detailed single motoneuron dendritic arbor analyses in cat and rat motoneurons (Alvarez et al. 1997 Cell-type specific organization of glycine receptor clusters in the mammalian spinal cord. J Comp Neurol. 379(1):150-70; Alvarez et al., 1998 Distribution of 5-hydroxytryptamine-immunoreactive boutons on alpha-motoneurons in the lumbar spinal cord of adult cats. J Comp Neurol. 393(1):69-83; Rotterman et al., 2014. Normal distribution of VGLUT1 synapses on spinal motoneuron dendrites and their reorganization after nerve injury. J Neurosci. 34(10):3475-92. doi: 10.1523/JNEUROSCI.4768-13.2014). Based on this experience, we do not believe it is feasible to include similar analyses to compare all motor columns throughout 6 segments of the spinal cord in this study. We agree with the reviewer that these are important data sets that need to be collected and they are planned for future experiments. These analyses will address different questions than the ones posed and answered in our current manuscript.

      (2) Number of motoneurons analyzed.

      We disagree with the reviewer assessment that our conclusions might be biased because of the numbers of motoneurons analyzed. We sampled a total of 295 motoneurons in 5 different mice (117 LMC/HMC, 99 MMC, and 79 PGC motoneurons), and we used stringent methods for synapse detection. Due to a technical error, Mouse 3 lacked data in upper lumbar and Th13, but all other mice included data in almost all motor columns and segments. We disagree with the characterization that these are small samples. For full transparency, all motoneurons analyzed were identified in Figure 6D. Each of the nearly 300 motoneuron cell bodies was carefully reconstructed through several optical planes to obtain an accurate estimate of synapse density. More automatic methods in current use in the literature sometimes analyze larger samples, but our methods are designed to avoid methodological biases inherent to these automatic methods. We do not use image thresholding to extract synaptic contacts because they lack accuracy identifying single synapses. Thus, estimates using this technique frequently refer to coverage, not synapse density. In addition, it is hard to keep threshold criteria consistent across multiple optical planes to analyze enough section thickness to estimate a motoneuron surface. This is because tissue light diffraction alters thresholding levels continuously across optical planes. Thus, many authors present data as linear densities across a perimeter (in a single plane) measuring many cells in one field in one plane. We avoid cell body linear densities (or coverage) because they bias counts towards larger synapses that have higher probability of being present at any single confocal plane. Moreover, estimates along a surface reduces synapse sampling variability and better estimate synaptic coverage compared to estimates derived from analyzing single cross-sections. We also confirm each genetically labeled varicosity as a likely synapse by accumulation of VGAT. In this manner we restrict our counts to synaptic boutons and not axons or intervaricose regions. Previously, we used bassoon to show the accuracy of our methods (Wootz et al. 2013 Alterations in the motor neuron-Renshaw cell circuit in the Sod1(G93A) mouse model. J Comp Neurol. 521(7):1449-69. doi: 10.1002/cne.23266). That means that our densities are true synaptic densities, which are difficult to extract from automatic methods that estimate fluorescence coverage over larger samples of somatic profiles but fail to individualize synapses and frequently bias results. These bulk methods introduce significant confounds in data interpretation: Is higher coverage due to bigger synapses or more synapses? Do threshold structures represent true synapses or also include axons? To what extent does sub- or over-thresholding in different planes affect identification of structures in contact with the motoneuron surface? We avoid all these problems. Not surprisingly, a nested ANOVA demonstrated consistent significant differences among motor columns and segments.

      In summary, while more automatic methods allow larger samples, they disregard true synaptic densities and are based on thresholding methods with high variability in different motoneurons, optical planes and histological sections, thereby they require much larger numbers of motoneurons to overcome their many biases and sources of error. This is not our case. Our sample size is large enough considering the accuracy of our methods and data quality. This is demonstrated by consistency in statistical results across motor columns in different segments and mice.

      Comment 3. Possibility of anterograde transsynaptic labeling from primary afferents infected with rabies virus.

      This is a fair question that we did not clearly explain. The reviewer compares our results with those of Pimpinella et al., 2022. The methods used are different. To obtain anterograde tracing, these authors used Cre lines to achieve high levels of expression of TVA and RV glycoprotein in specific subtypes of sensory neurons including proprioceptors. Then EnVa-coated Rabies virus was injected directly inside the spinal cord for cell-type specificity. This method transynaptically labeled in the anterograde direction interneurons receiving inputs from specific types of sensory afferents, but the method does not have the muscle specificity required in our analyses. In our case, we used intramuscular injections at P5 of AAV1-G for transcomplementation with Rabies virus delta G injected in the same muscles later, at P15. In previous studies in which we used the RV-delta G virus without AAV1G, we analyzed motoneuron and primary afferent infection rates and found both to be considerably reduced with injection age. In our hands, there is almost no RV infection of primary afferents when Rabies virus is injected i.m. at P15, but there is some limited motoneuron infection remaining (that we used to our advantage in this paper to avoid primary afferent and developmental confounds).

      Unfortunately, these methodological studies are presently communicated only in abstract form (GomezPerez et al., 2015 and 2016; Program Nos. 242.08 and 366.06). Therefore, we will add to the supplementary information some images from serial sections to those illustrated in the paper and that will show a few “start” LG motoneurons that remained labeled at this survival time point and the lack of any dorsal horn primary afferent labeling. This is consistent with our yet unpublished data that is based on a larger number of animals and more extensive time courses.

      Comment 4. Temporal resolution of birth-dating.

      We agree with the reviewer, and that is the reason we explicitly discuss that temporal resolution is not perfect (we also add a few more caveats that affect temporal resolution beyond the reviewers’ comments). However, the method is good enough to differentiate temporal sequences of neurogenesis with close to 12-hour resolution, once enough animals are analyzed to compensate for methodological temporal overlaps. That is the reason for our Figure 1D.

      Reviewer 3

      Comment 1. Text is too long and main message buried in technical details.

      We agree and similar to our response to the first comment of Reviewer 2, we will revise the writing to make it more straightforward while moving some of the information on methods and technical discussion to supplementary materials. As demonstrated by reviewer 2 comments, methodological discussions are still important to best interpret the data presented in this paper.

    1. eLife assessment

      This fundamental study for the first time defines genetically the role of the Clock gene in basal metazoa, using the cnidarian Nematostella vectensis. With convincing evidence, the study provides insight into the early evolution of circadian clocks. Clock in this species is necessary for daily rhythms under constant conditions, but not under a rhythmic light/dark cycle, suggesting that the major role of the circadian oscillator in this species could be a stabilizing function under non-rhythmic environmental conditions.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      In this revised manuscript Aguillon and collaborators convincingly demonstrating that CLK is required for free-running behavioral rhythms under constant conditions in the Cnidarian Nematostella. The results also convincingly show that CLK impacts rhythmic gene expression in this organism. This original work thus demonstrates that CLK was recruited very early during animal evolution in the circadian clock mechanism to optimize behavior and gene expression with the time-of-day. The manuscript could still benefit from some improvements so that it is more accessible for a wide readership.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Aguillon and collaborators have deeply revised, and in the progress significantly improved the presentation of their interesting results with the first Cnidarian circadian gene mutant. Results are now very convincingly demonstrating that CLK is required for free-running behavioral rhythms under constant conditions. The results also now more convincingly show that CLK impact rhythmic gene expression, although interpretation of the transcriptomics data is not straightforward. I think there is still improvements that are needed to make the manuscript more accessible. We authors need to keep in mind that a broad audience will read their report, not just chronobiologists. I have listed below several issues that I think should be addressed, and some editing suggestions.

      General comment to Editor and Reviewers:

      We are genuinely grateful to both reviewers and editors about all the feedback which helped us to make the best of our data, to question our analysis to the point we redefined our approach and end up with a great article we are proud of it. Only the name of authors is visible on the article, and considering how much the reviewing system help to improve the research it seems almost unfair. As such, we thank all of you and really appreciate the new eLife system. Bravo all.

      Abstract:

      (1) Line 40" It should read "transcript levels" instead of "transcription". There is no measurement of transcription rates in this manuscript, only mRNA levels.

      Modified accordingly.

      (2) Line 41: the authors mention "constant light". Does this refer to previous work? Their data in Figure 4 were in constant darkness, not in LL.

      Modified accordingly.

      (3) Line 46 and throughout the manuscript, the allelic nomenclature is not standard. 1-/- seems to indicate there are two different alleles. Since the allele might not be a null, I would suggest simply using 1/1, or perhaps delta/delta since the mutation results in a truncates CLK.

      NvClk1-/- became NvClkΔ/Δ. Except in the .xls supplementary table were the mutant kept the NvClk-/- nomenclature. It is not possible to replace only part of a word with a different font, here generating delta sign would require to do it one by one.

      (4) The last sentence of the abstract needs to be rephrased, as it suggests that CLK evolved to maintain circadian rhythms under constant conditions. Constant conditions very rarely exist on Earth, and thus cannot be an evolutionary driving force. Different explanations have been proposed on why a self-sustained clock is the evolutionary solution to timekeeping, but the purpose of the clock and of clock genes is not to maintain oscillations in constant conditions. Actually, this sentence conflicts with the title.

      Modified to: the Clock gene has evolved in cnidarians to sustain 24-hour rhythmic physiology and behavior in absence of diel environmental conditions. From my actual understanding, you are right, the purpose of clock gene is not to maintain oscillation in constant conditions (this is simply the result of the experiment), but to synchronize the physiology to the day/night rhythm, and surely to sustain 24h oscillations in case the environment challenges the perception of the diel cues. The DD or LL is just an artificial experimental design to reveal the endogenous time-keeping pacemaker.

      Results:

      (1) Line 148 and elsewhere in the MS: I would not use the word "lower" or "higher" to qualify acrophases. I would suggest advanced/delayed or earlier/later.

      Modified accordingly.

      (2) Line 157-9: The introductory sentence does not clearly present the rationale for the 6/6 experiments.

      We modified the paragraph accordingly: The presence of a 24-hour rhythm of NvClkΔ/Δ polyps under LD conditions could be attributed to either a direct light-response or the partial functioning of the circadian clock due to the nature of the mutation….

      (3) At the end of the behavior section, or perhaps at the end of each paragraph in this section, it would be helpful to have a summary of the results and more clearly explain their interpretation. The authors need to guide the readers, particularly non-chronobiologist, so that they can understand what the really neat data that were obtained mean. For example, what does it mean that the acrophase is different between mutant and wild-type, why are Clk mutants rhythmic under LD12/12 or 6/6, etc.

      We added a conclusion sentence to help non-specialist to understand each result.

      (4) Line 172 and elsewhere" "true rhythmic genes" sounds odd to me. Either they are, or they are not rhythmic.

      Modified to “rhythmic genes.”

      (5) Paragraph starting with line 184: I do not follow what is important about the number of genes per time cluster. What does it tell us, beyond the simple fact that less genes are rhythmic in the Clk mutants?

      We rewrote the result paragraph to make it clearer why we performed this clustering analysis. This clustering analysis became Extended Data Fig.2 with modification of the figures (see my comments in your review about Figure 3).

      (6) Line 197: The authors need to explain what they saw with circadian clock genes and their expression in CLk mutants. In some case, amplitude increased in LD. This surprising observation deserves some explanations. "Complex regulatory effect" is too vague.

      We replaced the vague “complex regulatory effect” by a more thorough description of the figure 3.a.

      (7) Line 198-203: Again, help the reader understand the significance of these observations.

      We rewrote the paragraph to help the reader to better understand the significance of these observations.

      Discussion:

      (1) Line 236-40. Careful with the use of -/-, which implies that an allele is a null. The first CLk mutants in mammals and flies, which the authors refer to. were actually dominant negatives.

      I went over the citations we used for this paragraph and this first mutation in fly dClkar is null, no dominant negative. Flies are still rhythmic in the dark. Unless there is an older mutation? However, you right the first mutation identified in mouse was a dominant-negative with loss of rhythmicity, while the gene deletion did not show any effect on the behavior, suggesting compensation by a paralog. I removed two references which were not relevant to the discussion.

      (2) Line 265-268 are not very clear. Do the authors mean that the lack of overlap for non-cricadian pacemaker genes is because of different experimental conditions? What would be those differences? It is reassuring that the Leach/Reitzel study and the present share pacemaker genes as rhythmic, but it is also surprising that there is almost no overlap beyond these genes. How robust are those other rhythms compared to circadian clock genes?

      We revised this paragraph and raised major points regarding the raising condition of our polyps between labs and their potential genetic differences which could explain these differences.

      (3) Line 270. I am not sure "compensation" is the right word, since there is no overlap between the rhythmic genes in mutants under LD and wild-type under either LD or DD. Also, saying on line 273 that the transcriptional pattern is not fully reproduced is a rather striking understatement, given the absence of rhythm gene overlap

      We rewrote the paragraph accordingly. We replaced by “alternative way to drive rhythmicity under LD condition”.

      (4) Line 279. The authors mention the possibility of false positives. Based on the FDR, is there more rhythmic genes than by chance?

      The possibility of false-positive is a risk to consider when you do not perform multiple-testing. We added within the results paragraph the number of rhythmic genes identified with BH.Q or p.adj. which both are the multiple testing for each algorithm (RAIN and JTK) we used.

      (5) Line 279-82. The references to the Ray study is rather obscure. What is the point the authors are trying to make here?

      Eventually, we removed the reference from this article and modify the paragraph of the discussion. Indeed, the discussion around the Ray study did not gave an interesting direction to discuss our results and analysis approach.

      (6) Line 284: define BHQ and p.adj

      Defined and referenced.

      (7) The way Lines 283-288 are worded do not provide a good rationale for how transcriptional rhythms were analyzed. The idea to combine two different approaches (JTK and RAIN) to be selective with rhythmicity was great. The authors need however to justify these choices in a more convincing manner. The goal is to detect rhythmic genes in a reliable manner, irrespective of the number of rhythmic genes observed Also, explaining the choice of methodology belongs to the result section.

      We explained our choice of methodology and moved it to the result section as suggested.

      (8) Line 292-3. There are known mechanisms that explain how transcriptional time clusters are generated. In particular, the use of interlocked feedback loop with antiphase peaks of transcriptions is well documented. Actually, it seems to me the clustering shown in Fig 4 might hint at such a mechanism.

      Indeed you are right the clustering shown in Fig 3 (former Fig 4) revealed such mechanism.

      Figures:

      Figure 2: Define relative amplitude

      We added the definition of the relative amplitude within the results. If this is what you asked for?

      Figure 3: Some of the cycles look odd (first row of graphs in panel C). Why would the first and last data point be so different in three of these graphs?

      We decided to modify this figure as we realized it was not informative and not objective enough, as we selected among multiple patterns few “representatives”. In the new figure we combined the cluster analysis to the behavior. Thus, readers can now pick a cluster according to a specific behavior activity level (or ZT/CT) and reach in supp. Table 4 the “genes of potential interest”. However generally speaking this figure does not explain more the consequences of the mutation, so we moved it into the Extended data Fig.2

      Figure4: define the color coding in the correlation panels (blue to red)

      These values from -1 to 1 are the Pearson correlation values. Now indicated on the figure with the color coding legend.

    3. Reviewer #2 (Public Review):

      In this revised manuscript Aguillon and collaborators convincingly demonstrating that CLK is required for free-running behavioral rhythms under constant conditions in the Cnidarian Nematostella. The results also convincingly show that CLK impacts rhythmic gene expression in this organism. This original work thus demonstrates that CLK was recruited very early during animal evolution in the circadian clock mechanism to optimize behavior and gene expression with the time-of-day.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a valuable contribution to cardiac arrhythmia research by demonstrating long noncoding RNA Dachshund homolog 1 (lncDACH1) tunes sodium channel functional expression and affects cardiac action potential conduction and rhythms. Whereas the evidence for functional impact of lncDACH1 expression on cardiac sodium currents and rhythms is convincing, biochemical experiments addressing the mechanism of changes in sodium channel expression and subcellular localization are incomplete.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors show that a long-non coding RNA lncDACH1 inhibits sodium currents in cardiomyocytes by binding to and altering the localization of dystrophin. The authors use a number of methodologies to demonstrate that lncDACH1 binds to dystrophin and disrupts its localization to the membrane, which in turn downregulates NaV1.5 currents. Knockdown of lncDACH1 upregulates NaV1.5 currents. Furthermore, in heart failure, lncDACH1 is shown to be upregulated which suggests that this mechanism may have pathophysiolgoical relevance.

      Strengths:

      (1) This study presents a novel mechanism of Na channel regulation which may be pathophysiologically important.

      (2) The experiments are comprehensive and systematically evaluate the physiological importance of lncDACH1.

      Weaknesses:

      (1). What is indicated by the cytoplasmic level of NaV1.5, a transmembrane protein? The methods do not provide details regarding how this was determined. Do you authors means NaV1.5 retained in various intracellular organelles?

      Thank you for the good suggestion. Our study showed that Nav1.5 was transferred to the cell membrane by the scaffold protein Dystropin in response to the regulation of LncDACH1, but not all Nav1.5 in the cytoplasm was transferred to the cell membrane. Therefore, the cytoplasmic level of Nav1.5 represents the Nav1.5 protein that is not transferred to the cell membrane but stays in the cytoplasm and various organelles within the cytoplasm when Nav1.5 is regulated by LncDACH1

      (2) What is the negative control in Fig. 2b, Fig. 4b, Fig. 6e, Fig. 7c? The maximum current amplitude in these seem quite different. -40 pA/pF in some, -30 pA/pF in others and this value seems to be different than in CMs from WT mice (<-20 pA/pF). Is there an explanation for what causes this variability between experiments and/or increase with transfection of the negative control? This is important since the effect of lncDACH1 is less than 50% reduction and these could fall in the range depending on the amplitude of the negative control.

      Thank you for the insightful comment. The negative control in Fig. 2b, Fig. 4b, Fig. 6e are primary cardiomyocytes transfected with empty plasmids. The negative control in Fig.7c are cardiomyocytes of wild-type mice injected with control virus. When we prepare cells before the patch-clamp experiments, the transfection efficiency of the transfection reagent used in different batches of cells, as well as the different cell sizes, ultimately lead to differences in CMS.

      (3) NaV1.5 staining in Fig. 1E is difficult to visualize and to separate from lncDACH1. Is it possible to pseudocolor differently so that all three channels can be visualized/distinguished more robustly?

      Thank you for the good suggestion. We have re-added color to the original image to distinguish between the three channels.

      Author response image 1.

      (4) The authors use shRNA to knockdown lncDACH1 levels. It would be helpful to have a scrambled ShRNA control.

      Thank you for the insightful comment. The control group we used was actually the scrambled shRNA, but we labeled the control group as NC in the article, maybe this has caused you to misunderstand.

      (5) Is there any measurement on the baseline levels of LncDACH1 in wild-type mice? It seems quite low and yet is a substantial increase in NaV1.5 currents upon knocking down LncDACH1. By comparison, the level of LncDACH1 seems to be massively upregulated in TAC models. Have the authors measured NaV1.5 currents in these cells? Furthermore, does LncDACH1 knockdown evoke a larger increase in NaV1.5 currents?

      Thank you for the insightful comment.

      (1).The baseline protein levels of LncDACH1 in wild-type mice and LncDACH1-CKO mice has been verified in a previously published article(Figure 3).(Hypertension. 2019;74:00-00. DOI: 10.1161/HYPERTENSIONAHA.119.12998.)

      Author response image 2.

      (2). We did not measure the Nav1.5 currents in cardiomyocytes of the TAC model mice in this artical, but in another published paper, we found that the Nav1.5 current in the TAC model mice was remarkably reduced than that in wild-type mice(Figure 4).(Gene Ther. 2023 Feb;30(1-2):142-149. DOI: 10.1038/s41434-022-00348-z)

      Author response image 3.

      This is consistent with our results in this artical, and our results show that LncDACH1 levels are significantly upregulated in the TAC model, then in the LncDACH1-TG group, the Nav1.5 current is significantly reduced after the LncDACH1 upregulation(Figure 3).

      Author response image 4.

      (6) What do error bars denote in all bar graphs, and also in the current voltage relationships?

      Thank you for the good comment. All the error bars represent the mean ± SEM. They represent the fluctuation of all individuals of a set of data based on the average value of this set of data, that is, the dispersion of a set of data.

      Reviewer #2 (Public Review):

      This manuscript by Xue et al. describes the effects of a long noncoding RNA, lncDACH1, on the localization of Nav channel expression, the magnitude of INa, and arrhythmia susceptibility in the mouse heart. Because lncDACH1 was previously reported to bind and disrupt membrane expression of dystrophin, which in turn is required for proper Nav1.5 localization, much of the findings are inferred through the lens of dystrophin alterations.

      The results report that cardiomyocyte-specific transgenic overexpression of lncDACH1 reduces INa in isolated cardiomyocytes; measurements in whole heart show a corresponding reduction in conduction velocity and enhanced susceptibility to arrhythmia. The effect on INa was confirmed in isolated WT mouse cardiomyocytes infected with a lncDACH1 adenoviral construct. Importantly, reducing lncDACH1 expression via either a cardiomyocyte-specific knockout or using shRNA had the opposite effect: INa was increased in isolated cells, as was conduction velocity in heart. Experiments were also conducted with a fragment of lnDACH1 identified by its conservation with other mammalian species. Overexpression of this fragment resulted in reduced INa and greater proarrhythmic behavior. Alteration of expression was confirmed by qPCR.

      The mechanism by which lnDACH1 exerts its effects on INa was explored by measuring protein levels from cell fractions and immunofluorescence localization in cells. In general, overexpression was reported to reduce Nav1.5 and dystrophin levels and knockout or knockdown increased them.

      Thank you for summarizing our work and thank you very much for your appreciation on our work.

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors report the first evidence of Nav1.5 regulation by a long noncoding RNA, LncRNA-DACH1, and suggest its implication in the reduction in sodium current observed in heart failure. Since no direct interaction is observed between Nav1.5 and the LncRNA, they propose that the regulation is via dystrophin and targeting of Nav1.5 to the plasma membrane.

      Strengths:

      (1) First evidence of Nav1.5 regulation by a long noncoding RNA.

      (2) Implication of LncRNA-DACH1 in heart failure and mechanisms of arrhythmias.

      (3) Demonstration of LncRNA-DACH1 binding to dystrophin.

      (4) Potential rescuing of dystrophin and Nav1.5 strategy.

      Thank you very much for your appreciation on our work.

      Weaknesses:

      (1) Main concern is that the authors do not provide evidence of how LncRNA-DACH1 regulates Nav1.5 protein level. The decrease in total Nav1.5 protein by about 50% seems to be the main consequence of the LncRNA on Nav1.5, but no mechanistic information is provided as to how this occurs.

      Thank you for the insightful comment.

      (1) The mechanism of the whole article is as mentioned in the discussion at the end of the article: LncDACH1 binds to dystrophin and thus inhibits membrane trafficking of Nav1.5, Dystrophin is a well-characterized Nav1.5 partner protein. It indirectly interacts with Nav1.5 via syntrophin, which binds with the C-terminus of dystrophin and with the SIV motif on the C-terminus of Nav1.5(Circ Res. 2006;99:407-414. doi: 10.1161/01.RES.0000237466.13252.5e)(Circulation.2014;130:147-160.doi:10.1161/CIRCULATIONAHA.113.007852).

      And we performed pulldown and RNA immunoprecipitation experiments to verify it (Figure 1).

      Author response image 5.

      2) Then we found that overexpression of lncDACH1 increased the ubiquitination of Nav1.5, which explains the downregulation of total Nav1.5 protein (Online Supplementary Figure 12).

      Author response image 6.

      3). Lastly,we found that lncDACH1 failed to pulldown Nav1.5 and anti-Nav1.5 did not precipitate lncDACH1( Supplementary Fig. 1).

      Author response image 7.

      These data indicated that lncDACH does not interact with Nav1.5 directly. It participates in the regulation of Nav1.5 by binding to dystrophin.Cytoplasmic Nav1.5 that failed to target on plasma membrane may be quickly distinguished and then degraded by these ubiquitination enzymes.

      (2) The fact that the total Nav1.5 protein is reduced by 50% which is similar to the reduction in the membrane reduction questions the main conclusion of the authors implicating dystrophin in the reduced Nav1.5 targeting. The reduction in membrane Nav1.5 could simply be due to the reduction in total protein.

      Thank you for the insightful comment. We do not rule out the possibility that the reduction in membrane Nav1.5 maybe be due to the reduction in total protein, but we don't think this is the main mechanism. Our data indicates that the membrane and total protein levels of Nav1.5 were reduced by 50%. However, the cytoplasmic Nav1.5 increased in the hearts of lncDACH1-TG mice than WT controls rather than reduced like membrane and total protein(Figure 1).

      Author response image 8.

      Therefore, we think the mian mechanism of the whole article is as mentioned in the discussion at the end of the article: LncDACH1 binds to dystrophin and thus inhibits membrane trafficking of Nav1.5.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) In Fig. 6E the error bars are only in one direction for cF-lncDACH1. It seems that this error overlaps for NC and cF-lncDACH1 at several voltages, yet it is marked as statistically significant. Also in Fig. 7C, what statistical test was used? Do the authors account for multiple comparisons?

      Thank you for the insightful comment.

      (1) We have recalculated the two sets of data and confirmed that there are indeed statistically significant between the two sets of data for NC and cF-lncDACH1 at In Fig. 6E, The overlaps in the picture may only be visually apparent.

      (2) The data in Fig. 7C are expressed as mean ± SEM. Statistical analysis was performed using unpaired Student’s t test or One-Way Analysis of Variance (ANOVA) followed by Tukey’s post-hoc analysis.

      (2) line 57, "The Western blot" remove "The"

      Sorry for the mistake. We have corrected it.

      (3) line 61, "The opposite data were collected" It is unclear what is meant by opposite.

      Sorry for the mistake. We have corrected it.

      (4) Lines 137-140. This sentence is complex, I would simplify as two sentences.

      Sorry for the mistake. We have corrected it.

      (5) Line 150, "We firstly validated" should be "we first validated"

      Sorry for the mistake. We have corrected it.

      (6) Line 181, "Consistently, the membrane" Is this statement meant to indicate that the experiments yielded a consistent results or that this statement is consistent with the previous one? In either case, this sentence should be reworded for clarification.

      Sorry for the mistake. We have corrected it.

      (7) Line 223, "In consistent, the ex vivo" I am not sure what In consistent means here.

      Thank you for the good suggestion. We mean that the results of ex vivo is consistent with the results of in vivo. We have corrected it to make it clearer.

      (8) Line 285. "a bunch of studies" could be rephrased as "multiple studies"

      Sorry for the mistake. We have corrected it.

      (9) Line 299 "produced no influence" Do you mean produced no change?

      Thank you for the good suggestion.As you put it,we mean it produced no change.

      (10) Line 325 "is to interact with the molecules" no need for "the molecules

      Sorry for the mistake. We have corrected it.

      (11) lines 332-335. This sentence is very confusing.

      Thank you for the insightful comment. We have corrected it.

      (12) Lines 341-342. It is unnecessary to claim primacy here.

      Thank you for the good suggestion. We have removed this sentence.

      (13) Line 373. "Sodium channel remodeling is commonly occured in" perhaps rephrase as occurs commonly

      Thank you for the insightful comment. We have corrected it.

      Reviewer #2 (Recommendations For The Authors):

      Critique

      (1) Aside from some issues with presentation noted below, these data provide convincing evidence of a link between lncDACH1 and Na channel function. The identification of a lncDACH1 segment conserved among mammalian species is compelling. The observation that lncDACH1 is increased in a heart failure model and provides a plausible hypothesis for disease mechanism.

      Thank you very much for your appreciation on our work.

      (2) Has a causal link between dystrophin and Na channel surface expression has been made, or is it an argument based on correlation? Is it possible to rule out a direct effect of lncDACH1 on Na channel expression? A bit more discussion of the limitations of the study would help here.

      Thank you for the insightful comment.

      (1). Dystrophin is a well-characterized Nav1.5 partner protein. It indirectly interacts with Nav1.5 via syntrophin, which binds with the C-terminus of dystrophin and with the SIV motif on the C-terminus of Nav1.5(Circ Res. 2006;99:407-414. doi: 10.1161/01.RES.0000237466.13252.5e)(Circulation.2014;130:147-160.doi:10.1161/CIRCULATIONAHA.113.007852).

      Author response image 9.

      (2).we performed pulldown and RNA immunoprecipitation experiments. The data showed that lncDACH1 failed to pulldown Nav1.5 and anti-Nav1.5 did not precipitate lncDACH1 (Online Supplementary Figure 11). These data indicated that lncDACH does not interact with Nav1.5 directly. ( Supplementary Fig. 1)

      Author response image 10.

      (3) What normalization procedures were used for qPCR quantification? I could not find these.

      Thank you for the good suggestion.The expression levels of mRNA were calculated using the comparative cycle threshold (Ct) method (2−ΔΔCt). Each data point was then normalized to ACTIN as an internal control in each sample. The final results are expressed as fold changes by normalizing the data to the values from control subjects. We have added the normalization procedures in the methods section of the article.

      (4) In general, I found the IF to be unconvincing - first, because the reported effects were not very apparent to me, but more importantly, because only exemplars were shown without quantification of a larger sample size.

      Thank you for the good suggestion. Accordingly, we quantified the immunostaining data. The data have been included in Supplementary Figure 2- 16.The sample size is labeled in the caption.

      Author response image 11.

      Fluorescence intensity of lncDACH1, dystrophin and Nav1.5 in isolated cardiomyocytes of lncDACH1-TG mice. a,b, Membrane levels of dystrophin (dys) and Nav1.5. N=9 for dys. N=8 for Nav1.5. P<0.05 versus WT group. c,d, Cytoplasm levels of dystrophin and Nav1.5. N=9. P<0.05 versus WT group. e, Fluorescence in situ hybridization (FISH) images of LncDACH1. N=10. *P<0.05 versus WT group. P-values were determined by unpaired t test.

      Author response image 12.

      Fluorescence intensity of dystrophin and Nav1.5 in cultured neonatal cardiomyocyte overexpressing lncDACH1. a,b, Membrane levels of dystrophin and Nav1.5. N=9. P<0.05 versus NC group. c,d, Cytoplasm levels of dystrophin and Nav1.5. N=9 for dys. N=12 for Nav1.5. P<0.05 versus NC group. P-values were determined by unpaired t test.

      Author response image 13.

      Fluorescence intensity of lncDACH1, dystrophin and Nav1.5 in isolated cardiomyocytes of lncDACH1-cKO mice. a,b, Membrane levels of dystrophin (dys) and Nav1.5. N=12 for dys. N=8 for Nav1.5. P<0.05 versus WT group. c,d, Distribution of cytoplasm levels of dystrophin and Nav1.5. N=12. P<0.05 versus WT group. e, Fluorescence in situ hybridization (FISH) images of LncDACH1 expression. N=8. *P<0.05 versus WT group. P-values were determined by unpaired t test.

      Author response image 14.

      Fluorescence intensity of dystrophin and Nav1.5 in cultured neonatal cardiomyocytes after knocking down of lncDACH1. a,b, Distribution of membrane levels of dystrophin and Nav1.5. N=11 for dys. N=8 for Nav1.5.P<0.05 versus NC group. c,d, Distribution of cytoplasm levels of dystrophin and Nav1.5. N=12 for dys. N=9 for Nav1.5.P<0.05 versus NC group. P-values were determined by unpaired t test.

      Author response image 15.

      Fluorescence intensity of dystrophin and Nav1.5 in isolated cardiomyocytes overexpressing cF-lncDACH1. a,b, Membrane levels of dystrophin (dys) and Nav1.5. N=9 for dys. N=7 for Nav1.5. P<0.05 versus NC group. c,d, Cytoplasm levels of dystrophin and Nav1.5. N=6 for dys. N=7 for Nav1.5. P<0.05 versus NC group. P-values were determined by unpaired t test.

      Author response image 16.

      Fluorescence intensity of dystrophin and Nav1.5 in cultured neonatal cardiomyocytes overexpressing cF-lncDACH1. a,b, Membrane levels of dystrophin and Nav1.5. N=10 for dys. N=11 for Nav1.5. P<0.05 versus NC group. c,d, Cytoplasm levels of dystrophin and Nav1.5. N=7 for dys. N=6 for Nav1.5.P<0.05 versus NC group. P-values were determined by unpaired t test.

      Author response image 17.

      Fluorescence intensity of Nav1.5 in human iPS differentiated cardiomyocytes overexpressing cF-lncDACH1. a, Membrane levels of Nav1.5. N=8 for Nav1.5. P<0.05 versus NC group. b, Cytoplasm levels of Nav1.5. N=10 for Nav1.5.P<0.05 versus NC group. P-values were determined by unpaired t test.

      (5) More information on how the fractionation kit works would be helpful. How are membrane v. cytoplasm fractions identified?

      a. I presume the ER is part of the membrane fraction? When Nav1.5 is found in the cytoplasmic fraction, what subcompartment is it in - the proteasome?

      b. In the middle panel of A - is the dystrophin signal visible on the WB for WT? I assume the selected exemplar is the best of the blots and so this raises concerns. Much is riding on the confidence with which the fractions report "membrane" v "cytoplasm."

      Thank you for the insightful comment.

      (1). How the fractionation kit works:

      The kit utilizes centrifuge column technology to obtain plasma membrane structures with native activity and minimal cross-contamination with organelles without the need for an ultracentrifuge and can be used for a variety of downstream assays. Separation principle: cells/tissues are sensitized by Buffer A, the cells pass through the centrifuge column under the action of 16000Xg centrifugation, the cell membrane is cut to make the cell rupture, and then the four components of nucleus, cytoplasm, organelle and plasma membrane will be obtained sequentially through differential centrifugation and density centrifugation, which can be used for downstream detection.

      Author response image 18.

      (2). How are membrane v. cytoplasm fractions identified:

      The membrane proteins and cytosolic proteins isolated by the kit, and then the internal controls we chose when performing the western blot experiment were :membrane protein---N-cadherin cytosolic protein---β-Actin

      Most importantly, when we incubate either the primary antibody of N-cadherin with the PVDF membrane of the cytosolic protein, or the primary antibody of the cytosolic control β-Actin with the PVDF membrane of the membrane protein, the protein bands cannot be obtained in the scan results

      Author response image 19.

      (6) More detail in Results, figures, and figure legends will assist the reader.

      a. In Fig. 5, it would be helpful to label sinus rhythm vs. arrhythmia segments.

      Thank you for the good suggestion. We've marked Sinus Rhythm and Arrhythmia segments with arrows

      Author response image 20.

      b. Please explain in the figure legend what the red bars in 5A are

      Thank you for the insightful comment. We've added the explanation to the figure legend .The red lines in the ECG traces indicate VT duration.

      c. In 5C, what the durations pertain to.

      Thank you for the good suggestion. 720ms-760ms refers to the duration of one action potential, with 720ms being the peak of one action potential and 760ms being the peak of another action potential.The interval duration is not fixed, in this artical, we use 10ms as an interval to count the phase singularities from the Consecutive phase maps. Because the shorter the interval duration, the larger the sample size and the more convincing the data.

      d. In the text, please define "breaking points" and explain what the physiological underpinning is. Define "phase singularity."

      Thank you for the insightful comment. Cardiac excitation can be viewed as an electrical wave, with a wavefront corresponding to the action potential upstroke (phase 0) and a waveback corresponding to rapid repolarization (phase 3). Normally, Under normal circumstances, cardiac conduction is composed of a sequence of well-ordered action potentials, and in the results of optical mapping experiments, different colors represent different phases.when a wave propagates through cardiac tissue, wavefront and waveback never touch.when arrhythmias occur in the heart, due to factors such as reenfrant phenomenon, the activation contour will meet the refractory contour and waves will break up, initiating a newly spiral reentry. Corresponding to the optical mapping result graph, different colors representing different time phases (including depolarization and repolarization) come together to form a vortex, and the center of the vortex is defined as the phase singularity.

      (7) In reflecting on why enhanced INa is not proarrhythmic, it is noted that the kinetics are not altered. I agree that is key, but perhaps the consequence could be better articulated. Because lncDACH1 does not alter Nav1.5 gating, the late Na current may not be enhanced to the same effect as observed with LQT gain-of-function Nav1.5 mutations, in which APD prolongation is attributed to gating defects that increase late Na current.

      Thank you for the good suggestion. Your explanation is very brilliant and important for this article. We have revised the discussion section of the article and added these explanations to it.

      Reviewer #3 (Recommendations For The Authors):

      (1) Experiments to specifically address the reduction in total Nav1.5 protein should be included.

      Thank you for the insightful comment. We examined the ubiquitination of Nav1.5. We found that overexpression of lncDACH1 increased the ubiquitination of Nav1.5, which explains the downregulation of total Nav1.5 protein (Online Supplementary Figure 12).

      Author response image 21.

      (2) Experiments to convincingly demonstrate that LncRNA-DACH1 regulates Nav1.5 targeting via dystrophin are missing. As it is, total reduction in Nav1.5 seems to be the explanation as to why there is a decrease in membrane Nav1.5.

      Thank you for the insightful comment. we performed pulldown and RNA immunoprecipitation experiments. The data showed that lncDACH1 can pulldown dystrophin(Figure 1),but failed to pulldown Nav1.5 and anti-Nav1.5 did not precipitate lncDACH1( Supplementary Fig. 1). These data indicated that lncDACH does not interact with Nav1.5 directly. It participates in the regulation of Nav1.5 by binding to dystrophin.

      Author response image 22.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors show that a long-non coding RNA lncDACH1 inhibits sodium currents in cardiomyocytes by binding to and altering the localization of dystrophin. The authors use a number of methodologies to demonstrate that lncDACH1 binds to dystrophin and disrupt its localization to the membrane, which in turn downregulates NaV1.5 currents. Knockdown of lncDACH1 upregulates NaV1.5 currents. Furthermore, in heart failure, lncDACH1 is shown to be upregulated which suggests that this mechanism may have pathophysiological relevance.

      Strengths:

      (1) This study presents a novel mechanism of Na channel regulation which may be pathophysiologically important.

      (2) The experiments are comprehensive and systematically evaluate the physiological importance of lncDACH1.

    3. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, the authors report the first evidence of Nav1.5 regulation by a long noncoding RNA, LncRNA-DACH1, and suggest its implication in the reduction in sodium current observed in heart failure. Since no direct interaction is observed between Nav1.5 and the LncRNA, they propose that the regulation is via dystrophin and targeting of Nav1.5 to the plasma membrane.

      Strengths:

      (1) First evidence of Nav1.5 regulation by a long noncoding RNA.<br /> (2) Implication of LncRNA-DACH1 in heart failure and mechanisms of arrhythmias.<br /> (3) Demonstration of LncRNA-DACH1 binding to dystrophin.<br /> (4) Potential rescuing of dystrophin and Nav1.5 strategy.

      Weaknesses:

      (1) The fact that the total Nav1.5 protein is reduced by 50% which is similar to the reduction in the membrane reduction questions the main conclusion of the authors implicating dystrophin in the reduced Nav1.5 targeting. The reduction in membrane Nav1.5 could simply be due to the reduction in total protein.

    4. eLife assessment

      This study presents an important contribution to cardiac arrhythmia research by demonstrating long noncoding RNA Dachshund homolog 1 (lncDACH1) tunes sodium channel functional expression and affects cardiac action potential conduction and rhythms. The evidence supporting the major claims are solid. The work will be of broad interest to cell biologists and cardiac electrophysiologists.

    5. Reviewer #2 (Public Review):

      This manuscript by Xue et al. describes the effects of a long noncoding RNA, lncDACH1, on the localization of Nav channel expression, the magnitude of INa, and arrhythmia susceptibility in the mouse heart. Because lncDACH1 was previously reported to bind and disrupt membrane expression of dystrophin, which in turn is required for proper Nav1.5 localization, much of the findings are inferred through the lens of dystrophin alterations.

      The results report that cardiomyocyte-specific transgenic overexpression of lncDACH1 reduces INa in isolated cardiomyocytes; measurements in whole heart show a corresponding reduction in conduction velocity and enhanced susceptibility to arrhythmia. The effect on INa was confirmed in isolated WT mouse cardiomyocytes infected with a lncDACH1 adenoviral construct. Importantly, reducing lncDACH1 expression via either a cardiomyocyte-specific knockout or using shRNA had the opposite effect: INa was increased in isolated cells, as was conduction velocity in heart. Experiments were also conducted with a fragment of lnDACH1 identified by its conservation with other mammalian species. Overexpression of this fragment resulted in reduced INa and greater proarrhythmic behavior. Alteration of expression was confirmed by qPCR.

      The mechanism by which lnDACH1 exerts its effects on INa was explored by measuring protein levels from cell fractions and immunofluorescence localization in cells. In general, overexpression was reported to reduce Nav1.5 and dystrophin levels and knockout or knockdown increased them.

      The strengths of this manuscript include convincing evidence of a link between lncDACH1 and Na channel function. The identification of a lncDACH1 segment conserved among mammalian species is compelling. The observation that lncDACH1 is increased in a heart failure model and provides a plausible hypothesis for disease mechanism.

      One limitation of the fractionation approach is the uncertain disposition of Na channel protein deemed "cytoplasmic." It seems likely that the membrane fraction includes ER membrane. The signal may reasonably be attributed to Na channel protein in stalled transport vesicles, or alternatively in stress granules, but this was not directly addressed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides an important cell atlas of the gill of the mussel Gigantidas platifrons using a single nucleus RNA-seq dataset, a resource for the community of scientists studying deep sea physiology and metabolism and intracellular host-symbiont relationships. The work, which offers solid insights into cellular responses to starvation stress and molecular mechanisms behind deep-sea chemosymbiosis, is of relevance to scientists interested in host-symbiont relationships across ecosystems.

      Public Reviews:

      Reviewer #1 (Public Review):

      Wang et al have constructed a comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes.

      Wang et al sample mussels from 3 different environments: animals from their native methane-rich environment, animals transplanted to a methane-poor environment to induce starvation, and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the upregulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them.

      Strengths:

      This paper makes available a high-quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and the collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors do an excellent job of making all their data and analysis available, making this not only an important dataset but a readily accessible and understandable one.

      The authors also use a diverse array of tools to explore their data. For example, the quality of the data is augmented by the use of in situ hybridizations to validate cluster identity and KEGG analysis provides key insights into how the transcriptomes of bacteriocytes change.

      The authors also do a great job of providing diagrams and schematics to help orient non-mussel experts, thereby widening the audience of the paper.

      Thank the reviewer for the valuable feedback on our study. We are grateful that the reviewers found our work to be interesting and we appreciate their thorough evaluation of our research. Their constructive comments will be considered as we continue to develop and improve our study.

      Weaknesses:

      One of the main weaknesses of this paper is the lack of coherence between the images and the text, with some parts of the figures never being referenced in the body of the text. This makes it difficult for the reader to interpret how they fit in with the author's discussion and assess confidence in their analysis and interpretation of data. This is especially apparent in the cluster annotation section of the paper.

      We appreciate the feedback and suggestions provided by the reviewer, and we have revised our manuscript to make it more accessible to general audiences.

      Another concern is the linking of the transcriptomic shifts associated with starvation with changes in interactions with the symbiotes. Without examining and comparing the symbiote population between the different samples, it cannot be concluded that the transcriptomic shifts correlate with a shift to the 'milking' pathway and not other environmental factors. Without comparing the symbiote abundance between samples, it is difficult to disentangle changes in cell state that are due to their changing interactions with the symbiotes from other environmental factors.

      We are grateful for the valuable feedback and suggestions provided by the reviewer. Our keen interest lies in understanding symbiont responses, particularly at the single-cell level. However, it's worth noting that existing commercial single-cell RNA-seq technologies rely on oligo dT priming for reverse transcription and barcoding, thus omitting bacterial gene expression information from our dataset. We hope that advancements in technology will soon enable us to perform an integrated analysis encompassing both host and symbiont gene expression.

      Additionally, conclusions in this area are further complicated by using only snRNA-seq to study intracellular processes. This is limiting since cytoplasmic mRNA is excluded and only nuclear reads are sequenced after the organisms have had several days to acclimate to their environment and major transcriptomic shifts have occurred.

      We appreciate the comments shared by the reviewer and agree that scRNA-seq provides more comprehensive transcriptional information by targeting the entire mRNA of the cell. However, we would like to highlight that snRNA-seq has some unique advantages over scRNA-seq. Notably, snRNA-seq allows for simple snap-freezing of collected samples, facilitating easier storage, particularly for samples obtained during field trips involving deep-sea animals and other ecologically significant non-model animal samples. Additionally, unlike scRNA-seq, snRNA-seq eliminates the need for tissue dissociation, which often involves prolonged enzymatic treatment of deep-sea animal tissue/cells under atmospheric pressure. This process can potentially lead to the loss of sensitive cells or alterations in gene expression. Moreover, snRNA-seq procedures disregard the size and shape of animal cells, rendering it a superior technology for constructing the cell atlas of animal tissues. Consequently, we assert that snRNA-seq offers flexibility and represents a suitable choice for the research objects of our current research.

      Reviewer #2 (Public Review):

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways.

      A major strength of this study includes the successful application of advanced single-nucleus techniques to a non-model, deep-sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep-sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons.

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design. In this area, I would appreciate more in-depth discussion of these impacts when interpreting the data.

      Thank the reviewer for their valuable feedback on our study. We're grateful that the reviewers found our work interesting, and we appreciate their thorough evaluation of our research. We'll consider their constructive comments as we continue to develop and improve our study.

      Because cells from multiple individuals were combined before sequencing, the in situ transplantation experiment lacks clear biological replicates. This may potentially result in technical variation (ie. batch effects) confounding biological variation, directly impacting the interpretation of observed changes between the Fanmao, Reconstitution, and Starvation conditions. It is notable that Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. It is not clear whether this is due to a technical factor impacting sequencing or whether these numbers are the result of the unique biology of Fanmao cells. Furthermore, from Table S19 it appears that while 98% of Fanmao cells survived doublet filtering, only ~40% and ~70% survived for the Starvation and Reconstitution conditions respectively, suggesting some kind of distinction in quality or approach.

      There is a pronounced divergence in the relative proportions of cells per cell type cluster in Fanmao compared to Reconstitution and Starvation (Fig. S11). This is potentially a very interesting finding, but it is difficult to know if these differences are the expected biological outcome of the experiment or the fact that Fanmao cells are much more sparsely sampled. The study also finds notable differences in gene expression between Fanmao and the other two conditions- a key finding is that bacteriocytes had the largest Fanmao-vs-starvation distance (Fig. 6B). But it is also notable that for every cell type, one or both comparisons against Fanmao produced greater distances than comparisons between Starvation and Reconstitution (Fig. 6B). Again, it is difficult to interpret whether Fanmao's distinctiveness from the other two conditions is underlain by fascinating biology or technical batch effects. Without biological replicates, it remains challenging to disentangle the two.

      As highlighted by the reviewer, our experimental design involves pooling multiple biological samples within a single treatment state before sequencing. We acknowledge the concern regarding the absence of distinct biological replicates and the potential impact of batch effects on result interpretation. While we recognize the merit of conducting multiple sequencing runs for a single treatment to provide genuine biological replicates, we contend that batch effects may not exert a strong influence on the observed patterns.

      In addition, we applied a bootstrap sampling algorithm to assess whether the gene expression patterns within a cluster are more similar than those between clusters. This algorithm involves selecting a portion of cells per cluster and examining whether this subset remains distinguishable from other clusters. Our assumption was that if different samples exhibited distinct expression patterns due to batch effect, the co-assignment probabilities of a cluster would be very low. This expectation was not met in our data, as illustrated in Fig. S2. The lack of significantly low co-assignment probabilities within clusters suggests that batch effects may not exert a strong influence on our results.

      Indeed, we acknowledge a noticeable shift in the expression patterns of certain cell types, such as the bacteriocyte. However, this is not universally applicable across all cell types. For instance, the UMAP figure in Fig. 6A illustrates a substantial overlap among basal membrane cell 2 from Fanmao, Starvation, and Reconstitution treatments, and the centroid distances between the three treatments are subtle, as depicted in Fig. 6B. This consistent pattern is also observed in DEPC, smooth muscle cells, and the food groove ciliary cells.

      The reviewer also noted variations in the number of cells per treatment. Specifically, Fanmao sequencing yielded fewer than 10 thousand cells, whereas the other two treatments produced 2-3 times more cells after quality control (QC). It is highly probable that the technician loaded different quantities of cells into the machine for single-nucleus sequencing—a not uncommon occurrence in this methodology. While loading more cells may increase the likelihood of doublets, it is crucial to emphasize that this should not significantly impact the expression patterns post-QC. It's worth noting that overloading samples has been employed as a strategic approach to capture rare cell types, as discussed in a previous study (reference: 10.1126/science.aay0267).

      The reviewer highlighted the discrepancy in cell survival rates during the 'doublet filtering' process, with 98% of Fanmao cells surviving compared to approximately 40% and 70% for the Starvation and Reconstitution conditions, respectively. It's important to clarify that the reported percentages reflect the survival of cells through a multi-step QC process employing various filtering strategies.

      Post-doublet removal, we filtered out cells with <100 or >2500 genes and <100 or >6000 unique molecular identifiers (UMIs). Additionally, genes with <10 UMIs in each data matrix were excluded. The observed differences in survival rates for Starvation and Reconstitution cells can be attributed to the total volume of data generated in Illumina sequencing. Specifically, we sequenced approximately 91 GB of data for Fanmao, ~196 GB for Starvation, and ~249 GB for Reconstitution. As a result, the qualified data obtained for Starvation and Reconstitution conditions was only about twice that of Fanmao due to the limited data volume.

      The reviewer also observed a divergence in the relative proportions of cells per cell type cluster in Fanmao compared to Reconstitution and Starvation, as depicted in Fig. S1. This discrepancy may hold true biological significance, presenting a potentially intriguing finding. However, our discussion on this pattern was rather brief, as we acknowledge that the observed differences could be influenced by the sample preparation process for dissection and digestion. It is crucial to consider that cutting a slightly different area during dissection may result in variations in the proportion of cells obtained. While we recognize the potential impact of this factor, we do not think that the sparsity of sampling alone could significantly affect the relative proportions of cells per cell type.

      In conclusion, we acknowledge the reviewer's suggestion that sequencing multiple individual samples per treatment condition would have been ideal, rather than pooling them together. However, the homogenous distribution observed in UMAP and the consistent results obtained from bootstrap sampling suggest that the impact of batch effects on our analyses is likely not substantial. Additionally, based on our understanding, the smaller number of cells in the Fanmao sample should not have any significant effect on the resulting different proportion of cells or the expression patterns per each cluster.

      Reviewer #3 (Public Review):

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand the fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change.

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement.

      We appreciate the valuable feedback provided by the reviewer on our study. It is encouraging to know that our work was found to be interesting and that they conducted a thorough evaluation of our research. We will take their constructive comments into account as we strive to develop and enhance our study. Thank the reviewer for all the input.

      The one particular area for clarification and improvement surrounds the concept of a proliferative progenitor population within the gill. The authors imply that three types of proliferative cells within gills have long been known, but their study may be the first to recover molecular markers for these putative populations. The markers the authors present for gill posterior end budding zone cells (PEBZCs) and dorsal end proliferation cells (DEPCs) are not intuitively associated with cell proliferation and some additional exploration of the data could be performed to strengthen the argument that these are indeed proliferative cells. The authors do utilize a trajectory analysis tool called Slingshot which they claim may suggest that PEBZCs could be the origin of all gill epithelial cells, however, one of the assumptions of this analysis is that differentiated cells are developed from the same precursor PEBZC population.

      However, these conclusions do not detract from the overall significance of the work of identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles or there may be independent ways in which organisms have been able to solve these problems.

      We are grateful for the valuable comments and suggestions provided by the reviewer. All suggestions have been carefully considered, and the manuscript has been revised accordingly. We particularly value the reviewer's insights regarding the characterization of the G. platifrons gill proliferative cell populations. In a separate research endeavor, we have conducted experiments utilizing both cell division and cell proliferation markers on these proliferative cell populations. While these results are not incorporated into the current manuscript, we would be delighted to share our preliminary findings with the reviewer. Our preliminary results indicate that the proliferative cell populations exhibit positivity for cell proliferation markers and contain a significant number of mitotic cells..

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Further experiments are needed to link the changes in transcriptomes of Bathymodioline mussels in the different environmental conditions to changes in their interactions with symbiotes. For example, quantifying the abundance and comparing the morphology of symbiotes between the environmental conditions would lend much support for shifting between milking and farming strategies. Without analyzing the symbiotes and comparing them across populations, it is difficult to comment on the mechanisms of interactions between symbiotes and the hosts. Without this analysis, this data is better suited towards comments about the general effect of environmental perturbation and stress on gene expression in these mussels.

      We appreciate the reviewer’s comments. We are also very curious about the symbiont responses, especially at the single-cell level. However, all the current commercial single-cell RNA-seq technologies are based on oligo dT priming for reverse transcription and barcoding. Therefore, the bacterial gene expression information is omitted from our dataset. Hopefully, with the development of technology, we could conduct an integrated analysis of both host and symbiont gene expression soon.

      Additionally, clarification is needed on which types of symbiotes are being looked at. Are they MOX or SOX populations? Are they homogenous? What are the concentrations of sulfur at the sampled sites?

      We thank you for your valuable comments and suggestions. Gigantidas platifrons harbors a MOX endosymbiont population characterized by a single 16S rRNA phylotype. We apologize for any confusion resulting from our previous wording. To clarify, we have revised lines 57-59 of our introduction

      In the text and images, consider using standardized gene names and leaving out the genome coordinates. This would greatly help with readability. Also, be careful to properly follow gene naming and formatting conventions (ie italicizing gene names and symbols).

      We appreciate the reviewer’s insightful comments. In model animals, gene nomenclature often stems from forward genetic approaches, such as the identification of loss-of-function mutants. These gene names, along with their protein products, typically correspond to unique genome coordinates. Conversely, in non-model invertebrates (e.g., Gigantidas platifrons of present study), gene prediction relies on a combination of bioinformatics methods, including de novo prediction, homolog-based prediction, and transcriptomics mapping. Subsequently, the genes are annotated by identifying their best homologs in well-characterized databases. Given that different genes may encode proteins with similar annotated functions, we chose to include both the gene ID (genome coordinates) and the gene name in our manuscript. This dual labeling approach ensures that our audience receives accurate and comprehensive information regarding gene identification and annotation.

      Additionally, extending KEGG analysis to the atlas annotation section could help strengthen the confidence of annotations. For example, when identifying bacteriocyte populations, the functional categories of individual marker genes (lysosomal proteases, lysosomal traffic regulators, etc) are used to justify the annotation. Presenting KEGG support that these functional categories are upregulated in this population relative to others would help further support how you characterize this cluster by showing it's not just a few specific genes that are enriched in this cell group, but rather an overall functionality.

      We appreciate the valuable suggestion provided by the reviewer. Indeed, incorporating KEGG analysis into the atlas annotation section could further enhance the confidence in our annotations. However, in our study, we encountered some limitations that impeded us from conducting a comprehensive KEGG enrichment analysis.

      Firstly, the number of differentially expressed genes (DEGs) that we identified for certain cell populations was relatively small, making it challenging to meet the threshold required for meaningful KEGG enrichment analysis. For instance, among the 97 marker genes identified for the Bacteriocyte cluster, only two genes, Bpl_scaf_59648-4.5 (lysosomal alpha-glucosidase-like) and Bpl_scaf_52809-1.6 (lysosomal-trafficking regulator-like isoform X1), were identified as lysosomal genes. To generate reliable KEGG enrichments, a larger number of genes is typically required.

      Secondly, single-nucleus sequencing, as employed in our study, tends to yield a relatively smaller number of genes per cell compared to bulk RNA sequencing. This limited gene yield can make it challenging to achieve sufficient gene representation for rigorous KEGG enrichment analysis.

      Furthermore, many genes in the genome still lack comprehensive annotation, both in terms of KEGG and GO annotations. In our dataset, out of the 33,584 genes obtained through single-nuclei sequencing, 26,514 genes have NO KEGG annotation, and 25,087 genes have NO GO annotation. This lack of annotations further restricts the comprehensive application of KEGG analysis in our study.

      The claim that VEPCs are symbiote free is not demonstrated. Additional double in situs are needed to show that markers of this cell type localize in regions free of symbiotes.

      We appreciate your comments and suggestions. In Figure 5B, our results demonstrate that the bacteriocytes (green fluorescent signal) are distant from the VEPCs, which are located around the tip of the gill filaments (close to the food groove). We have revised our Figure 5B to make it clear.

      Additionally, it does not seem like trajectory analysis is appropriate for these sampling conditions. Generally, to create trajectories confidently, more closely sampled time points are needed to sufficiently parse out the changes in expression. More justification is needed for the use of this type of analysis here and a discussion of the limitations should be mentioned, especially when discussing the hypotheses relating to PEBZCs, VEPCs, and DEPCs.

      We greatly appreciate your thoughtful commentary. It is important to acknowledge that in the context of a developmental study, incorporating more closely spaced time points indeed holds great value. In our ongoing project investigating mouse development, for instance, we have implemented time points at 24-hour intervals. However, in the case of deep-sea adult animals, we hypothesized a slower transcriptional shift in such extreme environment, which led us to opt for a time interval of 3-7 days. Examining the differential expression profiles among the three treatments, we observed that most cell types exhibited minimal changes in their expression profiles. For the cell types strongly impacted by in situ transplantation, their expression profiles per cell type still exhibited highly overlap in the UMAP analysis (Figure 6a), thus enabling meaningful comparisons. Nevertheless, we recognize that our sampling strategy may not be flawless. Additionally, the challenging nature of conducting in situ transplantation in 1000-meter depths limited the number of sampling occasions available to us. We sincerely appreciate your input and understanding.

      Finally, more detail should be added on the computational methods used in this paper. For example, the single-cell genomics analysis protocol should be expanded on so that readers unfamiliar with BD single-cell genomics handbooks could replicate the analysis. More detail is also needed on what criteria and cutoffs were used to calculate marker genes. Also, please be careful to cite the algorithms and software packages mentioned in the text.

      Acknowledged, thank you for highlighting this. In essence, the workflow closely resembles that of the 10x Genomics workflow (despite the use of a different software, i.e., Cell Ranger). We better explain the workflow below, and also noting that this information may no longer be relevant for newer users of BD or individuals who are not acquainted with BD, given that the workflow underwent a complete overhaul in the summer of 2023.

      References to lines

      Line 32: typo "..uncovered unknown tissue heterogeny" should read "uncovering" or "and uncovered")

      Overall abstract could include more detail of findings (ex: what are the "shifts in cell state" in line 36 that were observed)

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 60: missing comma "...gill filament structure, but also"

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 62-63: further discussion here, or in the relevant sections of the specific genes identified in the referenced bulk RNA-seq project could help strengthen confidence in annotation

      We appreciate the comment, and have revised the manuscript accordingly.

      Line 112: what bootstrapping strategy? Applied to what?

      This is a bootstrap sampling algorithm to assess the robustness of each cell cluster developed in a recent biorxiv paper. (Singh, P. & Zhai, Y. Deciphering Hematopoiesis at single cell level through the lens of reduced dimensions. bioRxiv, 2022.2006.2007.495099 (2022). https://doi.org:10.1101/2022.06.07.495099)

      Lines 127-129: What figures demonstrate the location of the inter lamina cells? Are there in situs that show this?

      We apologize for any errors; the referencing of figures in the manuscript has been revised for clarity

      Lines 185-190: does literature support these as markers of SMCs? Are they known smooth muscle markers in other systems?

      We characterized the SMCs by the expression of LDL-associated protein, angiotensin-converting enzyme-like protein, and the "molecular spring" titin-like protein, all of which are commonly found in human vascular smooth muscle cells. Based on this analysis, we hypothesize that these cells belong to the smooth muscle cell category.

      Line 201: What is meant by "regulatory roles"?

      In this context, we are discussing the expression of genes encoding regulatory proteins, such as SOX transcription factors and secreted-frizzled proteins.

      Line 211: which markers disappeared? What in situs show this?

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 211: typo, "role" → "roll"

      We apologize for the mistakes, and have revised the manuscript accordingly.

      Line 214: what are these "hallmark genes"

      We apologize for the mistakes, here we are referring to the genes listed in figure 4B. We have revised the manuscript accordingly.

      Line 220: are there meristem-like cells in metazoans? If so, this would be preferable to a comparison with plants.

      In this context, we are discussing the morphological characteristics of gill proliferative cell populations found in filibranch bivalves. These populations, namely PEPC, VEPC, and DEPC, consist of cells exhibiting morphological traits akin to those of plant cambial-zone meristem cells. These cells typically display small, round shapes with a high nucleus-to-plasma ratio. We acknowledge that while these terms are utilized in bivalve studies (citations below), they lack the robust support seen in model systems backed by molecular biology evidences. The present snRNA-seq data, however, may offer valuable cell markers for future comprehensive investigations.

      Leibson, N. L. & Movchan, O. T. Cambial zones in gills of Bivalvia. Mar. Biol. 31, 175-180 (1975). https://doi.org:10.1007/BF00391629

      Wentrup, C., Wendeberg, A., Schimak, M., Borowski, C. & Dubilier, N. Forever competent: deep-sea bivalves are colonized by their chemosynthetic symbionts throughout their lifetime. Environ. Microbiol. 16, 3699-3713 (2014). https://doi.org:10.1111/1462-2920.12597

      Cannuel, R., Beninger, P. G., McCombie, H. & Boudry, P. Gill Development and its functional and evolutionary implications in the blue mussel Mytilus edulis (Bivalvia: Mytilidae). Biol. Bull. 217, 173-188 (2009). https://doi.org:10.1086/BBLv217n2p173

      Line 335: what is slingshot trajectory analysis? Does this differ from the pseudotime analysis?

      Slingshot is an algorithm that uses the principal graph of the cells to infer trajectories. It models trajectories as curves on the principal graph, capturing the progression and transitions between different cellular states.

      Both Slingshot and pseudotime aim to infer cellular trajectories. Slingshot focuses on capturing branching patterns which is fully compatible with the graph generated using dimensionality reduction such as UMAP and PHATE, while pseudotime analysis aims to order cells along a continuous trajectory. It does not rely on dimensionality reduction graphs. We used both in the MS for different purposes.

      Line 241: introduce FISH methodology earlier in the paper, when in situ images are first referenced

      We appreciate the comment, and have revised the manuscript accordingly.

      Line 246-249: can you quantify the decrease in signal or calculate the concentration of symbiotes in the cells? Was 5C imaged whole? This can impact the fluorescent intensity in tissues of different thicknesses.

      We appreciate your comment. In Figure 5C, most of the typical gill filament region is visible (the ventral tip of the gill filament, and the mid part of the gill filament) except for the dorsal end. The gill filament of bathymodioline mussels exhibits a simple structure: a single layer of bacteriocytes grow on the basal membrane. Consequently, the gill slices have a fairly uniform thickness (with two layers of bacteriocytes and one layer of interlamina cells in between), minimizing any potential impact on fluorescent intensity. As of now, detailed quantification of intracellular symbionts may necessitate continuous TEM or ultra-resolution confocal sections to 3D reconstruct the bacteriocytes, which may exceed the scope of the current study. Therefore, fluorescent intensity remains the only method available to us for estimating bacterial density/distribution across the gill filament.

      Line 249: What is meant by 'environmental gradient?'

      Here we are refereeing the gases need for symbiont’s chemosynthesis. We have revised the manuscript to make it clear.

      Lines 255-256: Were the results shown in the TEM images previously known? Not clear what novel information is conveyed in images Fig 5 C and D

      In the Fig 5 C and D, we’ve delivered a high-quality SEM TEM image of a typical bacteriocyte, showcasing its morphology and subcellular machinery with clarity. These electron microscopy images offer the audience a comprehensive introduction to the cellular function of bacteriocytes. Additionally, they serve as supportive evidence for the bacteriocytes' snRNA-seq data.

      Line 295-296: Can you elaborate on what types of solute carrier genes have been shown to be involved with symbioses?

      We appreciate the comment, and have revised the manuscript accordingly. The putative functions of the solute carriers could be found in Figure 5I.

      Line 297-301: Which genes from the bulk RNA-seq study? Adding more detail and references in cluster annotation would help readers better understand the justifications.

      We appreciate the comment, and have revised the manuscript accordingly.

      Line 316 -322: Can you provide the values of the distances?

      We also provide values in the main text, in addition to the Fig6b. We also provide a supplementary Table (Supplementary Table S19).

      Line 328: What are the gene expression patterns?

      We observed genes that are up- and down-regulated in Starvation and reconstitution.

      LIne 334-337: A visualization of the different expression levels of the specific genes in clusters between sites might be helpful to demonstrate the degree of difference between sites.

      We have prepared a new supplementary file showing the different expression levels.

      Line 337: Citation needed

      We appreciate the comment. Here, we hypothesize the cellular responds based on the gene’s function and their expression patterns.

      Line 402-403: Cannot determine lineages from data presented. Need lineage tracing over time to determine this

      We acknowledge the necessity of conducting lineage tracing over time to validate this hypothesis. Nonetheless, in practical terms, it is difficult to obtain samples for testing this. Perhaps, it is easier to use their shallow sea relatives to test this hypothesis. However, in practice, it is very difficult.

      413-414: What are the "cell-type specific responses to environmental change"? It could be interesting to present these results in the "results and discussion" section

      These results are shown in Supplementary Figure S8.

      Line 419-424: Sampling details might go better earlier on in the paper, when the sampling scheme is introduced.

      We appreciate the comments. Here, we are discussing the limitations of our current study, not sampling details.

      Line 552: What type of sequencing? Paired end? How long?

      We conducted 150bp paired-end sequencing.

      556-563: More detail here would be useful to readers not familiar with the BD guide. Also be careful to cite the software used in analysis!

      The provided guide and handbook elucidate the intricacies of gene name preparation, data alignment to the genome, and the generation of an expression matrix. It is worth mentioning that we relied upon outdated versions of the aforementioned resources during our data analysis phase, as they were the only ones accessible to us at the time. However, we have since become aware of a newer pipeline available this year, rendering the information presented here of limited significance to other researchers utilizing BD.

      Many thanks for your kind reminding. We have now included a reference for STAR. All other software was cited accordingly. There are no scholarly papers or publications to refer to for the BD pipeline that we can cite.

      Line 577-578: How was the number of clusters determined? What is meant by "manually combine the clusters?" If cells were clustered by hand, more detail on the method is needed, as well as direct discussion and justification in the body of the paper.

      It would be more appropriate to emphasize the determination of cell types rather than clusters. The clusters were identified using a clustering function, as mentioned in the manuscript. It's important to note that the clustering function (in our case, the FindClusters function of Seurat) provides a general overview based on diffuse gene expression. Technically speaking, there is no guarantee that one cluster corresponds to a single cell type. Therefore, it is crucial to manually inspect the clustering results to assign clusters to the appropriate cell types. In some cases, multiple clusters may be assigned to the same cell type, while in other cases, a single cluster may need to be further subdivided into two or more cell types or sub-cell types, depending on the specific circumstances.

      For studies conducted on model species such as humans or mice, highly and specifically expressed genes within each cluster can be compared to known marker genes of cell types mentioned in previous publications, which generally suffices for annotation purposes. However, in the case of non-model species like Bathymodioline mussels, there is often limited information available about marker genes, making it challenging to confidently assign clusters to specific cell types. In such situations, in situ hybridisation proves to be incredibly valuable. In our study, WISH was employed to visualise the expression and morphology of marker genes within clusters. When WISH revealed the expression of marker genes from a cluster in a specific type of cell, we classified that cluster as a genuine cell type. Moreover, if WISH demonstrated uniform expression of marker genes from different clusters in the same cell, we assigned both clusters to the same cell type.

      We expanded the description of the strategy in the Method section.

      LIne 690-692: When slices were used, what part of the gill were they taken from?

      We sectioned the gill around the mid part which could represent the mature bacteriocytes.

      References to figures:

      General

      Please split the fluorescent images into different channels with an additional composite. It is difficult to see some of the expression patterns. It would also make it accessible to colorblind readers.

      We appreciate the comments and suggestions from the reviewer. We have converted our figures to CMYK colour which will help the colorblind audiences to read our paper.

      Please provide the number of replicates for each in situ and what proportion of those displayed the presented pattern.

      We appreciate the reviewer’s comments. We have explained in the material and methods part of the manuscript.

      Figure 2.C' is a fantastic summary and really helps the non-mussel audience understand the results. Adding schematics like this to Figures 3-5 would be helpful as well.

      We value the reviewer's comments. We propose that Figures 3K, 4C, and 5A-D could offer similar schematic explanations to assist the audience.

      Figure 2:

      Figures 2.C-F, 2.C', 2.H-J are not referenced in the text. Adding in discussions of them would help strengthen your discussions on the cluster annotation

      We appreciate the reviewer's comments. We have revise the manuscript accordingly.

      In 2.B. 6 genes are highlighted in red and said to be shown in in situs, but only 5 are shown.

      We apology for the mistake. We didn’t include the result 20639-0.0 WISH in present study. We have changed the label to black.

      Figure 3:

      FIg 2C-E not mentioned.

      We appreciate the reviewer's comments. We have revise the manuscript accordingly.

      In 3.B 8 genes are highlighted in red and said to be shown in in situs. Only 6 are.

      The result of the WISH were provided in Supplementary Figures S4 and S5.

      FIgure 3.K is not referenced in the legend.

      We appreciate the comment, and have revised the manuscript accordingly.

      Figure 4:

      In Figure D, it might be helpful to indicate the growth direction.

      We appreciate the comment, and have revised the manuscript accordingly by adding an arrow in panel D to indicate growth direction.

      4F: A double in situ with the symbiote marker is needed to demonstrate the nucleolin-like positive cells are symbiote free.

      We appreciate the comment. The symbiont free region could be found in Figure 5A.

      Figure 5:

      In 5.A, quantification of symbiote concentration would help support your conclusion that they are denser around the edges.

      We appreciate the comment, as we mentioned above, detailed quantification of intracellular symbionts may necessitate continuous TEM or ultra-resolution confocal sections to 3D reconstruct the bacteriocytes, which may exceed the scope of the current study. Therefore, fluorescent intensity remains the only method available to us for estimating bacterial density/distribution across the gill filament.

      In 5.D, the annotation is not clear. Adding arrows like in 5.C would be helpful.

      We appreciate the comment, and have revised the manuscript accordingly.

      A few genes in 5.F are not mentioned in the paper body when listing other genes. Mentioning them would help provide more support for your clustering.

      We appreciate the comment, and have revised the manuscript accordingly.

      Is 5.I meant to be color coded with the gene groups from 5.F? Color Coding the gene names, rather than organelles or cellular structures might portray this better and help visually strengthen the link between the diagram and your dot plot.

      We appreciate the suggestions. We've experimented with color-coding the gene names, but some colors are less discernible against a white background.

      Figure 6:

      6.B Is there a better way to visualize this data? The color coding is confusing given the pairwise distances. Maybe heatmaps?

      We attempted a heatmap, as shown in the figure below. However, all co-authors agree that a bar plot provides clearer visualization compared to the heatmap. We agree that the color scheme maya be confusing because they use the same color as for individual treatment. So we change the colors.

      Author response image 1.

      Figure 6.D: Why is the fanmao sample divided in the middle?

      Fig6C show that single-cell trajectories include branches. The branches occur because cells execute alternative gene expression programs. Thus, in Fig 6D, we show changes for genes that are significantly branch dependent in both lineages at the same time. Specifically, in cluster 2, the genes are upregulated during starvation but downregulated during reconstitution. Conversely, genes in cluster 1 are downregulated during starvation but upregulated during reconstitution. It's of note that Fig 6D displays only a small subset of significantly branch-dependent genes.

      FIgure 6.D: Can you visualize the expression in the same format as in figures 2-5?

      We appreciate the comments from the reviewer. As far as we know, this heatmap are the best format to demonstrate this type of gene expression profile.

      Supplementary Figure S2:

      Please provide a key for the cell type abbreviations

      We appreciate the comment, and have added the abbreviations of cell types accordingly.

      Supplementary Figures S4 and S5:

      What part of the larger images are the subsetted image taken from?

      We appreciate the comment, these images were taken from the ventral tip and mid of the gill slices, respectively. We have revised the figure legends to make it clear.

      Supplemental Figure S7:

      If clusters 1 and 2 show genes up and downregulated during starvation, what do clusters 4 and 3 represent?

      Cluster 1: Genes that are obviously upregulated during Starvation, and downregulated during reconstitution; luster4: genes are downregulated during reconstitution but not obviously upregulated during Starvation.

      Cluster 2 show genes upregulated during reconstitution, and cluster 3 obviously downregulated during Starvation.

      Author response table 1.

      Supplemental Figure S8:

      This is a really interesting figure that I think shows some of the results really well! Maybe consider moving it to the main figures of the paper?

      We appreciate the comments and suggestions. We concur with the reviewer on the significance of the results presented. However, consider the length of this manuscript, we have prioritized the inclusion of the most pertinent information in the main figures. Supplementary materials containing additional figures and details on the genes involved in these pathways are provided for interested readers.

      Supplemental Figure S11:

      Switching the axes might make this image easier for the reader to interpret. Additionally, calculating the normalized contribution of each sample to each cluster could help quantify the extent to which bacteriocytes are reduced when starving.

      Thank you for the insightful suggestion, which we have implemented as detailed below. We acknowledge the importance of understanding the changes in bacteriocyte proportions across different treatments. However, it's crucial to note that the percentage of cells per treatment is highly influenced by factors such as the location of digestion and sequencing, as previously mentioned.

      Author response image 2.

      Reviewer #2 (Recommendations For The Authors):

      The following are minor recommendations for the text and figures that may help with clarity:

      Fig. 3K: This figure describes water flow induced by different ciliary cells. It is not clear what the color of the arrows corresponds to, as they do not match the UMAP (i.e. the red arrow) and this is not indicated in the legend. Are these colours meant to indicate the different ciliary cell types? If so it would be helpful to include this in the legend.

      We appreciate the reviewer's comments and suggestions. The arrows indicate the water flow that might be agitated by the certain types of cilium. We have revised our figure and figure legends to make it clear.

      Line 369: The incorrect gene identifier is given for the mitochondrial trifunctional enzyme. This gene identifier is identical to the one given in line 366, which describes long-chain-fatty-acid-ligase ACSBG2-like (Bpl_scaf_28862-1.5).

      We appreciate the reviewer's comments and suggestions. We have revised our manuscript accordingly.

      Line 554: The Bioproject accession number (PRJNA779258) does not appear to lead to an existing page in any database.

      We appreciate the reviewer's comments and suggestions. We have released this Bioproject to the public.

      Line 597-598: it would be helpful to know the specific number of cells that the three sample types were downsampled to, and the number of cells remaining in each cluster, as this can affect the statistical interpretation of differential expression analyses.

      The number of cells per cluster in our analysis ranged from 766 to 14633. To mitigate potential bias introduced by varying cell numbers, we implemented downsampling, restricting the number of cells per cluster to no more than 3500. This was done to ensure that the differences between clusters remained less than 5 times. We experimented with several downsampling strategies, exploring cell limits of 4500 and 2500, and consistently observed similar patterns across these variations.

      Data and code availability:

      The supplementary tables and supplementary data S1 appear to be the final output of the differential expression analyses. Including the raw data (e.g. reads) and/or intermediate data objects (e.g. count matrices, R objects), in addition to the code used to perform the analyses, may be very helpful for replication and downstream use of this dataset. As mentioned above, the Bioproject accession number appears to be incorrect.

      We appreciate the reviewer's comments and suggestions. Regarding our sequencing data, we have deposited all relevant information with the National Center for Biotechnology Information (NCBI) under Bioproject PRJNA779258. Additionally, we have requested the release of the Bioproject. Furthermore, as part of this round of revision, we have included the count matrices for reference.

      Reviewer #3 (Recommendations For The Authors):

      As noted in the public review, my only major concerns are around the treatment of progenitor cell populations. I am sympathetic to the challenges of these experiments but suggest a few possible avenues to the authors.

      First, there could be some demonstration that these cells in G. platifrons are indeed proliferative, using EdU incorporation labeling or a conserved epitope such as the phosphorylation of serine 10 in histone 3. It appears in Mytilus galloprovincialis that proliferating cell nuclear antigen (PCNA) and phospho-histone H3 have previously been used as good markers for proliferative cells (Maiorova and Odintsova 2016). The use of any of these markers along with the cell type markers the authors recover for PEBZCs for example would greatly strengthen the argument that these are proliferative cells.

      If performing these experiments would not be currently possible, the authors could use some computation approaches to strengthen their arguments. Based on conserved cell cycle markers and the use of Cell-Cycle feature analysis in Seurat could the authors provide evidence that these progenitors occupy the G2/M phase at a greater percentage than other cells? Other than the physical position of the cells is there much that suggests that these are proliferative? While I am more convinced by markers in VEPCs the markers for PEBZCs and DEPCs are not particularly compelling.

      While I do not think the major findings of the paper hinge on this, comments such as "the PBEZCs gave rise to new bacteriocytes that allowed symbiont colonization" should be taken with care. It is not clear that the PBEZCs are proliferative and there does not seem to be any direct evidence that PBEZCs (or DEPCs or VEPCS for that manner) are the progenitor cells through any sort of labeling or co-expression studies.

      We appreciate the comments and suggestions from the reviewer. We have considered all the suggestions and have revised the manuscript accordingly. We especially appreciate the reviewer’s suggestions about the characterisations of the G. platifrons gill proliferative cell populations. In a separate research project, we have tested both cell division and cell proliferation markers on the proliferation cell populations. Though we are not able to include these results in the current manuscript, we are happy to share our preliminary results with the reviewer. Our results demonstrate the proliferative cell populations, particularly the VEPCs, are cell proliferation marker positive, and contains high amount of mitotic cells.

      Author response image 3.

      Finally, there is a body of literature that has examined cell proliferation and zones of proliferation in mussels (such as Piquet, B., Lallier, F.H., André, C. et al. Regionalized cell proliferation in the symbiont-bearing gill of the hydrothermal vent mussel Bathymodiolus azoricus. Symbiosis 2020) or other organisms (such as Bird, A. M., von Dassow, G., & Maslakova, S. A. How the pilidium larva grows. EvoDevo. 2014) that could be discussed.

      We appreciate the comments and suggestions from the reviewer. We have considered all the suggestions and have revised the manuscript accordingly (line 226-229).

      Minor comments also include:

      Consider changing the orientation of diagrams in Figure 2C' in relationship to Figure 2C and 2D-K.

      We appreciate the comments and suggestions from the reviewer. The Figure 2 has been reorganized.

      For the diagram in Figure 3K, please clarify if the arrows drawn for the direction of inter lamina water flow is based on gene expression, SEM, or some previous study.

      We are grateful for the reviewer's valuable feedback and suggestions. The arrows in the figure indicate the direction of water flow that could be affected by specific types of cilium. Our prediction is based on both gene expression and SEM results. To further clarify this point, we have revised the figure legend of Fig. 3.

      Please include a label for the clusters in Figure 5E for consistency.

      We have revised our Figure 5E to keep our figures consistent.

      Please include a note in the Materials and Methods for Monocle analysis in Figure 6.

      We conducted Monocle analyses using Monocle2 and Monocle 3 in R environment. We have revised our material and methods with further information of Figure 6.

      In Supplement 2, the first column is labeled PEBC while the first row is labeled PEBZ versus all other rows and columns have corresponding names. I am guessing this is a typo and not different clusters?

      We appreciate the great effort of the reviewer in reviewing our manuscript. We have corrected the typo in the revised version.

    2. Reviewer #1 (Public Review):

      Wang, He et al have constructed comprehensive single nucleus atlas for the gills of the deep sea Bathymodioline mussels, which possess intracellular symbionts that provide a key source of carbon and allow them to live in these extreme environments. They provide annotations of the different cell states within the gills, shedding light on how multiple cell types cooperate to give rise to the emergent functions of the composite tissues and the gills as a whole. They pay special attention to characterizing the bacteriocyte cell populations and identifying sets of genes that may play a role in their interaction with the symbiotes.

      Wang, He et al sample mussels from 3 different environments: animals from their native methane rich environment, animals transplanted to a methane-poor environment to induce starvation and animals that have been starved in the methane-poor environment and then moved back to the methane-rich environment. They demonstrated that starvation had the biggest impact on bacteriocyte transcriptomes. They hypothesize that the up-regulation of genes associated with lysosomal digestion leads to the digestion of the intracellular symbiont during starvation, while the non-starved and reacclimated groups more readily harvest the nutrients from symbiotes without destroying them. Further work exploring the differences in symbiote populations between ecological conditions will further elucidate the dynamic relationship between host and symbiote. This will help disentangle specific changes in transcriptomic state that are due to their changing interactions with the symbiotes from changes associated with other environmental factors.

      This paper makes available a high quality dataset that is of interest to many disciplines of biology. The unique qualities of this non-model organism and collection of conditions sampled make it of special interest to those studying deep sea adaptation, the impact of environmental perturbation on Bathymodioline mussels populations, and intracellular symbiotes. The authors also use a diverse array of tools to explore and validate their data.

    3. Reviewer #2 (Public Review):

      Wang, He et al. shed insight into the molecular mechanisms of deep-sea chemosymbiosis at the single-cell level. They do so by producing a comprehensive cell atlas of the gill of Gigantidas platifrons, a chemosymbiotic mussel that dominates the deep-sea ecosystem. They uncover novel cell types and find that the gene expression of bacteriocytes, the symbiont-hosting cells, supports two hypotheses of host-symbiont interactions: the "farming" pathway, where symbionts are directly digested, and the "milking" pathway, where nutrients released by the symbionts are used by the host. They perform an in situ transplantation experiment in the deep sea and reveal transitional changes in gene expression that support a model where starvation stress induces bacteriocytes to "farm" their symbionts, while recovery leads to the restoration of the "farming" and "milking" pathways.

      A major strength of this study includes the successful application of advanced single nucleus techniques to a non-model, deep sea organism that remains challenging to sample. I also applaud the authors for performing an in situ transplantation experiment in a deep sea environment. From gene expression profiles, the authors deftly provide a rich functional description of G. platifrons cell types that is well-contextualized within the unique biology of chemosymbiosis. These findings offer significant insight into the molecular mechanisms of deep-sea host-symbiont ecology, and will serve as a valuable resource for future studies into the striking biology of G. platifrons.

      The authors' conclusions are generally well-supported by their results. However, I recognize that the difficulty of obtaining deep-sea specimens may have impacted experimental design and no replicates were sampled.

      It is notable that the Fanmao cells were much more sparsely sampled. It appears that fewer cells were sequenced, resulting in the Starvation and Reconstitution conditions having 2-3x more cells after doublet filtering. These discrepancies also are reflected in the proportion of cells that survived QC, suggesting a distinction in quality or approach. However, the authors provide clear and sufficient evidence via bootstrapping that batch effects between the three samples are negligible. While batch effect does not appear to have affected gene expression profiles, the proportion of cell types may remain sensitive to sampling techniques, and thus interpretation of Fig. S12 must be approached with caution.

    4. eLife assessment

      This study provides an important cell atlas of the gill of the mussel Gigantidas platifrons using a single nucleus RNA-seq dataset, a resource for the community of scientists studying deep sea physiology and metabolism and intracellular host-symbiont relationships. The evidence supporting the conclusions is convincing with high-quality single-nucleus RNA-sequencing and transplant experiments. This work will be of broad relevance for scientists interested in host-symbiont relationships across ecosystems.

    5. Reviewer #3 (Public Review):

      Wang et al. explored the unique biology of the deep-sea mussel Gigantidas platifrons to understand fundamental principles of animal-symbiont relationships. They used single-nucleus RNA sequencing and validation and visualization of many of the important cellular and molecular players that allow these organisms to survive in the deep-sea. They demonstrate that a diversity of cell types that support the structure and function of the gill including bacteriocytes, specialized epithelial cells that host sulfur-oxidizing or methane-oxidizing symbionts as well as a suite of other cell types including supportive cells, ciliary, and smooth muscle cells. By performing experiments of transplanting mussels from one habitat which is rich in methane to methane-limited environments, the authors showed that starved mussels may consume endosymbionts versus in methane-rich environments upregulated genes involved in glutamate synthesis. These data add to the growing body of literature that organisms control their endosymbionts in response to environmental change.

      The conclusions of the data are well supported. The authors adapted a technique that would have been technically impossible in their field environment by preserving the tissue and then performing nuclear isolation after the fact. The use of single-nucleus sequencing opens the possibility of new cellular and molecular biology that is not possible to study in the field. Additionally, the in-situ data (both WISH and FISH) are high-quality and easy to interpret. The use of cell-type-specific markers along with a symbiont-specific probe was effective. Finally, the SEM and TEM were used convincingly for specific purposes in the case of showing the cilia that may support water movement.

      The one particular area for future exploration surrounds the concept of a proliferative progenitor population within the gills. The authors recover molecular markers for these putative populations and additional future work will uncover if these are indeed proliferative cells contribute to symbiont colonization.

      Overall the significance of this work is identifying the relationship between symbionts and bacteriocytes and how these host bacteriocytes modulate their gene expression in response to environmental change. It will be interesting to see how similar or different these data are across animal phyla. For instance, the work of symbiosis in cnidarians may converge on similar principles of there may be independent ways in which organisms have been able to solve these problems.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The authors' findings are primarily rooted in a series of well-conducted in vitro experiments using two CML cell lines, K562 and MEG-01. While the findings are interesting and novel, further work to corroborate these findings in primary CML samples would have greatly strengthened the potential real-world relevance of these discoveries. The authors appear to have some PBMCs from primary CML patients and a BM sample from a Ph+ ALL in which they performed western blot analyses (Fig 1). Couldn't these samples have been used to at least confirm some of the key discoveries? For example, the neddylation of BCR-ABL, or; sensitivity of primary leukemic cells to RAPSYN knockdown, and/or; phosphorylation of RAPSYN by SRC?

      We agree with your points and really appreciate your comments. To demonstrate the clinical relevance, we have conducted a series of experiments to address your concerns.

      (1) after a thorough optimization on the transduction process, we have managed to show that shRNA-mediated gene silencing of RAPSYN impaired the growth of primary CML samples. These additional data are presented as Figure 1D in the revised manuscript with its corresponding figure legend and description, lines 136-141.

      (2) we have invested tremendous time and effort to deal with “key discoveries” regardless of the almost impossible task with a great technical difficulty. With 5 mL (ethical approval) of PBMCs on hands, we have finally managed to confirm BCR-ABL neddylation by IP from two newly acquired CML patients. The results are as presented in Figure 2F in the revised manuscript with its corresponding figure legend and description, lines 186-187.

      (2) The authors initially interrogated a fairly dated (circa 2009) microarray-based primary dataset to show that the increase in RAPSYN is primarily a post-transcriptional event, as mRNA levels are not different between healthy and CML samples. It would be interesting to see whether differences might be more readily seen in more recent RNA-seq datasets from CML patients, given the well-known differences in sensitivity between the two platforms. Additionally, I wonder if there would be transcriptional signatures of increased NEDDylation (or RAPSYN-induced NEDDylation) that could be interrogated in primary samples? Furthermore, there are proteomics datasets of CML cells made resistant to TKIs (through in vitro selection experiments) that could be interrogated for independent validation of the authors' discoveries. For example: from K562 cells, PMID: 30730747 or PMID: 34922009).

      Thank you very much for your constructive comments. Based on your suggestion, we have 1) analyzed mRNA level of RAPSYN in RNA-seq datasets GSE13159 (2009), GSE138883 (2020) and GSE140385 (2020), indicating no difference between CML patients and healthy donors. We have included the results in Figure1-figure supplementary 1A and in the revised manuscript (lines 123-127); 2) examined the RNA levels of RAPSYN-related neddylation enzymes, including E1 (NAE1), E2 (UBE2M), NEDD8 and NEDP1 in these databases, and no significant differences of these neddylation-related genes were found between CML patients and healthy donors as well (Supplementary Figure 2C, lines 168-172).

      We have also analyzed the proteomics datasets from PMID: 30730747 and PMID: 34922009 according to your suggestion. Unfortunately, no information on RAPSYN expression is available in these datasets. To avoid potential negligence, we have examined all CML-related proteomics datasets from 2002 to 2024, still resulting in no information about protein expression of RAPSYN. Consequently, our finding on the higher expression of RAPSYN in the PBMCs of Ph+ patients in this study appears to be an observation for the first time. And we believe that our results should be more clinically relevant than those, if any, from the cells by in vitro selection.

      Reviewer #2 (Public Review):

      Most of the conclusions drawn in this paper are well supported by data, but some aspects of the data need to be clarified and extended:

      (1) The authors propose that targeting RAPSYN in Ph+ leukemia could have a high therapeutic index, suggesting that inhibition of RAPSYN may lead to cytotoxicity in Ph+ leukemia with high specificity and minimal side effects. To substantiate this assertion, the authors should investigate the impact on cell viability upon RAPSYN knockdown in non-Ph leukemic cell lines or HS-5 cells (similar to Figure 1C), despite their lower RAPSYN protein levels.

      We appreciate your valuable comments. When we used shRNA to knockdown the expression of RAPSYN in HS-5 cells, it did not affect the cell growth of HS-5 cells. We have included the data in Figure 1C, modified its figure legend, and added corresponding description, lines 136-141.

      (2) The authors intriguingly show that the protein levels of RAPSYN are significantly enriched in Ph+ patient samples and cell lines (Figure 1A, B), even though the mRNA levels remain unchanged (Supplementary Figure 1 A-C). This observation merits a clear explanation in the context of the presented results. The data in the manuscript does imply a feedforward loop mechanism (Figure 7), where BCR-ABL activates SRC, which subsequently stabilizes RAPSYN, which in turn helps protect BCR-ABL from c-CBL-mediated degradation. If this is the working hypothesis, it would be beneficial for the reader to see supporting evidence.

      Thank you very much for pointing out the issue. We have realized the inappropriateness of Figure 7, which was originally placed as a summarizing figure. To avoid potential confusion and misleading, this figure has been deleted, which does not affect the results and conclusions of this study. In addition, the differences on mRNA levels and protein expressions have been responded to Reviewer #1.

      (3) The authors present compelling evidence to suggest that RAPSYN may possess direct NEDD8-ligase activity on BCR-ABL. To strengthen this claim, it may be valuable to conduct further assays involving a ligase-deficient mutant, such as C366A, beyond its use in Figure 2J. Incorporating this mutant into the in vitro assay illustrated in Figure 2K, for instance, could offer substantial validation for the claim. In addition, showing whether the ligase-deficient mutant is capable of phenocopying the phosphorylation-mutant Y336F, as showcased in Figures 5E, F, and 6D, F, would be beneficial.

      We are grateful to your comments. In the manuscript, we have provided sufficient data to support the direct neddylation of BCR-ABL by RAPSYN, as you commented “The authors present compelling evidence to suggest that RAPSYN may possess direct NEDD8-ligase activity on BCR-ABL.”. Cys366 was previously demonstrated as the catalytic residue essential for E3 activity of RAPSYN (Li et al. 2016, PMID: 27839998), and the phosphorylation at Phe336 was thoroughly verified by site-directed mutagenesis and the treatments of SRC-specific inhibitor saracatinib in present cellular experiments. Therefore, while we fully respect your opinions, we do not think it would be necessary to perform tedious in vitro reactions for expected negative results, which was the reason for us not to conduct enzymatic reactions with known inactive mutants, such as C366A and Y336F, in the first place.

      (4) The observations presented in Figures 6 C-G require additional clarification. Notably, there are discrepancies in relative cell viability effects in K562 cells, and to some extent in MEG-01 cells, under conditions that are indicated as being either identical or highly similar. For instance, this inconsistency is observable when comparing the left panels of Figure 6C and 6D in the case of NC overexpression + shSRC#2, and the left panels of Figure 6E and 6G with NC overexpression or shNC, respectively. Listing potential causes of these discrepancies would strengthen the overall validity of the findings and their subsequent interpretation.

      Thank you for your comments and apologize for the confusion. To make a meaningful comparison, we have revised the method part “Preparation of stable RAPSYNWT, RAPSYNY336F or SRC expression cell lines” (lines 625-627) and reorganized Figure 6 to reflect the differences on the negative controls. In fact, we first used LV6 (EF-1a/Puro; OE-NC1) vector for the overexpression of RAPSYNWT and SRC. Due to low expression level with LV6 and long period of time for subsequent selection, we switched to LV18 (CMV/Puro; OE-NC2) for the overexpression of RAPSYNY336F. Since the sensitivities of K562/MEG01-OE-NC cells to shSRC transduction in Figure 6C (now revised to K562/MEG01-OE-NC1) and 6D (now revised to K562/MEG01-OE-NC2) were noticeably different, we have separated RAPSYNWT and RAPSYNY336F cells as 6C and 6D with their own corresponding empty vector as negative control, instead of merging the results into a single figure with one negative control of OE-NC. In addition, given the fact that K562/MEG01 cells reacted differently upon saracatinib treatments after transduction with the empty vector, we have also distinguished the negative controls as OE-NC1 in Figure 6E, OE-NC2 in Figure 6F and shNC in Figure 6G. Afterall, the transduction of K562/MEG01 cells with different expression vectors and viral particles caused the discrepancies in the experiments of cell viability, which has been clarified by reorganizing Figure 6 in the revision.

      (5) Throughout the manuscript, immunoblots which showcase immunoprecipitations of BCR-ABL or His-BCR-ABL depict poly-neddylation (e.g. Figures 2E-M, 3D-G, and 5A-E) and poly-ubiquitination (e.g. Figures 3D-G) patterns/smears where these patterns seem to extend below the molecular weight of BCR-ABL. To enhance clarity, it would be valuable for the authors to provide an explanation in the text or the figure legend for this observation. Is it reflective of potential degradation of BCR-ABL or is there another explanation behind it?

      Thank you for your valuable comments. After carefully checking original immunoblots, we have ascertained that the protein band of BCR-ABL was at 250 KDa and the smear bands appeared to be higher than 250 KDa were likely caused by the conjugation of NEDD8 (neddylation) or Ubiquitin (ubiquitination) onto BCR-ABL. Regarding the molecular weight of modified BCR-ABL lower than expected, whether it is a common feature as previously reported (Mao, J., et al, 2010, PMID: 21118980) or possible degradation during the modification process or sample preparation requires further investigation. We have corrected the labeling of figures in the revised manuscript.

      Reviewer #1 (Recommendations For The Authors):

      (1) It would really nail the real-world relevance of these nice findings if the authors are able to confirm some aspects of their cell line-based discoveries in publicly available 'omics datasets generated from primary CML samples. I have suggested some of these in the public review as well.

      Alternatively, if they are able to investigate samples from murine CML models (eg. BALB/c CML models), it would represent a step towards real-world relevance.

      Thank you very much for your constructive comments. According to your suggestion, we have examined and analyzed RAPSYN mRNA and protein in updated and publicly available datasets as replied in the public response.

      (2) The Discussion repeats some of the information already presented in the Introduction (for example, lines 311-327 of the merged document, or lines 349-358). I would urge the authors to instead expand more about how RAPSYN might be upregulated at the post-transcriptional level, or its potential post-translational regulation by SRC-mediated phosphorylation.

      Thanks for your constructive suggestion. We have re-written this part according to your suggestion and marked in red color in the revised manuscript, lines 319-325 and lines 351-378.

      (3) There are instances of clunky phrases/grammatical mistakes in the manuscript which detract from its readability (eg: lines 142-143: "...empty body transduced shRAPSN#3 or K562 cells into...."; lines 163-164: "Despite AChR subunits α7, M2, M3, and M4 were expressed in all tested cells, no change..."; line 178: "Preeminent BCR-ABL neddylation was detected in..."). A closer proof-reading of the final manuscript is advisable.

      We appreciate the valuable comments. We have made changes for improvement, which is marked in red color in the revised manuscript, lines 145-147, lines 166-168 and line 185.

      (4) The western blot in Fig 5C (particularly the control "OE-NC" of K562) looks drastically different from the corresponding control lanes in Figs 5A and 5B. Similarly, the cell viability curves presented in Fig 6D and 6F (for both K562 and MEG-01, control conditions) look very different from the corresponding curves in Figs 6A and 6B.

      We appreciate for your valuable comments. Because we accidently used the imagines with different exposure time, the western blots in Fig 5C (particularly the control "OE-NC" of K562) look very different from corresponding control lanes in Figs 5A and 5B. We have replaced images with the same exposure time in the revised manuscript.

      For readers to clearly understand, we have revised the method part “Preparation of stable RAPSYNWT, RAPSYNY336F or SRC expression cell lines” (lines 625-627) and related figure legends to reflect the differences.

      We have publicly responded the discrepancy on cell viability.

      Reviewer #2 (Recommendations For The Authors):

      In reviewing your study, I must insist that the completeness and robustness of your work would significantly benefit from a more exhaustive listing of the antibodies used for immunoblotting and immunoprecipitation within the Materials and Methods section. A number of antibodies have been accounted for, however, crucial ones targeting BCR-ABL, c-CBL, Ubiquitin, NEDD8, HA, Myc, and others appear to be omitted. To maintain rigorous scientific standards, I strongly encourage you to include these.

      We appreciate your comments. We have carefully checked the section of Methods and added detailed information of antibodies for Immunoblotting and Immunoprecipitation in the revised manuscript, lines 502-516.

    2. eLife assessment

      In this important study, the authors describe a novel function for RAPSYN in bcr-abl fusion associated leukemia, presenting convincing evidence that RAPSYN stabilizes the oncogenic BCR-ABL fusion protein. Compared to an earlier version of the manuscript, the authors have added data using primary samples that strengthen the conclusions.

    3. Reviewer #1 (Public Review):

      The manuscript by Zhao et al describes the identification of RAPSYN, a NEDD8 E3 ligase previously studied for its role in acetylcholine receptor clustering and neuromuscular junction formation, as a factor promoting the stabilisation of the BCR-ABL oncogene in Chronic Myeloid Leukemia (CML) cells. The authors have identified that NEDDylation of BCR-ABL by RAPSYN antagonises its poly-ubiquitin and subsequent proteasome-based degradation. Knocking down RAPSYN with shRNA led to increased poly-ubiquitination and faster turnover of BCR-ABL. Furthermore, they describe that SRC-dependent phosphorylation of RAPSYN facilitates its NEDD8-ligase activity.

      The authors' findings are primarily rooted in a series of well-conducted in vitro experiments using two CML cell lines, K562 and MEG-01. They have performed some further validations using primary CML samples, which have strengthened their claims.

      The author's initial discoveries have come from interrogating a number of publicly available gene expression datasets, both microarray-based and RNA-seq, which revealed that RAPSYN is increased at the protein level but that RNA levels are not different between healthy and CML samples. This is a very interesting observation which warrants further future investigation.

      The conclusions of this revised manuscript are broadly supported by the data and the analyses. It also describes novel findings that can spur future studies, both into the basic cellular biology of CML as well as into potential new therapeutic strategies.

      Comments on revised version:

      I thank the authors for addressing my concerns in the initial review. The revised manuscript with additional data is much stronger.

    4. Reviewer #2 (Public Review):

      In this study the authors aim to elucidate the role of RAPSYN in BCR-ABL-mediated leukemogenesis. RAPSYN is mainly known as a scaffolding protein for anchoring acetylcholine receptors (AChRs) to the cytoskeleton in muscle cells, facilitating AChR clustering through neddylation (Li et al., 2016). The authors demonstrate, through a broad and rigorous array of biochemical assays, that RAPSYN also plays a crucial role in the neddylation of BCR-ABL in leukemia cells. Their results indicate that this process shields BCR-ABL from ubiquitination and subsequent degradation, likely through a mechanism involving competition for binding with the BCR-ABL ubiquitin ligase c-CBL. In addition, the authors delve into the regulatory mechanisms underlying RAPSYN stability, demonstrating that it is enhanced through phosphorylation by SRC. This discovery further deepens our understanding of the complex dynamics of the molecular interactions that regulate BCR-ABL stability in leukemia.

      To confirm the physiological significance of their findings, the authors effectively utilize cell viability assays and in vivo models. The integration of these approaches lends strength and validity to their conclusions.

      The implications of the findings presented in this study are important, particularly in relation to our understanding of the pathogenesis and potential therapeutic strategies for Philadelphia chromosome-positive leukemias. By illuminating the role of RAPSYN in the regulation of BCR-ABL stability, this research potentially uncovers avenues for the development of targeted therapies, making a significant contribution to the field.

      Two areas of the study could benefit from additional validation and exploration:

      (1) The authors propose that targeting RAPSYN in Ph+ leukemia could have a high therapeutic index, suggesting that inhibition of RAPSYN may lead to cytotoxicity in Ph+ leukemia with high specificity and minimal side effects. The authors now include data showing RAPSYN knockdown in HS-5 cells does not affect cell growth (Figure 1C), supporting this assertion. This observation presents a contrast to DepMap data (https://depmap.org/), where RNAi and CRISPR-mediated RAPSYN depletion across hundreds of cell lines does not exhibit obvious differential effects on cell viability compared to Ph+ leukemia cell lines. Therefore, while the current results are promising, they call for additional validation by future studies to confirm RAPSYN as a viable therapeutic target in this context.

      (2) A particularly notable yet underexplored aspect of this study is the observed disparity between RAPSYN protein and mRNA levels in Ph+ patient samples and cell lines. There is a marked enrichment of RAPSYN protein levels (Figure 1A, B) despite seemingly unchanged mRNA levels (Supplementary Figure 1 A-C). The authors convincingly demonstrate that RAPSYN stabilizes BCR-ABL, while SRC-mediated phosphorylation in turn stabilizes RAPSYN. This points to a specific, SRC-driven stabilization mechanism of RAPSYN in the Ph+ leukemia context. Consequently, the question arises whether BCR-ABL (through activation of SRC) reciprocally stabilize RAPSYN? Exploring the effects of BCR-ABL depletion on RAPSYN levels could shed light on this potential two-way stabilization mechanism, offering deeper insight into the complex molecular dynamics of RAPSYN and BCR-ABL in Ph+ leukemias.

      In conclusion, this study represents a pivotal advancement in our understanding of Philadelphia chromosome-positive leukemias. It uniquely positions RAPSYN, a protein previously not associated with leukemogenesis, as a key regulator of BCR-ABL stability. Future research is essential to establish RAPSYN's potential as a therapeutic target and to more comprehensively understand its role in this context.

      Comments on revised version:

      I acknowledge and appreciate the author responses. Below are our comments on each reply:

      Reply 1: Your response and the inclusion of data regarding RAPSYN knockdown in HS-5 cells adequately address the concerns.

      Reply 2: The issue of the disparity between RAPSYN protein and mRNA levels in Ph+ leukemias has not sufficiently been resolved. Refer to point 2 in the revised review for more details. If conducting the proposed experiment is not feasible, I recommend a more thorough discussion in the manuscript to address and hypothesize about the causes of this discrepancy between protein and mRNA levels.

      Reply 3: Your rationale for not performing additional assays with inactive mutants is satisfactory.

      Reply 4: The clarification provided in your revision of the method section and the reorganization of Figure 6 successfully resolve the previously noted discrepancies. However, to ensure consistency and clarity across the paper, I recommend that you also specify the batches of constructs/viruses used in other relevant figures, such as Figure 1E.

      Reply 5: The clarification provided on the immunoblots sufficiently addresses the concern raised.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The authors have made important contributions to our understanding of the pathogenesis of erectile dysfunction (ED) in diabetic patients. They have identified the gene Lbh, expressed in pericytes of the penis and decreased in diabetic animals. Overexpression of Lbh appears to counteract ED in these animals. The authors also confirm Lbh as a potential marker in cavernous tissues in both humans and mice. While solid evidence supports Lbh's functional role as a marker gene, further research is needed to elucidate the specific mechanisms by which it exerts its effects. This work is of interest to those working in the fields of ED and angiogenesis.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the researchers aimed to investigate the cellular landscape and cell-cell interactions in cavernous tissues under diabetic conditions, specifically focusing on erectile dysfunction (ED). They employed single-cell RNA sequencing to analyze gene expression patterns in various cell types within the cavernous tissues of diabetic individuals. The researchers identified decreased expression of genes associated with collagen or extracellular matrix organization and angiogenesis in several cell types, including fibroblasts, chondrocytes, myofibroblasts, valve-related lymphatic endothelial cells, and pericytes. They also discovered a newly identified marker, LBH, that distinguishes pericytes from smooth muscle cells in mouse and human cavernous tissues. Furthermore, the study revealed that pericytes play a role in angiogenesis, adhesion, and migration by communicating with other cell types within the corpus cavernosum. However, these interactions were found to be significantly reduced under diabetic conditions. The study also investigated the role of LBH and its interactions with other proteins (CRYAB and VIM) in maintaining pericyte function and highlighted their potential involvement in regulating neurovascular regeneration. Overall, the manuscript is well-written and the study provides novel insights into the pathogenesis of ED in patients with diabetes and identifies potential therapeutic targets for further investigation.

      Comments on revised version:

      For Figure 4, immunofluorecent staining of LBH following intracavernous injections with lentiviruses is required to justify overexpression and tissue specificity.

      We agree with this claims. Therefore, we have performed the immunofluorecent staining of LBH in cavernous tissues after infection with LBH O/E lentiviruses. And we found the LBH expression is significantly decreased in DM or DM+NC groups, however, after infection with LBH O/E lentiviruses, the LBH expression is significantly increased, shown as Supplementary Fig. 10. (Please see revised ‘Result’ and ‘Supplementary Fig. 10’)

      Reviewer #3 (Public Review):

      Bae et al. described the key roles of pericytes in cavernous tissues in diabetic erectile dysfunction using both mouse and human single-cell transcriptomic analysis. Erectile dysfunction (ED) is caused by dysfunction of the cavernous tissue and affects a significant proportion of men aged 40-70. The most common treatment for ED is phosphodiesterase 5 inhibitors; however, these are less effective in patients with diabetic ED. Therefore, there is an unmet need for a better understanding of the cavernous microenvironment, cell-cell communications in patients with diabetic ED, and the development of new therapeutic treatments to improve the quality of life.

      Pericytes are mesenchymal-derived mural cells that directly interact with capillary endothelial cells (ECs). They play a vital role in the pathogenesis of erectile function as their interactions with ECs are essential for penile erection. Loss of pericytes has been associated with diabetic retinopathy, cancer, and Alzheimer's disease and has been investigated in relation to the permeability of cavernous blood vessels and neurovascular regeneration in the authors' previous studies. This manuscript explores the mechanisms underlying the effect of diabetes on pericyte dysfunction in ED. Additionally, the cellular landscape of cavernous tissues and cell type-specific transcriptional changes were carefully examined using both mouse and human single-cell RNA sequencing in diabetic ED. The novelty of this work lies in the identification of a newly identified pericyte (PC)-specific marker, LBH, in mouse and human cavernous tissues, which distinguishes pericytes from smooth muscle cells. LBH not only serves as a cavernous pericyte marker, but its expression level is also reduced in diabetic conditions. The LBH-interacting proteins (Cryab and Vim) were further identified in mouse cavernous pericytes, indicating that these signaling interactions are critical for maintaining normal pericyte function. Overall, this study demonstrates the novel marker of pericytes and highlights the critical role of pericytes in diabetic ED.

      Comments on revised version:

      Bae and colleagues substantially improved the data quality and revised their manuscript "Pericytes contribute to pulmonary vascular remodeling via HIF2a signaling". While these revisions clarify some of the concerns raised, others remain. In my view, the following question must be addressed.

      In my prior question on #3, I completely disagree with the statement that "identified cells with pericyte-like characteristics in the walls of large blood vessels". The staining that authors provided for LBH, was clearly stained for SMCs, not pericytes. Per Fig 2E, the authors are correct that LBH is colocalized with SMA+ cells( SMCs). However, the red signal from LBH clearly stains endothelial cells. In the rest of 2E and 2D, LBH is CD31- and their location suggests LBH stained for SMCs in the Aorta, Kidney vasculature, Dorsal vein, and Dorsal Artery.

      We respect the reviewer's comments and provide further justification for the reviewer's concerns. We first performed double staining of LBH and CD31 on dorsal artery and dorsal vein tissues. We found that LBH-expressing cells are completely different from CD31-expressing cells (Figrue 2D, indicated by arrows, and Supplementary Fig. 10A) and that expression is higher in veins than in arteries. This is consistent with previous understanding. In addition, in the double staining of LBH and α-SMA, we also found that there was no overlap between LBH-expressing cells and α-SMA-expressing smooth muscle cells in the cavernosum tissues, but there was some overlap in dorsal artery and dorsal vein (Figrue 2E, indicated by arrows). This may indicate that LBH is expressed slightly different types of blood vessels. This requires further experiments to prove in the future. In addition, to avoid confusion among other readers. We modify our previous discussion regarding the identification of cells with pericyte-like characteristics in the walls of large blood vessels. We removed the associated immunofluorescence staining in the aorta and kidneys replaced them with dorsal artery and dorsal vein (Please see revised ‘Result’ and ‘Figure 2’ and ‘Supplementary Fig. 10A’)

    2. eLife assessment

      The authors provide important insights into the pathogenesis of erectile dysfunction (ED) in patients with diabetes. The authors present compelling evidence, using single-cell transcriptomic analysis in both mouse and human cavernous tissues, to support their claims regarding the key roles of pericytes in diabetic ED. The identification of LBH as a potential pericyte-specific marker in both mouse and human tissues further strengthens their findings. This well-written manuscript offers novel and significant contributions to the field, identifying potential therapeutic targets for further investigation.

    3. Reviewer #1 (Public Review):

      In this study, the researchers aimed to investigate the cellular landscape and cell-cell interactions in cavernous tissues under diabetic conditions, specifically focusing on erectile dysfunction (ED). They employed single-cell RNA sequencing to analyze gene expression patterns in various cell types within the cavernous tissues of diabetic individuals. The researchers identified decreased expression of genes associated with collagen or extracellular matrix organization and angiogenesis in several cell types, including fibroblasts, chondrocytes, myofibroblasts, valve-related lymphatic endothelial cells, and pericytes. They also discovered a newly identified marker, LBH, that distinguishes pericytes from smooth muscle cells in mouse and human cavernous tissues. Furthermore, the study revealed that pericytes play a role in angiogenesis, adhesion, and migration by communicating with other cell types within the corpus cavernosum. However, these interactions were found to be significantly reduced under diabetic conditions. The study also investigated the role of LBH and its interactions with other proteins (CRYAB and VIM) in maintaining pericyte function and highlighted their potential involvement in regulating neurovascular regeneration. Overall, the manuscript is well-written and the study provides novel insights into the pathogenesis of ED in patients with diabetes and identifies potential therapeutic targets for further investigation.

      Comments on revised version:

      All my concerns have been properly addressed.

    4. Reviewer #3 (Public Review):

      Bae and colleagues substantially improved the data quality and revised their manuscript "Single cell transcriptome analysis of cavernous tissues reveals the key roles of pericytes in diabetic erectile dysfunction". While these revisions clarify some of the concerns raised, others remain. In my view, the following question must be addressed:

      In my prior question on #3, I completely disagree with the statement that "identified cells with pericyte-like characteristics in the walls of large blood vessels". The staining that authors provided for LBH, was clearly stained for SMCs, not pericytes. Per Fig 2E, the authors are correct that LBH is colocalized with SMA+ cells( SMCs). However, the red signal from LBH clearly stains endothelial cells. In the rest of 2E and 2D, LBH is CD31- and their location suggests LBH stained for SMCs in the Aorta, Kidney vasculature, Dorsal vein, and Dorsal Artery.

    1. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Jang et al. describes the application of new methods to measure the localization of GTP-binding signaling proteins (G proteins) on different membrane structures in a model mammalian cell line (HEK293). G proteins mediate signaling by receptors found at the cell surface (GPCRs), with evidence from the last 15 years suggesting that GPCRs can induce G-protein mediated signaling from different membrane structures within the cell, with variation in signal localization leading to different cellular outcomes. While it has been clearly shown that different GPCRs efficiently traffic to various intracellular compartments, it is less clear whether G proteins traffic in the same manner, and whether GPCR trafficking facilitates "passenger" G protein trafficking. This question was a blind spot in the burgeoning field of GPCR localized signaling in need of careful study, and the results obtained will serve as an important guidepost for further work in this field. The extent to which G proteins localize to different membranes within the cell is the main experimental question tested in this manuscript. This question is pursued through two distinct methods, both relying on genetic modification of the G-beta subunit with a tag. In one method, G-beta is modified with a small fragment of the fluorescent protein mNG, which combines with the larger mNG fragment to form a fully functional fluorescent protein to facilitate protein trafficking by fluorescent microscopy. This approach was combined with the expression of fluorescent proteins directed to various intracellular compartments (different types of endosomes, lysosome, endoplasmic reticulum, Golgi, mitochondria) to look for colocalization of G-beta with these markers. These experiments showed compelling evidence that G-beta co-localizes with markers at the plasma membrane and the lysosome, with weak or absent co-localization for other markers. A second method for measuring localization relied on fusing G-beta with a small fragment from a miniature luciferase (HiBit) that combines with a larger luciferase fragment (LgBit) to form an active luciferase enzyme. Localization of G-beta (and luciferase signal) was measured using a method known as bystander BRET, which relies on the expression of a fluorescent protein BRET acceptor in different cellular compartments. Results using bystander BRET supported findings from fluorescence microscopy experiments. These methods for tracking G protein localization were also used to probe other questions. The activation of GPCRs from different classes had virtually no impact on the localization of G-beta, suggesting that GPCR activation does not result in the shuttling of G proteins through the endosomal pathway with activated receptors.

      Strengths:

      The question probed in this study is quite important and, in my opinion, understudied by the pharmacology community. The results presented here are an important call to be cognizant of the localization of GPCR coupling partners in different cellular compartments. Abundant reports of endosomal GPCR signaling need to consider how the impact of lower G protein abundance on endosomal membranes will affect the signaling responses under study.

      The work presented is carefully executed, with seemingly high levels of technical rigor. These studies benefit from probing the experimental questions at hand using two different methods of measurement (fluorescent microscopy and bystander BRET). The observation that both methods arrive at the same (or a very similar) answer inspires confidence about the validity of these findings.

      Weaknesses:

      The rationale for fusing G-beta with either mNG2(11) or SmBit could benefit from some expansion. I understand the speculation that using the smallest tag possible may have the smallest impact on protein performance and localization, but plenty of researchers have fused proteins with whole fluorescent proteins to provide conclusions that have been confirmed by other methods. Many studies even use G proteins fused with fluorescent proteins or luciferases. Is there an important advantage to tagging G-beta with small tags? Is there evidence that G proteins with full-size protein tags behave aberrantly? If the studies presented here would not have been possible without these CRISPR-based tagging approaches, it would be helpful to provide more context to make this clearer. Perhaps one factor would be interference from newly synthesized G proteins-fluorescent protein fusions en route to the plasma membrane (in the ER and Golgi).

      As noted by the authors, they do not demonstrate that the tagged G-beta is predominantly found within heterotrimeric G protein complexes. If there is substantial free G-beta, then many of the conclusions need to be reconsidered. Perhaps a comparison of immunoprecipitated tagged G beta vs immunoprecipitated supernatant, with blotting for other G protein subunits would be informative.

      Additional context and questions:

      (1) There exists some evidence that certain GPCRs can form enduring complexes with G-beta-gamma (Pubmed: 23297229, 27499021). That would seem to offer a mechanism that would enable receptor-mediated transport of G protein subunits. It would be helpful for the authors to place the findings of this manuscript in the context of these previous findings since they seem somewhat contradictory.

      (2) There is some evidence that GaS undergoes measurable dissociation from the plasma membrane upon activation (see the mechanism of the assay in Pubmed: 35302493). It seems possible that G-alpha (and in particular GaS) might behave differently than the G-beta subunit studied here. This is not entirely clear from the discussion as it now stands.

      (3) The authors say "The presence of mNG-b1 on late endosomes suggested that some G proteins may be degraded by lysosomes". The mechanism of lysosomal degradation by proteins on the outside of the lysosome is not clear. It would be helpful for the authors to clarify.

      (4) Although the authors do a good job of assessing G protein dilution in endosomal membranes, it is unclear how this behavior compares to the measurement of other lipid-anchored proteins using the same approach. Is the dilution of G proteins what we would expect for any lipid-anchored protein at the inner leaflet of the plasma membrane?

    2. Reviewer #2 (Public Review):

      This is an interesting method that addresses the important problem of assessing G protein localization at endogenous levels. The data are generally convincing.

      Specific comments

      Methods:<br /> The description of the gene editing method is unclear. There are two different CRISPR cell lines made in two different cell backgrounds. The methods should clearly state which CRISPR guides were used on which cell line. It is also not clear why HiBit is included in the mNG-β1 construct. Presumably, this is not critical but it would be helpful to explicitly note. In general, the Methods could be more complete.

      Results:<br /> The explanation of validation experiments in Figures 1 C and D is incomplete and difficult to follow. The rationale and explanation of the experiments could be expanded. In addition, because this is an interesting method, it would be helpful to know if the endogenous editing affects normal GPCR signaling. For example, the authors could include data showing an Iso-induced cAMP response. This is not critical to the present interpretation but is relevant as a general point regarding the method. Also, it may be relevant to the interpretation of receptor effects on G protein localization.

      Discussion:<br /> The conclusion that beta-gamma subunits do not redistribute after GPCR activation seems new and different from previous reports. Is this correct? Can the authors elaborate on how the results compare to previous literature?

      Can the authors note that OpenCell has endogenously tagged Gβ1 and reports more obvious internal localization? Can the authors comment on this point?

      Is this the first use of CRISPR / HiBit for BRET assay? It would be helpful to know this or cite previous work if not. Also, as this is submitted as a tools piece, the authors might say a little more about the potential application to other questions.

    3. Reviewer #3 (Public Review):

      Summary:

      This article addresses an important and interesting question concerning intracellular localization and dynamics of endogenous G proteins. The fate and trafficking of G protein-coupled receptors (GPCRs) have been extensively studied but so far little is known about the trafficking routes of their partner G proteins that are known to dissociate from their respective receptors upon activation of the signaling pathway. The authors utilize modern cell biology tools including genome editing and bystander bioluminescence resonance energy transfer (BRET) to probe intracellular localization of G proteins in various membrane compartments in steady state and also upon receptor activation. Data presented in this manuscript shows that while G proteins are mostly present on the plasma membrane, they can be also detected in endosomal compartments, especially in late endosomes and lysosomes. This distribution, according to data presented in this study, seems not to be affected by receptor activation. These findings will have implications in further studies addressing GPCR signaling mechanisms from intracellular compartments.

      Strengths:

      The methods used in this study are adequate for the question asked. Especially, the use of genome-edited cells (for the addition of the tag on one of the G proteins) is a great choice to prevent the effects of overexpression. Moreover, the use of bystander BRET allowed authors to probe the intracellular localization of G proteins in a very high-throughput fashion. By combining imaging and BRET authors convincingly show that G proteins are very low abundant on early endosomes (also ER, mitochondria, and medial Golgi), however seem to accumulate on membranes of late endosomal compartments.

      Weaknesses:

      While the authors provide a novel dataset, many questions regarding G protein trafficking remain open. For example, it is not entirely clear which pathway is utilized to traffic G proteins from the plasma membrane to intracellular compartments. Additionally, future studies should also address the dynamics of G protein trafficking, for example by tracking them over multiple time points.

    4. eLife assessment

      This important study investigates the intracellular localization patterns of G proteins involved in GPCR signaling, presenting convincing evidence for their preference for plasma and lysosomal membranes over endosomal, endoplasmic reticulum, and Golgi membranes. This discovery has significant implications for understanding GPCR action and signaling from intracellular locations. This research will interest cell biologists studying protein trafficking and pharmacologists exploring localized signaling phenomena.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, Qiu and colleagues examined the effects of preovulatory (i.e., proestrous or late follicular phase) levels of circulating estradiol on multiple calcium and potassium channel conductances in arcuate nucleus kisspeptin neurons. Although these cells are strongly linked to a role as the "GnRH pulse generator," the goal here was to examine the physiological properties of these cells in a hormonal milieu mimicking late proestrus, the time of the preovulatory GnRH-LH surge. Computational modeling is used to manipulate multiple conductances simultaneously and support a role for certain calcium channels in facilitating a switch in firing mode from tonic to bursting. CRISPR knockdown of the TRPC5 channel reduced overall excitability, but this was only examined in cells from ovariectomized mice without estradiol treatment. The patch clamp experiments are comprehensive and overall solid but a direct demonstration of the role of these conductances in being necessary for surge generation (or at least having a direct physiological consequence on surge properties) is lacking, substantially reducing the impact of the findings.

      Strengths:

      (1) Examination of multiple types of calcium and potassium currents, both through electrophysiology and molecular biology.

      (2) Focus on arcuate kisspeptin neurons during the surge is relatively conceptually novel as the anteroventral periventricular nucleus (AVPV) kisspeptin neurons have received much more attention as the "surge generator" population.

      (3) The modeling studies allow for direct examination of manipulation of single and multiple conductances, whereas the electrophysiology studies necessarily require examination of each current in isolation. The construction of an arcuate kisspeptin neuron model promises to be of value to the reproductive neuroendocrinology field.

      We thank the reviewer for recognizing our comprehensive examination of Kiss-ARH neurons through electrophysiological, molecular and computational modeling of their activity during the preovulatory surge, which as the reviewer pointed out is “conceptually novel.” We will bolster our argument that Kiss1-ARH neurons transition from synchronized firing to burst firing with the E2-mediated regulation of channel expression with the addition of new experiments. We will address the weaknesses as follows:

      Weaknesses:

      (1) The novelty of some of the experiments needs to be clarified. This reviewer's understanding is that prior experiments largely used a different OVX+E2 treatment paradigm mimicking periods of low estradiol levels, whereas the present work used a "high E2" treatment model. However, Figures 10C and D are repeated from a previous publication by the same group, according to the figure legend. Findings from "high" vs. "low" E2 treatment regimens should be labeled and clearly separated in the text. It would also help to have direct comparisons between results from low E2 and high E2 treatment conditions.

      We will revise Figures 10C and 10D to include new findings on Tac2 and Vglut2 expression in OVX and E2-treated Kiss1ARH. We did show the previously published data (Qiu, eLife 2018) to contrast with Figures 10E, F showing the downregulation of TRPC5 and GIRK2 channels following E2 treatment. Most importantly, our E2 treatment regime is clearly stated in the Methods and is exactly the same that was used previously (Qiu, eLife 2016 and Qiu, eLife 2018) for the induction of the LH surge in OVX mice (Bosch, Molecular and Cellular Endocrinology 2013) .

      (2) In multiple places, links are made between the changes in conductances and the transition from peptidergic to glutamatergic neurotransmission. However, this relationship is never directly assessed. The data that come closest are the qPCR results showing reduced Tac2 and increased Vglut2 mRNA, but in the figure legend, it appears that these results are from a prior publication using a different E2 treatment regimen.

      In the revised Figure 1, we will now include a clear depiction of the transition from synchronized firing driven by NKB signaling in OVX females to burst firing driven by glutamate in E2-treated females. We have used the same E2 treatment paradigm as previously published (Qiu, eLife 2018).

      (3) Similarly, no recordings of arcuate-AVPV glutamatergic transmission are made so the statements that Kiss1ARH neurons facilitate the GnRH surge via this connection are still only conjecture and not supported by the present experiments.

      Using a horizontal hypothalamic slice preparation, we have shown that Kiss1-ARH neurons excite GnRH neurons via Kiss1ARH glutaminergic input to Kiss1AvPV neurons (summarized in Fig. 12, Qiu, eLife 2016). We do not think that it is necessary to repeat these experiments in the current manuscript.

      (4) Figure 1 is not described in the Results section and is only tenuously connected to the statement in the introduction in which it is cited. The relevance of panels C and D is not clear. In this regard, much is made of the burst firing pattern that arises after E2 treatment in the model, but this burst firing pattern is not demonstrated directly in the slice electrophysiology examples.

      We will revised Figure 1 to include new whole-cell, current clamp recordings documenting the burst firing in response to glutamate in E2-treated, OVX females.

      (5) In Figure 3, it would be preferable to see the raw values for R1 and R2 in each cell, to confirm that all cells were starting from a similar baseline. In addition, it is unclear why the data for TTA-P2 is not shown, or how many cells were recorded to provide this finding.

      Before initiating photo-stimulation for each Kiss1-ARH neuron, we adjust the resting membrane potential to -70 mV, as noted in each panel in Figure 3, through current injections. We will include new findings on the effects of the T-channel blocker TTA-P2 on slow EPSP in the revised Figure 3. The number of cells tested with each calcium channel blocker is depicted in each of the bar graphs summarizing the effects of the blockers.

      (6) In Figure 5, panel C lists 11 cells in the E2 condition but panel E lists data from 37 cells. The reason for this discrepancy is not clear.

      In Figure 5E, we measured the L-, N-, P/Q and R channel currents after pretreatment with TTA-P2 to block the T-type current, whereas in Figure 5C, we measured the current without TTA-P2.

      (7) In all histogram figures, it would be preferable to have the data for individual cells superimposed on the mean and SEM.

      In all revised Figures we will include the individual data points for the individual neurons.

      (8) The CRISPR experiments were only performed in OVX mice, substantially limiting interpretation with respect to potential roles for TRPC5 in shaping arcuate kisspeptin neuron function during the preovulatory surge.

      The TRPC5 channels are most important for generating slow EPSPs when expression of NKB is high in the OVX state. Conversely, the glutamatergic response becomes more significant when the expression of NKB and TRPC5 channel are muted. Therefore, the CRISPR experiments were specifically conducted in OVX mice to maximize the effects.

      (9) Furthermore, there are no demonstrations that the CRISPR manipulations impair or alter the LH surge.

      In this manuscript, our focus is on the cellular electrophysiological activity of the Kiss1ARH neurons in ovx and E2-treated females. Exploration of CRISPR manipulations related to the LH surge is certainly slated for future experiments, but these in vivo experiments are beyond the scope of these comprehensive cellular electrophysiological and molecular studies.

      (10) The time of day of slice preparation and recording needs to be specified in the Methods.

      We will provide the times of slice preparation and recordings in the revised Methods and Materials.

      Reviewer #2 (Public Review):

      Summary:

      Kisspeptin neurons of the arcuate nucleus (ARC) are thought to be responsible for the pulsatile GnRH secretory pattern and to mediate feedback regulation of GnRH secretion by estradiol (E2). Evidence in the literature, including the work of the authors, indicates that ARC kisspeptin coordinate their activity through reciprocal synaptic interactions and the release of glutamate and of neuropeptide neurokinin B (NKB), which they co-express. The authors show here that E2 regulates the expression of genes encoding different voltage-dependent calcium channels, calcium-dependent potassium channels, and canonical transient receptor potential (TRPC5) channels and of the corresponding ionic currents in ARC kisspeptin neurons. Using computer simulations of the electrical activity of ARC kisspeptin neurons, the authors also provide evidence of what these changes translate into in terms of these cells' firing patterns. The experiments reveal that E2 upregulates various voltage-gated calcium currents as well as 2 subtypes of calcium-dependent potassium currents while decreasing TRPC5 expression (an ion channel downstream of NKB receptor activation), the slow excitatory synaptic potentials (slow EPSP) elicited in ARC kisspeptin neurons by NKB release and expression of the G protein-associated inward-rectifying potassium channel (GIRK). Based on these results, and on those of computer simulations, the authors propose that E2 promotes a functional transition of ARC kisspeptin neurons from neuropeptide-mediated sustained firing that supports coordinated activity for pulsatile GnRH secretion to a less intense firing in glutamatergic burst-like firing pattern that could favor glutamate release from ARC kisspeptin. The authors suggest that the latter might be important for the generation of the preovulatory surge in females.

      Strengths:

      The authors combined multiple approaches in vitro and in silico to gain insights into the impact of E2 on the electrical activity of ARC kisspeptin neurons. These include patch-clamp electrophysiology combined with selective optogenetic stimulation of ARC kisspeptin neurons, reverse transcriptase quantitative PCR, pharmacology, and CRIPR-Cas9-mediated knockdown of the Trpc5 gene. The addition of computer simulations for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength.

      The authors add interesting information on the complement of ionic currents in ARC kisspeptin neurons and on their regulation by E2 to what was already known in the literature. Pharmacological and electrophysiological experiments appear of the highest standards. Robust statistical analyses are provided throughout, although some experiments (illustrated in Figures 7 and 8) do have rather low sample numbers.

      The impact of E2 on calcium and potassium currents is compelling. Likewise, the results of Trpc5 gene knockdown do provide good evidence that the TRPC5 channel plays a key role in mediating the NKB-mediated slow EPSP. Surprisingly, this also revealed an unsuspected role for this channel in regulating the membrane potential and excitability of ARC kisspeptin neurons.

      We thank the reviewer for recognizing that the “pharmacological and electrophysiological experiments appear of the highest standards” and “the addition of the computer modeling for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength. However, we agree with the reviewer that we need to provide a direct demonstration of “burst-like” firing of Kiss1-ARH neurons. We will address the weaknesses as follows:

      Weaknesses:

      The manuscript also has weaknesses that obscure some of the conclusions drawn by the authors.

      One has to do with the fact that "burst-like" firing that the authors postulate ARC kisspeptin neurons transition to after E2 replacement is only seen in computer simulations, and not in slice patch-clamp recordings. A more direct demonstration of the existence of this firing pattern, and of its prominence over neuropeptide-dependent sustained firing under conditions of high E2 would make a more convincing case for the authors' hypothesis.

      We will provide a more direct demonstration of the existence of this firing pattern in the whole-cell current clamp experiments in the revised Figure 1.

      In addition, and quite importantly, the authors compare here two conditions, OVX versus OVX replaced with high E2, that may not reflect the physiological conditions (the diestrous [low E2] and proestrous [high E2] stages of the estrous cycle) under which the proposed transition between neuropeptide-dependent sustained firing and less intense burst firing might take place. This is an important caveat to keep in mind when interpreting the authors' findings. Indeed, that E2 alters certain ionic currents when added back to OVX females, does not mean that the magnitude of these ionic currents will vary during the estrous cycle.

      We have published that the magnitude of the slow EPSP, which is TRPC5 channel mediated, varies throughout the estrous cycle and the similarity to that found in OVX compared to E2-treated, OVX females (Figure 2, Qiu, eLife 2016). Moreover, TRPC5 channel mRNA expression, similar to the peptides, is downregulated by an E2 treatment (Figure 10 this manuscript) that mimics proestrus levels of the steroid (Bosch, Mol Cell Endocrinology 2013). Furthermore, the magnitude of ionic currents is directly proportional to the number of ion channels expressed in the plasma membrane, which we have found correlates with mRNA expression. Therefore, it is likely that the magnitude of these ionic currents will vary during the estrous cycle.

      Lastly, the results of some of the pharmacological and genetic experiments may be difficult to interpret as presented. For example, in Figure 3, although it is possible that blockade of individual calcium channel subtypes suppresses the slow EPSP through decreased calcium entry at the somato-dendritic compartment to sustain TRPC5 activation and the slow depolarization (as the authors imply), a reasonable alternative interpretation would be that at least some of the effects on the amplitude of the slow EPSP result from suppression of presynaptic calcium influx and, thus, decreased neurotransmitter and neuropeptide secretion. Along the same lines, in Figure 12, one possible interpretation of the observed smaller slow EPSPs seen in mice with mutant TRPC5 could be that at least some of the effect is due to decreased neurotransmitter and neuropeptide release due to the decreased excitability associated with TRPC5 knockdown.

      The reviewer raises a good point, but our previous findings clearly demonstrate that chelating intracellular calcium with BAPTA in whole-cell current clamp recordings abolishes the slow EPSP and persistent firing (Qiu, J. Neurosci 2021), which we have noted is the rationale for dissecting out the contribution of T, R, N, L and P/Q calcium channels to the slow EPSP in our current studies (revised Figure 3 will include the effects of T-channel blocker).

      However, to further bolster the argument for the post-synaptic contribution of the calcium channels to the slow EPSP and eliminate the potential presynaptic effects of calcium channel blockers on the postsynaptic slow EPSP amplitude, which may result from reduced presynaptic calcium influx and subsequently decreased neurotransmitter release, we will utilized an additional strategy. Specifically, we will measure the response to the externally administered TACR3 agonist senktide under conditions in which the extracellular calcium influx, as well as neurotransmitter and neuropeptide release, are blocked (new Figure 3).

    2. eLife assessment

      This study addresses the effects of estrogen on the kisspeptin1 subset of neurons in the arcuate nucleus of the hypothalamus of female mice after ovaries were surgically removed. The authors repeat some of their prior work and provide new and interesting findings about the effects of estrogen on currents mediated by calcium and potassium channels, suggest a neurotransmitter "switch", and suggest Trpc5 regulates Kisspeptin 1 neuron excitability. While useful in its significance, there are concerns that the evidence for some conclusions is incomplete. This study will be of interest to endocrinologists and reproductive biologists.

    3. Reviewer #2 (Public Review):

      Summary:

      Kisspeptin neurons of the arcuate nucleus (ARC) are thought to be responsible for the pulsatile GnRH secretory pattern and to mediate feedback regulation of GnRH secretion by estradiol (E2). Evidence in the literature, including the work of the authors, indicates that ARC kisspeptin coordinate their activity through reciprocal synaptic interactions and the release of glutamate and of neuropeptide neurokinin B (NKB), which they co-express. The authors show here that E2 regulates the expression of genes encoding different voltage-dependent calcium channels, calcium-dependent potassium channels, and canonical transient receptor potential (TRPC5) channels and of the corresponding ionic currents in ARC kisspeptin neurons. Using computer simulations of the electrical activity of ARC kisspeptin neurons, the authors also provide evidence of what these changes translate into in terms of these cells' firing patterns. The experiments reveal that E2 upregulates various voltage-gated calcium currents as well as 2 subtypes of calcium-dependent potassium currents while decreasing TRPC5 expression (an ion channel downstream of NKB receptor activation), the slow excitatory synaptic potentials (slow EPSP) elicited in ARC kisspeptin neurons by NKB release and expression of the G protein-associated inward-rectifying potassium channel (GIRK). Based on these results, and on those of computer simulations, the authors propose that E2 promotes a functional transition of ARC kisspeptin neurons from neuropeptide-mediated sustained firing that supports coordinated activity for pulsatile GnRH secretion to a less intense firing in glutamatergic burst-like firing pattern that could favor glutamate release from ARC kisspeptin. The authors suggest that the latter might be important for the generation of the preovulatory surge in females.

      Strengths:

      The authors combined multiple approaches in vitro and in silico to gain insights into the impact of E2 on the electrical activity of ARC kisspeptin neurons. These include patch-clamp electrophysiology combined with selective optogenetic stimulation of ARC kisspeptin neurons, reverse transcriptase quantitative PCR, pharmacology, and CRIPR-Cas9-mediated knockdown of the Trpc5 gene. The addition of computer simulations for understanding the impact of E2 on the electrical activity of ARC kisspeptin cells is also a strength.

      The authors add interesting information on the complement of ionic currents in ARC kisspeptin neurons and on their regulation by E2 to what was already known in the literature. Pharmacological and electrophysiological experiments appear of the highest standards. Robust statistical analyses are provided throughout, although some experiments (illustrated in Figures 7 and 8) do have rather low sample numbers.

      The impact of E2 on calcium and potassium currents is compelling. Likewise, the results of Trpc5 gene knockdown do provide good evidence that the TRPC5 channel plays a key role in mediating the NKB-mediated slow EPSP. Surprisingly, this also revealed an unsuspected role for this channel in regulating the membrane potential and excitability of ARC kisspeptin neurons.

      Weaknesses:

      The manuscript also has weaknesses that obscure some of the conclusions drawn by the authors.

      One has to do with the fact that "burst-like" firing that the authors postulate ARC kisspeptin neurons transition to after E2 replacement is only seen in computer simulations, and not in slice patch-clamp recordings. A more direct demonstration of the existence of this firing pattern, and of its prominence over neuropeptide-dependent sustained firing under conditions of high E2 would make a more convincing case for the authors' hypothesis.

      In addition, and quite importantly, the authors compare here two conditions, OVX versus OVX replaced with high E2, that may not reflect the physiological conditions (the diestrous [low E2] and proestrous [high E2] stages of the estrous cycle) under which the proposed transition between neuropeptide-dependent sustained firing and less intense burst firing might take place. This is an important caveat to keep in mind when interpreting the authors' findings. Indeed, that E2 alters certain ionic currents when added back to OVX females, does not mean that the magnitude of these ionic currents will vary during the estrous cycle.

      Lastly, the results of some of the pharmacological and genetic experiments may be difficult to interpret as presented. For example, in Figure 3, although it is possible that blockade of individual calcium channel subtypes suppresses the slow EPSP through decreased calcium entry at the somato-dendritic compartment to sustain TRPC5 activation and the slow depolarization (as the authors imply), a reasonable alternative interpretation would be that at least some of the effects on the amplitude of the slow EPSP result from suppression of presynaptic calcium influx and, thus, decreased neurotransmitter and neuropeptide secretion. Along the same lines, in Figure 12, one possible interpretation of the observed smaller slow EPSPs seen in mice with mutant TRPC5 could be that at least some of the effect is due to decreased neurotransmitter and neuropeptide release due to the decreased excitability associated with TRPC5 knockdown.

    4. Reviewer #1 (Public Review):

      Summary:

      In this work, Qiu and colleagues examined the effects of preovulatory (i.e., proestrous or late follicular phase) levels of circulating estradiol on multiple calcium and potassium channel conductances in arcuate nucleus kisspeptin neurons. Although these cells are strongly linked to a role as the "GnRH pulse generator," the goal here was to examine the physiological properties of these cells in a hormonal milieu mimicking late proestrus, the time of the preovulatory GnRH-LH surge. Computational modeling is used to manipulate multiple conductances simultaneously and support a role for certain calcium channels in facilitating a switch in firing mode from tonic to bursting. CRISPR knockdown of the TRPC5 channel reduced overall excitability, but this was only examined in cells from ovariectomized mice without estradiol treatment. The patch clamp experiments are comprehensive and overall solid but a direct demonstration of the role of these conductances in being necessary for surge generation (or at least having a direct physiological consequence on surge properties) is lacking, substantially reducing the impact of the findings.

      Strengths:

      (1) Examination of multiple types of calcium and potassium currents, both through electrophysiology and molecular biology.

      (2) Focus on arcuate kisspeptin neurons during the surge is relatively conceptually novel as the anteroventral periventricular nucleus (AVPV) kisspeptin neurons have received much more attention as the "surge generator" population.

      (3) The modeling studies allow for direct examination of manipulation of single and multiple conductances, whereas the electrophysiology studies necessarily require examination of each current in isolation. The construction of an arcuate kisspeptin neuron model promises to be of value to the reproductive neuroendocrinology field.

      Weaknesses:

      (1) The novelty of some of the experiments needs to be clarified. This reviewer's understanding is that prior experiments largely used a different OVX+E2 treatment paradigm mimicking periods of low estradiol levels, whereas the present work used a "high E2" treatment model. However, Figures 10C and D are repeated from a previous publication by the same group, according to the figure legend. Findings from "high" vs. "low" E2 treatment regimens should be labeled and clearly separated in the text. It would also help to have direct comparisons between results from low E2 and high E2 treatment conditions.

      (2) In multiple places, links are made between the changes in conductances and the transition from peptidergic to glutamatergic neurotransmission. However, this relationship is never directly assessed. The data that come closest are the qPCR results showing reduced Tac2 and increased Vglut2 mRNA, but in the figure legend, it appears that these results are from a prior publication using a different E2 treatment regimen.

      (3) Similarly, no recordings of arcuate-AVPV glutamatergic transmission are made so the statements that Kiss1ARH neurons facilitate the GnRH surge via this connection are still only conjecture and not supported by the present experiments.

      (4) Figure 1 is not described in the Results section, and is only tenuously connected to the statement in the introduction in which it is cited. The relevance of panels C and D is not clear. In this regard, much is made of the burst firing pattern that arises after E2 treatment in the model, but this burst firing pattern is not demonstrated directly in the slice electrophysiology examples.

      (5) In Figure 3, it would be preferable to see the raw values for R1 and R2 in each cell, to confirm that all cells were starting from a similar baseline. In addition, it is unclear why the data for TTA-P2 is not shown, or how many cells were recorded to provide this finding.

      (6) In Figure 5, panel C lists 11 cells in the E2 condition but panel E lists data from 37 cells. The reason for this discrepancy is not clear.

      (7) In all histogram figures, it would be preferable to have the data for individual cells superimposed on the mean and SEM.

      (8) The CRISPR experiments were only performed in OVX mice, substantially limiting interpretation with respect to potential roles for TRPC5 in shaping arcuate kisspeptin neuron function during the preovulatory surge.

      (9) Furthermore, there are no demonstrations that the CRISPR manipulations impair or alter the LH surge.

      (10) The time of day of slice preparation and recording needs to be specified in the Methods.

    1. Author response:

      eLife assessment

      Unlocking the potential of molecular genetic tools (optogenetics, chemogenetics, sensors, etc.) for the study of systems neuroscience in nonhuman primates requires the development of effective regulatory elements for cell-type specific expression to facilitate circuit dissection. This study provides a valuable building block, by carefully characterizing the laminar expression profile of two viral vectors, one designed for general GABA+ergic neurons and the second for parvalbumin+ cell-type selective expression in the marmoset primary visual cortex. The authors provide solid evidence for the first enhancer S5E2 and incomplete evidence for the second one, h56D. This study contributes to our understanding of these tools but is limited by the understandably small number of animals used.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Federer et al. tested AAVs designed to target GABAergic cells and parvalbumin-expressing cells in marmoset V1. Several new results were obtained. First, AAV-h56D targeted GABAergic cells with >90% specificity, and this varied with serotype and layer. Second, AAV-PHP.eB.S5E2 targeted parvalbumin-expressing neurons with up to 98% specificity. Third, the immunohistochemical detection of GABA and PV was attenuated near viral injection sites.

      Strengths:

      Vormstein-Schneider et al. (2020) tested their AAV-S5E2 vector in marmosets by intravenous injection. The data presented in this manuscript are valuable in part because they show the transduction pattern produced by intraparenchymal injections, which are more conventional and efficient.

      Our manuscript additionally provides detailed information on the laminar specificity and coverage of these viral vectors, which was not investigated in the original studies.

      Weaknesses:

      The conclusions regarding the effects of serotype are based on data from single injection tracks in a single animal. I understand that ethical and financial constraints preclude high throughput testing, but these limitations do not change what can be inferred from the measurements. The text asserts that "...serotype 9 is a better choice when high specificity and coverage across all layers are required". The data presented are consistent with this idea but do not make a strong case for it.

      We are aware of the limitations of our results on the AAV-h56D. We agree with the Reviewer that a single injection per serotype does not allow us to make strong statements about differences between the 3 serotypes. Therefore, in the revised version of the manuscript we will temper our claims about such differences and use more caution in the interpretation of these data. Despite this weakness, we feel that these data still demonstrate high efficiency and specificity across cortical layers of transgene expression in GABA cells using the h56D promoter, at least with two of the 3 AAV serotypes we tested. We feel that in itself this is sufficiently useful information for the primate community, worthy of being reported. Due to cost, time and ethical considerations related to the use of primates, we chose not to perform additional experiments to determine precise differences among serotypes. Thus, for example, while it is possible that had we replicated these experiments, serotype 7 would have proven equally efficient and specific as the other two serotypes, we felt answering this question did not warrant additional experiments in this precious species.

      A related criticism extends to the analysis of injection volume on viral specificity. Some replication was performed here, but reliability across injections was not reported. My understanding is that individual ROIs were treated as independent observations. These are not biological replicates (arguably, neither are multiple injection tracks in a single animal, but they are certainly closer). Idiosyncrasies between animals or injections (e.g. if one injection happened to hit one layer more than another) could have substantial impacts on the measurements. It remains unclear which results regarding injection volume or serotype would hold up had a large number of injections been made into a large number of marmosets.

      For the AAV-S5E2, we made a total of 7 injections (at least 2 at the same volume), all of which, irrespective of volume, resulted in high specificity and efficiency for PV interneurons. Our conclusion is that larger volumes are slightly less specific, but the differences are minimal and do not warrant additional injections. Additionally, all of our injections involved all cortical layers, and the ROIs we selected for counts encompassed reporter protein expression across all layers. To provide a better sense of the reliability of the results across injections, in the revised version of the manuscript we will provide a supplementary table with results for each injection case separately.

      Reviewer #2 (Public Review):

      This is a straightforward manuscript assessing the specificity and efficiency of transgene expression in marmoset primary visual cortex (V1), for 4 different AAV vectors known to target transgene expression to either inhibitory cortical neurons (3 serotypes of AAV-h56D-tdTomato) or parvalbumin (PV)+ inhibitory cortical neurons in mice. Vectors are injected into the marmoset cortex and then postmortem tissue is analyzed following antibody labeling against GABA and PV. It is reported that: "in marmoset V1 AAV-h56D induces transgene expression in GABAergic cells with up to 91-94% specificity and 80% efficiency, depending on viral serotype and cortical layer. AAV-PHP.eB-S5E2 induces transgene expression in PV cells across all cortical layers with up to 98% specificity and 86-90% efficiency."

      These claims are largely supported but slightly exaggerated relative to the actual values in the results presented. In particular, the overall efficiency for the best h56D vectors described in the results is: "Overall, across all layers, AAV9 and AAV1 showed significantly higher coverage (66.1{plus minus}3.9 and 64.9%{plus minus}3.7)". The highest coverage observed is just in middle layers and is also less than 80%: "(AAV9: 78.5%{plus minus}9.1; AAV1: 76.9%{plus minus}7.4)".

      In the abstract, we indeed summarize the overall data and round up the decimals, and state that these parentages are upper bound and that they vary by serotype and layer, while in the Results we report the detailed counts with decimals. To clarify this, in the revised version of the Abstract we will change 80% to 79% and emphasize even more clearly the dependence on serotype and layer. We will amend this sentence of the Abstract as follows: “We show that in marmoset V1 AAV-h56D induces transgene expression in GABAergic cells with up to 91-94% specificity and 79% efficiency, but this depends on viral serotype and cortical layer.”

      For the AAV-PHP.eB-S5E2 the efficiency reported in the abstract ("86-90%) is also slightly exaggerated relative to the results: "Overall, across all layers coverage ranged from 78%{plus minus}1.9 for injection volumes >300nl to 81.6%{plus minus}1.8 for injection volumes of 100nl."

      Indeed, the numbers in the Abstract are upper bounds, for example efficiency in L4A/B with S5E2 reaches 90%. To further clarify this important point, in the revised abstract we will state ”AAV-PHP.eB-S5E2 induces transgene expression in PV cells across all cortical layers with up to 98% specificity and 86-90% efficiency, depending on layer”.

      These data will be useful to others who might be interested in targeting transgene expression in these cell types in monkeys. Suggestions for improvement are to include more details about the vectors injected and to delete some comments about results that are not documented based on vectors that are not described (see below).

      Major comments:

      Details provided about the AAV vectors used with the h56D enhancer are not sufficient to allow assessment of their potential utility relative to the results presented. All that is provided is: "The fourth animal received 3 injections, each of a different AAV serotype (1, 7, and 9) of the AAV-h56D-tdTomato (Mehta et al., 2019), obtained from the Zemelman laboratory (UT Austin)." At a minimum, it is necessary to provide the titers of each of the vectors. It would also be helpful to provide more information about viral preparation for both these vectors and the AAVPHP.eB-S5E2.tdTomato. Notably, what purification methods were used, and what specific methods were used to measure the titers?

      We thank the Reviewer for this comment. In the revised version of the manuscript, we will provide a Table with titers of each viral vector injected as well as more information regarding viral preparation methods. In fact, the methods for viral preparation and purification are detailed in the original publications so we feel it may be sufficient to cite the original papers?

      The first paragraph of the results includes brief anecdotal claims without any data to support them and without any details about the relevant vectors that would allow any data that might have been collected to be critically assessed. These statements should be deleted. Specifically, delete: "as well as 3 different kinds of PV-specific AAVs, specifically a mixture of AAV1-PaqR4-Flp and AAV1-h56D-mCherry-FRT (Mehta et al., 2019), an AAV1-PV1-ChR2-eYFP (donated by G. Horwitz, University of Washington)," and delete "Here we report results only from those vectors that were deemed to be most promising for use in primate cortex, based on infectivity and specificity. These were the 3 serotypes of the GABA-specific pAAV-h56D-tdTomato, and the PV-specific AAVPHP.eB-S5E2.tdTomato." These tools might in fact be just as useful or even better than what is actually tested and reported here, but maybe the viral titer was too low to expect any expression.

      This data is indeed anecdotal, and while we could delete it from the manuscript, as suggested by the Reviewer, we feel it could be useful information for the scientific community. It could prevent other labs from wasting resources, animals and time, particularly, as some of these vectors have been reported to be selective and efficient in the primate cortex, which we have not been able to confirm. We made several injections in several animals of those vectors that failed either to infect a sufficient number of cells or turned out to be poorly specific. Therefore, the negative results have been consistent. But we agree with the Reviewer that our negative results could have depended on factors such as titer. In the revised version of the manuscript, we will provide a supplementary Methods section in which we will report the specifics of the vectors that failed in our hands (i.e. number of injections made in how many animals, volumes, survival time, and titers).

      Based on the description in the Methods it seems that no antibody labeling against TdTomato was used to amplify the detection of the transgenes expressed from the AAV vectors. It should be verified that this is the case - a statement could be added to the Methods.

      That is indeed the case. We used no immunohistochemistry to enhance the reporter proteins as this was unnecessary. The native / non-emplified tdT signal was strong.

      Reviewer #3 (Public Review):

      Summary:

      Federer et al. describe the laminar profiles of GABA+ and of PV+ neurons in marmoset V1. They also report on the selectivity and efficiency of expression of a PV-selective enhancer (S5E2). Three further viruses were tested, with a view to characterizing the expression profiles of a GABA-selective enhancer (h56d), but these results are preliminary.

      Strengths:

      The derivation of cell-type specific enhancers is key for translating the types of circuit analyses that can be performed in mice - which rely on germline modifications for access to cell-type specific manipulation - in higher-order mammals. Federer et al. further validate the utility of S5E2 as a PV-selective enhancer in NHPs.

      Additionally, the authors characterize the laminar distribution pattern of GABA+ and PV+ cells in V1. This survey may prove valuable to researchers seeking to understand and manipulate the microcircuitry mediating the excitation-inhibition balance in this region of the marmoset brain.

      Weaknesses:

      Enhancer/promoter specificity and efficiency cannot be directly compared, because they were packaged in different serotypes of AAV.

      The three different serotypes of AAV expressing reporter under the h56D promoter were only tested once each, and all in the same animal. There are many variables that can contribute to the success (or failure) of a viral injection, so observations with an n=1 cannot be considered reliable.

      This is an important point that was also brought up by the Reviewer 1, which we thoroughly addressed in our comments. For clarity and convenience, we copied our response to Reviewer 1 below:.

      We are aware of the limitations of our results on the AAV-h56D. We agree with the Reviewer that a single injection per serotype does not allow us to make strong statements about differences between the 3 serotypes. Therefore, in the revised version of the manuscript we will temper our claims about such differences and use more caution in the interpretation of these data. Despite this weakness, we feel that these data still demonstrate high efficiency and specificity across cortical layers of transgene expression in GABA cells using the h56D promoter, at least with two of the 3 AAV serotypes we tested. We feel that in itself this is sufficiently useful information for the primate community, worthy of being reported. Due to cost, time and ethical considerations related to the use of primates, we chose not to perform additional experiments to determine precise differences among serotypes. Thus, for example, while it is possible that had we replicated these experiments, serotype 7 would have proven equally efficient and specific as the other two serotypes, we felt answering this question did not warrant additional experiments in this precious species.

      The language used throughout conflates the cell-type specificity conferred by the regulatory elements with that conferred by the serotype of the virus.

      In the revised version of the manuscript we will correct ambiguous language.

    2. eLife assessment

      Unlocking the potential of molecular genetic tools (optogenetics, chemogenetics, sensors, etc.) for the study of systems neuroscience in nonhuman primates requires the development of effective regulatory elements for cell-type specific expression to facilitate circuit dissection. This study provides a valuable building block, by carefully characterizing the laminar expression profile of two viral vectors, one designed for general GABA+ergic neurons and the second for parvalbumin+ cell-type selective expression in the marmoset primary visual cortex. The authors provide solid evidence for the first enhancer S5E2 and incomplete evidence for the second one, h56D. This study contributes to our understanding of these tools but is limited by the understandably small number of animals used.

    3. Reviewer #1 (Public Review):

      Summary:

      Federer et al. tested AAVs designed to target GABAergic cells and parvalbumin-expressing cells in marmoset V1. Several new results were obtained. First, AAV-h56D targeted GABAergic cells with >90% specificity, and this varied with serotype and layer. Second, AAV-PHP.eB.S5E2 targeted parvalbumin-expressing neurons with up to 98% specificity. Third, the immunohistochemical detection of GABA and PV was attenuated near viral injection sites.

      Strengths:

      Vormstein-Schneider et al. (2020) tested their AAV-S5E2 vector in marmosets by intravenous injection. The data presented in this manuscript are valuable in part because they show the transduction pattern produced by intraparenchymal injections, which are more conventional and efficient.

      Weaknesses:

      The conclusions regarding the effects of serotype are based on data from single injection tracks in a single animal. I understand that ethical and financial constraints preclude high throughput testing, but these limitations do not change what can be inferred from the measurements. The text asserts that "...serotype 9 is a better choice when high specificity and coverage across all layers are required". The data presented are consistent with this idea but do not make a strong case for it.

      A related criticism extends to the analysis of injection volume on viral specificity. Some replication was performed here, but reliability across injections was not reported. My understanding is that individual ROIs were treated as independent observations. These are not biological replicates (arguably, neither are multiple injection tracks in a single animal, but they are certainly closer). Idiosyncrasies between animals or injections (e.g. if one injection happened to hit one layer more than another) could have substantial impacts on the measurements. It remains unclear which results regarding injection volume or serotype would hold up had a large number of injections been made into a large number of marmosets.

    4. Reviewer #2 (Public Review):

      This is a straightforward manuscript assessing the specificity and efficiency of transgene expression in marmoset primary visual cortex (V1), for 4 different AAV vectors known to target transgene expression to either inhibitory cortical neurons (3 serotypes of AAV-h56D-tdTomato) or parvalbumin (PV)+ inhibitory cortical neurons in mice. Vectors are injected into the marmoset cortex and then postmortem tissue is analyzed following antibody labeling against GABA and PV. It is reported that: "in marmoset V1 AAV-h56D induces transgene expression in GABAergic cells with up to 91-94% specificity and 80% efficiency, depending on viral serotype and cortical layer. AAV-PHP.eB-S5E2 induces transgene expression in PV cells across all cortical layers with up to 98% specificity and 86-90% efficiency."

      These claims are largely supported but slightly exaggerated relative to the actual values in the results presented. In particular, the overall efficiency for the best h56D vectors described in the results is: "Overall, across all layers, AAV9 and AAV1 showed significantly higher coverage (66.1{plus minus}3.9 and 64.9%{plus minus}3.7)". The highest coverage observed is just in middle layers and is also less than 80%: "(AAV9: 78.5%{plus minus}9.1; AAV1: 76.9%{plus minus}7.4)". For the AAV-PHP.eB-S5E2 the efficiency reported in the abstract ("86-90%) is also slightly exaggerated relative to the results: "Overall, across all layers coverage ranged from 78%{plus minus}1.9 for injection volumes >300nl to 81.6%{plus minus}1.8 for injection volumes of 100nl."

      These data will be useful to others who might be interested in targeting transgene expression in these cell types in monkeys. Suggestions for improvement are to include more details about the vectors injected and to delete some comments about results that are not documented based on vectors that are not described (see below).

      Major comments:

      Details provided about the AAV vectors used with the h56D enhancer are not sufficient to allow assessment of their potential utility relative to the results presented. All that is provided is: "The fourth animal received 3 injections, each of a different AAV serotype (1, 7, and 9) of the AAV-h56D-tdTomato (Mehta et al., 2019), obtained from the Zemelman laboratory (UT Austin)." At a minimum, it is necessary to provide the titers of each of the vectors. It would also be helpful to provide more information about viral preparation for both these vectors and the AAVPHP.eB-S5E2.tdTomato. Notably, what purification methods were used, and what specific methods were used to measure the titers?

      The first paragraph of the results includes brief anecdotal claims without any data to support them and without any details about the relevant vectors that would allow any data that might have been collected to be critically assessed. These statements should be deleted. Specifically, delete: "as well as 3 different kinds of PV-specific AAVs, specifically a mixture of AAV1-PaqR4-Flp and AAV1-h56D-mCherry-FRT (Mehta et al., 2019), an AAV1-PV1-ChR2-eYFP (donated by G. Horwitz, University of Washington)," and delete "Here we report results only from those vectors that were deemed to be most promising for use in primate cortex, based on infectivity and specificity. These were the 3 serotypes of the GABA-specific pAAV-h56D-tdTomato, and the PV-specific AAVPHP.eB-S5E2.tdTomato." These tools might in fact be just as useful or even better than what is actually tested and reported here, but maybe the viral titer was too low to expect any expression.

      Based on the description in the Methods it seems that no antibody labeling against TdTomato was used to amplify the detection of the transgenes expressed from the AAV vectors. It should be verified that this is the case - a statement could be added to the Methods.

    5. Reviewer #3 (Public Review):

      Summary:

      Federer et al. describe the laminar profiles of GABA+ and of PV+ neurons in marmoset V1. They also report on the selectivity and efficiency of expression of a PV-selective enhancer (S5E2). Three further viruses were tested, with a view to characterizing the expression profiles of a GABA-selective enhancer (h56d), but these results are preliminary.

      Strengths:

      The derivation of cell-type specific enhancers is key for translating the types of circuit analyses that can be performed in mice - which rely on germline modifications for access to cell-type specific manipulation - in higher-order mammals. Federer et al. further validate the utility of S5E2 as a PV-selective enhancer in NHPs.

      Additionally, the authors characterize the laminar distribution pattern of GABA+ and PV+ cells in V1. This survey may prove valuable to researchers seeking to understand and manipulate the microcircuitry mediating the excitation-inhibition balance in this region of the marmoset brain.

      Weaknesses:

      Enhancer/promoter specificity and efficiency cannot be directly compared, because they were packaged in different serotypes of AAV.

      The three different serotypes of AAV expressing reporter under the h56D promoter were only tested once each, and all in the same animal. There are many variables that can contribute to the success (or failure) of a viral injection, so observations with an n=1 cannot be considered reliable.

      The language used throughout conflates the cell-type specificity conferred by the regulatory elements with that conferred by the serotype of the virus.

    1. Author response:

      The following is the authors’ response to the original reviews.

      General responses to the weaknesses of this work:

      The two reviewers mentioned two major weaknesses of this work:

      (1) The one unexplained step in this intricately described mechanism is how HSCB functions to promote TACC3 degradation. It appears that the proteasome is involved since MG-132 reverses the effect of HSCB deficiency, but no other details are provided. Does HSCB target TACC3 for ubiquitination somehow? Future studies will be required to understand this portion of the mechanism.

      We totally agree that the detailed mechanisms through which HSCB promotes TACC3 degradation should be clarified. We tried to find the ubiquitin ligases involved in this regulatory process but could not identify such a key protein so far. We also investigated whether HSCB itself is a ubiquitin ligase but found that the protein does not possess this activity. We therefore consider this weakness another limitation of this research and have added one sentence to the penultimate paragraph of the Discussion section to address this issue.

      (2) This study only uses cell models. The significance of this work may be broadened by further studies using animal models.

      We totally agree that in vivo models should be adopted to validate the major findings of this study. As we stated in the penultimate paragraph of the Discussion section, we did not have access to biological samples from the patient harboring the HSCB mutation. Additionally, HSCB constitutive knockout mice died during the embryonic stage, while conditional knockout did not cause embryonic death but resulted in almost no erythroid cells in the bone marrow. Therefore, we were not able to further validate our findings in in vivo models.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      • Figure 3A - Should include FOG1 on the total cell lysate blots to show if total FOG1 is changing or only the cytoplasmic/nuclear ratio. This is shown later but would be good to include here.

      We would like to thank the reviewer for the nice suggestion. We have added the blots for total FOG1 to updated Figure 3A as requested.

      • Figures 3C and 4F - Should include the qPCR results from control cultures on the graphs (EPO + CRISPR NC and shNC, respectively).

      We would like to thank the reviewer for the good suggestion. We have added the control groups for all qPCR assays to the updated figures throughout the study.

      • Figure 4 - The addition of genetic manipulation of TACC3 to confirm its role in the cytoplasmic retention of FOG1 and failed erythroid differentiation in HSCB-deficient cells would strengthen the conclusions of this figure.

      We would like to thank the reviewer for the good suggestion. We initially tried to knock down TACC3 expression through siRNAs to confirm its role in the cytoplasmic retention of FOG1. However, we found that siRNAs that worked well in untreated K562 and erythroid progenitor cells as well as several other cell lines had poor efficiency of knocking down gene expression upon HSCB deficiency. This happened not only to siRNAs targeting TACC3, but also to those targeting several other genes. Interestingly, gene overexpression plasmids worked especially well in HSCB-deficient cells. We were not able to explain these phenomena and chose to use an inhibitor of TACC3 to study its functional implications in this research.

      • Text should be added to discuss the implications of this work for the lineage-specifying function of GATA-1. There are papers by John Crispino and Alan Cantor/Stu Orkin using the FOG-binding mutant of GATA-1 that implicate FOG1-dependent GATA-1 activity as Meg/Ery specifying, whereas FOG1-independent GATA-1 activity promotes mast cell or eosinophil fate. This work suggests that GATA1-expressing myeloid progenitors where FOG1 is kept cytoplasmic (no EPO signaling) would be driven towards the mast cell fate.

      We would like to thank the reviewer for the valuable suggestion. We have added a new paragraph in the Discussion section of the updated manuscript to discuss the implication of this work for the lineage-specifying function of GATA-1.

      Reviewer #2 (Recommendations For The Authors):

      Minor comments:

      (1) In the model provided in Figure 7H, HSCB and FOG1 bind TACC3 simultaneously. However based on the data provided in Figure 6B and other figures, it seems that their interactions are more likely to be mutually exclusive. Is there a possibility that, besides inducing the degradation of TACC3, the binding of HSCB can inhibit the interaction between TACC3 and FOG1?

      We would like to thank the reviewer for the insightful comment. According to the data presented in the updated Figure 5F, TACC3 can simultaneously bind with HSCB and FOG1 in E 2-day HSCs. That is why we depict the simultaneous binding pattern in the model provided in Figure 7H. However, we agree that there is a possibility that the binding of HSCB can inhibit the interaction between TACC3 and FOG1 and have mentioned this possibility in the “Phosphorylation of HSCB by PI3K was necessary for its functionalization during human erythropoiesis” subsection of the “Results” section in the updated manuscript.

      (2) Whether the decreased TACC3 protein abundance (Figure 5D) during erythroblast differentiation is mainly due to the effect of HSCB. Can silencing of HSCB block this reduction?

      We would like to thank the reviewer for the great question. We have analyzed the protein abundance of TACC3 in HSCB-deficient hematopoietic stem cells induced for erythropoiesis for 0, 2 and 4 days and summarized the results as a new Figure 5E. According to the results, TACC3 protein abundance in HSCB-deficient hematopoietic stem cells exhibited no obvious change when the cells were induced for erythropoiesis for 0, 2 and 4 days. These results suggest that the decreased TACC3 protein abundance during early erythroblast differentiation was indeed due to the effect of HSCB. We only investigated the effect of HSCB on TACC3 abundance in early erythroid progenitors because, as shown in Figure 1, HSCB-deficient hematopoietic stem cells stopped differentiation at an early phase of their erythropoiesis. We have also mentioned these data in the “HSCB facilitated FOG1 nuclear translocation by binding with and mediating the proteasomal degradation of TACC3 upon activation of the EPO/EPOR signaling” subsection of the “Results” section in the updated manuscript.

      (3) This study shows that HSCB can be phosphorylated by PI3K, and this modification is important for its role in regulating FOG1 distribution. Does the phosphorylation of HSCB also affect its function in ISC biogenesis?

      We would like to thank the reviewer for the instructive question. We have analyzed the mitochondrial and cytosolic aconitase activities in wortmannin-treated K562 and E 2-day HSCs and their respective controls. The results have been summarized as a new Figure S5. According to the results, wortmannin treatment did not significantly affect mitochondrial and cytosolic aconitase activities. Therefore, it seems that HSCB phosphorylation does not affect its function in ISC biogenesis. We have also mentioned these data in the “Phosphorylation of HSCB by PI3K was necessary for its functionalization during human erythropoiesis” subsection of the “Results” section in the updated manuscript.

      (4) The method of isolation of nuclear fraction needs to be provided in the "Materials and Methods" section.

      We would like to thank the reviewer for the thoughtful suggestion. We have added the required information to the “Nuclear proteomics analysis” subsection of the "Materials and Methods" section in the updated manuscript.

    2. eLife assessment

      This fundamental work significantly advances our understanding of how FOG1 nuclear localization is regulated during erythropoiesis and megakaryopoiesis, including the role of EPO and MPL/TPO signaling in this process. The authors provide compelling evidence using both K562 and CD34+ cells that heat shock cognate B (HSCB) can promote the proteasomal degradation of TACC3 to regulate the nuclear localization of FOG1, and that this function is independent of its role in iron-sulfur cluster (ISC) biogenesis. Together these data will be of interest to the fields of hematopoiesis and cell biology.

    3. Reviewer #1 (Public Review):

      Summary:

      In the paper entitled "PI3K/HSCB axis facilitates FOG1 nuclear translocation to promote erythropoiesis and megakaryopoiesis", the authors sought to determine the role of HSCB, a known regulator of Iron sulfur cluster transfer, in the generation of erythrocytes and megakaryocytes. They utilized a human primary cell model of hematopoietic differentiation to identify a novel mechanism whereby HSCB is necessary for activation of erythroid and megakaryocytic gene expression through regulation of the nuclear localization of FOG-1, a essential transcription co-regulator of the GATA transcription factors. Their work establishes this novel regulatory axis as a mechanism by which cytokine signaling through EPO-R and MPL drives the lineage-specification of hematopoietic progenitors to erythrocytes and megakaryocytes, respectively.

      Impact:

      The major impact of this work is in a greater understanding of how cytokine signaling through EPO/TPO function to promote lineage specification of hematopoietic stem/progenitor cells. While the major kinase cascades downstream of the EPO/TPO receptors have been elucidated, how those cascades effect gene expression to promote a specific differentiation program is poorly understood. For this work, we now understand that nuclear localization of FOG is a critical regulatory node by which EPO/TPO signaling is required to launch FOG-dependent gene expression. However, these cytokine receptors have many overlapping and redundant targets, so it still remains to be elucidated how signaling through the different receptors promotes divergent gene expression programs. Perhaps similar regulatory mechanisms exist for other lineage-specifying transcription factors.

      Strengths:

      The authors use two different cellular models of erythroid differentiation (K562 and human primary CD34+ cells) to elucidate the multi-factorial mechanism controlling FOG-1 nuclear localization. The studies are well-controlled and rigorously establish their mechanism through complementary approaches. The differentiation effects are established through cell surface marker expression, protein expression, and gene expression analyses. Novel protein interactions discovered by proteomics analyses were validated through bi-directional co-IP experiments in multiple experimental systems. Protein cellular localization findings are supported by both immunofluorescence and cell fractionation immunoblot analyses. The robustness of their experimental findings gives great confidence for the likelihood that the methods and findings can be reproduced in future work based on their conclusions.

      Weaknesses:

      The one unexplained step in this intricately described mechanism is how HSCB functions to promote TACC3 degradation. It appears that the proteasome is involved since MG-132 reverses the effect of HSCB deficiency, but no other details are provided. Does HSCB target TACC3 for ubiquitination somehow? Future studies will be required to understand this portion of the mechanism.

      One weakness of the study design is that no in vivo experiments are conducted. The authors comment that the HSCB mouse phenotype is too dramatic to permit studies of erythropoiesis in vivo; however, a conditional approach could have been pursued.<br /> It should also be noted that a previous study had already shown that TACC3 regulates the nuclear localization of FOG-1, so this portion of the mechanism is not entirely novel. However, the role of HSCB and the proteasomal degradation of TACC3 is entirely novel to my knowledge.

    4. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Liu et al. identified an important pathway regulating the nuclear translocation of the key transcriptional factor FOG1 during human hematopoiesis. The authors show that heat shock cognate B (HSCB) can interact with and promote the proteasomal degradation of TACC3, and this function is independent of its role in iron-sulfur cluster biogenesis. TACC3 represses the activity of FOG1 by sequestering it in the cytoplasm. Therefore, HSCB can promote the nuclear translocation of FOG1 through down-regulating TACC3. The authors further show that the phosphorylation of HSCB by PI3K downstream of the EPO signaling pathway is important for its role in regulating the nuclear translocation of FOG1. The data are solid and the manuscript is overall well written. The findings of this manuscript provide important new knowledge to the fields of hematopoiesis and cell biology.

      Strengths:

      (1) This study uses a multi-pronged approach that combines techniques from a number of fields to convincingly demonstrate the pathway regulating the nuclear translocation of FOG1 during hematopoiesis. The proposed role of each component in the pathway is well supported by solid data.

      (2) This work provides important new insights into the function of HSCB, which was known to be an iron-sulfur cluster assembly protein. This study identifies a new role of HSCB and shows that HSCB can regulate the stability of the TACC3 protein, and this cytoplasmic function of HSCB is regulated by protein phosphorylation by PI3K.

      (3) The findings of this work open up new directions for research in hematopoiesis and related fields. For example, are there any other TACC3-binding proteins whose subcellular localization are regulated by the presence or absence of TACC3? What is the E3 ligase responsible for the degradation of TACC3? Does this identified mechanism contribute to the sideroblastic anemias observed in HSCB human patients and animal models?

    1. eLife assessment

      The studies described here are useful; they are broadly applicable to all antibody discovery subfields but do not add significant improvement to techniques already published. The findings are incomplete with respect to the methodology since details that are crucial in order to repeat the experiment are lacking (such as a timestamp) and they do not take into account multiple recent papers that have tested similar strategies. These studies will be of interest to a specialized audience working on making antibodies to infectious agents.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper by Watanabe et al described an expression system that can express the paired heavy and light chains of IgG antibodies from single cell B cells. In addition, they used FACS sorting for specific antigens to screen/select the specific populations for more targeted cloning of mAb genes. By staining with multiple antigens, they were able to zoom in to cross-reactive antibodies.

      Strengths:

      A highly efficient process that combines selection/screening with dua expression of both antibody chains. It is particularly suitable for the isolation of cross-reactive antibodies against conserved epitopes of different antigens, such as surface proteins of related viruses.

      Weaknesses:

      (1) The overall writing is very difficult to follow and the authors need to work on significant re-writing.

      (2) The paper in its current form really lacks detail and it is NOT possible for readers to repeat or follow their methods. For example: a) It is not clear whether the authors checked the serum to see if the mice were producing antibodies before they sacrificed them to harvest spleen/blood i.e. using ELISA? b) How long after administration of the second dose were the mice sacrificed? c) What cell types are taken for single B cell sorting? Splenocytes or PBMC? These are just some of the questions which need to be addressed.

      (3) According to the authors, 77 clones were sorted from the PR8+ and H2+ double positive quadrant. It is surprising that after transfection and re-analysing of bulk antibody presenting EXPI cells on FACS, only 13 clones (or 8 clones? - unclear) seemed to be truly cross-reactive. If that is the case, the approach is not as efficient as the authors claimed.

    3. Reviewer #2 (Public Review):

      Summary:

      Watanabe, Takashi, et al. investigated the use of the Golden Gate dual-expression vector system to enhance the modern standard for rapid screening of recombinant monoclonal antibodies. The presented data builds upon modern techniques that currently use multiple expression vectors to express heavy and light chain pairs. In a single vector, they express the linked heavy and light chain variable genes with a membrane-bound Ig which allows for rapid and more affordable cell-based screening. The final validation of H1 and H2 strain influenza screening resulted in 81 "H1+", 48 "H2+", and 9 "cross" reactive clones. The kinetics of some of the soluble antibodies were tested via SPR and validated with a competitive inhibition with classical well-characterized neutralizing clones.

      Strengths:

      In this study, Watanabe, Takashi, et al. further develop and refine the methodologies for the discovery of monoclonal antibodies. They elegantly merge newer technologies to speed up turnaround time and reduce the cost of antibody discovery. Their data supports the feasibility of their technique.

      This study will have an impact on pandemic preparedness and antibody-based therapies.

      Weaknesses:

      A His tagged antigen was used for immunization and H1-his was used in all assays. Either the removal of His specific clones needs to be done before selection, or a different tag needs to be used in the subsequent assays.

      This assay doesn't directly test the neutralization of influenza but rather equates viral clearance to competitive inhibition. The results would be strengthened with the demonstration of a functional antibody in vivo with viral clearance.

      Limitations of this new technique are as follows: there is a significant loss of cells during FACs, transfection and cloning efficiency are critical to success, and well-based systems limit the number of possible clones (as the author discussed in the conclusions). Early enrichment of the B cells could improve efficiency, such as selection for memory B cells.

    1. eLife assessment

      The authors present valuable findings on trends in hind limb morphology through the evolution of titanosaurian sauropod dinosaurs, the land animals that reached the most remarkable gigantic sizes. The solid results include the use of 3D geometric morphometrics to examine the femur, tibia, and fibula to provide new information on the evolution of this clade and on evolutionary trends between morphology and allometry. Further justification of the ontogenetic stages of the sampled individuals would help strengthen the manuscript's conclusions, and the inclusion of additional large-body mass taxa could provide expanded insights into the proposed trends.

    2. Reviewer #1 (Public Review):

      Summary:

      Páramo et al. used 3D geometric morphometric analyses of the articulated femur, tibia, and fibula of 17 macronarian taxa (known to preserve these three skeletal elements) to investigate morphological changes that occurred in the hind limb through the evolutionary history of this sauropod clade. A principal components analysis was completed to understand the distribution of the morphological variation. A supertree was constructed to place evolutionary trends in morphological variation into phylogenetic context, and hind limb centroid size was used to investigate potential relationships between skeletal anatomy and gigantism. The majority of the results did not yield statistically significant differences, but they did identify interesting shape-change trends, especially within subclades of Titanosauria. Many previous studies have attempted to elucidate a link between wide-gauge posture and gigantism, which in this study Páramo et al. investigate among several titanosaurian subclades. They propose that morphologies associated with wide-gauge posture arose in parallel with increasing body size among basal members of Macronaria and that this connection became less significant once wide-gauge posture was acquired within Titanosauria. The authors also suggest that other biomechanical factors influenced the independent evolution of subclades within Titanosauria and that these influences resulted in instances of convergent evolution. Therefore, they infer that, overall, wide-gauge posture was not significantly correlated with gigantism, though some morphological aspects of hind limb skeletal anatomy appear to have been associated with gigantism. Their work also supports previous findings of a decrease in body size within Titanosauriformes (which they found to be not significant with shape variables but significant with Pagel's lambda). Collectively, their results support and build on previous work to elucidate more specifics on the evolution of this enigmatic clade. Further study will show if their hypotheses stand or if the inclusion of additional specimens and taxa yields alternative results.

      Strengths:

      Páramo et al. were diligent in their efforts to digitize and prepare specimens for this study while also minimizing user bias. Their previous work provided a strong platform for this study, specifically for their robust methodology. Between their supplemental files (which include details about specimen digitization and preparation) and the main body of the manuscript, the authors fully provide their results in detailed tables and figures. Their conclusions on evolutionary trends within Titanosauria are reasonably well supported (see weaknesses below) and they provide important details that enhance our understanding of the evolution of this clade and complement previous findings. Their discussion of links between morphology and various biomechanical adaptations is important, and future studies can use these results to investigate such biomechanical adaptations. The trends they identify within the subclades of Titanosauria are very interesting and highlight the diversity of this clade. It is possible that additional investigations of the evolution of these subclades could unite findings in sauropod myology and biomechanics, each of which has been suggested to vary among titanosaurian taxa without a clear phylogenetic or evolutionary distribution. The authors suggest that certain common morphologies arose via convergent evolution among titanosaurian subclades, such as members of Colossosauria exhibiting morphologies more similar to basal titanosaurians than derived saltasaurines. While this conclusion about convergent evolution is not well explained, only future testing will determine if this hypothesis remains supported. Additionally, the authors discuss the influence of uncertainty on the phylogenetic position of some taxa, and this reminds readers to view their findings as tentative trends that may be illuminated through further quantitative analyses. If one accepts the use of hind limb centroid size as a reliable approximation of body size (see concerns in Weaknesses below) then their data also support a hypothesis of decreasing body size through titanosaurian evolution (with PC 2 further differentiating small titanosaurian taxa from one another), providing an opportunity for future analyses to further investigate these interesting trends.

      Weaknesses:

      Several sentences throughout the manuscript could benefit from citations. For example, the discussion of using hind limb centroid size as a proxy for body mass has no citations attributed. This should be cited or described as a new method for estimating body mass with data from extant taxa presented in support of this relationship. This particular instance is a very important point to include supporting documentation because the authors' conclusions about evolutionary trends in body size are predicated on this relationship.

      An additional area of concern is the lack of any discussion of taphonomic deformation in Section 3.3 Caveats of This Study, the results, or the methods. The authors provide a long and detailed discussion of taphonomic loss and how this study does a good job of addressing it; however, taphonomic deformation to specimens and its potential effects on the ensuing results were not addressed at all. Hedrick and Dodson (2013) highlight that, with fossils, a PCA typically includes the effects of taphonomic deformation in addition to differences in morphology, which results in morphometric graphs representing taphomorphospaces. For example, in this study, the extreme negative positioning of Dreadnoughtus on PC 2 (which the authors highlight as "remarkable") is almost certainly the result of taphonomic deformation to the distal end of the holotype femur, as noted by Ullmann and Lacovara (2016).

      The authors investigated 17 taxa and divided them into 9 clades, with only Titanosauria and Lithostrotia including more than two taxa (and four clades are only represented by one taxon). While some of these clades represent the average of multiple individuals, the small number of plotted taxa can only weakly support trends within Titanosauria. If similar general trends could be found when the taxa are parsed into fewer, more inclusive clades, it would support and strengthen their claims. Of course, the authors can only study what is preserved in the fossil record, and titanosaurian remains are often highly fragmentary; these deficiencies should therefore not be held against the authors. They clearly put effort and thought into their choices of taxa to include in this study, but there are limitations arising from this low sample size that inherently limit the confidence that can be placed on their conclusions, and this caveat should be more clearly discussed. Specifically, the authors note that their dataset contains many lithostrotians, but they do not discuss unevenness in body size sampling. As neither their size-category boundaries nor the taxa which fall into each of them are clearly stated, the reader must parse the discussion to glean which taxa are in each size category. It should be noted that the authors include both Jainosaurus and Dreadnoughtus as 'large' taxa even though the latter is estimated to have been roughly five times the body mass of the former, making Dreadnoughtus the only taxon included in this extreme size category. The effects that this may have on body size trends are not discussed. Additionally, few taxa between the body masses of Jainosaurus and Dreadnoughtus have been included even though the hind limbs of several such macronarians have been digitized in prior studies (such as Diamantinasaurus and Giraffititan; Klinkhamer et al. 2018). Also, several members of Colossosauria are more similar in general body size to Dreadnoughtus than Jainosaurus, but unfortunately, they do not preserve a known femur, tibia, and fibula, so the authors could not include them in this study. Exclusion of these taxa may bias inferences about body size evolution, and this is a sampling caveat that could have been discussed more clearly. Future studies including these and other taxa will be important for further evaluating the hypotheses about macronarian evolution advanced by Páramo et al. in this study.

    3. Reviewer #2 (Public Review):

      The authors report a quantitative comparative study regarding hind limb evolution among titanosaurs. I find the conclusions and findings of the manuscript interesting and relevant. The strength of the paper would be increased if the authors were to improve their reporting of taxon sampling and their discussion of age estimation and the potential implications that uncertainty in these estimates would have for their conclusions regarding gigantism (vs. ontogenetic patterns).

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      Following small molecule screens, this study provides convincing evidence that 7,8 dihydroxyflavone (DHF) is a competitive inhibitor of pyridoxal phosphatase. These results are important since they offer an alternative mechanism for the effects of 7,8 dihdroxyflavone in cognitive improvement in several mouse models. This paper is also significant due to the interest in the protein phosphatases and neurodegeneration fields.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Zink et al set out to identify selective inhibitors of the pyridoxal phosphatase (PDXP). Previous studies had demonstrated improvements in cognition upon removal of PDXP, and here the authors reveal that this correlates with an increase in pyridoxal phosphate (PLP; PDXP substrate and an active coenzyme form of vitamin B6) with age. Since several pathologies are associated with decreased vitamin B6, the authors propose that PDXP is an attractive therapeutic target in the prevention/treatment of cognitive decline. Following high throughput and secondary small molecule screens, they identify two selective inhibitors. They follow up on 7, 8 dihydroxyflavone (DHF). Following structure-activity relationship and selectivity studies, the authors then solve a co-crystal structure of 7,8 DHF bound to the active site of PDXP, supporting a competitive mode of PDXP inhibition. Finally, they find that treating hippocampal neurons with 7,8 DHF increases PLP levels in a WT but not PDXP KO context. The authors note that 7,8 DHF has been used in numerous rodent neuropathology models to improve outcomes. 7, 8 DHF activity was previously attributed to activation of the receptor tyrosine kinase TrkB, although this appears to be controversial. The present study raises the possibility that it instead/also acts through modulation of PLP levels via PDXP, and is an important area for future work.

      Strengths:

      The strengths of the work are in the comprehensive, thorough, and unbiased nature of the analyses revealing the potential for therapeutic intervention in a number of pathologies.

      Weaknesses:

      Potential weaknesses include the poor solubility of 7,8 DHF that might limit its bioavailability given its relatively low potency (IC50= 0.8 uM), which was not improved by SAR. However, the compound has an extended residence me and the co-crystal structure could aid the design of more potent molecules and would be of interest to those in the pharmaceutical industry. The images related to crystal structure could be improved.

      Reviewer #2 (Public Review):

      Summary:

      In this study, the authors performed a screening for PDXP inhibitors to identify compounds that could increase levels of pyridoxal 5'- phosphate (PLP), the co-enzymatically active form of vitamin B6. For the screening of inhibitors, they first evaluated a library of about 42,000 compounds for activators and inhibitors of PDXP and secondly, they validated the inhibitor compounds with a counter-screening against PGP, a close PDXP relative. The final narrowing down to 7,8-DHF was done using PLP as a substrate and confirmed the efficacy of this flavonoid as an inhibitor of PDXP function. Physiologically, the authors show that, by acutely treating isolated wild-type hippocampal neurons with 7,8-DHF they could detect an increase in the ratio of PLP/PL compared to control cultures. This effect was not seen in PDXP KO neurons.

      Strengths:

      The screening and validation of the PDXP inhibitors have been done very well because the authors have performed crystallographic analysis, a counter screening, and mutation analysis. This is very important because such rigor has not been applied to the original report of 7,8 DHF as an agonist for TrkB. Which is why there is so much controversy on this finding.

      Weaknesses:

      As mentioned in the summary report the study may benefit from some in vivo analysis of PLP levels following 7,8-DHF treatment, although I acknowledge that it may be challenging because of the working out of the dosage and timing of the procedure.

      Reviewer #3 (Public Review):

      This is interesting biology. Vitamin B6 deficiency has been linked to cognitive impairment. It is not clear whether supplements are effective in restoring functional B6 levels. Vitamin B6 is composed of pyridoxal compounds and their phosphorylated forms, with pyridoxal 5-phosphate (PLP) being of particular importance. The levels of PLP are determined by the balance between pyridoxal kinase and phosphatase activities. The authors are testing the hypothesis that inhibition of pyridoxal phosphatase (PDXP) would arrest the age-dependent decline in PLP, offering an alternative therapeutic strategy to supplements. Published data illustrating that ablation of the Pdxp gene in mice led to increases in PLP levels and improvement in learning and memory trials are consistent with this hypothesis.

      In this report, the authors conduct a screen of a library of ~40k small molecules and identify 7,8dihydroxyflavone (DHF) as a candidate PDXP inhibitor. They present an initial characterization of this micromolar inhibitor, including a co-crystal structure of PDXP and 7,8-DHF. In addition, they demonstrate that treatment of cells with 7,8 DHP increases PLP levels. Overall, this study provides further validation of PDXP as a therapeutic target for the treatment of disorders associated with vitamin B6 deficiency and provides proof-of-concept for inhibition of the target with small-molecule drug candidates.

      Strengths include the biological context, the focus on an interesting and under-studied class of protein phosphatases that includes several potential therapeutic targets, and the identification of a small molecule inhibitor that provides proof-of-concept for a new therapeutic strategy. Overall, the study has the potential to be an important development for the phosphatase field in general.

      Weaknesses include the fact that the compound is very much an early-stage screening hit. It is an inhibitor with micromolar potency for which mechanisms of action other than inhibition of PDXP have been reported. Extensive further development will be required to demonstrate convincingly the extent to which its effects in cells are due to on-target inhibition of PDXP.

      Recommendations for the authors:

      There is general agreement that the study represents an advance regarding the mechanisms of pyridoxal phosphatase and 7,8 DHF. From the reviewers' comments, several major questions and considerations are raised, followed by their detailed remarks:

      (1) More analysis of the solubility and dose of 7,8 DHF with regard to the 50% inhibition and the salt bridge of the B protomer, as raised by the reviewers.

      (2) Is there a possible involvement of another phosphatase?

      (3) Does 7,8 DHF cause an effect upon TrkB tyrosine phosphorylation?

      We thank the Reviewers and Editors for their fair and constructive comments and suggestions. We have performed additional experiments to address these questions and considerations. In addition, we have generated two new high-resoling (1.5 Å) crystal structures of human PDXP in complex with 7,8-DHF that substantially expand our understanding of 7,8-DHF-mediated PDXP inhibition. The scientist who performed this work for the revision of our manuscript has been added as an author (shared first authorship).

      We believe that the insights gained from these new data have further strengthened and improved the quality of our manuscript. Together, our data provide compelling evidence that 7,8-dihydroxyflavone is a direct and competitive inhibitor of pyridoxal phosphatase.

      Please find our point-by-point responses to the Public Reviews that are not addressed in the Recommendations for the Authors, and the Recommendations for the Authors below.

      Reviewer #2:

      As mentioned in the summary report the study may benefit from some in vivo analysis of PLP levels following 7,8-DHF treatment, although I acknowledge that it may be challenging because of the working out of the dosage and timing of the procedure.

      We agree that an in vivo analysis of PLP levels following 7,8-DHF treatment could be informative for the further evaluation of a possible mechanistic link between the reported effects of this compound and PDXP/vitamin B6. However, we currently do not have a corresponding animal experimentation permission in place and are unlikely to obtain such a permit within a reasonable me frame for this revision.

      Recommendations For The Authors:

      Reviewer #1:

      The work is already well-written, comprehensive, and convincing.

      Suggestions that could improve the manuscript.

      (1) Include a protein tyrosine phosphatase (PTP) in the selectivity analysis. One possibility is that 7,8 DHF acts on a PTP (such as PTP1B), leading to TrkB activation by preventing dephosphorylation. I note that a previous study has looked at SAR for flavones with PTP1B (PMID: 29175190), which is worth discussion.

      We thank the reviewer for bringing this interesting possibility to our attention. We were not aware of the SAR study for flavonoids with PTP1B by Proenca et al. but have now tested the effect of 7,8-DHF on PTP1B, referring to this paper. As shown in Figure 2d, PTP1B was not inhibited by 7,8-DHF at a concentration of 5 or 10 µM. At the highest tested concentration of 40 µM, 7,8-DHF inhibited PTP1B merely by ~20%. For comparison, compound C13 (3-hydroxy-7,8-dihydroxybenzylflavone-3’,4’dihydroxymethyl-phenyl), which emerged as the most active flavonoid in the SAR study by Proenca et al. inhibited PTP1B with an IC50 of 10 µM. Consistent with the results of these authors, our finding confirms that less polar substituents, such as O-benzyl groups at positions 7 and 8, and O-methyl groups at positions 3’ and 4’ of the flavone scaffold, are important for the ability of flavonoids to effectively inhibit PTP1B. We conclude that PTP1B inhibition by 7,8-DHF is unlikely to be a primary contributor to the reported cellular and in vivo effects of this flavone.

      In addition to PTP1B, we have now additionally tested the effect of 7,8-DHF on the serine/threonine protein phosphatase calcineurin/PP2B, the DNA/RNA-directed alkaline phosphatase CIP, and three other metabolite-directed HAD phosphatases, namely NANP, NT5C1A and PNKP. PP2B, CIP and NANP were not inhibited by 7,8-DHF. Similar to PTP1B, PNKP activity was attenuated (~30%) only at 40 µM 7,8-DHF. In contrast, 7,8-DHF effectively inhibited NT5C1A (IC50 ~10 µM). NT5C1A is an AMP hydrolase expressed in skeletal muscle and heart. To our knowledge, a role of NT5C1A in the brain has not been reported. Based on currently available information, the inhibition of NT5C1A therefore appears unlikely to contribute to 7,8-DHF effects in the brain.

      The results of these experiments are shown in the revised Figure 2d. Taken together, the extended selectivity analysis of 7,8-DHF on a total of 12 structurally and functionally diverse protein- and nonprotein-directed phosphatases supports our initial conclusion that 7,8-DHF preferentially inhibits PDXP.

      (2) Line 144: It is unclear how fig 2c supports the statement here. Remove call out for clarity.

      Our intention was to highlight the fact that 7,8-DHF concentrations >12.5 µM could not be tested in the BLI assay (shown in Figure 2c) due to 7,8-DHF solubility issues under these experimental conditions. However, since this is discussed in the text, but not directly visible in Figure 2c, we agree with the Reviewer and have removed this call out.

      (3) Figure 3a. It is difficult to see the pink 7,8 DHF on top of the pink ribbon backbone. A better combination of colours could be used. Likewise in Figure 3b it is pink on pink again.

      We have improved the combination of colors to enhance the visibility of 7,8-DHF and have consistently color-coded murine and the new human PDXP structures throughout the manuscript.

      (4) Figure 3c and d. These are the two protomers I believe, but the colour coding is not present in 3c where the ribbon is now gray. Please choose colours that can be used to encode protomers throughout the figure.

      Please see response to point 3 above.

      (5) Figure 3f. I think this is the same protomer as 3c but a 180-degree rotation. Could this be indicated, or somehow lined up between the two figures for clarity? It would also be useful to have 3e in the same orientation as 3f, to better visualise the overlap with PLP binding. PLP and 7,8 DHF could be labelled similarly to the amino acids in 3f (the colour coding here is helpful).

      Please see response to point 3 above. We have substantially revised the structural figures and have used consistent color coding and the same perspective of 7,8-DHF in the PDXP active sites.

      (6) Figure 3g. The colours of the bars relating to specific mutations do not quite match the colours in Figure 3f, which I think was the aim and is very helpful.

      We have adapted the colours of the residues in Figure 3f (now Fig. 3b and additionally Fig. 3 – figure supplement 1e) so that they exactly match the colours of the bars in Figure 3g (now Fig. 3d).

      Reviewer #2:

      No further comments.

      Reviewer #3:

      Page 4: The authors describe 7,8DHF as a "selective" inhibitor of PDXP - in my opinion, they do not have sufficient data to support such a strong assertion. Reports that 7,8DHF may act as a TRK-B-agonist already highlight a potential problem of off-target effects. Does 7,8DHF promote tyrosine phosphorylation of TRK-B in their hands? The selectivity panel presented in Figure 2, focusing on 5 other HAD phosphatases, is much too limited to support assertions of selectivity.

      We agree with the Reviewer that our previous selectivity analysis with six HAD phosphatases was limited. To further explore the phosphatase target spectrum of 7,8-DHF, we have now analyzed six other enzymes: three other non-HAD phosphatases (the tyrosine phosphatase PTP1B, the serine/threonine protein phosphatase PP2B/calcineurin, and the DNA/RNA-directed alkaline phosphatase/CIP) and three other non-protein-directed C1/C0-type HAD phosphatases (NT5C1A, NANP, and PNKP). The C1-capped enzymes NT5C1A and NANP were chosen because we previously found them to be sensitive to small molecule inhibitors of the PDXP-related phosphoglycolate phosphatase PGP (PMID: 36369173). PNKP was chosen to increase the coverage of C0-capped HAD phosphatases (previously, only the C0-capped MDP1 was tested).

      We found that calcineurin, CIP and NANP were not inhibited by up to 40 µM 7,8-DHF. The activities of PTP1B or PNKP activity were attenuated (by ~20 or 30%, respectively) only at 40 µM 7,8-DHF. In contrast, 7,8-DHF effectively inhibited NT5C1A (IC50 ~10 µM). We have previously found that NT5C1A was sensitive to small-molecule inhibitors of the PDXP paralog PGP, although these molecules are structurally unrelated to 7,8-DHF (PMID: 36369173). NT5C1A is an AMP hydrolase expressed in skeletal muscle and heart (PMID: 12947102). To our knowledge, a role of NT5C1A in the brain has not been reported. Based on currently available information, the inhibition of NT5C1A therefore appears unlikely to contribute to 7,8-DHF effects in the brain. The results of these experiments are shown in the revised Figure 2d. Taken together, the extended selectivity analysis of 7,8-DHF on a total of 12 structurally and functionally diverse protein- and non-protein-directed phosphatases supports our initial conclusion that 7,8-DHF preferentially inhibits PDXP. To nevertheless avoid any overstatement, we have now also replaced “selective” by “preferential” in this context throughout the manuscript.

      We have not tested if 7,8-DHF promotes tyrosine phosphorylation of TRK-B. Being able to detect 7,8- DHF-induced TRK-B phosphorylation in our hands would not exclude an additional role for PDXP/vitamin B6-dependent processes. Not being able to detect TRK-B phosphorylation may indicate absence of evidence or evidence of absence. This would neither conclusively rule out a biological role for 7,8-DHF-induced TRK-B phosphorylation in vivo, nor contribute further insights into a possible involvement of vitamin B6-dependent processes in 7,8-DHF induced effects.

      Page 6: The authors report that they obtained only two PDXP-selective inhibitor hits from their screen; 7,8DHF and something they describe as FMP-1. For the later, they state that it "was obtained from an academic donor, and its structure is undisclosed for intellectual property reasons". In my opinion, this is totally unacceptable. This is an academic research publication. If the authors wish to present data, they must do so in a manner that allows a reader to assess their significance; in the case of work with small molecules that includes the chemical structure. In my opinion, the authors should either describe the compound fully or remove mention of it altogether.

      We are unable to describe “FMP-1” because its identity has not been disclosed to us. The academic donor of this molecule informed us that they were not able to permit release of any details of its structure or general structural class due to an emerging commercial interest.

      We mentioned FMP-1 simply to highlight the fact that the screening campaign yielded more than one inhibitor. FMP-1 was also of interest due its complete inhibition of PDXP phosphatase activity.

      Because the structure of this molecule is unknown to us, we have now removed any mention of this compound in the manuscript. For the same reason, we have removed the mention of the inhibitor hits “FMP-2” and “FMP-3” in Figure 2 – figure supplement 1 and Figure 2 – figure supplement 2. The number of PDXP inhibitor hits in the manuscript has been adapted accordingly.

      Page 7: The observed plateau at 50% inhibition requires further explanation. It is not clear how poor solubility of the compound explains this observation. For example, the authors state that "due to the aforementioned poor solubility of 7,8DHF, concentrations higher than 12.5µM could not be evaluated". Yet on page 8, they describe assays against the specificity panel at concentrations of compound up to 40µM. Do the analogues of 7,8DHF (Fig 2b) result in >50% inhibition at higher concentrations? Further explanation and data on the solubility of the compounds would be of benefit.

      We currently do not have a satisfactory explanation for the apparent plateau of ~50% PDXP inhibition by 7,8-DHF. Resolving this question will likely require other approaches, including computational chemistry such as molecular dynamics simulations, and we feel that this is beyond the scope of the present manuscript.

      We previously speculated that the limited solubility of 7,8-DHF may counteract a complete enzyme inhibition if higher concentrations of this molecule are required. Specifically, we referred to Todd et al. who have performed HPLC-UV-based solubility assays of 7,8-DHF (ref. 35). These authors found that immediately after 7,8-DHF solubilization, nominal 7,8-DHF concentrations of 5, 20 or 50 µM resulted in 0.5, 3.0 or 13 µM of 7,8-DHF in solution of (i.e., 10, 15 or 26% of the respective nominal concentration). Seven hours later, 46, 26 or 26% of the respective nominal 7,8-DHF concentrations were found in solution. Hence, above a nominal concentration of 5 µM, 7,8-DHF solubility does not increase linearly with the input concentration, but plateaus at ~20% of the nominal concentration. This phenomenon could potentially contribute to the apparent plateau of human or murine PDXP inhibition by 7,8-DHF in vitro.

      However, experiments performed during the revision of our manuscript show that they HAD phosphatase NT5C1A can be effectively inhibited by 7,8-DHF with an IC50-value of 10 µM (see revised Fig. 2). Together with the fact that the activity of the PDXP-Asn61Ser variant can be completely inhibited by 7,8-DHF (see Fig. 3d), we conclude that the reason for the observed plateau of PDXP inhibition is likely to be primarily structural, with Asn61 impeding 7,8-DHF binding. We have therefore removed the mention of the limited solubility of 7,8-DHF here. On p.14, we now say: “These data also suggest that Asn61 contributes to the limited efficacy of 7,8-mediated PDXP inhibition in vitro.”

      The solubility of 7,8-DHF is dependent on the specific assay and buffer conditions. In BLI experiments, interference patterns caused by binding of 7,8-DHF in solution to biotinylated PDXP immobilized on the biosensor surface are measured. In phosphatase selectivity assays, phosphatases are in solution, and the effect of 7,8-DHF on the phosphatase activity is measured via the quantification of free inorganic phosphate.

      In BLI experiments, we observed that the sensorgrams obtained with the highest tested 7,8-DHF concentration (25 µM) showed the same curve shapes as the sensorgrams obtained with 12.5 µM 7,8-DHF. This contrasts with the expected steeper slope of the curves at 25 µM vs. 12.5 µM 7,8-DHF. The same behavior was observed for the reference sensors (i.e., the SSA sensors that were not loaded with PDXP, but incubated with 7,8-DHF at all employed concentrations for referencing against nonspecific binding of 7,8-DHF to the sensors). The sensorgrams at 25 µM 7,8-DHF were therefore not included in the analysis (this is now specified in the Materials and Methods BLI section on p.27). To clarify this point, we now state that “As a result of the poor solubility of the molecule, a saturation of the binding site was not experimentally accessible” (p.7).

      In contrast, the phosphatase selectivity assays described on p.8 could be performed with nominal 7,8-DHF concentrations of up to 40 µM. Although the effective 7,8-DHF concentration in solution is expected to be lower (see ref. 35 and discussed above), the limited solubility of 7,8-DHF in phosphatase assays does not prevent the quantification of free inorganic phosphate. Nevertheless, we cannot exclude some interference with this absorbance-based assay (e.g., due to turbidity caused by insoluble compound). Indeed, 5,6-dihydroxyflavone and 5,6,7-trihydroxyflavone caused an apparent increase in PDXP activity at concentrations above 10 µM (see Figure 2b), which may be related to compound solubility issues. Alternatively, these flavones may activate PDXP at higher concentrations.

      We have tested the 7,8-DHF analogue 3,7,8,4’-tetrahydroxyflavone at concentrations of 70 and 100 µM. At concentrations >100 µM, the DMSO concentration required for solubilizing the flavone interferes with PDXP activity. PDXP inhibition by 3,7,8,4’-tetrahydroxyflavone was slightly increased at 70 µM compared to 40 µM (by ~18%) but plateaued between 70 and 100 µM. These results are now mentioned in the text (p.7): “The efficacy of PDXP inhibition by 3,7,8,4’-tetrahydroxyflavone was not substantially increased at concentrations >40 µM (relative PDXP activity at 40 µM: 0.46 ± 0.05; at 70 µM: 0.38 ± 0.15; at 100 µM: 0.37 ± 0.09; data are mean values ± S.D. of n=6 experiments).”

      Page 9: The authors report that PDXP crystallizes as a homodimer in which 7,8DHF is bound only to one protomer. Is the second protomer active? Does that contribute to the 50% inhibition plateau? If Arg62 is mutated to break the salt bridge, does inhibition go beyond 50%?

      We have no way to measure the activity of the second, inhibitor-free protomer in murine PDXP. We know that PDXP functions as a constitutive homodimer, and based on our current understanding, both protomers are active. We have previously shown that the experimental monomerization of PDXP (upon introduction of two-point mutants in the dimerization interface) strongly reduces its phosphatase activity. Specifically, PDXP homodimerization is required for an inter-protomer interaction that mediates the proper positioning of the substrate specificity loop. Thus, homodimerization is necessary for effective substrate coordination and -dephosphorylation (PMID: 24338687).

      In the murine structure, we observed that 7,8-DHF binding to the second subunit (the B-protomer) is prevented by a salt bridge between Arg62 and Asp14 of a symmetry-related A-protomer in the crystal lace (i.e., this is not a salt bridge between Arg62 in the B-protomer and Asp14 in the A-protomer of a PDXP homodimer). As suggested, we have nevertheless tested the potential role of this salt bridge for the sensitivity of the PDXP homodimer to 7,8-DHF.

      The mutation of Arg62 is not suitable to answer this question, because this residue is involved in the coordination of 7,8-DHF (see Figure 3b), and the PDXP-Arg62Ala mutant is inhibitor resistant (see Figure 3d). We have therefore mutated Asp14, which is not involved in 7,8-DHF coordination. As shown in the new Figure 3 – figure supplement 1d, the 7,8-DHF-mediated inhibition of PDXPAsp14Ala again reached a plateau at ~50%. This result suggests that while an Arg62-Asp14 salt bridge is stabilized in the murine crystal, it is not a determinant of the active site accessibility of protomer B in solution.

      To address this important question further, we have now also generated co-crystals of human PDXP bound to 7,8-DHF, and refined two structures to 1.5 Å. We found that in human PDXP, both protomers bind 7,8-DHF. These new, higher resolution data are now shown in the revised Figure 3 and its figure supplements, and we have moved the panels referring to the previously reported murine PDXP structure to the Figure 3 – figure supplement 1. Thus, both protomers of human PDXP, but only one protomer of murine PDXP bind 7,8-DHF in the crystal structure, yet the 7,8-DHFmediated inhibition of human and murine PDXP plateaus at ~50% under the phosphatase assay conditions (see Figure 2a). We conclude that 7,8-DHF binding efficiency in the PDXP crystal does not necessarily reflect its inhibitory efficiency in solution.

      Taken together, these data indicate that the apparent partial inhibition of murine and human PDXP phosphatase activity by 7,8-DHF in our in vitro assays is not explained by an exclusive binding of 7,8DHF to just one protomer of the homodimer.

      Page 10-12; Is it possible to generate a mutant form of PDXP in which activity is maintained but inhibition is attenuated - an inhibitor-resistant mutant form of PDXP? Can such a mutant be used to assess on-target vs off-target effects of 7,8DHF in cells?

      This is an excellent point, and we agree with the Reviewer that such an approach would provide further evidence for cellular on-target activity of 7,8-DHF. Indeed, the verification of the PDXP-7,8DHF interaction sites has led to the generation of catalytically active, inhibitor-resistant PDXP mutants, such as Tyr146Ala and Glu148Ala (Fig. 3d). However, the biochemical analysis of such mutants in primary hippocampal neurons is a very difficult task.

      Primary hippocampal neurons are derived from pooled, isolated hippocampi of mouse embryos and are subsequently differentiated for 21 days in vitro. The resulting cellular yield is typically low and variable, and the viability (and contamination of the respective cultures with e.g. glial cells) varies from batch to batch. Although such cell preparations are suitable for electrophysiological or immunocytochemical experiments, they are far from ideal for biochemical studies. A meaningful experiment would require the efficient expression of a catalytically active, but inhibitor-resistant PDXP-mutant in PDXP-KO neurons. In parallel, PDXP-KO cells reconstituted with PDXP-WT (at phosphatase activity levels comparable with the PDXP mutant cells) would be needed for comparison. Unfortunately, the generation of (a) sufficient numbers of (b) viable cells that (c) efficiently express (d) functionally comparable levels of PDXP-WT or -mutant for downstream analysis (PLP/PL-levels upon inhibitor treatment) is currently not possible for us.

      Human iPSC-derived (hippocampal) spheroids are at present no alternative, due to the necessity of generating PDXP-KO lines first, and the difficulties with transfecting/transducing them. Such a system would require extensive validation. We have attempted to use SH-SY5Y cells (a metastatic neuroblastoma cell line), but PDXK expression in these cells is modest and they produce too little PLP. We therefore feel that this question is beyond the scope of our current study.

    2. eLife assessment

      Following small molecule screens, this study provides convincing evidence that 7,8 dihydroxyflavone (DHF) is a competitive inhibitor of pyridoxal phosphatase. These results are important since they offer an alternative mechanism for the effects of 7,8 dihdroxyflavone in cognitive improvement in several mouse models. This paper is also significant due to the interest in the phosphatases and neurodegeneration fields.

    3. Reviewer #1 (Public Review):

      Summary:

      This manuscript set out to identify selective inhibitors of the pyridoxal phosphatase (PDXP). Previous studies had demonstrated improvements in cognition upon removal of PDXP, and here the authors reveal that this correlates with an increase in pyridoxal phosphate (PLP; PDXP substrate and an active coenzyme form of vitamin B6) with age. Since several pathologies are associated with decreased vitamin B6, the authors propose that PDXP is an attractive therapeutic target in the prevention/treatment of cognitive decline. Following high throughput and secondary small molecule screens, they identify two selective inhibitors. They follow up on 7, 8 dihydroxyflavone (DHF). Following structure-activity relationship and selectivity studies, the authors then solve a co-crystal structure of 7,8 DHF bound to the active site of PDXP, supporting a competitive mode of PDXP inhibition. Finally, they find that treating hippocampal neurons with 7,8 DHF increases PLP levels in a WT but not PDXP KO context. The authors note that 7,8 DHF has been used in numerous rodent neuropathology models to improve outcomes. 7, 8 DHF activity was previously attributed to activation of the receptor tyrosine kinase TrkB, although this appears to be controversial. The present study raises the possibility that it instead/also acts through modulation of PLP levels via PDXP, and is an important area for future work.

      Strengths:

      The strengths of the work are in the comprehensive, thorough, and unbiased nature of the analyses revealing the potential for therapeutic intervention in a number of pathologies.

      Weaknesses:

      Potential weaknesses include the poor solubility of 7,8 DHF that might limit its bioavailability given its relatively low potency (IC50= 0.8 uM), which was not improved by SAR. The solubility issues of 7,8 DHF have been discussed at length in the authors' response to Reviewer #3. In particular, the solubility of 7,8 DHF has been found to be variable due to the concentration and buffer conditions. The 7,8 DHF compound has an extended residence time and the co-crystal structure could aid the design of more potent molecules and would be of interest to those in the pharmaceutical industry. The images related to crystal structure have been improved with additional structural analysis of PDXP in a complex of 7,8-DHF (see revised Figure 3).

    1. eLife assessment

      This useful work shows that the experimental application of serotonin to locust antennal lobes induces an increased feeding-related response to some odorants (even in food-satiated animals). To explain how the odorant-specific effects are seen despite similar consequences of 5-HT modulation on all projection neuronal types, the authors propose a simple quantitative model built around projection with different downstream connections. While they are consistent with the authors' conclusions, the current panel of experiments is incomplete and additional future work will be required to fully support the conclusions the authors currently draw from their observations.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is an interesting study that performs scRNA-Seq on infected and uninfected wounds. The authors sought to understand how infection with E. faecalis influences the transcriptional profile of healing wounds. The analysis demonstrated that there is a unique transcriptional profile in infected wounds with specific changes in macrophages, keratinocytes, and fibroblasts. They also speculated on potential crosstalk between macrophages and neutrophils and macrophages and endothelial cells using NicheNet analysis and CellChat. Overall the data suggest that infection causes keratinocytes to not fully transition which may impede their function in wound healing and that the infection greatly influenced the transcriptional profile of macrophages and how they interact with other cells.

      Strengths:

      It is a useful dataset to help understand the impact of wound infection on the transcription of specific cell types. The analysis is very thorough in terms of transcriptional analysis and uses a variety of techniques and metrics.

      Weaknesses:

      Some drawbacks of the study are the following. First, the fact that it only has two mice per group, and only looks at one time point after wounding decreases the impact of the study. Wound healing is a dynamic and variable process so understanding the full course of the wound healing response would be very important to understand the impact of infection on the healing wound. Including unwounded skin in the scRNA-Seq would also lend a lot more significance to this study. Another drawback of the study is that mouse punch biopsies are very different than human wounds as they heal primarily by contraction instead of reepithelialization like human wounds. So while the conclusions are generally supported the scope of the work is limited.

      Thank you for your thoughtful review and acknowledgment of the thoroughness of our analysis.

      First, the fact that it only has two mice per group, and only looks at one time point after wounding decreases the impact of the study.

      We acknowledge your concerns regarding the limitations of our study, particularly regarding the small number of mice per group and the examination of only one time point post-wounding. We agree that a more comprehensive analysis across multiple time points would provide a deeper understanding of the temporal changes induced by infection. While our primary focus in this study was to elucidate the foundational responses to bacteria-infected wounds, we attempted to augment our analysis by incorporating publicly available datasets of similar nature. However, these datasets lacked power in terms of cell number and populations. Nonetheless, we have bolstered our analysis by applying a crossentropy test on the integrated dataset and reporting its significance (Figure S1F), ensuring the robustness of our single-cell RNA sequencing datasets.

      Including unwounded skin in the scRNA-Seq would also lend a lot more significance to this study.

      We also recognize the significance of comparing infected wounds to unwounded skin to establish a baseline for transcriptional changes. While we attempted to incorporate publicly available unwounded skin samples into our analysis, we encountered limitations in the number of cells, particularly within the immune population. This constraint is addressed in the Limitations section of the manuscript.

      Another drawback of the study is that mouse punch biopsies are very different than human wounds as they heal primarily by contraction instead of re-epithelialization like human wounds.

      Regarding the concern about differences between murine and human wound healing mechanisms, we took measures during tissue isolation to mitigate this issue, extracting incisions of the wounds rather than contracted tissues. Despite the primary mode of wound closure in mice being contraction, we believe our analysis still offers valuable insights into cellular responses to infection relevant to human wound healing.

      We appreciate your constructive criticism of our study. Despite these constraints, we believe our work provides valuable insights into the transcriptional changes induced by infection in healing wounds.

      Reviewer #2 (Public Review):

      Summary:

      The authors have performed a detailed analysis of the complex transcriptional status of numerous cell types present in wounded tissue, including keratinocytes, fibroblasts, macrophages, neutrophils, and endothelial cells. The comparison between infected and uninfected wounds is interesting and the analysis suggests possible explanations for why infected wounds are delayed in their healing response.

      Strengths:

      The paper presents a thorough and detailed analysis of the scRNAseq data. The paper is clearly written and the conclusions drawn from the analysis are appropriately cautious. The results provide an important foundation for future work on the healing of infected and uninfected wounds.

      Weaknesses:

      The analysis is purely descriptive and no attempt is made to validate whether any of the factors identified are playing functional roles in wound healing. The experimental setup is analyzing a single time point and does not include a comparison to unwounded skin.

      We are thankful for your acknowledgment of the thoroughness of our analysis and the cautious nature of our conclusions.

      The analysis is purely descriptive, and no attempt is made to validate whether any of the factors identified are playing functional roles in wound healing.

      Regarding your concern about the purely descriptive nature of our analysis and the lack of functional validation of identified factors, we agree on the importance of understanding the functional roles of transcriptional changes in wound healing. To address this limitation, we plan to conduct functional experiments, such as perturbation assays or in vivo validation studies, to validate the roles of specific factors identified in our analysis.

      The experimental setup is analyzing a single time point and does not include a comparison to unwounded skin.

      We acknowledge the importance of comparing wounded tissue to unwounded skin to establish a baseline for understanding transcriptional changes. This point is noted and acknowledged in the limitations section of our manuscript.

      We appreciate your feedback and assure you that we will consider your suggestions in future iterations of our research.

      Recommendations For The Authors:

      We are grateful for the positive overall assessment of our revised work by the reviewers. Critical comments on specific aspects of our work are listed verbatim below followed by our responses.

      Reviewer 1 (Recommendations for the Authors):

      (1) The figures are a bit cluttered and hard to parse out. The different parts of the figure seem to be scattered all over the place with no consistent order.

      Thank you for your feedback regarding the figures in our manuscript. We acknowledge your concern that some panels may appear cluttered and challenging to navigate. In response, we made concerted efforts to declutter certain panels, taking into account page size constraints and ensuring a minimum font size for readability.

      (2) I didn't really understand what the last sentence on page 6 meant. Is this meant to say that these could be biomarkers of infection?

      We thank the reviewer for noting this lack of clarity. We revised the statement.

      Updated manuscript (lines 111-113)

      “Overall, the persistent E. faecalis infection contributed to higher Tgfb1 expression, whilst Pdgfa levels remained low, correlating with delayed wound healing.”

      (3) >(3) A reference on page 19 didn't format correctly.

      We thank the reviewer for catching the typos. We corrected the reference formatting.

      Updated manuscript (lines 503-505)

      “We confirm the immune-suppressive role of E. faecalis in wound healing, consistent with previous findings in different experimental settings (Chong et al., 2017; Kao et al., 2023; Tien et al., 2017).”

      (4) The title doesn't really address the scope of the finding which goes beyond immunomodulatory.

      The reviewer is correct! We therefore revised the title to cover all aspects of the study as:

      “Decoding the complexity of delayed wound healing following Enterococcus faecalis infection”

      Reviewer 2 (Recommendations for the Authors):

      (1) On page 6, the expression of Tgfb1 is described as "aggravated" by wounding alone. I am not sure whether this means Tgfb1 levels are increased or decreased. It appears from the data that it is increased, which was confusing to me since I interpreted "aggravated" as meaning decreased. So perhaps a different more straightforward word could be used to describe the data.

      We modified this ambiguous statement to:

      Updated manuscript (lines 105-106)

      “By contrast, wounding alone resulted in higher transforming growth factor beta 1 (Tgfb1) expression.”

      (2) On page 7, the authors state that "cells from infected wounds...demonstrated distinct clustering patterns compared to cells from uninfected wounds (Figure S1F)" but when I look at the data in this figure, I cannot really see a difference. Perhaps the differences could be more clearly highlighted?

      Thank you for pointing out this issue. We appreciate the reviewer's comment. We utilized the crossentropy test for statistical comparison, employing UMAP embedding space data. While the data underwent batch correction based on infection status, the UMAP plots for each condition may appear visually similar. However, it's important to note that the number of cells per clusters between the infected and uninfected conditions varies significantly. This aspect influences the selection of points (cells) and their nearest neighbours for statistical testing within each cluster in the embedding space. To address this concern, we have included a table indicating the number of cells per cell type alongside the plot (Figure S1F), providing additional context for the interpretation of our results.

      Author response table 1.

      Author response image 1.

      (3) On page 8, Zeb2hi cells are described as "immunosuppressive" and yet the genes are highlighted to express in include Cxcl2 and IL1b which I would classify as inflammatory, not immunosuppressive. Can the authors be a bit more clear on why they describe the phenotype of these cells as "immunosuppressive"?

      We agree with the reviewer that this is a bit counterintuitive. Conventionally, CXCL2 is thought to be chemoattractant for neutrophil recruitment. However, the infection-specific keratinocyte cluster expressing Cxcl2, Il1b, Wfdc17 along with Zeb2 and Thbs1 indicate their myeloid-derived suppressor cell-like features, which play immunosuppressive roles during infection and in cancer (Alshetaiwi et al., 2020; Siriwach et al., 2022; Veglia et al., 2021).

      Updated manuscript (lines 159-163)

      “As the barrier to pathogens, keratinocytes secrete a broad range of cytokines that can induce inflammatory responses (Alshetaiwi et al., 2020; Siriwach et al., 2022; Veglia et al., 2021). However, Zeb2hi keratinocytes co-expressing Cxcl2, Il1b, and Wfdc17, indicate myeloidderived suppressor cell-like phenotype which implies an immunosuppressive environment (Hofer et al., 2021; Veglia et al., 2021).”

      (4) On pages 8-9, Keratinocytes are described to express MHC class II. I find this quite unexpected since class II is usually thought to be expressed primarily by APCs such as DCs and B cells. Is there a precedent for keratinocytes to express class II? The authors should acknowledge that this is unexpected and in need of further validation, or support the claim with references in which class II expression has been previously observed on keratinocytes (and is thus not unexpected)

      Although MHC class II expression is predominantly on immune cells, an antigen-presenting role for keratinocytes has been reported in many studies (Banerjee et al., 2004; Black et al., 2007; Carr et al., 1986; Gawkrodger et al., 1987; Jiang et al., 2020; Li et al., 2022; Oh et al., 2019; Tamoutounour et al., 2019). Therefore, antigen-presenting role of keratinocytes is known and expected, and we think that this should be further investigated in in the context of wound infection.

      Updated manuscript (lines 177-179)

      “These genes are associated with the major histocompatibility complex (MHC) class II, suggesting a self-antigen presenting keratinocyte population, which have a role in costimulation of T cell responses (Meister et al., 2015; Tamoutounour et al., 2019).”

      REFERENCES

      Alshetaiwi, H., Pervolarakis, N., McIntyre, L. L., Ma, D., Nguyen, Q., Rath, J. A., Nee, K., Hernandez, G., Evans, K., Torosian, L., Silva, A., Walsh, C., & Kessenbrock, K. (2020). Defining the emergence of myeloid-derived suppressor cells in breast cancer using single-cell transcriptomics. Science Immunology, 5(44), eaay6017. https://doi.org/10.1126/sciimmunol.aay6017

      Banerjee, G., Damodaran, A., Devi, N., Dharmalingam, K., & Raman, G. (2004). Role of keratinocytes in antigen presentation and polarization of human T lymphocytes. Scandinavian Journal of Immunology, 59(4), 385–394. https://doi.org/10.1111/j.0300-9475.2004.01394.x

      Black, A. P. B., Ardern-Jones, M. R., Kasprowicz, V., Bowness, P., Jones, L., Bailey, A. S., & Ogg, G. S. (2007). Human keratinocyte induction of rapid effector function in antigen-specific memory CD4+ and CD8+ T cells. European Journal of Immunology, 37(6), 1485–1493. https://doi.org/10.1002/eji.200636915

      Carr, M. M., McVittie, E., Guy, K., Gawkrodger, D. J., & Hunter, J. A. (1986). MHC class II antigen expression in normal human epidermis. Immunology, 59(2), 223–227.

      Gawkrodger, D. J., Carr, M. M., McVittie, E., Guy, K., & Hunter, J. A. (1987). Keratinocyte expression of MHC class II antigens in allergic sensitization and challenge reactions and in irritant contact dermatitis. The Journal of Investigative Dermatology, 88(1), 11–16. https://doi.org/10.1111/1523-1747.ep12464641

      Jiang, Y., Tsoi, L. C., Billi, A. C., Ward, N. L., Harms, P. W., Zeng, C., Maverakis, E., Kahlenberg, J. M., & Gudjonsson, J. E. (2020). Cytokinocytes: The diverse contribution of keratinocytes to immune responses in skin. JCI Insight, 5(20), e142067, 142067. https://doi.org/10.1172/jci.insight.142067

      Li, D., Cheng, S., Pei, Y., Sommar, P., Kärner, J., Herter, E. K., Toma, M. A., Zhang, L., Pham, K., Cheung, Y. T., Liu, Z., Chen, X., Eidsmo, L., Deng, Q., & Xu Landén, N. (2022). Single-Cell Analysis Reveals Major Histocompatibility Complex II‒Expressing Keratinocytes in Pressure Ulcers with Worse Healing Outcomes. The Journal of Investigative Dermatology, 142(3 Pt A), 705–716. https://doi.org/10.1016/j.jid.2021.07.176

      Oh, S., Chung, H., Chang, S., Lee, S.-H., Seok, S. H., & Lee, H. (2019). Effect of Mechanical Stretch on the DNCB-induced Proinflammatory Cytokine Secretion in Human Keratinocytes. Scientific Reports, 9(1), 5156. https://doi.org/10.1038/s41598-019-41480-y

      Siriwach, R., Ngo, A. Q., Higuchi, M., Arima, K., Sakamoto, S., Watanabe, A., Narumiya, S., & Thumkeo, D. (2022). Single-cell RNA sequencing identifies a migratory keratinocyte subpopulation expressing THBS1 in epidermal wound healing. iScience, 25(4), 104130. https://doi.org/10.1016/j.isci.2022.104130

      Tamoutounour, S., Han, S.-J., Deckers, J., Constantinides, M. G., Hurabielle, C., Harrison, O. J., Bouladoux, N., Linehan, J. L., Link, V. M., Vujkovic-Cvijin, I., Perez-Chaparro, P. J., Rosshart, S. P., Rehermann, B., Lazarevic, V., & Belkaid, Y. (2019). Keratinocyte-intrinsic MHCII expression controls microbiota-induced Th1 cell responses. Proceedings of the National Academy of Sciences of the United States of America, 116(47), 23643–23652. https://doi.org/10.1073/pnas.1912432116

      Veglia, F., Sanseviero, E., & Gabrilovich, D. I. (2021). Myeloid-derived suppressor cells in the era of increasing myeloid cell diversity. Nature Reviews. Immunology, 21(8), 485–498. https://doi.org/10.1038/s41577-020-00490-y

    2. Reviewer #1 (Public Review):

      Summary:

      This is an interesting study that performs scRNA-Seq on infected and uninfected wounds. The authors sought to understand how infection with E. faecalis influences the transcriptional profile of healing wounds. The analysis demonstrated that there is a unique transcriptional profile in infected wounds with specific changes in macrophages, keratinocytes, and fibroblasts. They also speculated on potential crosstalk between macrophages and neutrophils and macrophages and endothelial cells using NicheNet analysis and CellChat. Overall the data suggest that infection causes keratinocytes to not fully transition which may impede their function in wound healing and that the infection greatly influenced the transcriptional profile of macrophages and how they interact with other cells.

      Strengths:

      It is a useful dataset to help to understand the impact of wound infection on transcription of specific cell types. The analysis is very thorough in terms of transcriptional analysis and uses a variety of techniques and metrics.

      Weaknesses:

      Some drawbacks of the study are the following. First the fact that it only has two mice per group, and only looks at one time point after wounding decreases the impact of the study. Wound healing is a dynamic and variable process so understanding the full course of the wound healing response would be very important to understand the impact of infection on the healing wound. The analysis has been bolstered by applying a cross-entropy test on the integrated dataset and to ensure robustness of the datasets (Fig S1F). Including unwounded skin in the scRNA-Seq would also lend a lot more significance to this study. However, this was technically challenging due to constraints with the number of immune cells in unwounded skin as described in the limitations section. Another drawback of the study is that mouse punch biopsies are very different than human wounds as they heal primarily by contraction instead of re-epithelialization like human wounds. The authors mitigated this somewhat be extracting the incisional parts of the wound. So while the conclusions are generally supported the scope of the work is somewhat limited.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #3 (Public Review):

      Summary:

      It has been proposed in the literature, that the ATP release channel Panx1 can be activated in various ways, including by tyrosine phosphorylation of the Panx1 protein. The present study reexamines the commercial antibodies used previously in support of the phosphorylation hypothesis and the presented data indicate that the antibodies may recognize proteins unrelated to Panx1. Consequently, the authors caution about the use and interpretation of results obtained with these antibodies.

      Strengths:

      The manuscript by Ruan et al. addresses an important issue in Panx1 research, i.e. the activation of the channel formed by Panx1 via protein phosphorylation. If the authors' conclusions are correct, the previous claims for Panx1 phosphorylation on the basis of the commercial anti-phospho-Panx1 antibodies would be in question.

      This is a very detailed and comprehensive analysis making use of state-of-the-art techniques, including mass spectrometry and phos-tag gel electrophoresis.

      In general, the study is well-controlled as relating to negative controls.

      The value of this manuscript is, that it could spawn new, more function-oriented studies on the activation of Panx1 channels.

      Weaknesses:

      Although the manuscript addresses an important issue, the activation of the ATP-release channel Panx1 by protein phosphorylation, the data provided do not support the firm conclusion that such activation does not exist. The failure to reproduce published data obtained with commercial anti-phospho Panx1 antibodies can only be of limited interest for a subfield.

      (1) The title claiming that "Panx1 is NOT phosphorylated..." is not justified by the failure to reproduce previously published data obtained with these antibodies. If, as claimed, the antibodies do not recognize Panx1, their failure cannot be used to exclude tyrosine phosphorylation of the Panx1 protein. There is no positive control for the antibodies.

      The full title of our manuscript is “Human Pannexin 1 Channel is NOT Phosphorylated by Src Tyrosine Kinase at Tyr199 and Tyr309”. The major conclusion of our manuscript shall not be extended to the claim that “Panx1 is NOT phosphorylated”. This is by no means our conclusion. In fact, the LC-MS/MS data from both ours and others have shown that PANX1 is phosphorylated at both serine and tyrosine sites1. However, we provided solid evidence that Tyr199 and Tyr309 of human PANX1 are not effective substrate of the Src kinase.

      We did provide several positive controls for the antibodies in our study. We showed that the anti-PANX1 and anti-Src antibodies unambiguously recognized PANX1 and Src, respectively (Figure 3A), and that a pan-specific phosphotyrosine antibody (P-Tyr-100) unambiguously recognized phosphorylated Src (Figure 3A)—as expected—but did not recognize PANX1. In addition, we demonstrated that the two antibodies in question (anti-PANX1-pY198 and anti-PANX1-pY308) did produce signals in our western blot analysis, but we provided compelling evidence that the bands produced by these antibodies do not correspond to PANX1 (Figure 2B).

      (2) The authors claim that exogenous SRC expression does not phosphorylate Y198. DeLalio et al. 2019 show that Panx1 is constitutively phosphorylated at Y198, so an effect of exogenous SRC expression is not necessarily expected.

      We have unambiguously identified peptide fragments containing non-phosphorylated Y198 in our LC-MS/MS experiment, none corresponds to a phosphorylated Y198. Therefore, our LC-MS/MS data doesn’t support the notion that Panx1 is constitutively phosphorylated at Y198.

      (3) The authors argue that the GFP tag of Panx1at the COOH terminus does not interfere with folding since the COOH modified (thrombin cleavage site) Panx1 folds properly, forming an amorphous glob in the cryo-EM structure. However, they do not show that the COOH-modified Panx1 folds properly. It may not, because functional data strongly suggest that the terminal cysteine dives deep into the pore. For example, the terminal cysteine, C426, can form a disulfide bond with an engineered cysteine at position F54 (Sandilos et al. 2012).

      Our manuscript included results of using a non-GFP tagged PANX1 construct (Figure 2-figure supplement 1). We didn’t notice any difference for PANX1 phosphorylation between GFP-tagged and non-GFP-tagged PANX1. Therefore, the folding of the C-terminal tail of PANX1 doesn’t affect the conclusion of our study.

      (4) The authors dismiss the additional arguments for tyrosine phosphorylation of Panx1 given by the various previous studies on Panx1 phosphorylation. These studies did not, as implied, solely rely on the commercial anti-phospho-Panx1 antibodies, but also presented a wealth of independent supporting data. Contrary to the authors' assertion, in the previous papers the pY198 and pY308 antibodies recognized two protein bands in the size range of glycosylated and partial glycosylated Panx1.

      We didn’t dismiss additional arguments for the Src-dependent PANX1 regulation. In fact, in the discussion of our manuscript, we acknowledged the fact that Src may still be involved in PANX1 regulation, but probably through indirect mechanisms. In the two previous studies2,3, it’s unclear if the multimeric bands detected by pY198/pY308 antibodies correspond to glycosylated PANX1 or not, as the authors did not overlay the protein markers with their blots. In particular, the migration pattern of PANX1 changes across different western blot images from DeLalio et al2. It’s also worth noting that none of these “independent supporting data” in the two previous studies provided direct evidence that Src can phosphorylate pY198/pY308.

      (5) A phosphorylation step triggering channel activity of Panx1 would be expected to occur exclusively on proteins embedded in the plasma membrane. The membrane-bound fraction is small in relation to the total protein, which is particularly true for exogenously expressed proteins. Thus, any phosphorylated protein may escape detection when total protein is analyzed. Furthermore, to be of functional consequence, only a small fraction of the channels present in the plasma membrane need to be in the open state. Consequently, only a fraction of the Panx1 protein in the plasma membrane may need to be phosphorylated. Even the high resolution of mass spectroscopy may not be sufficient to detect phosphorylated Panx1 in the absence of enrichment processes.

      We agree with the reviewer that only plasma membrane-residing Panx1 phosphorylation is functionally relevant. Interestingly, however, previous studies actually analyzed total protein from cell lysate and concluded that PANX1 is phosphorylated at Y198 and Y3082,3. This has motivated our analysis, in which we found that the phosphorylation events cannot be detected when using whole cell lysate. Therefore, we have also conducted an electrophysiology experiment by comparing conditions with/without active Src kinase (Figure 7). Our result indicates that PANX1 current is not affected by the presence of Src. This result suggests that even if there might be minor Src kinase phosphorylation beyond detection limit of western blot or mass spectrometry, they may not be functionally significant as well.

      (6) In the electrophysiology experiments described in Figure 7, there is no evidence that the GFP-tagged Panx1 is in the plasma membrane. Instead, the image in Figure 7a shows prominent fluorescence in the cytoplasm. In addition, there is no evidence that the CBX-sensitive currents in 7b are mediated by Panx1-GFP and are not endogenous Panx1. Previous literature suggests that the hPanx1 protein needs to be cleaved (Chiu et al. 2014) or mutated at the amino terminus (Michalski et al 2018) to see voltage-activated currents, so it is not clear that the currents represent hPANX1 voltage-activated currents.

      Our previous analysis has already shown that endogenous current of non-transfected cells is not sensitive to CBX4. Therefore, the CBX-sensitive current in cells overexpressed PANX1 is from PANX1-GFP. It should be noted that when protein is overexpressed, it tends to accumulate at different intracellular membranes during protein synthesis/maturation. However, this doesn’t affect a portion of the protein to be trafficked to the plasma membrane. In the paper from Michalski et al 2018, it was shown that WT human/mouse PANX1 displayed voltage-dependent activation5. Although the current is relatively small, it is clearly distinguishable from non-transfected HEK and CHO cells. This voltage-dependent activation is also sensitive to CBX, consistent with our measurement (Figure 7)4. When GS is introduced at the N-terminus, the voltage-dependent activation of human/mouse PANX1 is significantly boosted, likely due to the altered NTH conformation resulting from the N-terminal extension.

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      Literature quotes are still problematic. Why are secondary papers quoted instead of the original work? At least quote reviews by authors who published the original findings.

      We appreciate the reviewer pointing this out. We have carefully checked our references and made sure that the original literature is cited.

      Why does wtPanx1 run close to the 37 kD marker (Figure 2 supplement 1) instead of close to 50 kD as shown in the previous papers using the pY198 and pY308 antibodies?

      It is a common observation that membrane proteins migration in SDS-PAGE gel doesn’t correlate with their formula molecular weight, also known as “gel shifting”6–8. The molecular mechanism of this phenomenon remains complex. Therefore, simply relying on protein molecular standard could not unambiguously identify PANX1 protein band. This is an issue for identifying PANX1 band, especially in light of the fact that some antibodies may not be very specific (see Figure 6B). In our experiment, we have correlated the in-gel fluorescence and western blot signal which allowed us to determine the protein band corresponding to PANX1. It is worth noting that in Figure S3 of DeLalio 2019, the PANX1 is detected at 37 kDa2. However, in many other panels of the paper, PANX1 is detected at close to 50 kDa (for example, Figure S2B).

      Figure 6, supplement 1: why are there oligomers observed in the absence of crosslinking? Why is there no shift in the size of the "oligomers" in response to glycosidase F?

      It is common to observe multimeric membrane proteins, including PANX1, forming oligomeric bands in SDS-PAGE gels, likely because they are not fully denatured or disassembled. PANX1 also contains several free cysteines, which may non-specifically crosslink subunits. There is actually a small shift for the 75 kDa band (dimer) in Figure 6, supplement 1. For higher molecular weight bands, this small shift may not be apparent due to the limited resolution of the gel.

      A positive control for the antibodies used is missing. The authors argue that such controls are not available, since these commercial antibodies are "proprietary".

      We did provide several positive controls for the antibodies in our study. We showed that the anti-PANX1 and anti-Src antibodies unambiguously recognized PANX1 and Src, respectively (Figure 3A), and that a pan-specific phosphotyrosine antibody (P-Tyr-100) unambiguously recognized phosphorylated Src (Figure 3A)—as expected—but did not recognize PANX1. In addition, we demonstrated that the two antibodies in question (anti-PANX1-pY198 and anti-PANX1-pY308) did produce signals in our western blot analysis, but we provided compelling evidence that the bands produced by these antibodies do not correspond to PANX1 (Figure 2B).

      Unfortunately, the epitopes that Millipore Sigma used to generate anti-PANX1-pY198 and anti-PANX1-pY308 are not available. The description of the immunogen from Millipore Sigma website states that “A linear peptide corresponding to 12 amino acids surrounding phospho-Tyr198 of murine Pannexin-1” and “A linear peptide corresponding to 13 amino acids surrounding phosphotyrosine 308 of rat pannexin-1”. However, these immunogen peptides are not available for us to purchase.

      References

      (1) Nouri-Nejad, D. et al. Pannexin 1 mutation found in melanoma tumor reduces phosphorylation, glycosylation, and trafficking of the channel-forming protein. Mol Biol Cell 32, (2021).

      (2) DeLalio, L. J. et al. Constitutive SRC-mediated phosphorylation of pannexin 1 at tyrosine 198 occurs at the plasma membrane. Journal of Biological Chemistry 294, (2019).

      (3) Weilinger, N. L. et al. Metabotropic NMDA receptor signaling couples Src family kinases to pannexin-1 during excitotoxicity. Nat Neurosci 19, (2016).

      (4) Ruan, Z., Orozco, I. J., Du, J. & Lü, W. Structures of human pannexin 1 reveal ion pathways and mechanism of gating. Nature 584, (2020).

      (5) Michalski, K., Henze, E., Nguyen, P., Lynch, P. & Kawate, T. The weak voltage dependence of pannexin 1 channels can be tuned by N-terminal modifications. Journal of General Physiology 150, (2018).

      (6) Rath, A., Cunningham, F. & Deber, C. M. Acrylamide concentration determines the direction and magnitude of helical membrane protein gel shifts. Proc Natl Acad Sci U S A 110, (2013).

      (7) Rath, A. & Deber, C. M. Correction factors for membrane protein molecular weight readouts on sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Anal Biochem 434, (2013).

      (8) Rath, A., Glibowicka, M., Nadeau, V. G., Chen, G. & Deber, C. M. Detergent binding explains anomalous SDS-PAGE migration of membrane proteins. Proc Natl Acad Sci U S A 106, (2009).

    2. Reviewer #1 (Public Review):

      The current manuscript revisits previous reports in the literature. The human Pannexin 1 channel is regulated by phosphorylation at two residues by Src kinase. From this series of experiments, the authors conclude that PANX-1 is not phosphorylated at these residues.

      The biggest strength of the manuscript is the comprehensiveness of the approach. The authors recapitulate prior experiments in the literature and also add a series of new, orthogonal experiments that all examine the claim of PANX-1 phosphorylation. The breadth of the reported experiments extends over multiple cell lines and protein constructs, in vitro purified proteins, mass spec, different phosphorylation detection reagents and antibodies, and functional electrophysiology assays that show that the addition of Src does not impact gating. The combined weight of all these data strongly suggests that the field should re-examine the claim that PANX-1 is regulated by phosphorylation at Y199 and Y309.

      Another strength is that the authors go beyond simply showing that the antibodies do not recognize phosphorylated PANX-1. They also provide potential mechanisms for how the antibodies may be misleading. Both antibodies recognize phosphorylated Src-1. In the case of anti-PANX1-pY308, the authors provide solid mutagenesis evidence that the antibody also weakly recognizes a non-phosphorylated epitope of PANX1 in the same region as the tyrosine. This helps make a convincing case.

      Such experiments, while not glamorous, have great practical importance for developing an accurate understanding of how Pannexin channels are regulated.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1:

      Summary:

      The evolution of non-shivering thermogenesis is of fundamental importance to understand. Here, in small mammals, the contractile apparatus of the muscle is shown to increase energy expenditure upon a drop in ambient temperature. Additionally, in the state of torpor, small hibernators did not show an increase in energy expenditure under the same challenge.

      Strengths:

      The authors have conducted a very well-planned study that has sampled the muscles of large and small hibernators from two continents. Multiple approaches were then used to identify the state of the contractile apparatus, and its energy expenditure under torpor or otherwise.

      Weaknesses:

      There was only one site of biopsy from the animals used (leg). It would be interesting to know if non-shivering thermogenesis is something that is regionally different in the animal, given the core body and distal limbs have different temperatures.

      We thank the reviewer for their time and effort in reviewing our manuscript. Furthermore, we agree that it would be of interest to perform similar experiments upon different muscle sites in these animals. This is of particular interest as in some mammals, such as mice, distal limbs do not shiver and therefore non-shivering thermogenesis may play a more prominent role in heat regulation. A paper from Aydin et al., demonstrated that when shivering muscles (soleus) were prevented undergoing non-shivering thermogenesis via knock-out of UCP1 and were then exposed to cold temperatures, the force production of these muscles was significantly reduced due to prolonged shivering [1]. These results do suggest that even in shivering muscle, non-shivering thermogenesis plays a key role in the generation of heat for survival and for the maintenance of muscle performance. Furthermore, there is evidence from garden dormice that muscle temperature during torpor is slightly warmer than abdominal temperature and slighter cooler that heart temperature which is 7-8°C than abdominal suggesting the existence of non-shivering thermogenesis in skeletal and cardiac muscles (Giroud et al. in prep) [2]. We have added this information and reference into our discussion to reflect this important point (Discussion, paragraph 6, “As the biopsies which were used…”).

      Reviewer #2:

      Summary:

      The authors utilized (permeabilized) fibers from muscle samples obtained from brown and black bears, squirrels, and Garden dormice, to provide interesting and valuable data regarding changes in myosin conformational states and energetics during hibernation and different types of activity in summer and winter. Assuming that myosin structure is similar between species then its role as a regulator of metabolism would be similar and not different, yet the data reveal some interesting and perplexing differences between the selected hibernating species.

      Strengths:

      The experiments on the permeabilized fibers are complementary, sophisticated, and well-performed, providing new information regarding the characteristics of skeletal muscle fibers between selected hibernating mammalian species under different conditions (summer, interarousal, and winter).

      The studies involve complementary assessments of muscle fiber biochemistry, sarcomeric structure using X-ray diffraction, and proteomic analyses of posttranslational modifications.

      Weaknesses:

      It would be helpful to put these findings on permeabilized fibers into context with the other anatomical/metabolic differences between the species to determine the relative contribution of myosin energetics (with these other contributors) to overall metabolism in these different species, including factors such as fat volume/distribution.

      We thank the reviewer for the time and effort they have put into reviewing our paper and are grateful for the helpful suggestions which we believe, enhances our work (please see below for detailed answers to critics).

      Reviewer #3:

      Summary and strengths:

      The manuscript, "Remodelling of skeletal muscle myosin metabolic states in hibernating mammals", by Lewis et al, investigates whether myosin ATP activity may differ between states of hibernation and activity in both large and small mammals. The study interrogates (primarily) permeabilized muscle strips or myofibrils using several state-of-the-art assays, including the mant-ATP assay to investigate ATP utilization of myosin, X-ray diffraction of muscles, proteomics studies, metabolic tests, and computational simulations. The overall data suggests that ATP utilization of myosin during hibernation is different than in active conditions.

      A clear strength of this study is the use of multiple animals that utilize two different states of hibernation or torpor. Two large animal hibernators (Eurasian Brown Bear, American Black Bear) represent large animal hibernators that typically undergo prolonged hibernation. Two small animal hibernators (Garden Dormouse, 13 Lined Ground Squirrel) undergo torpor with more substantial reductions in heart rate and body temperature, but whose torpor bouts are interrupted by short arousals that bring the animals back to near-summer-like metabolic conditions.

      Especially interesting, the investigators analyze the impact that body temperature may have on myosin ATP utilization by performing assays at two different temperatures (8 and 20 degrees C, in 13 Lined Ground Squirrels).

      The multiple assays utilized provide a more comprehensive set of methods with which to test their hypothesis that muscle myosins change their metabolic efficiency during hibernation.

      We thank this reviewer for the effort and time they have put into carefully reviewing our manuscript and have taken on board their valuable suggestions to improve our manuscript (please see below for detailed answers to critics).

      Suggestions and potential weaknesses:

      While the samples and assays provide a robust and comprehensive coverage of metabolic needs and testing, the data is less categorical. Some of these may be dependent on sample size or statistical analysis while others may be dependent on interpretation.

      (1) Statistical Analysis

      (1a) The results of this study often cannot be assessed properly due to a lack of clarity in the statistical tests.

      For example, the results related to the large animal hibernators (Figure 1) do not describe the statistical test (in the text of the results, methods, or figure legends). (Similarly for figure 6 and Supplemental Figure 1). Further, it is not clear whether or when the analysis was performed with paired samples. As the methods described, it appears that the Eurasian Brown Bear data should be paired per animal.

      We thank the reviewer for these important points and have added information upon the statistical tests used where previously missing in each figure legend. Details on the statistical testing used for figure 6 are listed in the methods section, paragraph 18, “All statistical analysis of TMT derived protein expression data…”

      (1b) The statistical methods state that non-parametric testing was utilized "where data was unevenly distributed". Please clarify when this was used.

      We have now clariid all statistical tests used in the figure legends.

      (1c) While there are two different myosin isoforms, the isoform may be considered a factor. It is unclear why a one-way ANOVA is generally used for most of the mant-ATP chase data.

      The reviewer is right, in our analysis, we haven’t considered ‘myosin isoforms’ as a factor. One of the main reasons for that is because we have decided to treat fibres expressing different myosin heavy chain isoforms as totally separated entities (not interconnected).

      (1d) While the technical replicates on studies such as the mant-ATP chase assay are well done, the total biological replicates are small. A consideration of the sample power should be included.

      Unfortunately, obtaining additional biological samples from these unique species is challenging. Hence, we have added a statement in the Discussion section. This statement focuses on the potential benefits of increasing sample size to increase statistical power (Discussion, paragraph 2, “In contrast to our study hypothesis…”

      (1e) An analysis of the biological vs statistical significance should be considered, especially for the mant-ATP chase data from the American Black Bear, where there appear to be shifts between the summer and winter data.

      We agree that it is important to be careful when drawing conclusions from data only based on p-values. We agree that the modest differences observed in these data on American Black bear, whilst not significant, are worth noting and we have added these considerations into the manuscript (Discussion, paragraph 2, “In contrast to our study hypothesis…).

      (2) Consistency of DRX/SRX data.

      (2a) The investigators performed both mant-ATP chase and x-ray diffraction studies to investigate whether myosin heads are in an "on" or "off" state. The results of these two studies do not appear to be fully consistent with each other, which should not be a surprise. The recent work of Mohran et al (PMID 38103642) suggests that the mant-ATP-predicted SRX:DRX proportions are inconsistent with the position of the myosin heads. The discussion appears to lack a detailed assessment of this prior work and lack a substantive assessment contrasting the differing results of the two assays in the current study. i.e. why the current study's mant-ATP chase and x-ray diffraction results differ.

      Prior works on skeletal muscle (observing discrepancies between Mant-ATP chase assay and X-ray diffraction) are rather scarce. Adding a comprehensive discussion about this may be beyond the scope of current study and would distract the reader from the main topic. For this reason, we have not added any section. Note that, we have other manuscripts in preparation that are specifically dedicated to the discrepancy.

      (2b) The discussion of the current study's x-ray diffraction data relating to the I_1,1/I_1,0 ratio and how substantially different this is to the M6 results merits discussion. i.e. how can myosin both be more primed to contract during IBA versus torpor (according to intensity ratio), but also have less mass near the thick filament (M6).

      The I1,1/I1,0 ratio indicates a subtle mass shift towards the myosin thick filament whilst the M6 spacing shows a more compliant thick filament. These results are not incompatible and rely on interpretation of the X-ray diffraction patterns. To avoid any confusion and avoid distracting the reader from the main topic, we have decided not to speculate there.

      (3) Possible interactions with Heat Shock Proteins

      Heat Shock Proteins (HSPs), such as HSP70, have been shown to be differential during torpor vs active states. A brief search of HSP and myosin reveals HPSs related to thick filament assembly and Heat Shock Cognate 70 interacting with myosin binding protein C. Especially given the author's discussion of protein stability and the potential interaction with myosin binding protein C and the SRX state, the limitation of not assessing HSPs should be discussed. (While HSP's relation to thick filament assembly might conceivably modify the interpretation of the M3 x-ray diffraction results, this reviewer acknowledges that possibility as a leap.)

      The reviewer raises an interesting and potentially important of the potential impact of HSP and their interaction with the thick filament during hibernation. We have added a section into the discussion of this manuscript regarding this, with particular impact upon the HSP70 acting as a chaperone for myosin binding protein, however we feel that it is important to point out that HSPs have only been shown to interact with MYBPC3, a cardiac isoform of this protein which is not present in skeletal muscle [3]. (Discussion, paragraph 5, “Of potential further interest to the regulation of myosin…”).

      Despite the above substantial concerns/weaknesses, this reviewer believes that this manuscript represents a valuable data set.

      Other comments related to interpretation:

      (4) The authors briefly mention the study by Toepfer et al [Ref 25] and that it utilizes cardiac muscles. There would benefit from increased discussion regarding the possible differences in energetics between cardiac and skeletal muscle in these states.

      As this manuscript focuses solely on skeletal muscle. We believe that introducing comparisons between cardiac and skeletal muscles would confuse the reader. These types of muscles have very different regulations of SRX/DRX as an example. Note that we are preparing a manuscript focusing on cardiac muscle and hibernation.

      (5) The author's analysis of temperature is somewhat limited.

      (5a) First, the authors use 20 degrees C (room temperature), not 37 degrees C, a more physiologic body temperature for large mammals. While it is true that limbs are likely at a lower temperature, 20 degrees C seems substantially outside of a normal range. Thus, temperature differences may have been minimized by the author's protocol.

      The authors agree that the experimental set up to perform these single fiber studies at slightly higher temperatures may have been more beneficial to replicate the physiological conditions of these hind leg muscle in the analyzed animals. However, previous work has shown that the resting myosin dynamics are in fact stable at temperatures between 20-30 degrees Celsius in type I, type II and cardiac mammalian muscle fibers [4].

      (5b) Second, the authors discuss the possibility of myosin contributing to non-shivering thermogenesis. The magnitude of this impact should be discussed. The suggestion of myosin ATP utilization also implies that there is some basal muscle tone (contraction), as the myosin ATPase utilizes ATP to release from actin, before binding and hydrolyzing again. Evidence of this tone should be discussed.

      The reviewer is raising an interesting point and it would indeed be interesting to assess the magnitude of the impact and whether a basal muscle tone exists. Assessing the magnitude of the impact, is not an easy task and would require very advanced simulations which we are not experts in unfortunately. As for basal muscle tone, this is difficult to say as myosin is not actually binding to actin but hydrolyzing ATP at a faster pace during hibernation. We then think that the relation between our data and basal muscle tone is unclear. Hence, we have decided not to discuss these points in the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This is a very interesting paper. I have some minor suggestions to help improve it.

      Is there any way to estimate the contribution of contractile apparatus to energy expenditure in reference to what is being generated at SERCA in the resting muscle under the various states examined?

      This is an interesting idea however, as far as we know, this would be challenging experimentally (in the hibernating mammals) and difficult to achieve in a reliable manner.

      It is important to emphasize that while BAT has been traditionally seen to be the site of NST, the skeletal muscle is very important, especially in large mammals, where BAT is going to be a very small % of the body and unlikely to be able to adequately provide heat. The addition of the contractile apparatus to SERCA as a heat generator at rest is very important -- also, the activation of ryanodine receptor Ca2+ to increase the local [Ca2+] at SERCA to generate heat has also recently been shown and should be mentioned (Meizoso-Huesca et al 2022, PNAS; Singh et al 2023, PNAS) alongside the work of Bal et al 2012 etc...

      We have included these mechanisms and references into the manuscript discussion [5, 6]. Discussion, paragraph 4, “A critical difference between the large hibernators…”

      Are you able to report the likely proportion of type II fibers in the muscles you have sampled?

      The fiber type breakdown for all animals used in this study is reported in supplementary table 1.

      The sampling of muscle from the legs of live animals is sensible and convenient. Is it possible different muscles in the body have different levels of NST, changes in energy expenditure in torpor, and other states?

      As discussed in the public review we have added to the discussion of this manuscript to reflect upon this important point of potentially different results from different muscle sites in these animals.

      Reviewer #2 (Recommendations For The Authors):

      Is it likely that the proportion of fast and slow myosin-heavy chains within the selected sample of myofibers from the different mammals contributes to the overall differences in the energetics of different conformational states? In living animals, how does the relative contribution of the energetics from different muscle fiber types compare with the contribution from other organs to the overall regulation of metabolism during activities in summer, winter, or periods of intermittent arousal?

      Fiber types in mammals can be vastly different between species as well as having a considerable amount of plasticity to change within each species upon specific stimuli. Furthermore, some mammals also have specific myosin heavy chain isoforms which have considerable expression, for example, myosin heavy chain 2B which is expressed in rodents such as mice but not larger mammals such as humans.

      In the manuscript, we demonstrate that there is no significant change in the ATP usage by myosin in resting muscle in any of the species which we examined (Fig 1 F, L; Fig 2 E, J). The relatively high mitochondrial density of type I fibers when compared to type II fibers may contribute to a higher overall requirement of energy storage primarily via lipid oxidation. However, mitochondrial respiration is heavily suppressed during hibernation, so questions remain over the overall energy demand in hibernating muscle beyond myosin [7]. The fact that myosin ATP demand is relatively preserved in hibernating muscle suggests that skeletal muscle may be a relatively energy-demanding organ even during hibernation, we speculate in the manuscript this may be due to the requirement of maintaining muscular tone and function during this period of prolonged immobilization. This may be of relevance when one considers the almost complete shutdown of organs involved with food intake and breakdown such as the stomach and liver during hibernation. Furthermore, heart rate and breathing rates are vastly suppressed. Altogether, whilst is it difficult at this point to make an accurate estimate of energy demands between the different organs of hibernators, our data points to skeletal muscle to be a relatively high energy demand organ during these periods. When considering the difference between fiber type, again our data suggests that both type I and type II fibers have relatively similar energy demands during hibernation.

      The supplementary data are quite revealing as to how the myosin isoform composition is stable in some species but highly plastic in others in response to the same environmental/metabolic challenges. Why is the myosin heavy chain isoform (I and II) composition stable for brown bears but not for black bears between summer and winter? This is very interesting. For the Ground squirrel, there is remarkable plasticity between myosin heavy chain isoforms ( I and II) between summer, interbout arousal, and torpor. Yet in the Garden Dormouse, the myosin heavy chain isoform (I and II) composition is stable between these three activity states. The inconsistencies between and within species are perplexing and worthy of closer interrogation.

      The measurements and role of myosin energetics in different conformational states are interesting but need to be explained in context with other metabolic regulators for these hibernating mammals, especially because some species show remarkable plasticity whereas others show remarkable stability. For example, compare brown and black bears which show differences in the response of myosin composition the activity, interbout arousal, and torpor. Ground squirrels show remarkable plasticity in myosin isoform composition between activity states (and likely metabolic differences), but the Garden Dormouse has a remarkably stable myosin isoform composition during the three metabolic/environmental challenges. What mechanisms facilitate these modifications in some but not other mammals, even those of similar size? The differences are very interesting, worthy of follow-up, and may well contribute to further understanding the significance of the energetics of different myosin conformational states.

      We agree that the changes seen between these species are very interesting and worthy of further investigation. What would be of further interest would be to look at methods which would allow for even deeper phenotyping, such as single fiber proteomics, to allow for the assessment of the percentage of hybrid fibers and fibers undergoing any fiber type switch during hibernating periods. Our results do observe a modest, albeit not significant, increase in the number of type I muscle fibers in 13-lined ground squirrels and Garden dormice during torpor which is consistent with previous studies[8]. Previous studies have demonstrated that lower temperatures may promote a shift towards more oxidative type I muscle fibers in mammals[9]. This could be an explanation for why we see this specifically in the smaller hibernators, however as we demonstrate and discuss, these lower temperatures are vital for the survival of these smaller mammals during hibernation so it would be inconsistent to hypothesize that these shifts are for heat-production purposes. Further studies are warranted to understand the relevance of these shifts further, particularly those with a higher sample size. It would also be on interest to examine fiber type percentages during the progression these long hibernating periods to observe if these changes are progressive.

      As for the triggers and mechanisms which facilitate these changes to myosin dynamics, this is of current investigation by the field. One which may be of particular relevance to the changes seen during hibernation would that of steroid hormones previous research has demonstrated that steroid hormone levels in make and female bears change differentially[10]. This may be of relevance as the steroid hormone estradiol has been shown to slow the resting myosin ATP turnover via the binding of myosin RLC[11]. Considering these studies, future work which looks at hibernating animals of each sex as different groups may be fruitful.

      Reviewer #3 (Recommendations For The Authors):

      i. PDF Pg 8- Results- 'Myosin temperature sensitivity is lost in relaxed skeletal muscles fibers of hibernating Ictidomys tridecemlineatus.': An extra comma appears to be placed between "temperature, decrease".

      ii. PDF Pg 9- Results- 'Hyper-phosphorylation of Myh2 predictably stabilizes myosin backbone in hibernating Ictidomys tridecemlineatus.' (last paragraph): A parenthesis needs to be closed upon the first reference to "supplemental figures 2 and 3".

      iii. PDF Pg 15- Methods- 'Samples collection and cryo-preservation'- The authors use the term "individuals" in the 2nd line. Consider using "subjects".

      iv. PDF Pg 15- Methods- 'Samples collection and cryo-preservation' (2nd paragraph)- define "subadult" in approximate months or years.

      v. PDF Pg 15- Methods- 'Samples collection and cryo-preservation' (2nd paragraph)- The authors state that brown bears were located in "February and again ... in late June". Was this order of operations always held? If so, a comment about how the potential ageing from the hibernation (especially if sub-adult transitions to adulthood in this period) should be included.

      All samples were collected during the subadult period of the lifespan of each bear and therefore we do not think that there would be a potential aging affect observed considering the lifespan of this species to be 20-30 years.

      vi. PDF Pg 15- Methods- 'Samples collection and cryo-preservation' (3rd paragraph)- The justification for deprivation of feeding of black bears 24 hours prior to euthanasia should be included. A comment on how this might impact post-translational modifications or gene expression should be included.

      Animals are starved prior to prevent aspiration during euthanasia. Considering these samples are to be compared to animals which have not consumed food or water for five months the impact relative impact on PTMs and gene expression would be considered negligible.

      vii. PDF Pg 17- Methods- 'Mant-ATP chase experiments' (just after normalized fluorescence equation): The "Where" may be lowercase.

      viii. PDF Pg 17- Methods- 'Mant-ATP chase experiments' (last paragraph): The protocol for myosin staining, along with the antibody identification (source, catalog number) should be included.

      ix. PDF Pg 18- Methods- 'Post-translational Modification Peptide mapping': Define the makeup of the acrylamide gel and/or the source and catalog number.

      x. PDF Pg 18- Methods- 'Post-translational Modification Peptide mapping': The authors state that "Gel bands were washed..." Please specify which protein bands and if multiple bands (i.e. multiple isoforms) were isolated.

      We thank this reviewer for their careful reading of our manuscript, we have made the changes above as relevant.

      Reference list

      (1) Aydin, J., et al., Nonshivering thermogenesis protects against defective calcium handling in muscle. Faseb j, 2008. 22(11): p. 3919-24.

      (2) Stickler, S., Regional body temperatures and fatty acid compositions in hibernating garden dormice: a focus on cardiac adaptions. 2022, Vienna: Vienna. p. v, 49 Seiten, Illustrationen.

      (3) Glazier, A.A., et al., HSC70 is a chaperone for wild-type and mutant cardiac myosin binding protein C. JCI Insight, 2018. 3(11).

      (4) Walklate, J., et al., Exploring the super-relaxed state of myosin in myofibrils from fast-twitch, slow-twitch, and cardiac muscle. Journal of Biological Chemistry, 2022. 298(3).

      (5) Meizoso-Huesca, A., et al., Ca<sup>2+</sup> leak through ryanodine receptor 1 regulates thermogenesis in resting skeletal muscle. Proceedings of the National Academy of Sciences, 2022. 119(4): p. e2119203119.

      (6) Singh, D.P., et al., Evolutionary isolation of ryanodine receptor isoform 1 for muscle-based thermogenesis in mammals. Proceedings of the National Academy of Sciences, 2023. 120(4): p. e2117503120.

      (7) Staples, J.F., K.E. Mathers, and B.M. Duffy, Mitochondrial Metabolism in Hibernation: Regulation and Implications. Physiology, 2022. 37(5): p. 260-271.

      (8) Xu, R., et al., Hibernating squirrel muscle activates the endurance exercise pathway despite prolonged immobilization. Exp Neurol, 2013. 247: p. 392-401.

      (9) Yu, J., et al., Effects of Cold Exposure on Performance and Skeletal Muscle Fiber in Weaned Piglets. Animals (Basel), 2021. 11(7).

      (10) Frøbert, A.M., et al., Differential Changes in Circulating Steroid Hormones in Hibernating Brown Bears: Preliminary Conclusions and Caveats. Physiol Biochem Zool, 2022. 95(5): p. 365-378.

      (11) Colson, B.A., et al., The myosin super-relaxed state is disrupted by estradiol deficiency. Biochemical and biophysical research communications, 2015. 456(1): p. 151-155.

    1. eLife assessment

      This important study builds on a previous publication (with partially overlapping authors), demonstrating that T. brucei has a continuous endomembrane system, which probably facilitates high rates of endocytosis. Using a range of cutting-edge approaches, the authors present compelling evidence that an actomyosin system, with the myosin TbMyo1 as the molecular motor, is localized close to the endosomal system in the bloodstream form (BSF) of Trypanosoma brucei. It shows convincingly that actin plays a role in the organization and integrity of the endosomal system, and that the trypanosome Myo1is an active motor that interacts with actin and transiently associates with endosomes, but a role of Myo1 in endomembrane function in vivo was not directly demonstrated. This work should be of interest to cell biologists and microbiologists working on the cytoskeleton, and unicellular eukaryotes.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      Comments on revised version:

      The authors have satisfactorily addressed my concerns.

      I suggest some minor edits, however. Line 747 does not mention MARK3 and neither does the figure 8 legend (just MARK2). It would be helpful if the authors could include references to the papers reporting the shown structures in the Figure 8 legend

      We have added MARK3 and related references in the revised Figure 8 legend.

      Reviewer #2:

      I would recommend that the catalog numbers from the different antibodies used in the study, mainly CST and Invitrogen are depicted in material and methods (see Methods/Recombinant proteins and general reagents).

      Thank you for the comment. We have now added the antibody catalog numbers in the revised methods section.

      I have one remark related to question number 5 (my question was not clear enough). I meant if the authors did look at the functional relevance of the residues implicated in the identified salt-bridge network/tethers. What happens to the proteins functionally when you mutate those residues? (represented on Fig. 8).

      Otherwise, the authors have satisfactorily addressed my concerns.

      Yes, we have analyzed the stability of the salt bridge interaction in the context of cysteine mutations, and our findings are described in the results section titled “Cysteine mutations alter critical structural interactions required for kinase allosteric regulation Figure 6)”. However, we have not performed mutational analysis of the salt bridge residues as we feel this would be beyond the scope of the current study.

    2. Reviewer #1 (Public Review):

      Summary:

      Bendzunas, Byrne et al. explore two highly topical areas of protein kinase regulation in this manuscript. Firstly, the idea that Cys modification could regulate kinase activity. The senior authors have published some standout papers exploring this idea of late, and the current work adds to the picture of how active site Cys might have been favoured in evolution to serve critical regulatory functions. Second, BRSK1/2 are understudied kinases listed as part of the "dark kinome" so any knowledge of their underlying regulation is of critical importance to advancing the field.

      Strengths:

      In this study, the author pinpoints highly-conserved, but BRSK-specific, Cys residues as key players in kinase regulation. There is a delicate balance between equating what happens in vitro with recombinant proteins relative to what the functional consequence of Cys mutation might be in cells or organisms, but the authors are very clear with the caveats relating to these connections in their descriptions and discussion. Accordingly, by extension, they present a very sound biochemical case for how Cys modification might influence kinase activity in cellular environs.

      Comments on revised version:

      The authors have satisfactorily addressed my concerns.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study by Bendzunas et al, the authors show that the formation of intra-molecular disulfide bonds involving a pair of Cys residues near the catalytic HRD motif and a highly conserved T-Loop Cys with a BRSK-specific Cys at an unusual CPE motif at the end of the activation segment function as repressive regulatory mechanisms in BSK1 and 2. They observed that mutation of the CPE-Cys only, contrary to the double mutation of the pair, increases catalytic activity in vitro and drives phosphorylation of the BRSK substrate Tau in cells. Molecular modeling and molecular dynamics simulations indicate that oxidation of the CPE-Cys destabilizes a conserved salt bridge network critical for allosteric activation. The occurrence of spatially proximal Cys amino acids in diverse Ser/Thr protein kinase families suggests that disulfide-mediated control of catalytic activity may be a prevalent mechanism for regulation within the broader AMPK family. Understanding the molecular mechanisms underlying kinase regulation by redox-active Cys residues is fundamental as it appears to be widespread in signaling proteins and provides new opportunities to develop specific covalent compounds for the targeted modulation of protein kinases.

      The authors demonstrate that intramolecular cysteine disulfide bonding between conserved cysteines can function as a repressing mechanism as indicated by the effect of DTT and the consequent increase in activity by BSK-1 and -2 (WT). The cause-effect relationship of why mutation of the CPE-Cys only increases catalytic activity in vitro and drives phosphorylation of the BRSK substrate Tau in cells is not clear to me. The explanation given by the authors based on molecular modeling and molecular dynamics simulations is that oxidation of the CPE-Cys (that will favor disulfide bonding) destabilizes a conserved salt bridge network critical for allosteric activation. However, no functional evidence of the impact of the salt-bridge network is provided. If you mutated the two main Cys-pairs (aE-CHRD and A-loop T+2-CPE) you lose the effect of DTT, as the disulfide pairs cannot be formed, hence no repression mechanisms take place, however when looking at individual residues I do not understand why mutating the CPE only results in the opposite effect unless it is independent of its connection with the T+2residue on the A-loop.

      Strengths:

      This is an important and interesting study providing new knowledge in the protein kinase field with important therapeutic implications for the rationale design and development of next-generation inhibitors.

      Comments on revised version:

      The authors have satisfactorily addressed my concerns.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review): Weaknesses:

      However, the molecular mechanisms leading to NPC dysfunction and the cellular consequences of resulting compartmentalization defects are not as thoroughly explored. Results from complementary key experiments using western blot analysis are less impressive than microscopy data and do not show the same level of reduction. The antibodies recognizing multiple nucleoporins (RL1 and Mab414) could have been used to identify specific nucleoporins that are most affected, while the selection of Nup98 and Nup107 is not well explained.

      The results for the Western blots are less impressive than single nuclei imaging analysis because the protocol for isolating brain nuclei is heterogeneous and includes non-neuronal cells. For this reason, we selected specific nucleoporins for Western blot studies to complement the nonspecificity of pan-NPC antibodies for which the detection is based on the glycosylated moieties. We reasoned that a combination of pan-NPC and select NUPs will give the strongest complementary validation for the mutant phenotype. We have discussed the rationale of NUP selection in discussion. In brief, we selected NUP107 as it is a major component of the Yscaffold complex and is a long-lived subunit of the NPCs (Boehmer et al., 2003; D'Angelo et al., 2009). NUP98 is a mobile nucleoporin and is associated with the central pore, nuclear basket and cytoplasmic filaments. Both NUPs have been implicated in degenerative disorders. (Eftekharzadeh et al., 2018; Wu et al., 2001).

      There is also no clear hypothesis on how Aβ pathology may affect nucleoporin levels and NPC function. All functional NCT experiments are based on reporters or dyes, although one would expect widespread mislocalization of endogenous proteins, likely affecting many cellular pathways.

      We agree that the interaction between Aβ pathology and the NPC remains a work in progress. We decided to rigorously characterize Aβ-mediated deficits in App KI neurons – using different approaches and in more than one animal model – before moving on to explore mechanisms in subsequent studies, which we think deserves more extensive experiments. We seek your understanding and have included in the discussion, possible mechanisms for direct and indirect Aβ-mediated disruption of NPCs. We have also included an additional study to show the disruption in the localization of an endogenous nucleocytoplasmic protein – CRTC1 (cAMP Regulated Transcriptional Coactivator), which is CREB coactivator responsive to neural activity. We observed under basal and also in tetrodotoxin-silenced conditions, there is much higher CRTC1 in the nucleus in App KI neurons relative to WT. This reflects the compromised permeability barrier that we observed via FRAP studies. (Supplementary Figure S15).

      The second part of this manuscript reports that in App KI neurons, disruption in the permeability barrier and nucleocytoplasmic transport may enhance activation of key components of the necrosome complex that include receptor-interacting kinase 3 (RIPK3) and mixed lineage kinase domain1 like (MLKL) protein, resulting in an increase in TNFα-induced necroptosis. While this is of potential interest, it is not well integrated in the study. This potential disease pathway is not shown in the very simple schematic (Fig. 8) and is barely mentioned in the Discussion section, although it would deserve a more thorough examination.

      The study of necroptosis is meant to showcase a single cellular pathway that requires nucleocytoplasmic transport for activation that is compromised and is relevant for AD. We agree there is much more to explore in this pathway but feel is outside the scope of this study. We have included a new illustration that models how damage to NPCs and permeability barrier results in enhanced vulnerability of App KI neurons for necroptosis (Supplemental figure S12).

      Reviewer #2 (Public Review):

      (1) Adding statistics and comparisons between wild-type changes at different times/ages to determine if the nuclear pore changes with time in wild-type neurons. The images show differences in the Nuclear pore in neurons from the wild-type mice, with time in culture and age. However, a rigorous statistical analysis is lacking to address the impact of age/development on NUP function. Although the authors state that nuclear pore transport is reported to be altered in normal brain aging, the authors either did not design their experiments to account for the normal aging mechanisms or overlooked the analysis of their data in this light.

      All our quantifications and statistical comparisons in neuron cocultures are time-matched between WT and App KI neurons, and thus independent of age and maturity of the neurons in culture. The accelerated loss of NUP expression is evident across all time groups. However, we cannot compare across age groups in cultured neurons as the time-matched WT and App KI samples for each time point were processed and imaged separately as neurons matured over time (Fig. 1B-C). An experiment must be done simultaneously across all age groups to compare agerelated effects for WT and App KI neurons in order to account for time-dependent changes. Given the unique challenges of studying “aging” in culture systems, we opted to be more conservative in our interpretation of the results and as such, we were careful to describe the accelerated nuclear pore deficits in App KI neurons relative to time-matched WT expression and speculate its relationship to normal brain aging only in the discussion section. We seek your understanding in this matter. That said, we are able to capture the decline of the NPC in histology of brain sections and observed a statistically significant drop in WT NUP levels in animal sections across age groups where we quantified and compared the raw nuclear intensities from brain sections that were processed and imaged simultaneously across independent experiments (Fig. 1D-E). We have included a statement in the results section to highlight that point.

      (2) Add experiments to assess the contribution of wild-type beta-amyloid accumulation with aging. It was described in 2012 (Guix FX, Wahle T, Vennekens K, Snellinx A, Chávez-Gutiérrez L, Ill-Raga G, Ramos-Fernandez E, Guardia-Laguarta C, Lleó A, Arimon M, Berezovska O, Muñoz FJ, Dotti CG, De Strooper B. 2012. Modification of γ-secretase by nitrosative stress links neuronal ageing to sporadic Alzheimer's disease. EMBO Mol Med 4:660-673, doi:10.1002/emmm.201200243) and 2021 (Burrinha T, Martinsson I, Gomes R, Terrasso AP, Gouras GK, Almeida CG. 2021. Upregulation of APP endocytosis by neuronal aging drives amyloid-dependent synapse loss. J Cell Sci 134. doi:10.1242/jcs.255752), 28 DIV neurons are senescent and accumulate beta-amyloid42. In addition, beta-amyloid 42 accumulates normally in the human brain (Baker-Nigh A, Vahedi S, Davis EG, Weintraub S, Bigio EH, Klein WL, Geula C. 2015. Neuronal amyloid-β accumulation within cholinergic basal forebrain in ageing and Alzheimer's disease. Brain 138:1722-1737. doi:10.1093/brain/awv024), thus, it would be important to determine if it contributes to NUP dysfunction. Unfortunately, the authors tested the Abeta contribution at div14 when wild-type Abeta accumulation was undetected. It would enrich the paper and allow the authors to conclude about normal aging if additional experiments were performed, namely, treating 28Div neurons with DAPT and assessing if NUP is restored.

      Your point is well-noted. We are intrigued at the potential contribution of WT Aβ to the decline in NUPs and NPC but decided to focus on mutant Aβ for this manuscript. We have observed negligible MOAB2-positive Aβ signals in WT neurons across all age groups (data not shown) but acknowledge the potential contributions of aging toward a reduction in NPC function. Instead, we have included a section in the discussion to highlight the aging-related expression of Aβ in WT neurons and a subset of the citations above to indicate a possible link with normal decay of NPCs.

      Reviewer #3 (Public Review):

      Weaknesses:

      (1) It does not consider the relationship of the findings here to other published work on the intraneuronal perinuclear and nuclear accumulation of amyloid in other transgenic mouse models and in humans.

      We have updated the discussion to further elaborate on intraneuronal and perinuclear accumulation of amyloid and how that relates to our NPC phenotype.

      (2) It appears to presume that soluble, secreted Abeta is responsible for the effect rather than the insoluble amyloid fibrils.

      At present, our data cannot fully discount the role of fibrils or other forms of Aβ causing the NPC deficits, but our studies do show that external presence of Aβ (e.g. addition of synthetic oligomeric Aβ or App KI conditioned media) leads to intracellular accumulation and NPC dysfunction. We are aware that endogenous formation of fibrils could also contribute to the NPC dysfunction but refrained from drawing any conclusions without further studies. We have stated this in the discussion.

      (5) It is not clear when the alteration in NUP expression begins in the KI mice as there is no time at which there is no difference between NUP expression in KI and Wt and the earliest time shown is 2 months. If NUP expression is decreased from the earliest times at birth, then this makes the significance of the observation of the association with amyloid pathology less clear.

      The phenotype we observed early in neuronal cultures and in very young animals is subtle and in all our studies, the severity of the NUP phenotypes consistently correlates with elevated intracellular Aβ. We expect that by looking at earlier/younger neurons, the deficits will not be present. However, neurons before DIV7 are immature, and hence we chose not to include those in our observations. In animals, we observed Aβ expression in neuronal soma in young mice (2 mo.), but it is not clear when the deficits manifests and how early to look. While the NUP expression is reduced at an early stage, we speculate in discussion that cellular homeostatic mechanisms can compensate for any compromised nuclear functions and to maintain viability to the point where age-dependent degradation of cellular mechanisms will eventually lead to progression of AD.

      Reviewer #1 (Recommendations For The Authors):

      While the App KI model is suitable for modeling one key aspect of human AD, the use of the term "AD neurons" throughout the manuscript is misleading and should be avoided when describing experiments with "App KI neurons".

      Noted and corrected.

      The claim that Aβ pathology causes NPC dysfunction via reduced nucleoporin protein expression would be stronger if it was better supported by biochemical evidence based on western blots (WBs) to complement the strong microscopy data. The results shown in Figure 2H show a very weak effect compared to microscopy data that does not appear to match the quantification (e.g. Lamin-B1 staining appears reduced after 2 months in WB but not the graph). It is also not clear why nuclear fractionation is required. WB analyses with RL1 and MAB414 (that recognizes multiple FG-Nupsin ICCs and WBs) would help identify Nups that are most affected by Aβ pathology.

      The weaker Western blot results is due to the heterogeneity of the nuclei we isolated from the whole brain which includes non-neuronal cells. We reasoned that isolating the nuclear fraction would give us a cleaner Western blot with fewer background bands as the input lysate is more specific. We also decided to use antibodies against specific NUPs as a way to complement the pan-NPC antibodies that detect glycosylation-enriched epitopes in the nucleus. We reasoned that Western blot identification of individual subunits should provide complementary and stronger evidence for the reduction of NUPs at the peptide level. Overall, we used four different nuclear pore antibodies (RL1, Mab414, NUP98, NUP107) to demonstrate the same mutant phenotype in App KI neurons.

      While the observed NCT defects are discussed in detail, the authors do not present any potential mechanisms to be tested, how intracellular Aβ may impact NPCs. Does Aβ pathology affect nucleoporin expression or stability?

      We have observed the presence of Aβ adjacent to the nuclear membrane and also in the cytosol via high resolution confocal microscopy (Supplementary Figure S14). Our primary goal in this paper is to provide convincing evidence – using different assays and in more than one mouse model – for the reduction of NUPs and lower NPC counts. We feel mechanistic details of Aβdriven NPC disruption requires more extensive experimentation more suitable for subsequent publications.

      The very simple schematic just represents the loss of compartmentalization, without illustrating more complex concepts. It would also be improved by representing the outer and inner nuclear membrane fusing around the NPCs with a much wider perinuclear space between the membranes. As shown now, the nuclear envelope almost looks like a single membrane, while >60kDa proteins are shown at a similar size as the 125MDa NPC.

      We have updated the illustration along with a new schematic for necroptosis (Supplementary Figure S12). We have refrained from giving specific details of the damage to the nuclear pore complex because it is not yet clear the nature of these deficits.

      Misspelling of "Hoechst" as "Hochest" in several figures (Fig. 1, 2, S5, S7).

      Noted and corrected

      Reviewer #2 (Recommendations For The Authors):

      (1) Additional data analysis is required concerning the wild-type controls. The figures show clear differences in the wild-type neurons with time in culture (referring to figures 1A, 1B, 1C; 2A, 2B, 2C, 2D,6E, 6F, 6G, s4) and in different ages (2E, 2F, 2G, 5B, 5C, 5D). The data analysis is shown for knockin vs the time-matched wild-type condition. The effect of time in wild-type neurons/mice should also be analyzed. All the data is suggested to be normalized to 7 DIV/2month wild-type neurons/mice. Were these experiments done with different time points of the same culture? This would be the best to conclude on the effect of time.

      We have noted a decline of NUPs in WT neurons over time in primary cultures and in animal sections. This is not surprising since the NPC and nuclear signaling pathways deteriorate with age (Liu and Hetzer, 2022; Mertens et al., 2015). However, we are unable to do a direct comparison across age groups in cultured neurons as the time-matched WT and App KI neuronal samples for each time point were processed and imaged separately as neurons matured over time (Fig. 1B-C). Hence, we perform statistical analysis for each time-matched WT and App KI neurons. To be clear, multiple independent experiments across different cultures were performed at each time point. Given the inherent challenges of studying aging in culture systems, we opted to be more conservative in our interpretation of the results and as such, we were careful to describe the accelerated nuclear pore deficits in App KI neurons relative to WT levels without inferring the effect of time and speculate its relationship to normal brain aging only in the discussion section. That said, we are able to capture the decline of the nuclear pore complex across different age groups in histology of brain sections where we observed a drop in WT NUP levels in animal sections when we quantified and compared the raw nuclear intensities from brain sections that were processed and imaged simultaneously across independent experiments (Fig. 1D-E).

      Similarly, in Figure 2H, why aren't 2 months compared with 14 months? Why were these ages chosen? 2 months is a young adult, and 14 months is a middle-aged adult. To conclude, aging should have included an age between 18 and 24 months old.

      As with cultures, we isolated age-matched WT and App KI animals separately. We chose 2 to 14 months as they represent young and middle-aged adults as we wanted to showcase the nuclear pore deficits induced by the presence of Aβ without drawing a conclusion on the effects of age or time. That said, we do show histology of brain sections at 18 months of age with individual NUPs. We agree that the temporal aspects of NPC loss in WT neurons is interesting, however, given our experimental parameters, we cannot draw conclusions across different age groups at the moment.

      In Figure 3, statistics between wild type should have been included.

      Similar to the above comment, samples were processed and imaged independently across different groups, hence we cannot compare the datapoints across time.

      (4) Additional quantification: The intensity of MOAB2 at 2 and 13 months should be measured as in Figure 3C.

      Intracellular Aβ signal in 2-mo. old App KI mice is diffuse throughout the soma but in older animals, they are punctate. This observation was similarly described by Lord et al. for tgAPPArcSwe mice (Lord et al., 2006). We have included a confocal micrograph of MOAB-2 immunocytochemistry of a 13-mo. App KI brain section in supplemental figures (Supplementary Figure S13). We found it challenging to differentiate whether the signal is localized intracellularly or as an extracellular aggregate. Regardless, the differences in the quality and uneven distribution of Aβ signal makes any direct comparison of soma intensity across the different age groups harder to interpret in the context of the mutant phenotype.

      (5) Additional experiments: Because primary neurons differentiate, mature, and age with time in culture, they are required to control for the developmental stage of your cultures. Analyzing neuronal markers such as doublecortin for neuronal precursors, MAP2 (or Tau) for dendritic/axonal maturation, synapsin for synaptic maturation, and accumulation of senescenceassociated beta-galactosidase (SA-Beta-Gal) as an aging marker.

      As part of the maintenance of cultures, we stain cultures for axodendritic markers (e.g. MAP2), glial cell distribution (e.g GFAP) and excitatory vs. inhibitory neuronal subpopulations (e.g. Gad65) and synaptic markers (e.g. PSD95) to ensure that growth, survival and viability of neurons are not compromised (data not shown). These markers for maturity are routinely tracked to ensure proper development. We also test the health of the cultures (e.g. apoptosis, necrosis) and to look for cytoskeletal disruption or fragmentation for neuronal processes.

      (6) Additional methods: The quantification of Abeta intensity in Figure 3 is not clearly explained in the methods. Was the intensity measured per field, per cell body?

      The quantifications for Aβ are done for each MAP2-positive cell body and have included that statement in the methods.

      (7) Missing in discussion integration and references to these papers:

      a. Mertens J, Paquola ACM, Ku M, Hatch E, Böhnke L, Ladjevardi S, McGrath S, Campbell B, Lee H, Herdy JR, Gonçalves JT, Toda T, Kim Y, Winkler J, Yao J, Hetzer MW, Gage FH. 2015. Directly Reprogrammed Human Neurons Retain Aging-Associated Transcriptomic Signatures and Reveal Age-Related Nucleocytoplasmic Defects. Cell Stem Cell 17:705-718. doi:10.1016/j.stem.2015.09.001

      b. Guix FX, Wahle T, Vennekens K, Snellinx A, Chávez-Gutiérrez L, Ill-Raga G, Ramos-Fernandez E, Guardia-Laguarta C, Lleó A, Arimon M, Berezovska O, Muñoz FJ, Dotti CG, De Strooper B. 2012. Modification of γ-secretase by nitrosative stress links neuronal ageing to sporadic Alzheimer's disease. EMBO Mol Med 4:660-673. doi:10.1002/emmm.201200243

      c. Burrinha T, Martinsson I, Gomes R, Terrasso AP, Gouras GK, Almeida CG. 2021. Upregulation of APP endocytosis by neuronal aging drives amyloid-dependent synapse loss. J Cell Sci 134. doi:10.1242/jcs.255752),

      Neuronal amyloid-β accumulation within cholinergic basal forebrain in ageing and Alzheimer's disease. Brain 138:1722-1737. doi:10.1093/brain/awv024).

      We have cited a subset of the papers in the discussion section and also expanded the discussion to include the possibility of time-dependent changes for Aβ expression in WT neurons.

      Reviewer #3 (Recommendations For The Authors):

      Specific comments:

      (1) Fig. 1D,E. Fig. 2E, F. This shows the change in NUP IR with time for the APP-KI, but there is also a difference between Wt and KI from the earliest time shown. How early is this difference apparent? From birth? The study should go back to the earliest time possible as the timing of the staining for NUP is important to correlate this with other events of intraneuronal Abeta and amyloid IR. Is the difference between 4 and 7-month ko mice in Figures 2G and 2F statistically significant? If not, perhaps we need a larger N to determine the timing accurately.

      The point is well taken. We have not examined the WT and App KI brains before 2-mo. of age. At this early time point, the extracellular amyloid deposits are very low but intracellular Aβ can be readily detected in neuronal soma. We expect that as the animal ages, the Aβ inside cells will directly impact the NPC mutant phenotype, but it is unclear how early this phenotype manifests in animals and when we should look. To be clear, in less mature neurons (DIV7), the phenotype is very subtle and can only be observed via high resolution microscopy. The differences between 4-7 mo. old animals (Fig. 2F and G) in terms of severity of the reduction cannot be assessed as the age-matched animals for each time point were processed separately, but at each time point, we observed a significant reduction of NPC relative to WT. Nevertheless, in Figure 1E, we performed immunohistochemistry experiments with pan-NPC antibodies and quantified raw intensities to show a difference between 4/7-mo. with 13-mo. old animals.

      (2) Similarly, the increase in Abeta IR is only shown for cultured neurons and only a single time point of 2 months is shown for CA1 in KI brain. Since a major point is that the decrease in NUP IR is correlated with an increase in Abeta IR, a more convincing approach would be to stain for both simultaneously in KI brain, especially since Abeta IR is quite sensitive to conformational variation between APP, Abeta, and aggregated forms and whether they are treated with denaturants for "antigen retrieval". The entire brain hemisphere should be shown as the pathology is not limited to CA1. There are many different Abeta antibodies that are specific to the amyloid state so it should be possible to come up with a set of antibodies and conditions that work for both Abeta and NUP staining.

      The intracellular Aβ signal in 2-mo. old App KI mice is diffuse throughout the soma but in older animals, they are punctate. We have included a confocal micrograph of MOAB-2 immunocytochemistry of a 13-mo. App KI brain section (Supplementary Figure S13). We did not quantify Aβ as it was challenging to differentiate if the signal is intracellular Aβ or amyloid β plaques. Regardless, the differences in the quality and uneven distribution of Aβ signal makes any direct comparison of soma intensity across the different age groups much harder to interpret.

      (3) Figure 3A. The staining with MOAB 2 and 82E1 appears qualitatively different with 82E1 exhibiting larger perinuclear puncta. Both antibodies appear to stain puncta inside the nucleus consistent with previously published reports of intranuclear amyloid IR. If these are flattened images, then 3D Z stacks should be shown to clarify this. Figure 3H shows what appears to be Abeta immunofluorescence quantitation in DAPT-treated cells, but the actual images are apparently not shown. The details of this experiment aren't clear or what antibody is used, but this may not be Abeta as many APP fragments that are not Abeta also react with antibodies like MOAB2.

      Since 82E1 detects a larger epitope (aa1-16 as compared to 1-4 in MOAB-2), it is possible some forms of Aβ are differentially detected inside the cell. MOAB-2 is shown to detect the different forms of Aβ40 and 42, with a stronger selectivity for the latter. However, it is not known to react with APP or APP/CTFs (Youmans et al., 2012). DAPT-treated cells were processed and imaged as with other experiments in figure 3 using MOAB-2 antibodies to detect Aβ. We have included that information in the figure legends.

      The way we image the cell is to collect LSM800 confocal stacks and use IMARIS software to render the nucleus in a 3D object prior to quantifying the intensity or coverage. In this way, we are capturing and quantifying the entire volume of the nucleus and not just a single plane. The majority of signal for MOAB-2 positive Aβ are punctate signals in the cytosol with a subset adjacent to the nucleus (Supplementary Figure 14; Airyscan; single plane). We also detected MOAB-2 signals coming from within the nucleus. The nature of this interaction between Aβ and the nuclear membrane/perinuclear space/nucleoplasm remains unclear.

      (4) P20 L12. "We demonstrate an Aβ-driven loss of NUP expression in hippocampal neurons both in primary cocultures and in AD mouse models" It isn't clear that exogenous or extracellular Abeta drives this in living animals. All the data that demonstrate this is derived from cell culture and things may be very different (eg. Soluble Abeta concentration) in vivo. It is OK to speculate that the same thing happens in vivo, but to say it has been demonstrated in vivo is not correct.

      We have rewritten the opening statement in the paragraph to narrowly define our observations in the context of App KI. We understand the caveats of our studies in primary cultures, but we have done our due diligence to study the phenomenon in different assays, using at least four different nuclear pore antibodies, and in more than one mouse model to show the deficits. We mentioned Aβ-driven loss but did not conclude which Aβ peptide (e.g. 40 vs. 42) or form (e.g. fibrillar) that drives the deficits. However, we have shown some data that oligomers and not monomers as well as extracellular Aβ can accumulate in the soma and trigger NPC deficits. We also state in the discussion that other possible mechanisms of action, mainly via indirect interactions of Aβ with the cell, could result in the deficits.

      (5) P21, L21 "Inhibition of γ-secretase activity prevented cleavage of mutant APP and generation of Aβ, which led to the partial restoration of NUP levels". What the data actually shows is that treatment of the cells with DAPT led to partial restoration of NUP levels. Other studies have shown that DAPT is a gamma secretase inhibitor, so it is reasonable to suspect that the effect to gamma secretase activity, but the substrates and products are assumed rather than measured, so a little caution is a good idea here. For example, CTF alpha is also a substrate, producing P3, which is not considered abeta. The products Abeta and P3 also typically are secreted, where they can be further degraded. Abeta and P3 can also aggregate into amyloid, so whether the effect is really due to Abeta per se as a monomer or Abeta-containing aggregates isn't clear.

      The point is noted. DAPT inhibition of -secretase can impact more than one substate as the complex can cleave multiple substrates. However, we have measured Aβ intensity which increases with DAPT, and while a singular experiment is insufficient to show direct Aβ involvement, we have performed other experiments that show a correlation of Aβ levels inside the soma and the degree of NPC reduction. This includes the direct application of synthetic Aβ42 oligomers. We agree the data cannot fully exclude the involvement of other -secretase cleavage products, but we feel there is strong enough evidence that Aβ – in whatever form - is at least partially if not, the main driver that promote these deficits.

      (6) Discussion. The authors point to "intracellular Abeta" as a potential causative agent for decreased NUP expression and function and cite a number of papers reporting intracellular Abeta. (D'Andrea et al., 2001; Iulita et al., 2014; Kimura et al., 2003; LaFerla et al., 1997; Oddo et al., 2003b; Takahashi et al., 2004; Wirths et al., 2001). Most of these papers report immunoreactivity with Abeta antibodies and argue about whether this is really Abeta40 or 42 and not APP or APP-CTF immunoreactivity. What is missing from these papers and the discussion in this manuscript is that this is not just soluble Abeta, but Abeta amyloid of the same type that ends up in plaques because it has the same immunoreactivity with Abeta amyloid fibril-specific antibodies and even the classical anti-Abeta antibodies 6E10 and 4G8 after antigen retrieval as shown in papers by Pensalfini, et al., 2014 and Lee, et al., 2022 (1,2) who describe the evolution of neuritic plaques and their amyloid core beginning inside neurons. The term "dystrophic neurite" is a misnomer because the structures that resemble "neurites" morphologically are actually autophagic vesicles packed with Abeta and APP immunoreactive material which has the detergent insolubility properties of amyloid plaques. See (1,2). The apparent intranuclear IR of MOAB2 and 82E1 mentioned in comment 3 is relevant here. In Lee et al., the 3D serial section EM reconstruction of one of these neurons with perinuclear and nuclear amyloid shows abundant amyloid fibrils in the remnant of the nucleus. The nuclear envelope appears to break down as evidenced by the redistribution of NeuN immunoreactivity (Pensalfini et al.,) and other nuclear markers and the EM evidence (Lee et al.,). These papers are also improperly cited as evidence for a hypothetical intracellular source for soluble Abeta.

      We have devoted a section of the discussion to highlight some of these findings in the context of Pensalfini et al. 2014 and Lee et al. 2022. Lee et al. tested multiple animal strains to observe the Panthos structures but did not use the App KI mouse model. Since none of our experiments directly tested their observations (e.g. perinuclear fibrils or acidity of autophagic vesicles) in App KI, we decided to take a more conservative approach in our interpretations by framing the NPC deficits without specifying the nature of the intracellular Aβ. We note in discussion that it is entirely possible that App KI animals also show the same Panthos phenotypes and the perinuclear accumulation of Aβ which results in damaged NUPs. To do that, the Panthos phenotype must first be established in App KI mice.

      (7) The authors also cite the work of Ditaranto et al., 2001 and Ji et al., 2002 for Aβ-induced lysosomal leakage from these vesicular structures but overlook the original publications on Abeta-induced lysosomal leakage by Yang et al., (3) who further show that this is correlated with aggregation of Abeta42 upon internalization which also leads to the co-aggregation of APP and APP-CTFs in a detergent-insoluble form (4) and pulse-chase studies demonstrate that metabolically-labeled APP ultimately ends up as insoluble Abeta that have "ragged" N-termini (5). This work seems relevant to the results reported here as the perinuclear amyloid that the authors report here is likely to be the same insoluble, aggregated APP and APP-CTF-containing amyloid as that reported in references 1 and 2.

      We have included the literature references in the discussion, highlighting the possibility of lysosomal leakage contributing to the NPC damage.

      Minor points.

      (1) P2, L28 "permeability barrier facilities passive" should be 'facilitates'.

      (2) P7, L24 "homogenate and grounded for 5 additional strokes" One of the peculiarities of English is that the past tense of grind is ground. Grounded means something else.

      (3) P8, L9 "For synthetic Aβ experiments," Abeta what? 42? 40? It makes a difference and if it is Abeta42, you should be specific in the rest of the text where it is used.

      (4) P11, L14. "To determine if Aβ can trigger changes in nuclear structure and function" It seems a little early to start by presupposing that it is Abeta that triggers changes in nuclear structure and function. It sounds like you are starting out with a bias.

      (5) P11, L16,17 "While Aβ pathology is robustly detected in App KIs" At some point in the manuscript, either here or in the introduction, it would be useful to include a couple of sentences about what the pathology is in these mice along with the timing of the development of the pathology to compare with the results presented here. There are several types of amyloid deposits, "neuritic" plaques, diffuse plaques, and cerebrovascular amyloid. This is important because the early "neuritic" plaques are intraneuronal at least early on before the neuron dies. See (1,2).

      (6) P19, L10. "LMB is an inhibitor or CRM-1 mediated" should be of

      All minor points have been addressed in the manuscript and figures.

      References

      (1) Pensalfini, A., Albay, R., 3rd, Rasool, S., Wu, J. W., Hatami, A., Arai, H., Margol, L., Milton, S., Poon, W. W., Corrada, M. M., Kawas, C. H., and Glabe, C. G. (2014) Intracellular amyloid and the neuronal origin of Alzheimer neuritic plaques. Neurobiol Dis 71C, 53-61

      (2) Lee, J. H., Yang, D. S., Goulbourne, C. N., Im, E., Stavrides, P., Pensalfini, A., Chan, H., Bouchet-Marquis, C., Bleiwas, C., Berg, M. J., Huo, C., Peddy, J., Pawlik, M., Levy, E., Rao, M., Staufenbiel, M., and Nixon, R. A. (2022) Faulty autolysosome acidification in Alzheimer’s disease mouse models induces autophagic build-up of Abeta in neurons, yielding senile plaques. Nat Neurosci 25, 688-701

      (3) Yang, A. J., Chandswangbhuvana, D., Margol, L., and Glabe, C. G. (1998) Loss of endosomal/lysosmal membrane impermeability is an early event in amyloid Aß1-42 pathogenesis. J. Neurosci. Res. 52, 691-698

      (4) Yang, A. J., Knauer, M., Burdick, D. A., and Glabe, C. (1995) Intracellular A beta 1-42 aggregates stimulate the accumulation of stable, insoluble amyloidogenic fragments of the amyloid precursor protein in transfected cells. J Biol Chem 270, 14786-14792

      (5) Yang, A., Chandswangbhuvana, D., Shu, T., Henschen, A., and Glabe, C. G. (1999) Intracellular accumulation of insoluble, newly synthesized Aßn-42 in APP transfected cells that have been treated with Aß1-42. J. Biol. Chem. 274, 20650-20656

      References

      Boehmer, T., Enninga, J., Dales, S., Blobel, G., and Zhong, H. (2003). Depletion of a single nucleoporin, Nup107, prevents the assembly of a subset of nucleoporins into the nuclear pore complex. Proc Natl Acad Sci U S A 100, 981-985.

      D'Angelo, M.A., Raices, M., Panowski, S.H., and Hetzer, M.W. (2009). Age-dependent deterioration of nuclear pore complexes causes a loss of nuclear integrity in postmitotic cells. Cell 136, 284-295.

      Eftekharzadeh, B., Daigle, J.G., Kapinos, L.E., Coyne, A., Schiantarelli, J., Carlomagno, Y., Cook, C., Miller, S.J., Dujardin, S., Amaral, A.S., et al. (2018). Tau Protein Disrupts Nucleocytoplasmic Transport in Alzheimer's Disease. Neuron 99, 925-940 e927.

      Liu, J., and Hetzer, M.W. (2022). Nuclear pore complex maintenance and implications for agerelated diseases. Trends Cell Biol 32, 216-227.

      Lord, A., Kalimo, H., Eckman, C., Zhang, X.Q., Lannfelt, L., and Nilsson, L.N. (2006). The Arctic Alzheimer mutation facilitates early intraneuronal Abeta aggregation and senile plaque formation in transgenic mice. Neurobiol Aging 27, 67-77.

      Mertens, J., Paquola, A.C., Ku, M., Hatch, E., Bohnke, L., Ladjevardi, S., McGrath, S., Campbell, B., Lee, H., Herdy, J.R., et al. (2015). Directly Reprogrammed Human Neurons Retain Aging-Associated Transcriptomic Signatures and Reveal Age-Related Nucleocytoplasmic Defects. Cell stem cell 17, 705-718.

      Wu, X., Kasper, L.H., Mantcheva, R.T., Mantchev, G.T., Springett, M.J., and van Deursen, J.M. (2001). Disruption of the FG nucleoporin NUP98 causes selective changes in nuclear pore complex stoichiometry and function. Proc Natl Acad Sci U S A 98, 3191-3196.

      Youmans, K.L., Tai, L.M., Kanekiyo, T., Stine, W.B., Jr., Michon, S.C., Nwabuisi-Heath, E., Manelli, A.M., Fu, Y., Riordan, S., Eimer, W.A., et al. (2012). Intraneuronal Abeta detection in 5xFAD mice by a new Abeta-specific antibody. Molecular neurodegeneration 7, 8.

    2. eLife assessment

      This study focuses on nuclear pore complex dysfunction in a mouse model of Alzheimer's disease related Aβ pathology. If future revisions can adequately respond to the reviewer comments, the findings may eventually be useful in supporting the idea that nuclear cytoplasmic transport defects occur prior to plaque deposition in this disease model and may be caused by Alzheimer's disease pathology. However, even after revision, the work suffers from overinterpretation of some of the data and remains incomplete in several respects.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors try to establish that there is an Abeta-dependent loss of nuclear pores early in Alzheimer's disease. To do so the authors compared different NUP proteins and assessed their function by analyzing nuclear leakage and resistance to induction of nuclear damage and the associated necroptosis. The authors use a mouse knockin for hAPP with familial Alzheimer's mutations to model amyloidosis related to Alzheimer's disease. Treatment with an inhibitor of beta-amyloid production partially rescued the loss of nuclear pore proteins in young KI neurons, implicating beta-amyloid in Nuclear Pore dysfunction, a mechanism already described in other neurodegenerative diseases but not in Alzheimer's disease.

      Comments on revised version:

      Upon careful review, some of the critical concerns raised have yet to be fully addressed (the authors did not adequately address the two points of my public review or 5 of my 7 recommendation points), particularly regarding the effects of maturation stage or age. This has negatively impacted my initial enthusiasm for the paper, as the current approach does not fully capture the role of nuclear pore dysfunction in Alzheimer's disease, which is intimately dependent on aging. Here are specific recommendations for further revision:

      (1) The manuscript would benefit from a clearer acknowledgement of the limitations concerning the effects of maturation or age. I recommend removing mentions of the effect of time, for example:

      (i) Line 1 "4: "By using brain tissues and primary neurons cultured from App KI and wildtype (WT) mice, we observed a loss of NPCs in neuronal nuclei over time. "

      (ii) Line 20 "13: "Similarly, in neuron cocultures, there was an 20 increase in intracellular Aβ levels over WT neurons that parallels the reduction of NUPs as neurons 21 mature from DIV "-28. "

      (2) The subheading in the Discussion section, "Age-dependent decline in nuclear function during normal aging and in AD," could be more accurately retitled "Nuclear function decline" in AD" to avoid suggesting age dependence without the requisite data.

      (3) Because primary neurons differentiate, mature, and age with time in culture, they are required to control for the developmental stage of your cultures. Please include the control data that would support cultures maturation stage, such as staining for axodendritic markers (e.g., MAP2), glial cell distribution (e.g., GFAP), and the balance of excitatory vs. inhibitory neuronal subpopulations (e.g., Gad65). This data is crucial for substantiating the culture conditions and the resulting interpretations.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank both reviewers for their supportive comments. Reviewer 1 has suggested a different data processing strategy to better resolve subunits at the CALHM4/CALHM2 interface:

      I recommend an alternative data processing strategy. First, refine particles with 2-4 CALHM4 subunits with symmetry imposed. This is followed by symmetry expansion, signal subtraction of two adjacent subunits, and subsequent classification and refinement of the subtracted particles. This approach, while not guaranteed, can potentially provide a clearer definition of CALHM2 and CALHM4 interfaces and show whether CALHM2 subunits adopt different conformations based on their proximity to CALHM4 subunits.

      We have followed the recommended strategy in an attempt to improve the resolution and better resolve the structural heterogeneity in CALHM2/4 channels. To this end, we have combined symmetry expansion and partial signal subtraction, as suggested by the reviewer. Initially, a symmetrized (C11) 3.4 Å consensus map of undecameric CALHM2/4 channels bound to sybodies SbC2 and SbC4 was used. The particles of this reconstruction were subjected to symmetry expansion (C11) followed by signal subtraction of nine adjacent subunits. Next, we performed focused, alignment-free 3D classification of the remaining two subunits followed by refinement of these classes, leading to the classification of CALHM subunit pairs. The majority of the classes feature well-resolved CALHM2 pairs, consistent with the original approach (Author response image 1A). A minority of the classes contain CALHM4 subunits, revealing heterogeneity similar to regions of CALHM4 subunits observed in the non-symmetrized channel reconstruction (Author response image 1B). Unfortunately, this approach thus did not improve resolution or facilitate a more accurate subunit assignment. Consequently, we decided not to include these attempts in our manuscript. The resubmitted version thus contains only small corrections compared to the previous version.

      Author response image 1.

      Classification of subunit pairs of undecameric CALHM2/4 channels bound to sybodies SbC2 and SbC4 after the processing combining symmetry expansion and partial signal subtraction. (A) Classes showing CALHM2 subunit pairs. (B) Classes showing subunits at interfaces to CALHM4.

    1. Reviewer #1 (Public Review):

      Summary:

      The authors analyzed the bacterial colonization of human sperm using 16S rRNA profiling. Patterns of microbiota colonization were subsequently correlated with clinical data, such as spermiogram analysis, the presence of reactive oxygen species (ROS), and DNA fragmentation. The authors identified three main clusters dominated by Streptococcus, Prevotella, and Lactobacillus & Gardnerella, respectively, which aligns with previous observations. Specific associations were observed for certain bacterial genera, such as Flavobacterium and semen quality. Overall, it is a well-conducted study that further supports the importance of the seminal microbiota.

      Strengths:

      - The authors performed the analysis on 223 samples, which is the largest dataset in semen microbiota analysis so far.<br /> - Inclusion of negative controls to control contaminations.<br /> - Inclusion of a positive control group consisting of men with proven fertility.

      Weaknesses:<br /> - The manuscript needs comprehensive proofreading for language and formatting. In many instances, spaces are missing or not required.<br /> - Could the authors explore correlation network analyses to get additional insights into the structure of different clusters?<br /> - The GitHub link is not correct.<br /> - It is not possible to access the dataset on ENA.<br /> - Add the graphs obtained with decontam analysis as a supplementary figure.<br /> - There is nothing about the RPL group in the results section, while the authors discuss this issue in the introduction. What about the controls with proven fertility?<br /> - While correctly stated in the title, the term microbiota should be used throughout the manuscript instead of "microbiome"

    1. eLife assessment

      This valuable study reports a chemogenetic screen for resistance and sensitivity towards three compounds that inhibit cell cycle progression: camptothecin, colchicine, and palbociclib. Following up on the palbociclib results, the authors provide solid evidence that knockdown of the PRC2.1 complex, likely through increasing D-type cyclin expression, confers resistance to palbociclib. The generality of the results would be improved by demonstrating the effect of PRC2.1 on cyclin expression and cell cycle progression in more than one cell line.

    2. Reviewer #1 (Public Review):

      The study by Longhurst et al. investigates the mechanisms of chemoresistance and chemosensitivity towards three compounds that inhibit cell cycle progression: camptothecin, colchicine, and palbociclib. Genome-wide genetic screens were conducted using the HAP1 Cas9 cell line, revealing compound-specific and shared pathways of resistance and sensitivity. The researchers then focused on novel mechanisms that confer resistance to palbociclib, identifying PRC2.1. Genetic and pharmacological disruption of PRC2.1 function, but not related PRC2.2, leads to resistance to palbociclib. The researchers then show that disruption of PRC2.1 function (for example, by MTF2 deletion), results in locus-specific changes in H3K27 methylation and increases in D-type cyclin expression. It is suggested that increased expression of D-type cyclins results in palbociclib resistance.

      Strengths:

      The results of this study are interesting and contribute insights into the molecular mechanisms of CDK4/6 inhibitors. Importantly, while CDK4/6 inhibitors are effective in the clinic, tumour recurrence is very high due to acquired resistance.

      Weaknesses:

      A key resistance mechanism is Rb loss, so it is important to understand if resistance conferred by PRC2.1 loss is mediated by Rb, and whether restoration of PRC2.1 function in Rb-deplete cells results in renewed palbociclib sensitivity. It is also important to understand the clinical implications of the results presented. The inclusion of these data would significantly improve the paper. However, besides some presentation issues and typos as described below, it is my opinion that the results are robust and of broad interest.

      Major questions:

      (1) Is the resistance to CDK4/6 inhibition conferred by mutation of MTF2 mediated by Rb?

      (2) Are mutations in PRC2.1 found in genetic analyses of tumour samples in patients with acquired resistance?

    3. Reviewer #2 (Public Review):

      Summary:

      Longhurst et al. assessed cell cycle regulators using a chemogenetic CRISPR-Cas9 screen in haploid human cell line HAP1. Besides known cell cycle regulators they identified the PRC2.1 subcomplex to be specifically involved in G1 progression, given that the absence of members of the complex makes the cells resistant to Palbociclib. They further showed that in HAP1 cells the PRC2.1, but not the PRC2.2 complex is important to repress the cyclins CCND1 and CCND2. This can explain the enhanced resistance to Palbociclib, a CDK4/6-Inhibitor, after PRC2.1 deletion.

      Strengths:

      The initial CRISPR screen is very interesting because it uses three distinct chemicals that disturb the cell cycle at various stages. This screen mostly identified known cell cycle regulators, which demonstrates the validity of the approach. The results can be used as a resource for future research.

      The most interesting outcome of the experiment is the finding that knockouts of the PRC2.1 complex make the cell resistant to Palbociclib. In a further experiment, the authors focused on MTF2 and JARID2 as the main components of PRC2.1 and PRC2.2, respectively. Via extensive analyses, including genome-wide experiments, they confirmed that MTF2 is particularly important to repress the cyclins CCND1 and CCND2. The absence of MTF2 therefore leads to increased expression of these genes, sufficient to make the cell resistant to palociclib. This result will likely be of wide interest to the community.

      Weaknesses:

      The main weakness of the manuscript is that the experiments were performed in only one cell line. To draw more general conclusions, it would be essential to confirm some of the results in other cell lines.<br /> In addition, some of the findings, such as the results from the CRISPR screen as well as the stronger impact of the MTF2 KO on H3K27me3 and gene expression (compared to JARID2 KO), are not unexpected, given that similar results were already obtained before by other labs.

    4. Reviewer #3 (Public Review):

      This study begins with a chemogenetic screen to discover previously unrecognized regulators of the cell cycle. Using a CRISPR-Cas9 library in HAP1 cells and an assay that scores cell fitness, the authors identify genes that sensitize or desensitize cells to the presence of palbociclib, colchicine, and camptothecin. These three drugs inhibit proliferation through different mechanisms, and with each treatment, expected and unexpected pathways were found to affect drug sensitivity. The authors focus the rest of the experiments and analysis on the polycomb complex PRC2, as the deletion of several of its subunits in the screen conferred palbociclib resistance. The authors find that PRC2, specifically a complex dependent on the MTF2 subunit, methylates histone 3 lysine 27 (H3K27) in promoters of genes associated with various processes including cell-cycle control. Further experiments demonstrate that Cyclin D expression increases upon loss of PRC2 subunits, providing a potential mechanism for palbociclib resistance.

      The strengths of the paper are the design and execution of the chemogenetic screen, which provides a wealth of potentially useful information. The data convincingly demonstrate in the HAP1 cell line that the MTF2-PRC2 complex sustains the effects of palbociclib (Figure 4), methylates H3K27 in CpG-rich promoters (Figure 5), and represses Cyclin D expression (Figure 6). These results could be of great interest to those studying cell-cycle control, resistance mechanisms to therapeutic cell-cycle inhibitors, and chromatin regulation and gene expression.

      There are several weaknesses that limit the overall quality and potential impact of the study. First, none of the results from the colchicine and camptothecin screens (Figures 1 and 2) are experimentally validated, which lessens the rigor of those data and conclusions. Second, all experiments validating and further exploring results from the palbociclib screen are restricted to the Hap1 cell line, so the reproducibility and generality of the results are not established. While it is reasonable to perform the initial screen to generate hypotheses in the Hap1 line, other cancer and non-transformed lines should be used to test further the validity of conclusions from data in Figures 4-6. Third, conclusions drawn from data in Figures 3D and 4D are not fully supported by the experimental design or results. Finally, there have been other similar chemogenetic screens performed with palbociclib, most notably the study described by Chaikovsky et al. (PMID: 33854239). Results here should be compared and contrasted to other similar studies.

    1. eLife assessment

      The authors explore ER stress signalling mediated by ATF6 using a genome-wide gene depletion screen. They find that the ER chaperone Calreticulin binds and directly represses ATF6; this proposed function for Calreticulin is intriguing and constitutes an important finding. The evidence presented is based on CHO genetic evidence and biochemical results and is convincing.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Tung and colleagues identify Calreticulin as a repressor of ATF6 signaling using a CRISPR screen and characterize the functional interaction between ATF6 and CALR.

      Strengths:

      The manuscript is well written and interesting with an innovative experimental design that provides some new mechanistic insight into ATF6 regulation as well as crosstalk with the IRE1 pathway. The methods used were fit for purpose and reasonable conclusions were drawn from the data presented. Findings are novel and bring together glycoprotein quality control and activation of one sensor of the UPR. This is a novel perspective on how the integration of ER homeostasis signals could be sensed in the ER.

      Weaknesses:

      Several points remain to be documented to support the authors' model.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, the authors set out to use an unbiased CRISPR/Cas9 screen in CHO cells to identify genes encoding proteins that either increase or repress ATF6 signaling in CHO cells.

      Strengths:

      The strengths of the paper include the thoroughness of the screens, the use of a novel, double ATF6/IRE1 UPR reporter cell line, and follow-up detailed experiments on two of the findings in the screens, i.e. FURIN and CRT, to test the validity of involvement of each as direct regulators of ATF6 signaling. Additional strengths are the control experiments that validate the ATF6 specificity of the screens, as well as, for CRT, the finding of focus, determining roles for the glycosylation and cysteines in ATF6 as mechanistically involved in how CRT represses ATF6, at least in CHO cells.

      Weaknesses:

      The weaknesses of the paper are that the authors did not describe why they focused only on the top 100 proteins in each list of ATF6 activators and repressors. Additionally, there were a few methodology items missing, such as the nature of where the insertion site in the CHO cell genome of the XBP1::mCherry reporter. Since the authors go to great lengths to insert the other reporter for ATF6 activation in a "safe harbor" location, it leads to questions about whether the XBP1::mCherry reporter insertion is truly innocuous. An additional weakness is that the evidence for the physical interaction between ATF6LD and CRT is not strong, being dependent mainly on a single IP/IB experiment in Figure 4C that comprises only 1 lane on the gel for each of the test cases. Moreover, while that figure suggests that the interaction between CRT and ATF6 is decreased by mutating out the glycosylation sites in the ATF6LD, the BLI experiment in the same figure, 4B, suggests that there are no differences in the affinities of CRT for ATF6LD WT, deltaGly and deltaCys. An additional detail is that I found Figure 6A to be difficult to interpret, and that 6B was required in order for me to best evaluate the points being made by the authors in this figure.

      Overall, I believe that this work will positively impact the field as it provides a list of potential regulators of ATF6 activation and repression that others will be able to use as a launch point for discovering such interactions in cells and tissues or interest beyond CHO cells. However, I agree with the authors that these findings were in CHO cell lines and that it is possible, if not likely, that some of the interactions they found will be cell type/line specific.

    1. Reviewer #1 (Public Review):

      Summary:

      The manuscript "Engineering of PAClight1P78A: A High-Performance Class-B1 GPCR-Based Sensor for PACAP1-38" by Cola et al. presents the development of a novel genetically encoded sensor, PAClight1P78A, based on the human PAC1 receptor. The authors provide a thorough in vitro and in vivo characterization of this sensor, demonstrating its potential utility across various applications in life sciences, including drug development and basic research.

      The diverse methods to validate PAClight1P78A demonstrate a comprehensive approach to sensor engineering by combining biochemical characterization with in vivo studies in rodent brains and zebrafish. This establishes the sensor's biophysical properties (e.g., sensitivity, specificity, kinetics, and spectral properties) and demonstrates its functionality in physiologically relevant settings. Importantly, the inclusion of control sensors and the testing of potential intracellular downstream effects such as G-protein activation underscore a careful consideration of specificity and biological impact.

      Strengths:

      The fundamental development of PAClight1P78A addresses a significant gap in sensors for Class-B1 GPCRs. The iterative design process -starting from PAClight0.1 to the final PAClight1P78A variant - demonstrates compelling optimization. The innovative engineering results in a sensor with a high apparent dynamic range and excellent ligand selectivity, representing a significant advancement in the field. The rigorous in vitro characterization, including dynamic range, ligand specificity, and activation kinetics, provides a critical understanding of the sensor's utility. Including in vivo experiments in mice and zebrafish larvae demonstrates the sensor's applicability in complex biological systems.

      Weaknesses:

      The manuscript shows that the sensor fundamentally works in vivo, albeit in a limited capacity. The titration curves show sensitivity in the nmol range at which endogenous detection might be possible. However, perhaps the sensor is not sensitive enough or there are not any known robust paradigms for PACAP release. A more detailed discussion of the sensors's limitations, particularly regarding in vivo applications and the potential for detecting endogenous PACAP release, would be helpful.

      There are several experiments with an n=1 and other low single-digit numbers. I assume that refers to biological replicates such as mice or culture wells, but it is not well defined. n=1 in experimental contexts, particularly in Figure 1, raises significant concerns about the exact dynamic range of the sensor, data reproducibility, and the robustness of conclusions drawn from these experiments. Also, ROI for cell cultures, like in Figure 1, is not well defined. The methods mentioned ROIs were manually selected, which appears very selective, and the values in Figure 1c become unnecessarily questionable. The lack of definition for "ROI" is confusing. Do ROIs refer to cells, specific locations on the cell membrane, or groups of cells? It would be best if the authors could use unbiased methods for image analysis that include the majority of responsive areas or an explanation of why certain ROIs are included or excluded.

    2. Reviewer #2 (Public Review):

      Summary:

      The PAClight1 sensor was developed using an approach successful for the development of other fluorescence-based GPCR sensors, which is the complete replacement of the third intracellular loop of the receptor with a circularly-permuted green fluorescent protein. When expressed in HEK cells, this sensor showed good expression and a weak but measurable response to the extracellular presence of PACAP1-38 (a F/Fo of 43%). Additional mutation near the site of insertion of the linearized GPF, at the C-terminus of the receptor, and within the second intracellular loop produced a final optimized sensor with F/Fo of >1000%. Finally, screening of mutational libraries that also included alterations in the extracellular ligand-binding domain of the receptor yielded a molecule, PAClight1P78A, that exhibited a high ligand-dependent fluorescence response combined with a high differential sensitivity to PACAP (EC50 30 nM based on cytometric sorting of stably transfected HEK293 cells) compared to its congener VIP, (with which PACAP shares two highly related receptors, VPAC1 and VPAC2) as well as several unrelated neuropeptides, and significantly slowed activation kinetics by PACAP in the presence of a 10-fold molar excess of the PAC1 antagonist PACAP6-38. A structurally highly similar control construct, PAClight1P78Actl, showed correspondingly similar basal expression in HEK293 cells, but no PACAP-dependent enhancement in fluorescent properties.

      PAClight1P78A was expressed in neurons of the mouse cortex via AAV9.hSyn-mediated gene transduction. Slices taken from PAClight1P78A-transfected cortex, but not slices taken from PAClight1P78Actl-transfected cortex exhibited prompt and persistent elevation of F/Fo after 2 minutes of perfusion with PACAP1-38 which persisted for up to 14 minutes and was statistically significant after perfusion with 3000, but not 300 or 30 nM, of peptide. Likewise, microinfusion of 200 nL of 300 uM PACAP1-38 into the cortex of optical fiber-implanted freely moving mice elicited a F/Fo (%) of greater than 15, and significantly higher than that elicited by application of similar concentrations of VIP, CRF, or enkephalin, or vehicle alone. In vivo experiments were carried out in zebrafish larvae by the introduction of PAClight1P78A into single-cell stage Danio rerio embryos using a Tol2 transposase-based plasmid with a UAS promoter via injection (of plasmid and transposase mRNA), and sorting of post-fertilization embryos using a marker for transgenesis carried in the UAS : PAClight1P78A construct. Expression of PAClight1P78A was directed to cells in the olfactory bulb which express the fish paralog of the human PAC1 receptor by using the Tg(GnRH3:gal4ff) line, and fluorescent signals were elicited by intracerebroventricular administration of PACAP1-38 at a single concentration (1 mM), which were specific to PACAP and to the presence of PAClight1P78A per se, as controlled by parallel experiments in which PAClight1P78Actl instead of PAClight1P78A was contained in the transgenic plasmid.

      Major strengths and weaknesses of the methods and results:

      The report represents a rigorous demonstration of the elicitation of fluorescent signals upon pharmacological exposure to PACAP in nervous system tissue expressing PAClight1P78A in both mammals (mice) and fish (zebrafish larvae). Figure 4d shows a change in GFP fluorescence activation by PACAP occurring several seconds after the cessation of PACAP perfusion over a two-minute period, and its persistence for several minutes following. One wonders if one is apprehending the graphical presentation of the data incorrectly, or if the activation of fluorescence efficiency by ligand presentation is irreversible in this context, in which case the utility of the probe as a real-time indicator, in vivo, of released peptide might be diminished.

      Appraisal of achievement of aims, and data support of conclusions:

      Small cavils with controls are omitted for clarity; the larger issue of appraisal of results based on the scope of the designed experiments is discussed in the section below. An interesting question related to the time dependence of the PACAP-elicited activation of PAClight1P87A is its onset and reversibility, and additional data related to this would be welcome.

      Discussion of the impact of the work, and utility of the methods and data:

      Increasingly, neurotransmitter function may be observed in vivo, rather than by inferring in vivo function from in vitro, in cellular, or ex vivo experimentation. This very valuable report discloses the invention of a genetically encoded sensor for the class B1 GPCR PAC1. PAC1 is the major receptor for the neuropeptide PACAP, which in turn is a major neurotransmitter involved in brain response to psychogenic stress, or threat, in vertebrates as diverse as mammals and fishes. If this sensor possesses the sensitivity to detect endogenously released PACAP in vivo it will indeed be an impactful tool for understanding PACAP neurotransmission (and indeed PACAP action in general, in immune and endocrine compartments as well) in future experiments.

      However, the sensor has not yet been used to detect endogenously released PACAP. Until this has been done, one cannot answer the question as to whether the levels of exogenously perfused/administered PACAP used here merely to calibrate the sensor's sensitivity are indeed unphysiologically high. If endogenous PACAP levels don't get that high, then the sensor will not be useful for its intended purpose. The authors should address this issue and allude to what kind of experiments would need to be done in order to detect endogenous PACAP release in living tissue in intact animals. The authors could comment upon the success of other GPCR sensors that have been used to observe endogenous ligand release, and where along the pathway to becoming a truly useful reagent this particular sensor is.

    3. Reviewer #3 (Public Review):

      Summary:

      The manuscript introduces PAClight1P78A, a novel genetically encoded sensor designed to facilitate the study of class-B1 G protein-coupled receptors (GPCRs), focusing on the human PAC1 receptor. Addressing the significant challenge of investigating these clinically relevant drug targets, the sensor demonstrates a high dynamic range, excellent ligand selectivity, and rapid activation kinetics. It is validated across a variety of experimental contexts including in vitro, ex vivo, and in vivo models in mice and zebrafish, showcasing its utility for high-throughput screening, basic research, and drug development efforts related to GPCR dynamics and pharmacology.

      Strengths:

      The innovative design of PAClight1P78A successfully bridges a crucial gap in GPCR research by enabling real-time monitoring of receptor activation with high specificity and sensitivity. The extensive validation across multiple models emphasizes the sensor's reliability and versatility, promising significant contributions to both the scientific understanding of GPCR mechanisms and the development of novel therapeutics. Furthermore, by providing the research community with detailed methodologies and access to the necessary viral vectors and plasmids, the authors ensure the sensor's broad applicability and ease of adoption for a wide range of studies focused on GPCR biology and drug targeting.

      Weaknesses<br /> To further strengthen the manuscript and validate the efficacy of PAClight1P78A as a selective PACAP sensor, it is crucial to demonstrate the sensor's ability to detect endogenous PACAP release in vivo under physiological conditions. While the current data from artificial PACAP application in mouse brain slices and microinfusion in behaving mice provide foundational insights into the sensor's functionality, these approaches predominantly simulate conditions with potentially higher concentrations of PACAP than naturally occurring levels.

      Although the sensor's specificity for the PAC1 receptor and its primary ligand is a pivotal achievement, exploring its potential application to other GPCRs within the class-B1 family or broader categories could enhance the manuscript's impact, suggesting ways to adapt this technology for a wider array of receptor studies. Additionally, while the sensor's performance is convincingly demonstrated in short-term experiments, insights into its long-term stability and reusability in more prolonged or repeated measures scenarios would be valuable for researchers interested in chronic studies or longitudinal behavioral analyses. Addressing these aspects could broaden the understanding of the sensor's practical utility over extended research timelines.

      Furthermore, the current in vivo experiments involving microinfusion of PACAP near sensor-expressing areas in behaving mice are based on a relatively small sample size (n=2), which might limit the generalizability of the findings. Increasing the number of subjects in these experimental groups would enhance the statistical power of the results and provide a more robust assessment of the sensor's in vivo functionality. Expanding the sample size will not only validate the findings but also address potential variability within the population, thereby reinforcing the conclusions drawn from these crucial experiments.