26,199 Matching Annotations
  1. Last 7 days
    1. Reviewer #2 (Public Review):

      Summary:

      In this article, the authors study the function of TEDC1 and TEDC2, two proteins previously reported to interact with TUBD1 and TUBE1. Previous work by the same group had shown that TUBD1 and TUBE1 are required for centriole assembly and that human cells lacking these proteins form abnormal centrioles that only have singlet microtubules that disintegrate in mitosis. In this new work, the authors demonstrate that TEDC1 and TEDC2 depletion results in the same phenotype with abnormal centrioles that also disintegrate into mitosis. In addition, they were able to localize these proteins to the proximal end of the centriole, a result not previously achieved with TUBD1 and TUBE1, providing a better understanding of where and when the complex is involved in centriole growth.

      Strengths:

      The results are very convincing, particularly the phenotype, which is the same as previously observed for TUBD1 and TUBE1. The U-ExM localization is also convincing: despite a signal that's not very homogeneous, it's clear that the complex is in the proximal region of the centriole and procentriole. The phenotype observed in U-ExM on the elongation of the cartwheel is also spectacular and opens the question of the regulation of the size of this structure. The authors also report convincing results on direct interactions between TUBD1, TUBE1, TEDC1, and TEDC2, and an intriguing structural prediction suggesting that TEDC1 and TEDC2 form a heterodimer that interacts with the TUBD1- TUBE1 heterodimer.

      Weaknesses:

      The phenotypes observed in U-ExM on cartwheel elongation merit further quantification, enabling the field to appreciate better what is happening at the level of this structure.

    2. Reviewer #3 (Public Review):

      Summary:

      Human cells deficient in delta-tubulin or epsilon-tubulin form unstable centrioles, which lack triplet microtubules and undergo a futile formation and disintegration cycle. In this study, the authors show that human cells lacking the associated proteins TEDC1 or TEDC2 have these identical phenotypes. They use genetics to knockout TEDC1 or TEDC2 in p53-negative RPE-1 cells and expansion microscopy to structurally characterize mutant centrioles. Biochemical methods and AlphaFold-multimer prediction software are used to investigate interactions between tubulins and TEDC1 and TEDC2.

      The study shows that mutant centrioles are built only of A tubules, which elongate and extend their proximal region, fail to incorporate structural components, and finally disintegrate in mitosis. In addition, they demonstrate that delta-tubulin or epsilon-tubulin and TEDC1 and TEDC2 form one complex and that TEDC1 TEDC2 can interact independently of tubulins. Finally, they show that the localization of four proteins is mutually dependent.

      Strengths:

      The results presented here are mostly convincing, the study is exciting and important, and the manuscript is well-written. The study shows that delta-tubulin, epsilon-tubulin, TEDC1, and TEDC2 function together to build a stable and functional centriole, significantly contributing to the field and our understanding of the centriole assembly process.

      Weaknesses:

      The ultrastructural characterization of TEDC1 and TEDC2 obtained by U-ExM is inconclusive. Improving the quality of the signals is paramount for this manuscript.

    1. eLife assessment

      This useful study explores the relationship between the sequence of prokaryotic promoter elements and their activity using mutagenesis to generate thousands of mutant sequences. The evidence supporting these findings is incomplete, and would benefit from additional experiments, clarification of methods, and a more detailed discussion of related literature. This work will appeal to those interested in bacterial genetics, genome evolution, and gene regulation.

    2. Reviewer #1 (Public Review):

      Summary:

      This study by Fuqua et al. studies the emergence of sigma70 promoters in bacterial genomes. While there have been several studies to explore how mutations lead to promoter activity, this is the first to explore this phenomenon in a wide variety of backgrounds, which notably contain a diverse assortment of local sigma70 motifs in variable configurations. By exploring how mutations affect promoter activity in such diverse backgrounds, they are able to identify a variety of anecdotal examples of gain/loss of promoter activity and propose several mechanisms for how these mutations interact within the local motif landscape. Ultimately, they show how different sequences have different probabilities of gaining/losing promoter activity and may do so through a variety of mechanisms.

      Major strengths and weaknesses of the methods and results:

      This study uses Sort-Seq to characterize promoter activity, which has been adopted by multiple groups and shown to be robust. Furthermore, they use a slightly altered protocol that allows measurements of bi-directional promoter activity. This combined with their pooling strategy allows them to characterize expressions of many different backgrounds in both directions in extremely high throughput which is impressive! A second key approach this study relies on is the identification of promoter motifs using position weight matrices (PWMs). While these methods are prone to false positives, the authors implement a systematic approach which is standard in the field. However, drawing these types of binary definitions (is this a motif? yes/no) should always come with the caveat that gene expression is a quantitative trait that we oversimplify when drawing boundaries.

      Their approach to randomly mutagenizing promoters allowed them to find many anecdotal examples of different types of evolutions that may occur to increase or decrease promoter activity. However, the lack of validation of these phenomena in more controlled backgrounds may require us to further scrutinize their results. That is, their explanations for why certain mutations lead or obviate promoter activity may be due to interactions with other elements in the 'messy' backgrounds, rather than what is proposed.

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      The authors express a key finding that the specific landscape of promoter motifs in a sequence affects the likelihood that local mutations create or destroy regulatory elements. The authors have described many examples, including several that are non-obvious, and show convincingly that different sequence backgrounds have different probabilities for gaining or losing promoter activity. While this overarching conclusion is supported by the manuscript, the proposed mechanisms for explaining changes in promoter activity are not sufficiently validated to be taken for absolute truth. There is not sufficient description of the strength of emergent promoter motifs or their specific spacings from existing motifs within the sequence. Furthermore, they do not define a systematic process by which mutations are assigned to different categories (e.g. box shifting, tandem motifs, etc.) which may imply that the specific examples are assigned based on which is most convenient for the narrative.

      Impact of the work on the field, and the utility of the methods and data to the community:

      From this study, we are more aware of different types of ways promoters can evolve and devolve, but do not have a better ability to predict when mutations will lead to these effects. Recent work in the field of bacterial gene regulation has raised interest in bidirectional promoter regions. While the authors do not discuss how mutations that raise expression in one direction may affect another, they have created an expansive dataset that may enable other groups to study this interesting phenomenon. Also, their variation of the Sort-Seq protocol will be a valuable example for other groups who may be interested in studying bidirectional expression. Lastly, this study may be of interest to groups studying eukaryotic regulation as it can inform how the evolution of transcription factor binding sites influences short-range interactions with local regulator elements.

      Any additional context to understand the significance of the work:

      The task of computationally predicting whether a sequence drives promoter activity is difficult. By learning what types of mutations create or destroy promoters from this study, we are better equipped for this task.

    3. Reviewer #2 (Public Review):

      Summary:

      Fuqua et al investigated the relationship between prokaryotic box motifs and the activation of promoter activity using a mutagenesis sequencing approach. From generating thousands of mutant daughter sequences from both active and non-active promoter sequences they were able to produce a fantastic dataset to investigate potential mechanisms for promoter activation. From these large numbers of mutated sequences, they were able to generate mutual information with gene expression to identify key mutations relating to the activation of promoter island sequences.

      Strengths:

      The data generated from this paper is an important resource to address this question of promoter activation. Being able to link the activation of gene expression to mutational changes in previously nonactive promoter regions is exciting and allows the potential to investigate evolutionary processes relating to gene regulation in a statistically robust manner. Alongside this, the method of identifying key mutations using mutual information in this paper is well done and should be standard in future studies for identifying regions of interest.

      Weaknesses:

      While the generation of the data is superb the focus only on these mutational hotspots removes a lot of the information available to the authors to generate robust conclusions. For instance.

      (1) The linear regression in S5 used to demonstrate that the number of mutational hotspots correlates with the likelihood of a mutation causing promoter activation is driven by three extreme points.

      (2) Many of the arguments also rely on the number of mutational hotspots being located near box motifs. The context-dependent likelihood of this occurring is not taken into account given that these sequences are inherently box motif rich. So, something like an enrichment test to identify how likely these hot spots are to form in or next to motifs.

      (3) The link between changes in expression and mutations in surrounding motifs is assessed with two-sided Mann Whitney U tests. This method assumes that the sequence motifs are independent of one another, but the hotspots of interest occur either in 0, 3, 4, or 5s in sequences. There is therefore no sequence where these hotspots can be independent and the correlation causation argument for motif change on expression is weakened.

      (4) The distance between -10 and -35 was mentioned briefly but not taken into account in the analysis.

      The authors propose mechanisms of promoter activation based on a few observations that are treated independently but occur concurrently. To address this using complementary approaches such as analysis focusing on identifying important motifs, using something like a glm lasso regression to identify significant motifs, and then combining with mutational hotspot information would be more robust. Other elements known to be involved in promoter activation including TGn or UP elements were not investigated or discussed.

    4. Reviewer #3 (Public Review):

      Summary:

      Like many papers in the last 5-10 years, this work brings a computational approach to the study of promoters and transcription, but unfortunately disregards or misrepresents much of the existing literature and makes unwarranted claims of novelty. My main concerns with the current paper are outlined below although the problems are deeply embedded.

      Strengths:

      The data could be useful if interpreted properly, taking into account i) the role of translation ii) other promoter elements, and iii) the relevant literature.

      Weaknesses:

      (1) Incorrect assumptions and oversimplification of promoters.

      - There is a critical error on line 68 and Figure 1A. It is well established that the -35 element consensus is TTGACA but the authors state TTGAAA, which is also the sequence represented by the sequence logo shown and so presumably the PWM used. It is essential that the authors use the correct -35 motif/PWM/consensus.

      -Likely, the authors have made this mistake because they have looked at DNA sequence logos generated from promoter alignments anchored by either the position of the -10 element or transcription start site (TSS), most likely the latter. The distance between the TSS and -10 varies. Fewer than half of E. coli promoters have the optimal 7 bp separation with distances of 8, 6, and 5 bp not being uncommon (PMID: 35241653). Furthermore, the distance between the -10 and -35 elements is also variable (16,17, and 18 bp spacings are all frequently found, PMID: 6310517). This means that alignments, used to generate sequence logos, have misaligned -35 hexamers. Consequently, the true consensus is not represented. If the alignment discrepancies are corrected, the true consensus emerges. This problem seems to permeate the whole study since this obviously incorrect consensus/motif has been used throughout to identify sequences that resemble -35 hexamers.

      - An uninformed person reading this paper would be led to believe that prokaryotic promoters have only two sequence elements: the -10 and -35 hexamers. This is because the authors completely ignore the role of the TG motif, UP element, and spacer region sequence. All of these can compensate for the lack of a strong -35 hexamer and it's known that appending such elements to a lone -10 sequence can create an active promoter (e.g. PMIDs 15118087, 21398630, 12907708, 16626282, 32297955). Very likely, some of the mutations, classified as not corresponding to a -10 or -35 element in Figure 2, target some of these other promoter motifs.

      - The model in Figure 4C is highly unlikely. There is no evidence in the literature that RNAP can hang on with one "arm" in this way. In particular, structural work has shown that sequence-specific interactions with the -10 element can only occur after the DNA has been unwound (PMID: 22136875). Further, -10 elements alone, even if a perfect match to the consensus, are non-functional for transcription. This is because RNAP needs to be directed to the -10 by other promoter elements, or transcription factors. Only once correctly positioned, can RNAP stabilise DNA opening and make sequence-specific contacts with the -10 hexamer. This makes the notion that RNAP may interact with the -10 alone, using only domain 2 of sigma, extremely unlikely.

      (2) Reinventing the language used to describe promoters and binding sites for regulators.

      - The authors needlessly complicate the narrative by using non-standard language. For example, On page 1 they define a motif as "a DNA sequence computationally predicted to be compatible with TF binding". They distinguish this from a binding site "because binding sites refer to a location where a TF binds the genome, rather than a DNA sequence". First, these definitions are needlessly complicated, why not just say "putative binding sites" and "known binding sites" respectively? Second, there is an obvious problem with the definitions; many "motifs" with also be "bindings sites". In fact, by the time the authors state their definitions, they have already fallen foul of this conflation; in the prior paragraph they stated: "controlled by DNA sequences that encode motifs for TFs to bind". The same issue reappears throughout the paper.

      - The authors also use the terms "regulatory" and non-regulatory" DNA. These terms are not defined by the authors and make little sense. For instance, I assume the authors would describe promoter islands lacking transcriptional activity (itself an incorrect assumption, see below)as non-regulatory. However, as horizontally acquired sections of AT-rich DNA these will all be bound by H-NS and subject to gene silencing, both promoters for mRNA synthesis and spurious promoters inside genes that create untranslated RNAs. Hence, regulation is occurring.

      - Line 63: "In prokaryotes, the primary regulatory sequences are called promoters". Promoters are not generally considered regulatory. Rather, it is adjacent or overlapping sites for TFs that are regulatory. There is a good discussion of the topic here (PMID: 32665585).

      (3) The authors ignore the role of translation.

      - The authors' assay does not measure promoter activity alone, this can only be tested by measuring the amount of RNA produced. Rather, the assay used measures the combined outputs of transcription and translation. If the DNA fragments they have cloned contain promoters with no appropriately positioned Shine-Dalgarno sequence then the authors will not detect GFP or RFP production, even though the promoter could be making an RNA (likely to be prematurely terminated by Rho, due to a lack of translation). This is known for promoters in promoter islands (e.g. Figure 1 in PMID: 33958766).

      - In Figure S6 it appears that the is a strong bias for mutations resulting in RFP expression to be close to the 3' end of the fragment. Very likely, this occurs because this places the promoter closer to RFP and there are fewer opportunities for premature termination by Rho

      (4) Ignoring or misrepresenting the literature.

      - As eluded to above, promoter islands are large sections of horizontally acquired, high AT-content, DNA. It is well known that such sequences are i) packed with promoters driving the expression on RNAs that aren't translated ii) silenced, albeit incompletely, by H-NS and iii) targeted by Rho which terminates untranslated RNA synthesis (PMIDs: 24449106, 28067866, 18487194). None of this is taken into account anywhere in the paper and it is highly likely that most, if not all, of the DNA sequences the authors have used contain promoters generating untranslated RNAs.

      - The authors state that GC content does not correlate with the emergence of new promoters. It is known that GC content does correlate to the emergence of new promoters because promoters are themselves AT-rich DNA sequences (e.g. see Figure 1 of PMID: 32297955). There are two reasons the authors see no correlation in this work. First, the DNA sequences they have used are already very AT-rich (between 65 % and 78 % AT-content). Second, they have only examined a small range of different AT-content DNA (i.e. between 65 % and 78 %). The effect of AT-content on promoter emerge is most clearly seen between AT-content of between around 40 % and 60 %. Above that level, the strong positive correlation plateaus.

      - Once these authors better include and connect their results to the previous literature, they can also add some discussion of how previous papers in recent years may have also missed some of this important context.

      (5) Lack of information about sequences used and mutations.

      - To properly assess the work any reader will need access to the sequences cloned at the start of the work, where known TSSs are within these sequences (ideally +/- H-NS, which will silence transcription in the chromosomal context but may not when the sequences are removed from their natural context and placed in a plasmid). Without this information, it is impossible to assess the validity of the authors' work.

      - The authors do not account for the possibility that DNA sequences in the plasmid, on either side of the cloned DNA fragment, could resemble promoter elements. If this is the case, then mutations in the cloned DNA will create promoters by "pairing up" with the plasmid sequences. There is insufficient information about the DNA sequences cloned, the mutations identified, or the plasmid, to determine if this is the case. It is possible that this also accounts for mutational hotspots described in the paper.

      (6) Overselling the conclusions.

      Line 420: The paper claims to have generated important new insights into promoters. At the same time, the main conclusion is that "Our study demonstrates that mutations to -10 and -35 boxes motifs are the primary paths to create new promoters and to modulate the activity of existing promoters". This isn't new or unexpected. People have been doing experiments showing this for decades. Of course, mutations that make or destroy promoter elements create and destroy promoters. How could it be any other way?

    1. eLife assessment

      This useful work provides a risk-prediction tool, in the form of a nomogram, for practitioners and elderly patients with non-metastatic colon cancer using data from the SEER registry. The unique contribution of this work is the focus on conditional survival. However, the underlying statistical approach is suboptimal and therefore incomplete, which substantially lessens the potential impact of this work. The analysis could use a more rigorous consideration of competing risks.

    2. Reviewer #1 (Public Review):

      Summary:

      This study assessed conditional survival in elderly patients with non-metastatic colon cancer who underwent colectomy. The study found that 5-year conditional overall survival rates exhibited a slight increase initially, followed by a decrease over time. In contrast, 5-year conditional colon-specific survival rates consistently improved over the same period. Nomograms were developed to predict survival probabilities at baseline and for patients surviving 1, 3, and 5 years post-diagnosis, with good predictive performance. The study concludes that conditional survival offers valuable insights into medium- and long-term survival probabilities for these patients.

      Strengths:

      The strengths of this study include robust study design, methodology, statistical analysis, and interpretation of the findings. Utilizing a well-known database for the analysis is another strength. Differentiating overall survival and colon-specific survival rates could be another one. Focusing on elderly patients with this condition is another major point. Providing nomograms for an easier implication of the findings in real-world clinical practice is a major strength of the study.

      Weaknesses:

      Relying on only one database of patients and narrowing down the population to only elderly patients who underwent colectomy could be mentioned as a weakness. Less generalizability of the findings for other populations and not including more diverse databases is a major weakness of this study. The good predictive capabilities of the developed tools are another weakness that could be improved to be excellent.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors assessed the conditional survival of elderly patients with non-metastatic colon cancer who had survived a certain length of time after colectomy. They used data from the Surveillance, Epidemiology, and End Results (SEER) registry to conduct a conditional survival analysis providing estimates of conditional survival rates as well as an analysis of which variables were most important for survival at baseline, one year, three years, and five years.

      Strengths:

      - The authors used SEER data, providing them with long-term follow-up, and thoroughly considered a wide range of variables related to cancer mortality.<br /> - The authors did a thorough job of assessing the predictive ability of their models.<br /> - The authors used conditional survival, providing estimates of survival that are meaningful for patients/physicians, making them useful for clinical practice.

      Weaknesses:

      - The paper would have benefited from a more thorough explanation of why the methods were improvements on existing approaches.

      - This study was primarily interested in cancer mortality, and compared it to the secondary outcome of death from any cause. The study would have benefited from modeling death from non-cancer causes (the competing risk) in addition to death from colon cancer, rather than comparing only to the composite endpoint of death from any cause.

      - When considering a cause-specific hazard, as done with cancer survival in this paper, it would be better to consider the cumulative incidence function rather than Kaplan Meier, since it does not assume the independence of the events like Kaplan Meier does. For this reason, the paper would benefit from focusing on the results of the adjusted cause-specific hazard models (rather than the unadjusted conditional survival estimates done using Kaplan Meier estimates shown in Figure 1 and conducting a parallel analysis for death from other causes.

      - The authors mention that they consider disparities using a log-rank test. For the same reason as above, is not the best approach when dealing with competing risks as it depends on Kaplan Meier curves. The log-rank test may be fine if there is no strong dependence between the two causes of death, but the paper would benefit from some discussion of that choice, or sensitivity analysis by comparison to other approaches.

      - The variables for the adjusted models were chosen with univariate Cox regression analysis, with any variables having a p-value less than 0.05 being included in the adjusted. Another approach, which may have made the models more easily comparable, would be to choose the variables that are relevant based on prior literature and include them in the multivariate model regardless of significance. The paper would benefit from a discussion of what is gained by excluding some variables from some models.

    4. Reviewer #3 (Public Review):

      Summary:

      This article uses a subset of data from the SEER cancer registry to develop nomograms, a patient-facing risk prediction tool, for predicting overall and cancer-specific survival in elderly patients who underwent colectomy for the treatment of non-metastatic colon cancer. A unique contribution is the intent to provide conditional predictions, i.e. given that you have survived for x years from your diagnosis, what is your probability of survival for an additional y years? Although the goal is a useful one, the approach is unfortunately hampered by some important weaknesses.

      Strengths:

      Predicting conditional overall survival is a useful, patient-oriented goal.

      The data source is the high-quality SEER cancer registry.

      Weaknesses:

      Using Kaplan-Meier methodology to estimate the survival distribution for a time-to-event in the presence of another competing time-to-event (in this case: estimating colon-specific survival in the presence of death from other causes) will generally over-estimate the event rate. The reported colon-specific survival probabilities are probably biased downwards from their true values. See https://pubmed.ncbi.nlm.nih.gov/10204198/

      A similar concern would apply to the use of the cause-specific Cox model, and thus also the nomogram, to predict absolute (conditional) survival.

      The p-value-based methodology for determining which predictors should be included in the nomogram is rudimentary. More modern variable selection methods, e.g. the Lasso, would have been preferred.

      Related to the above comment, some predictors are present for the conditional survival nomogram for time t, then absent for time t+1, then present again for time t+2. A cancer site is an example of such a predictor. From a face validity perspective, this doesn't really make sense. Ideally, predictors would not enter, then leave, and then re-enter a model.

      Many observations were excluded due to missingness in predictors, e.g. >10000 were excluded to due unknown CEA (Supplementary Figure 1). Given the number of observations dropped due to missingness in the predictors, ideally an attempt would have been made to incorporate the partial information available in these data.

      Details are lacking on how the AUCs and Brier scores were calculated in the presence of censoring / competing events, which limits the reader's understanding of the results.

      It is not clear why a nomogram would be preferred to an online risk prediction calculator.

    1. eLife assessment

      The work is important and of potential value to areas other than the bone field because it supports a role and mechanism for beta-catenin that is novel and unusual. The findings are significant in that they support the presence of another anabolic pathway in bone that can be productively targeted for therapeutic goals. The data for the most part are convincing. The work could be strengthened by better characterizing the osteoclast KO of Malat1 related to the Lys cre model and by including biochemical markers of bone turnover from the mice.

    2. Reviewer #1 (Public Review):

      Summary

      The authors were trying to discover a novel bone remodeling network system. They found that an IncRNA Malat1 plays a central role in the remodeling by binding to β-catenin and functioning through the β-catenin-OPG/Jagged1 pathway in osteoblasts and chondrocytes. In addition, Malat1 significantly promotes bone regeneration in fracture healing in vivo. Their findings suggest a new concept of Malat1 function in the skeletal system. One significantly different finding between this manuscript and the competing paper pertains to the role of Malat1 in osteoclast lineage, specifically, whether Malat1 functions intrinsically in osteoclast lineage or not.

      Strengths:

      This study provides strong genetic evidence demonstrating that Malat1 acts intrinsically in osteoblasts while suppressing osteoclastogenesis in a non-autonomous manner, whereas the other group did not utilize relevant conditional knockout mice. As shown in the results, Malat1 knockout mouse exhibited abnormal bone remodeling and turnover. Furthermore, they elucidated molecular function of Malat1, which is sufficient to understand the phenotype in vivo.

      Weaknesses:

      Discussing differences between previous paper and their status would be highly informative and beneficial for the field, as it would elucidate the solid underlying mechanisms.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors investigated the roles of IncRNA Malat1 in bone homeostasis which was initially believed to be non-functional for physiology. They found that both Malat1 KO and conditional KO in osteoblast lineage exhibit significant osteoporosis due to decreased osteoblast bone formation and increased osteoclast resorption. More interestingly they found that deletion of Malat1 in osteoclast lineage cells does not affect osteoclast differentiation and function. Mechanistically, they found that Malat1 acts as a co-activator of b-Catenin directly regulating osteoblast activity and indirectly regulating osteoclast activity via mediating OPG, but not RANKL expression in osteoblast and chondrocyte. Their discoveries establish a previously unrecognized paradigm model of Malat1 function in the skeletal system, providing novel mechanistic insights into how a lncRNA integrates cellular crosstalk and molecular networks to fine-tune tissue homeostasis, and remodeling.

      Strengths:

      The authors generated global and conditional KO mice in osteoblast and osteoclast lineage cells and carefully analyzed the role of Matat1 with both in vivo and in vitro systems. The conclusion of this paper is mostly well supported by data.

      Weaknesses:

      More objective biological and biochemical analyses are required.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Qin and colleagues study the role of Malat1 in bone biology. This topic is interesting given the role of lncRNAs in multiple physiologic processes. A previous study (PMID 38493144) suggested a role for Malat1 in osteoclast maturation. However, the role of this lncRNA in osteoblast biology was previously not explored. Here, the authors note osteopenia with increased bone resorption in mice lacking Malat1 globally and in osteoblast lineage cells. At the mechanistic level, the authors suggest that Malat1 controls beta-catenin activity. These results advance the field regarding the role of this lncRNA in bone biology.

      Strengths:

      The manuscript is well-written and data are presented in a clear and easily understandable manner. The bone phenotype of osteoblast-specific Malat1 knockout mice is of high interest. The role of Malat1 in controlling beta-catenin activity and OPG expression is interesting and novel.

      Weaknesses:

      The lack of a bone phenotype when Malat1 is deleted with LysM-Cre is of interest given the previous report suggesting a role for this lncRNA in osteoclasts. However, to interpret the findings here, the authors should investigate the deletion efficiency of Malat1 in osteoclast lineage cells in their model. The data in the fracture model in Figure 8 seems incomplete in the absence of a more complete characterization of callus histology and a thorough time course. The role of Malat1 and OPG in chondrocytes is unclear since the osteocalcin-Cre mice (which should retain normal Malat1 levels in chondrocytes) have similar bone loss as the global mutants.

    1. eLife assessment

      In this valuable study, Gue, Hue et al. describe how two poorly understood rhabdomyosarcoma fusion-oncogenes, VGLL2::NCOA2 and TEAD1::NCOA2, function at the genomic, transcriptional, and proteomic levels in multiple systems. They generated solid data that support that these fusion-oncogenes leverage TEAD transcriptional signatures, in a mechanism that is independent of YAP/TAZ, and that this activity potentially contributes to tumorigenesis. This work offers new mechanistic insights into oncogenic gene fusion events identified in cancer patients and reveals potential therapeutic strategies for the treatment of rhabdomyosarcomas.

    2. Reviewer #1 (Public Review):

      Summary:

      Guo, Hue et al. focused on understanding the epigenetic activity and functional dependencies for two different fusions found in infantile rhabdomyosarcoma, VGLL2::NCOA2, and TEAD1::NCOA2. They use a variety of models and methods; specifically, ectopic expression of the fusions in human 293T cells to perform RNAseq (both fusions), CUT&RUN (VGLL2::NCOA2), and BioID mass spec (both fusions). These data identify that the VGLL2::NCOA2 fusion has peaks that are enriched for TEAD motifs. Further, CPB/p300 CUT&RUN support an enrichment of binding sites and three TEAD targets in VGLL2::NCOA2 and TEAD1::NCOA2 expressing cells. They also functionally evaluated genetic and chemical dependencies (TEAD inhibition), and found this was only effective for the VGLL2::NCOA2 fusion, and not for TEAD1::NCOA2. Using complementary biochemical approaches they suggest (with other supporting data) that the fusions regulate TEAD transcriptional outputs via a YAP/TAZ independent mechanism. Further, they expand into a C2C12 myoblast model and show that TEAD1::NCOA2 is transforming in colony formation assays and in mouse allografts. This is consistent with previously published strategies using VGLL2::NCOA2. Importantly, they show that a CBP/p300 (a binding partner found in their BioID mass spec) small molecule inhibitor suppresses tumor formation using this mouse allograft model, that the tumors are less proliferative, and have a reduction in transcriptional of three TEAD target genes. Generally, the data is interesting and suggests new biology for these fusion-oncogenes. However, the choice of 293T for the majority of the transcriptional, epigenetic, and proteomic studies makes the findings difficult to interpret in the context of the human disease, and the rationale for the choice of an epithelial-like kidney cell line is not discussed. Further, details are missing from the figures, figure legends, and methods that make the data difficult to interpret, and should be added to improve the reader's understanding. Overall, the breadth of methods used in this study, and the comparison of the two fusion-oncogene's biology is of interest to the fusion-oncogene, pediatric sarcoma, and epigenetic therapeutic targeting fields.

      Strengths:

      (1) Multiple experimental approaches were used to understand the biology of the fusion-oncogenes, including genomic, proteomic, chemical, and genetic inhibition. These approaches identify potential new mechanisms of convergent fusion-oncogene activity, around TEAD transcriptional targeting (that is YAP/TAZ independent) and reveal CBP/p300 as a functional dependency.

      (2) Complementary models were used, including cell-based assays and mouse allograft models to show the dependency on CBP/P300.

      (3) Co-IPs were clear and convincing and showed direct interaction of the fusion-oncogene with ectopic and endogenous TEAD1/pan-TEAD, but not YAP/TAZ.

      (4) Potential to follow-up on additional targets/mechanisms of tumorigenesis. For example, in the BioID proteomics screen, a unique VGLL2::NCOA2 and TEAD::NCOA2 interactor is P53, which also is an enriched pathway in Figure 4C in the p300 CUT&RUN peaks in the VGLL2::NCOA2 and TEAD1::NCOA2 expressing cells - is this indicative of the toxicity of the fusion-oncogenes or do you think this informs potential mechanisms for transformation.

      Weaknesses:

      (1) The rationale for performing genomics, transcriptional, and proteomics work in 293T cells is not discussed. Further, there are no functional readouts mentioned in the 293T cells with expression of the fusion-oncogenes. Did these cells have any phenotypes associated with fusion-oncogene expression (proliferation differences, morphological changes, colony formation capacity)? Further, how similar are the gene expression signatures from RNA-seq to rhabdomyosarcoma? This would help the reader interpret how similar these cell models are to human disease.

      (2) TEAD1::NCOA2 fusion-oncogene model was not credentialed past H&E, and expression of Desmin. Is the transcriptional signature in C2C12 or 293T similar to a rhabdomyosarcoma gene signature?

      (3) For the fusion-oncogenes, did the HA, FLAG, or V5 tag impact fusion-oncogene activity? Was the tag on the 3' or 5' of the fusion? This was not discussed in the methods.

      (4) Generally, the lack of details in the figures, figure legends, and methods make the data difficult to interpret. A few examples are below:

      a. Individual data points are not shown for figure bar plots (how many technical or biological replicates are present and how many times was the experiment repeated?).<br /> b. What exons were included in the fusion-oncogenes from VGLL2 and NCOA2 or TEAD1 and NCOA2?<br /> c. For how long were the colony formation experiments performed? Two weeks?<br /> d. In Figure 2D, what concentration of CP1 was used and for how long?<br /> e. How was A485 resuspended for cell culture and mouse experiments, what is the percentage of DMSO?<br /> f. How many replicates were done for RNA-seq, CUT&RUN, and ATACseq experiments?

    3. Reviewer #2 (Public Review):

      In the manuscript entitled "VGLL2 and TEAD1 fusion proteins drive YAP/TAZ-independent transcription and tumorigenesis by engaging p300", Gu et al. studied two Hippo pathway-related gene fusion events (i.e., VGLL2-NCOA2, TEAD1-NCOA2) in spindle cell rhabdomyosarcoma (scRMS) and showed that their fusion proteins can activate Hippo downstream gene transcription independent of YAP/TAZ. Using the BioID-based mass spectrometry analysis, the authors revealed histone acetyltransferase CBP/p300 as specific binding proteins for VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. Pharmacologically targeting p300 inhibited the fusion proteins-induced Hippo downstream gene transcription and tumorigenic events.

      Overall, this study provides mechanistic insights into the scRMS-associated gene fusions in tumorigenesis and reveals potential therapeutic targets for cancer treatment. The manuscript is well-written and easy to follow.

      Here, several suggestions are made for the authors to improve their study.

      Main points

      (1) The authors majorly focused on the Hippo downstream gene transcription in this study, while a significant portion of genes regulated by the VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins are non-Hippo downstream genes (Figure 3). The authors should investigate whether the altered Hippo pathway transcription is essential for VGLL2-NCOA2 and TEAD1-NCOA2-induced cell transformation and tumorigenesis. Specifically, they should test if treatment with the TEAD inhibitor can reverse the cell transformation and tumorigenesis caused by VGLL2-NCOA2 but not TEAD1-NCOA2. In addition, it is important to examine whether YAP-5SA expression can rescue the inhibitory effects of A485 on VGLL2-NCOA2 and TEAD1-NCOA2-induced colony formation and tumor growth. This will help clarify whether Hippo downstream gene transcription is important for the oncogenic activities of these two fusion proteins.

      (2) Rationale for selecting CBP/p300 for functional studies needs to be provided. The BioID-MS experiment identified many interacting proteins for VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins (Table S4). The authors should explain the scoring system used to identify the high-interacting proteins for VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. Was CEP/p300 the top candidates on the list? Providing this information will help justify the focus on CBP/p300 and validate their importance in this study.

      (3) p300 was revealed as a key driver for the VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins-induced transcriptome alteration and tumorigenesis. To strengthen the point, the authors should identify the p300 binding region on VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins. Mutants with defects in p300 binding/recruitment should be generated and included as a control in the related q-PCR and tumorigenic studies. This work will help confirm the crucial role of p300 in mediating the oncogenic effects of these two fusion proteins.

      (4) Another major issue is the overexpression system extensively used in this study. It is important to determine whether the VGLL2-NCOA2 and TEAD1-NCOA2 fusion genes are also amplified in cancer. If not, the expression levels of the VGLL2-NCOA2 and TEAD1-NCOA2 fusion proteins should be adjusted to endogenous levels to assess their oncogenic effects on gene transcription and tumorigenesis. This approach would make the study more relevant to the pathological conditions observed in scRMS cancer patients.

    1. eLife assessment

      Approaches for quantifying synaptic activity events are currently limited, and recent advances in AI and deep learning provide an opportunity to develop powerful new ways to automate this process. In this study, the authors have generated a valuable tool, miniML, that uses open-source software that convincingly enables rapid, automated, and accurate quantification of synaptic events from a variety of systems and approaches. This software will be of significant utility to a variety of neuroscience researchers.

    2. Reviewer #1 (Public Review):

      O'Neill et al. have developed a software analysis application, miniML, that enables the quantification of electrophysiological events. They utilize a supervised deep learned-based method to optimize the software. miniML is able to quantify and standardize the analyses of miniature events, using both voltage and current clamp electrophysiology, as well as optically driven events using iGluSnFR3, in a variety of preparations, including in the cerebellum, calyx of held, Golgi cell, human iPSC cultures, zebrafish, and Drosophila. The software appears to be flexible, in that users are able to hone and adapt the software to new preparations and events. Importantly, miniML is an open-source software free for researchers to use and enables users to adapt new features using Python.

      Overall this new software has the potential to become widely used in the field and an asset to researchers. However, the authors fail to discuss or even cite a similar analysis tool recently developed (SimplyFire), and determine how miniML performs relative to this platform. There are a handful of additional suggestions to make miniML more user-friendly, and of broad utility to a variety of researchers, as well as some suggestions to further validate and strengthen areas of the manuscript:

      (1) miniML relative to existing analysis methods: There is a major omission in this study, in that a similar open source, Python-based software package for event detection of synaptic events appears to be completely ignored. Earlier this year, another group published SimplyFire in eNeuro (Mori et al., 2024; doi: 10.1523/eneuro.0326-23.2023). Obviously, this previous study needs to be discussed and ideally compared to miniML to determine if SimplyFire is superior or similar in utility, and to underscore differences in approach and accuracy.

      (2) The manuscript should comment on whether miniML works equally well to quantify current clamp events (voltage; e.g. EPSP/mEPSPs) compared to voltage clamp (currents, EPSC/mEPSCs), which the manuscript highlights. Are rise and decay time constants calculated for each event similarly?

      (3) The interface and capabilities of miniML appear quite similar to Mini Analysis, the free software that many in the field currently use. While the ability and flexibility for users to adapt and adjust miniML for their own uses/needs using Python programming is a clear potential advantage, can the authors comment, or better yet, demonstrate, whether there is any advantage for researchers to use miniML over Mini Analysis or SimplyFire if they just need the standard analyses?

      (4) Additional utilities for miniML: The authors show miniML can quantify miniature electrophysiological events both current and voltage clamp, as well as optical glutamate transients using iGluSnFR. As the authors mention in the discussion, the same approach could, in principle, be used to quantify evoked (EPSC/EPSP) events using electrophysiology, Ca2+ events (using GCaMP), and AP waveforms using voltage indicators like ASAP4. While I don't think it is reasonable to ask the authors to generate any new experimental data, it would be great to see how miniML performs when analysing data from these approaches, particularly to quantify evoked synaptic events and/or Ca2+ (ideally postsynaptic Ca2+ signals from miniature events, as the Drosophila NMJ have developed nice approaches).

    3. Reviewer #2 (Public Review):

      Summary:

      This paper presents miniML as a supervised method for the detection of spontaneous synaptic events. Recordings of such events are typically of low SNR, where state-of-the-art methods are prone to high false positive rates. Unlike current methods, training miniML requires neither prior knowledge of the kinetics of events nor the tuning of parameters/thresholds.

      The proposed method comprises four convolutional networks, followed by a bi-directional LSTM and a final fully connected layer which outputs a decision event/no event per time window. A sliding window is used when applying miniML to a temporal signal, followed by an additional estimation of events' time stamps. miniML outperforms current methods for simulated events superimposed on real data (with no events) and presents compelling results for real data across experimental paradigms and species.

      Strengths:

      The authors present a pipeline for benchmarking based on simulated events superimposed on real data (with no events). Compared to five other state-of-the-art methods, miniML leads to the highest detection rates and is most robust to specific choices of threshold values for fast or slow kinetics. A major strength of miniML is the ability to use it for different datasets. For this purpose, the CNN part of the model is held fixed and the subsequent networks are trained to adapt to the new data. This Transfer Learning (TL) strategy reduces computation time significantly and more importantly, it allows for using a substantially smaller data set (compared to training a full model) which is crucial as training is supervised (i.e. uses labeled examples).

      Weaknesses:

      The authors do not indicate how the specific configuration of miniML was set, i.e. number of CNNs, units, LSTM, etc. Please provide further information regarding these design choices, whether they were based on similar models or if chosen based on performance.

      The data for the benchmark system was augmented with equal amounts of segments with/without events. Data augmentation was undoubtedly crucial for successful training.

      (1) Does a balanced dataset reflect the natural occurrence of events in real data? Could the authors provide more information regarding this matter?

      (2) Please provide a more detailed description of this process as it would serve users aiming to use this method for other sub-fields.

      The benchmarking pipeline is indeed valuable and the results are compelling. However, the authors do not provide comparative results for miniML for real data (Figures 4-8). TL does not apply to the other methods. In my opinion, presenting the performance of other methods, trained using the smaller dataset would be convincing of the modularity and applicability of the proposed approach.

      Impact:

      Accurate detection of synaptic events is crucial for the study of neural function. miniML has a great potential to become a valuable tool for this purpose as it yields highly accurate detection rates, it is robust, and is relatively easily adaptable to different experimental setups.

      Additional comments:

      Line 73: the authors describe miniML as "parameter-free". Indeed, miniML does not require the selection of pulse shape, rise/fall time, or tuning of a threshold value. Still, I would not call it "parameter-free" as there are many parameters to tune, starting with the number of CNNs, and number of units through the parameters of the NNs. A more accurate description would be that as an AI-based method, the parameters of miniML are learned via training rather than tuned by the user.

      Line 302: the authors describe miniML as "threshold-independent". The output trace of the model has an extremely high SNR so a threshold of 0.5 typically works. Since a threshold is needed to determine the time stamps of events, I think a better description would be "robust to threshold choice".

    4. Reviewer #3 (Public Review):

      miniML as a novel supervised deep learning-based method for detecting and analyzing spontaneous synaptic events. The authors demonstrate the advantages of using their methods in comparison with previous approaches. The possibility to train the architecture on different tasks using transfer learning approaches is also an added value of the work. There are some technical aspects that would be worth clarifying in the manuscript:

      (1) LSTM Layer Justification: Please provide a detailed explanation for the inclusion of the LSTM layer in the miniML architecture. What specific benefits does the LSTM layer offer in the context of synaptic event detection?

      (2) Temporal Resolution: Can you elaborate on the reasons behind the lower temporal resolution of the output? Understanding whether this is due to specific design choices in the model, data preprocessing, or post-processing will clarify the nature of this limitation and its impact on the analysis.

      (3) Architecture optimization: how was the architecture CNN+LSTM optimized in terms of a number of CNN layers and size?

    1. eLife assessment

      This study provides a novel and promising NPRL2 gene therapy for enhanced immunotherapy response in a KRAS/STK11 mutant anti-PD1 resistant metastatic NSCLC humanized mouse model. Overall, the authors presented a large amount of convincing in vivo data to demonstrate that NPRL2 gene therapy induces antitumor activity through DC-mediated antigen presentation and cytotoxic immune cell activation. This work will be of interest and useful to medical biologists and oncologists in the research field of KRAS-mutant NSCLC.

    2. Reviewer #1 (Public Review):

      This study excellently complements the previous one by unveiling the properties of NPRL2 in augmenting the effect of immune checkpoint inhibitors such as pembrolizumab in KRAS mutant lung cancer models.

      The following points should be clarified:

      (1) In KRAS mutant cell lines with LKB1 co-mutations or deletions, such as A549 cells, does treatment with NPRL2 not increase the efficacy of immunotherapy? Is this correct? Similarly, does the delivery of NPRL2 only potentiate the effect of immunotherapy in KRAS mutant cell lines without associated LKB1 mutations?

      (2) Do the authors analyze by western blot if NPRL2 influences or restores STING and LKB1 in the A549 cell line that lacks LKB1 and STING?

      (3) Mechanistically, is there any explanation as to why NPRL2 delivery increases the efficacy of immunotherapy? Is there any effect on FUS or MYC?

      (4) Is there any way to carry out a clinical study of systematically delivering NPRL2 in KRAS lung cancer patients?

    3. Reviewer #2 (Public Review):

      Summary:

      NPRL2 gene therapy induces effective antitumor immunity in KRAS/STK11 mutant anti-PD1 resistant metastatic non-small cell lung cancer (NSCLC) in a humanized mouse model by Meraz et al investigated the antitumor immune responses to NPRL2 gene therapy in aPD1R / KRAS/STK11mt NSCLC in a humanized mouse model, and found that NPRL2 gene therapy induces antitumor activity on KRAS/STK11mt/aPD1R tumors through DC-mediated antigen presentation and cytotoxic immune cell activation.

      Strengths:

      The novelty of the study.

      Weaknesses:

      (1) The inconsistent effect of NPRL2 combined with pembrolizumab. Figure 2I-K, showed a similar tumor intensity in the NPRL2 group and combination group. However, NPRL2 combined with pembrolizumab was synergistic in the KRASwt/aPD1S H1299 tumors in Figure 4.

      (2) The authors stated that NPRL2 combined with pembrolizumab was not synergistic in the KRAS/STK11mt/aPD1R tumors but was synergistic in the KRASwt/aPD1S H1299 tumors. How did the synergistic effect defined in the study, more details need to be provided here.

      (3) Nearly all of the work was performed pre-clinically. Validation in the clinical setting would provide more strong evidence for the conclusion.

      (4) Figure 5 and Figure 6 have the same legend. These 2 figures could be merged as a new one.

      (5) Figure 5B & C, n=9 in the Figure 5B. However, the detail number in Figure 5C was less than 9.

    4. Reviewer #3 (Public Review):

      Summary:

      NPRL2/TUSC4 is a tumor suppressor gene whose expression is reduced in many cancers including NSCLC. This study presents a novel finding on NPRL2 gene therapy, which induces antitumor activity on aPD1-resistant tumors. Since KRAS/STK11 mutant tumors were reported to be less benefited from ICIs, this study has potential clinical application value.

      Strengths:

      This work uncovers the advantage of NPRL2 gene therapy by using humanized models and multiple cell lines. Moreover, via immune cell depletion studies, the mechanism of NPRL2 gene therapy has focused on dendritic cells and CD8+T cells.

      Weaknesses:

      A major concern would be the lack of systematic, and logical rigor. This work did not present a link between apoptosis and antigen presenting induced by NPRL2 restoration. There is no evidence proving that the PI3K/AKT/mTOR signaling pathway is related to antigen presenting, which is the major reason of NPRL2 induced antitumor response. Therefore, the two parts may not support each other logically.

    1. eLife assessment

      This work proposes that positive biodiversity-ecosystem functioning relationships found in experiments have been exaggerated because commonly used statistical analyses are flawed. As an alternative, the authors suggest a new analysis based on species competitive responses. Unfortunately, the presented methods are not reproducibly described, not yet complete, and inadequate for hypothesis testing. The reviewers agreed that the authors have either misinterpreted or chosen not to take into account much of the current research literature in the field of plant competition and biodiversity research.

    2. Reviewer #1 (Public Review):

      [Editors' note: this is an overall synthesis from the Reviewing Editor in consultation with the reviewers.]

      The three reviews expand our critique of this manuscript in some depth and complementary directions. These can be synthesized in the following main points (we point out that there is quite a bit more that could be written about the flaws with this study; however, time constraints prevented us from further elaborating on the issues we see):

      (1) It is unclear what the authors want to do. It seems their main point is that the large BEF literature and especially biodiversity experiments overstate the occurrence of positive biodiversity effects because some of these can result from competition. Because reduced interspecific relative to intraspecific competition in mixture is sufficient to produce positive effects in mixtures (if interspecific competition = 0 then RYT = S, where S is species richness in mixture -- this according to the reciprocal yield law = law of constant final yield), they have a problem accepting NE > 0 as true biodiversity effect (see additive partitioning method of Loreau & Hector 2001 cited in manuscript).

      (2) The authors' next claim, without justification, that additive partitioning of NE is flawed and theoretically and biologically meaningless. They misinterpret the CE component as biological niche partitioning and the SE component as biological dominance. They do not seem to accept that the additive partitioning is a logically and mathematically sound derivation from basic principles that cannot be contested.

      (3) The authors go on to introduce a method to calculate species-level overyielding (RY > 1/S in replacement series experiments) as a competitive growth response and multiply this with the species monoculture biomass relative to the maximum to obtain competitive expectation. This method is based on resource competition and the idea that resource uptake is fully converted into biomass (instead of e.g. investing it in allelopathic chemical production).

      (4) It is unclear which experiments should be done, i.e. are partial-density monocultures planted or simply calculated from full-density monocultures? At what time are monocultures evaluated? The framework suggests that monocultures must have the full potential to develop, but in experiments, they are often performing very poorly, at least after some time. I assume in such cases the monocultures could not be used.

      (5) There are many reasons why the ideal case of only resource competition playing a role is unrealistic. This excludes enemies but also differential conversion factors of resources into biomass and antagonistic or facilitative effects. Because there are so many potential reasons for deviations from the null model of only resource competition, a deviation from the null model does not allow conclusions about underlying mechanisms.

      Furthermore, this is not a systematically developed partitioning, but some rather empirical ad hoc formulation of a first term that is thought to approximate competitive effects as understood by the authors (but again, there already are problems here). The second residual term is not investigated. For a proper partitioning approach, one would have to decompose overyielding into two (or more) terms and demonstrate (algebraically) that under some reasonable definitions of competitive and non-competitive interactions, these end up driving the respective terms.

      (6) Using a simplistic simulation to test the method is insufficient. For example, I do not see how the simulation includes a mechanism that could create CE in additive partitioning if all species would have the same monoculture yield. Similarly, they do not include mechanisms of enemies or antagonistic interactions (e.g. allelopathy).

      (7) The authors do not cite relevant literature regarding density x biodiversity experiments, competition experiments, replacement-series experiments, density-yield experiments, additive partitioning, facilitation, and so on.

      Overall, this manuscript does not lead further from what we have already elaborated in the broad field of BEF and competition studies and rather blurs our understanding of the topic.

    3. Reviewer #2 (Public Review):

      This manuscript is motivated by the question of what mechanisms cause overyielding in mixed-species communities relative to the corresponding monocultures. This is an important and timely question, given that the ultimate biological reasons for such biodiversity effects are not fully understood.

      As a starting point, the authors discuss the so-called "additive partitioning" (AP) method proposed by Loreau & Hector in 2001. The AP is the result of a mathematical rearrangement of the definition of overyielding, written in terms of relative yields (RY) of species in mixtures relative to monocultures. One term, the so-called complementarity effect (CE), is proportional to the average RY deviations from the null expectations that plants of both species "do the same" in monocultures and mixtures. The other term, the selection effect (SE), captures how these RY deviations are related to monoculture productivity. Overall, CE measures whether relative biomass gains differ from zero when averaged across all community members, and SE, whether the "relative advantage" species have in the mixture, is related to their productivity. In extreme cases, when all species benefit, CE becomes positive. When large species have large relative productivity increases, SE becomes positive. This is intuitively compatible with the idea that niche complementarity mitigates competition (CE>0), or that competitively superior species dominate mixtures and thereby driver overyielding (SE>0).

      However, it is very important to understand that CE and SE capture the "statistical structure" of RY that underlies overyielding. Specifically, CE and SE are not the ultimate biological mechanisms that drive overyielding, and never were meant to be. CE also does not describe niche complementarity. Interpreting CE and SE as directly quantifying niche complementarity or resource competition, is simply wrong, although it sometimes is done. The criticism of the AP method thus in large part seems unwarranted. The alternative methods the authors discuss (lines 108-123) are based on very similar principles.

      The authors now set out to develop a method that aims at linking response patterns to "more true" biological mechanisms.

      Assuming that "competitive dominance" is key to understanding mixture productivity, because "competitive interactions are the predominant type of interspecific relationships in plants", the authors introduce "partial density" monocultures, i.e. monocultures that have the same planting density for a species as in a mixture. The idea is that using these partial density monocultures as a reference would allow for isolating the effect of competition by the surrounding "species matrix".

      The authors argue that "To separate effects of competitive interactions from those of other species interactions, we would need the hypothesis that constituent species share an identical niche but differ in growth and competitive ability (i.e., absence of positive/negative interactions)." - I think the term interaction is not correctly used here, because clearly competition is an interaction, but the point made here is that this would be a zero-sum game.

      The authors use the ratio of productivity of partial density and full-density monocultures, divided by planting density, as a measure of "competitive growth response" (abbreviated as MG). This is the extra growth a plant individual produces when intraspecific competition is reduced.

      Here, I see two issues: first, this rests on the assumption that there is only "one mode" of competition if two species use the same resources, which may not be true, because intraspecific and interspecific competition may differ. Of course, one can argue that then somehow "niches" are different, but such a niche definition would be very broad and go beyond the "resource set" perspective the authors adopt. Second, this value will heavily depend on timing and the relationship between maximum initial growth rates and competitive abilities at high stand densities.

      The authors then progress to define relative competitive ability (RC), and this time simply uses monoculture biomass as a measure of competitive ability. To express this biomass in a standardized way, they express it as different from the mean of the other species and then divide by the maximum monoculture biomass of all species.

      I have two concerns here: first, if competitive ability is the capability of a species to preempt resources from a pool also accessed by another species, as the authors argued before, then this seems wrong because one would expect that a species can simply be more productive because it has a broader niche space that it exploits. This contradicts the very narrow perspective on competitive ability the authors have adopted. This also is difficult to reconcile with the idea that specialist species with a narrow niche would outcompete generalist species with a broad niche. Second, I am concerned by the mathematical form. Standardizing by the maximum makes the scaling dependent on a single value.

      As a final step, the authors calculate a "competitive expectation" for a species' biomass in the mixture, by scaling deviations from the expected yield by the product MG ⨯ RC. This would mean a species does better in a mixture when (1) it benefits most from a conspecific density reduction, and (2) has a relatively high biomass.

      Put simply, the assumption would be that if a species is productive in monoculture (high RC), it effectively does not "see" the competitors and then grows like it would be the sole species in the community, i.e. like in the partial density monoculture.

      Overall, I am not very convinced by the proposed method.

      (1) The proposed method seems not very systematic but rather "ad hoc". It also is much less a partitioning method than the AP method because the other term is simply the difference. It would be good if the authors investigated the mathematical form of this remainder and explored its properties.. when does complementarity occur? Would it capture complementarity and facilitation?

      (2) The justification for the calculation of MG and RC does not seem to follow the very strict assumptions of what competition (in the absence of complementarity) is. See my specific comments above.

      (3) Overall, the manuscript is hard to read. This is in part a problem of terminology and presentation, and it would be good to use more systematic terms for "response patterns" and "biological mechanisms".

      Examples:<br /> - on line 30, the authors write that CE is used to measure "positive" interactions and SE to measure "competitive interactions", and later name "positive" and "negative" interactions "mechanisms of species interactions". Here the authors first use "positive interaction" as any type of effect that results in a community-level biomass gain, but then they use "interaction" with reference to specific biological mechanisms (e.g. one species might attract a parasite that infests another species, which in turn may cause further changes that modify the growth of the first and other species).

      - on line 70, the authors state that "positive interaction" increases productivity relative to the null expectation, but it is clear that an interaction can have "negative" consequences for one interaction partner and "positive" ones for the other. Therefore, "positive" and "negative" interactions, when defined in this way, cannot be directly linked to "resource partitioning" and "facilitation", and "species interference" as the authors do. Also, these categories of mechanisms are still simple. For example, how do biotic interactions with enemies classify, see above?

      - line 145: "Under the null hypothesis, species in the mixture are assumed to be competitively equivalent (i.e., absence of interspecific interactions)". This is wrong. The assumption is that there are interspecific interactions, but that these are the same as the intraspecific ones. Weirdly, what follows is a description of the AP method, which does not belong here. This paragraph would better be moved to the introduction where the AP method is mentioned. Or omitted, since it is basically a repetition of the original Loreau & Hector paper.

      Other points:

      - line 66: community productivity, not ecosystem productivity.<br /> - line 68: community average responses are with respect to relative yields - this is important!<br /> - line 64: what are "species effects of species interactions" ?<br /> - line 90: here "competitive" and "productive" are mixed up, and it is important to state that "suffers more" refers to relative changes, not yield changes.<br /> - line 92: "positive effect of competitive dominance": I don't understand what is meant here.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript by Tao et al. reports on an effort to better specify the underlying interactions driving the effects of biodiversity on productivity in biodiversity experiments. The authors are especially concerned with the potential for competitive interactions to drive positive biodiversity-ecosystem functioning relationships by driving down the biomass of subdominant species. The authors suggest a new partitioning schema that utilizes a suite of partial density treatments to capture so-called competitive ability. While I agree with the authors that understanding the underlying drivers of biodiversity-ecosystem functioning relationships is valuable - I am unsure of the added value of this specific approach for several reasons.

      Strengths:

      I can find a lot of value in endeavouring to improve our understanding of how biodiversity-ecosystem functioning relationships arise. I agree with the authors that competition is not well integrated into the complementarity and selection effect and interrogating this is important.

      Weaknesses:

      (1) The authors start the introduction very narrowly and do not make clear why it is so important to understand the underlying mechanisms driving biodiversity-ecosystem functioning relationships until the end of the discussion.

      (2) The authors criticize the existing framework for only incorporating positive interactions but this is an oversimplification of the existing framework in several ways:<br /> a. The existing partitioning scheme incorporates resource partitioning which is an effect of competition.<br /> b. The authors neglect the potential that negative feedback from species-specific pests and pathogens can also drive positive BEF and complementarity effects but is not a positive interaction, necessarily. This is discussed in Schnitzer et al. 2011, Maron et al. 2011, Hendriks et al. 2013, Barry et al. 2019, etc.<br /> c. Hector and Loreau (and many of the other citations listed) do not limit competition to SE because resource partitioning is a byproduct of competition.

      (3) It is unclear how this new measure relates to the selection effect, in particular. I would suggest that the authors add a conceptual figure that shows some scenarios in which this metric would give a different answer than the traditional additive partition. The example that the authors use where a dominant species increases in biomass and the amount that it increases in biomass is greater than the amount of loss from it outcompeting a subdominant species is a general example often used for a selection effect when exactly would you see a difference between the two? :<br /> a. Just a note - I do think you should see a difference between the two if the species suffers from strong intraspecific competition and has therefore low monoculture biomass but this would tend to also be a very low-density monoculture in practice so there would potentially be little difference between a low density and high-density monoculture because the individuals in a high-density monoculture would die anyway. So I am not sure that in practice you would really see this difference even if partial density plots were incorporated.

      (4) One of the tricky things about these endeavors is that they often pull on theory from two different subfields and use similar terminology to refer to different things. For example - in competition theory, facilitation often refers to a positive relative interaction index (this seems to be how the authors are interpreting this) while in the BEF world facilitation often refers to a set of concrete physical mechanisms like microclimate amelioration. The truth is that both of these subfields use net effects. The relative interaction index is also a net outcome as is the complementarity effect even if it is only a piece of the net biodiversity effect. Trying to combine these two subfields to come up with a new partitioning mechanism requires interrogating the underlying assumptions of both subfields which I do not see in this paper.

      (5) The partial density treatment does not isolate competition in the way that the authors indicate. All of the interactions that the authors discuss are density-dependent including the mechanism that is not discussed (negative feedback from species-specific pests and pathogens). These partial density treatment effects therefore cannot simply be equated to competition as the authors indicate.:<br /> a. Additionally - the authors use mixture biomass as a stand-in for competitive ability in some cases but mixture biomass could also be determined by the degree to which a plant is facilitated in the mixture (for example).

      (6) I found the literature citation to be a bit loose. For example, the authors state that the additive partition is used to separate positive interactions from competition (lines 70-76) and cite many papers but several of these (e.g. Barry et al. 2019) explicitly do not say this.

      (7) The natural take-home message from this study is that it would be valuable for biodiversity experiments to include partial density treatments but I have a hard time seeing this as a valuable addition to the field for two reasons:<br /> a. In practice - adding in partial density treatments would not be feasible for the vast majority of experiments which are already often unfeasibly large to maintain.<br /> b. The density effect would likely only be valuable during the establishment phase of the experiment because species that are strongly limited by intraspecific competition will die in the full-density plots resulting in low-density monocultures. You can see this in many biodiversity experiments after the first years. Even though they are seeded (or rarely planted) at a certain density, the density after several years in many monocultures is quite low.

    5. Reviewer #4 (Public Review):

      Summary:

      This manuscript claims to provide a new null hypothesis for testing the effects of biodiversity on ecosystem functioning. It reports that the strength of biodiversity effects changes when this different null hypothesis is used. This main result is rather inevitable. That is, one expects a different answer when using a different approach. The question then becomes whether the manuscript's null hypothesis is both new and an improvement on the null hypothesis that has been in use in recent decades.

      Strengths:

      In general, I appreciate studies like this that question whether we have been doing it all wrong and I encourage consideration of new approaches.

      Weaknesses:

      Despite many sweeping critiques of previous studies and bold claims of novelty made throughout the manuscript, I was unable to find new insights. The manuscript fails to place the study in the context of the long history of literature on competition and biodiversity and ecosystem functioning. The Introduction claims the new approach will address deficiencies of previous approaches, but after reading further I see no evidence that it addresses the limitations of previous approaches noted in the Introduction. Furthermore, the manuscript does not reproducibly describe the methods used to produce the results (e.g., in Table 1) and relies on simulations, claiming experimental data are not available when many experiments have already tested these ideas and not found support for them. Finally, it is unclear to me whether rejecting the 'new' null hypothesis presented in the manuscript would be of interest to ecologists, agronomists, conservationists, or others. I will elaborate on each of these points below.

      The critiques of biodiversity experiments and existing additive partitioning methods are overstated, as is the extent to which this new approach addresses its limitations. For example, the critique that current biodiversity experiments cannot reveal the effects of species interactions (e.g., lines 37-39) isn't generally true, but it could be true if stated more specifically. That is, this statement is incorrect as written because comparisons of mixtures, where there are interspecific and intraspecific interactions, with monocultures, where there are only intraspecific interactions, certainly provide information about the effects of species interactions (interspecific interactions). These biodiversity experiments and existing additive partitioning approaches have limits, of course, for identifying the specific types of interactions (e.g., whether mediated by exploitative resource competition, apparent competition, or other types of interactions). However, the approach proposed in this manuscript gets no closer to identifying these specific mechanisms of species interactions. It has no ability to distinguish between resource and apparent competition, for example. Thus, the motivation and framing of the manuscript do not match what it provides. I believe the entire Introduction would need to be rewritten to clarify what gap in knowledge this proposed approach is addressing and what would be gained by filling this knowledge gap.

      I recommend that the Introduction instead clarify how this study builds on and goes beyond many decades of literature considering how competition and biodiversity effects depend on density. This large literature is insufficiently addressed in this manuscript. This fails to give credit to previous studies considering these ideas and makes it unclear how this manuscript goes beyond the many previous related studies. For example, see papers and books written by de Wit, Harper, Vandermeer, Connolly, Schmid, and many others. Also, note that many biodiversity experiments have crossed diversity treatments with a density treatment and found no significant effects of density or interactions between density and diversity (e.g., Finn et al. 2013 Journal of Applied Ecology). Thus, claiming that these considerations of density are novel, without giving credit to the enormous number of previous studies considering this, is insufficient.

      Replacement series designs emerged as a consensus for biodiversity experiments because they directly test a relevant null hypothesis. This is not to say that there are no other interesting null hypotheses or study designs, but one must acknowledge that many designs and analyses of biodiversity experiments have already been considered. For example, Schmid et al. reviewed these designs and analyses two decades ago (2002, chapter 6 in Loreau et al. 2002 OUP book) and the overwhelming consensus in recent decades has been to use a replacement series and test the corresponding null hypothesis.

      It is unclear to me whether rejecting the 'new' null hypothesis presented in the manuscript would be of interest to ecologists, agronomists, conservationists, or others. Most biodiversity experiments and additive partitions have tested and quantified diversity effects against the null hypothesis that there is no difference between intraspecific and interspecific interactions. If there was no less competition and no more facilitation in mixtures than in monocultures, then there would be no positive diversity effects. Rejecting this null hypothesis is relevant when considering coexistence in ecology, overyielding in agronomy, and the consequences of biodiversity loss in conservation (e.g., Vandermeer 1981 Bioscience, Loreau 2010 Princeton Monograph). This manuscript proposes a different null hypothesis and it is not yet clear to me how it would be relevant to any of these ongoing discussions of changes in biodiversity.

      The claim that all previous methods 'are not capable of quantifying changes in ecosystem productivity by species interactions and species or community level' is incorrect. As noted above, all approaches that compare mixtures, where there are interspecific interactions, to monocultures, where there are no species interactions, do this to some extent. By overstating the limitations of previous approaches, the manuscript fails to clearly identify what unique contribution it is offering, and how this builds on and goes beyond previous work.

      The manuscript relies on simulations because it claims that current experiments are unable to test this, given that they have replacement series designs (lines 128-131). There are, however, dozens of experiments where the replacement series was repeated at multiple densities, which would allow a direct test of these ideas. In fact, these ideas have already been tested in these experiments and density effects were found to be nonsignificant (e.g., Finn et al. 2013).

      It seems that the authors are primarily interested in trees planted at a fixed density, with no opportunity for changes in density, and thus only changes in the size of individuals (e.g., Fig. 1). In natural and experimental systems, realized density differs from the initial planted density, and survivorship of seedlings can depend on both intraspecific and interspecific interactions. Thus, the constrained conditions under which these ideas are explored in this manuscript seem narrow and far from the more complex reality where density is not fixed.

      Additional detailed comments:

      It is unclear to me which 'effects' are referred to on line 36. For example, are these diversity effects or just effects of competition? What is the response variable?

      The usefulness of the approach is overstated on line 52. All partitioning approaches, including the new one proposed here, give the net result of many types of species interactions and thus cannot 'disentangle underlying mechanisms of species interactions.'

      The weaknesses of previous approaches are overstated throughout the manuscript, including in lines 60-61. All approaches provide some, but not all insights. Sweeping statements that previous approaches are not effective, without clarifying what they can and can't do, is unhelpful and incorrect. Also, these statements imply that the approach proposed here addresses the limitations of these previous approaches. I don't yet see how it does so.

      The definitions given for the CE and SE on line 71 are incorrect. Competition affects both terms and CE can be negative or have nothing to do with positive interactions, as noted in many of the papers cited.

      The proposed approach does not address the limitations noted on lines 73 and 74.

      The definition of positive interactions in lines 77 and 78 seems inconsistent with much of the literature, which instead focuses on facilitation or mutualism, rather than competition when describing positive interactions.

      Throughout the manuscript, competition is often used interchangeably with resource competition (e.g., line 82) and complementarity is often attributed to resource partitioning (e.g., line 77). This ignores apparent competition and partitioning enemy-free niche space, which has been found to contribute to biodiversity effects in many studies.

      In what sense are competitive interactions positive for competitive species (lines 82-83)? By definition, competition is an interaction that has a negative effect. Do you mean that interspecific competition is less than intraspecific competition? I am having a very difficult time following the logic.

      Results are asserted on lines 93-95, but I cannot find the methods that produced these results. I am unable to evaluate the work without a repeatable description of the methods.

      The description of the null hypothesis in the common additive partitioning approach on lines 145-146 is incorrect. In the null case, it does not assume that there are no interspecific interactions, but rather that interspecific and intraspecific interactions are equivalent.

    1. eLife assessment

      This valuable manuscript describes evidence of sex differences in specific corticostriatal projections during alcohol consumption, and this is noteworthy given the increasing rates/levels of drinking in females and the liability for Alcohol Use disorder. They provide solid evidence of the lateralisation of the activity of the circuit, but other evidence is incomplete, particularly with regard to its description of the drinking measure and how this relates to intoxication. The analyses of the histology data are not complete, and there are further inconsistencies that make it difficult to reconcile the photometry and behavioral data. The findings will be of partial interest to researchers investigating functional circuitry underlying alcohol-driven behaviors.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper uses a model of binge alcohol consumption in mice to examine how the behaviour and its control by a pathway between the anterior insular cortex (AIC) to the dorsolateral striatum (DLS) may differ between males and females. Photometry is used to measure the activity of AIC terminals in the DLS when animals are drinking and this activity seems to correspond to drink bouts in males but not females. The effects appear to be lateralized with inputs to the left DLS being of particular interest.

      Strengths:

      Increasing alcohol intake in females is of concern and the consequences for substance use disorder and brain health are not fully understood, so this is an area that needs further study. The attempt to link fine-grained drinking behaviour with neural activity has the potential to enrich our understanding of the neural basis of behaviour, beyond what can be gleaned from coarser measures of volumes consumed etc.

      Weaknesses:

      The introduction to the drinking in the dark (DID) paradigm is rather narrow in scope (starting line 47). This would be improved if the authors framed this in the context of other common intermittent access paradigms and gave due credit to important studies and authors that were responsible for the innovation in this area (particularly studies by Wise, 1973 and returned to popular use by Simms et al 2010 and related papers; e.g., Wise RA (1973). Voluntary ethanol intake in rats following exposure to ethanol on various schedules. Psychopharmacologia 29: 203-210; Simms, J., Bito-Onon, J., Chatterjee, S. et al. Long-Evans Rats Acquire Operant Self-Administration of 20% Ethanol Without Sucrose Fading. Neuropsychopharmacol 35, 1453-1463 (2010).) The original drinking in the dark demonstrations should also be referenced (Rhodes et al., 2005). Line 154 Theile & Navarro 2014 is a review and not the original demonstration.

      When sex differences in alcohol intake are described, more care should be taken to be clear about whether this is in terms of volume (e.g. ml) or blood alcohol levels (BAC, or at least g/kg as a proxy measure). This distinction was often lost when lick responses were being considered. If licking is similar (assuming a single lick from a male and female brings in a similar volume?), this might mean males and females consume similar volumes, but females due to their smaller size would become more intoxicated so the implications of these details need far closer consideration. What is described as identical in one measure, is not in another.

      No conclusions regarding the photometry results can be drawn based on the histology provided. Localization and quantification of viral expression are required at a minimum to verify the efficacy of the dual virus approach (the panel in Supplementary Figure 1 is very small and doesn't allow terminals to be seen, and there is no quantification). Whether these might differ by sex is also necessary before we can be confident about any sex differences in neural activity.

      While the authors have some previous data on the AIC to DLS pathway, there are many brain regions and pathways impacted by alcohol and so the focus on this one in particular was not strongly justified. Since photometry is really an observational method, it's important to note that no causal link between activity in the pathway and drinking has been established here.

      It would be helpful if the authors could further explain whether their modified lickometers actually measure individual licks. While in some systems contact with the tongue closes a circuit which is recorded, the interruption of a photobeam was used here. It's not clear to me whether the nose close to the spout would be sufficient to interrupt that beam, or whether a tongue protrusion is required. This detail is important for understanding how the photometry data is linked to behaviour. The temporal resolution of the GCaMP signal is likely not good enough to capture individual links but I think more caution or detail in the discussion of the correspondence of these events is required.

      Even if the pattern of drinking differs between males and females, the use of the word "strategy" implies a cognitive process that was never described or measured.

    3. Reviewer #3 (Public Review):

      Summary:

      In this manuscript by Haggerty and Atwood, the authors use a repeated binge drinking paradigm to assess how water and ethanol intake changes in male in female mice as well as measure changes in anterior insular cortex to dorsolateral striatum terminal activity using fiber photometry. They find that overall, males and females have similar overall water and ethanol intake, but females appear to be more efficient alcohol drinkers. Using fiber photometry, they show that the anterior insular cortex (AIC) to dorsolateral striatum projections (DLS) projections have sex, fluid, and lateralization differences. The male left circuit was most robust when aligned to ethanol drinking, and water was somewhat less robust. Male right, and female and left and right, had essentially no change in photometry activity. To some degree, the changes in terminal activity appear to be related to fluid exposure over time, as well as within-session differences in trial-by-trial intake. Overall, the authors provide an exhaustive analysis of the behavioral and photometric data, thus providing the scientific community with a rich information set to continue to study this interesting circuit. However, although the analysis is impressive, there are a few inconsistencies regarding specific measures (e.g., AUC, duration of licking) that do not quite fit together across analytic domains. This does not reduce the rigor of the work, but it does somewhat limit the interpretability of the data, at least within the scope of this single manuscript.

      Strengths:

      - The authors use high-resolution licking data to characterize ingestive behaviors.<br /> - The authors account for a variety of important variables, such as fluid type, brain lateralization, and sex.<br /> - The authors provide a nice discussion on how this data fits with other data, both from their laboratory and others'.<br /> - The lateralization discovery is particularly novel.

      Weaknesses:

      - The volume of data and number of variables provided makes it difficult to find a cohesive link between data sets. This limits interpretability.<br /> - The authors describe a clear sex difference in the photometry circuit activity. However, I am curious about whether female mice that drink more similarly to males (e.g., less efficiently?) also show increased activity in the left circuit, similar to males. Oppositely, do very efficient males show weaker calcium activity in the circuit? Ultimately, I am curious about how the circuit activity maps to the behaviors described in Figures 1 and 2.<br /> - What does the change in water-drinking calcium imaging across time in males mean? Especially considering that alcohol-related signals do not seem to change much over time, I am not sure what it means to have water drinking change.

    1. eLife assessment

      Here the authors present a useful extension of their previous method to cluster neuronal activity into cell assemblies (groups of neurons with correlated activity). The authors provide solid evidence that this method can identify temporal dynamics of neuronal clusters in sample simulated data, and they show how this method can be applied to whole-brain zebrafish data.

    2. Reviewer #1 (Public Review):

      Summary:

      Understanding large-scale neural activity remains a formidable challenge in neuroscience. While several methods have been proposed to discover the assemblies from such large-scale recordings, most previous studies do not explicitly model the temporal dynamics. This study is an attempt to uncover the temporal dynamics of assemblies using a tool that has been established in other domains.

      The authors previously introduced the compositional Restricted Boltzmann Machine (cRBM) to identify neuron assemblies in zebrafish brain activity. Building upon this, they now employ the Recurrent Temporal Restricted Boltzmann Machine (RTRBM) to elucidate the temporal dynamics within these assemblies. By introducing recurrent connections between hidden units, RTRBM could retrieve neural assemblies and their temporal dynamics from simulated and zebrafish brain data.

      Strengths:

      The RTRBM has been previously used in other domains. Training in the model has been already established. This study is an application of such a model to neuroscience. Overall, the paper is well-structured and the methodology is robust, the analysis is solid to support the authors' claim.

      Weaknesses:

      The overall degree of advance is very limited. The performance improvement by RTRBM compared to their cRBM is marginal, and insights into assembly dynamics are limited.

      (1) The biological insights from this method are constrained. Though the aim is to unravel neural ensemble dynamics, the paper lacks in-depth discussion on how this method enhances our understanding of zebrafish neural dynamics. For example, the dynamics of assemblies can be analyzed using various tools such as dimensionality reduction methods once we have identified them using cRBM. What information can we gain by knowing the effective recurrent connection between them? It would be more convincing to show this in real data.

      (2) Despite the increased complexity of RTRBM over cRBM, performance improvement is minimal. Accuracy enhancements, less than 1% in synthetic and zebrafish data, are underwhelming (Figure 2G and Figure 4B). Predictive performance evaluation on real neural activity would enhance model assessment. Including predicted and measured neural activity traces could aid readers in evaluating model efficacy.

    3. Reviewer #2 (Public Review):

      Summary:

      In this work, the authors propose an extension to some of the last author's previous work, where a compositional restricted Boltzmann machine was considered as a generative model of neuron-assembly interaction. They augment this model by recurrent connections between the Boltzmann machine's hidden units, which allow them to explicitly account for temporal dynamics of the assembly activity. Since their model formulation does not allow the training towards a compositional phase (as in the previous model), they employ a transfer learning approach according to which they initialise their model with a weight matrix that was pre-trained using the earlier model so as to essentially start the actually training in a compositional phase. Finally, they test this model on synthetic and actual data of whole-brain light-sheet-microscopy recordings of spontaneous activity from the brain of larval zebrafish.

      Strengths:

      This work introduces a new model for neural assembly activity. Importantly, being able to capture temporal assembly dynamics is an interesting feature that goes beyond many existing models. While this work clearly focuses on the method (or the model) itself, it opens up an avenue for experimental research where it will be interesting to see if one can obtain any biologically meaningful insights considering these temporal dynamics when one is able to, for instance, relate them to development or behaviour.

      Weaknesses:

      For most of the work, the authors present their RTRBM model as an improvement over the earlier cRBM model. Yet, when considering synthetic data, they actually seem to compare with a "standard" RBM model. This seems odd considering the overall narrative, and it is not clear why they chose to do that. Also, in that case, was the RTRBM model initialised with the cRBM weight matrix?

      A few claims made throughout the work are slightly too enthusiastic and not really supported by the data shown. For instance, when the authors refer to the clusters shown in Figure 3D as "spatially localized", this seems like a stretch, specifically in view of clusters 1, 3, and 4. Moreover, when they describe the predictive performance of their model as "close to optimal" when the down-sampling factor coincided with the interaction time scale, it seems a bit exaggerated given that it was more or less as close to the upper bound as it was to the lower bound.

      When discussing the data statistics, the authors quote correlation values in the main text. However, these do not match the correlation values in the figure to which they seem to belong. Now, it seems that in the main text, they consider the Pearson correlation, whereas in the corresponding figure, it is the Spearman correlation. This is very confusing, and it is not really clear as to why the authors chose to do so.

      Finally, when discussing the fact that the RTRBM model outperforms the cRBM model, the authors state it does so for different moments and in different numbers of cases (fish). It would be very interesting to know whether these are the same fish or always different fish.

    4. Reviewer #3 (Public Review):

      With ever-growing datasets, it becomes more challenging to extract useful information from such a large amount of data. For that, developing better dimensionality reduction/clustering methods can be very important to make sense of analyzed data. This is especially true for neuroscience where new experimental advances allow the recording of an unprecedented number of neurons. Here the authors make a step to help with neuronal analyses by proposing a new method to identify groups of neurons with similar activity dynamics. I did not notice any obvious problems with data analyses here, however, the presented manuscript has a few weaknesses:

      (1) Because this manuscript is written as an extension of previous work by the same authors (van der Plas et al., eLife, 2023), thus to fully understand this paper it is required to read first the previous paper, as authors often refer to their previous work for details. Similarly, to understand the functional significance of identified here neuronal assemblies, it is needed to go to look at the previous paper.

      (2) The problem of discovering clusters in data with temporal dynamics is not unique to neuroscience. Therefore, the authors should also discuss other previously proposed methods and how they compare to the presented here RTRBM method. Similarly, there are other methods using neural networks for discovering clusters (assemblies) (e.g. t-SNE: van der Maaten & Hinton 2008, Hippocluster: Chalmers et al. 2023, etc), which should be discussed to give better background information for the readers.

      (3) The above point to better describe other methods is especially important because the performance of the presented here method is not that much better than previous work. For example, RTRBM outperforms the cRBM only on ~4 out of 8 fish datasets. Moreover, as the authors nicely described in the Limitations section this method currently can only work on a single time scale and clusters have to be estimated first with the previous cRBM method. Thus, having an overview of other methods which could be used for similar analyses would be helpful.

    1. Reviewer #1 (Public Review):

      Summary

      A novel statistical model of neural population activity called the Random Projection model has been recently proposed. Not only is this model accurate, efficient, and scalable, but also is naturally implemented as a shallow neural network. This work proposes a new class of RP model called the reshaped RP model. Inheriting the virtue of the original RP model, the proposed model is more accurate and efficient than the original, as well as compatible with various biological constraints. In particular, the authors have demonstrated that normalizing the total synaptic input in the reshaped model has a homeostatic effect on the firing rates of the neurons, resulting in even more efficient representations with equivalent computational accuracy. These results suggest that synaptic normalization contributes to synaptic homeostasis as well as efficiency in neural encoding.

      Strengths<br /> This paper demonstrates that the accuracy and efficiency of the random projection models can be improved by extending the model with reshaped projections. Furthermore, it broadens the applicability of the model under biological constraints of synaptic regularization. It also suggests the advantage of the sparse connectivity structure over the fully connected model for modeling spiking statistics. In summary, this work successfully integrates two different elements, statistical modeling of the spikes and synaptic homeostasis in a single biologically plausible neural network model. The authors logically demonstrate their arguments with clear visual presentations and well-structured text, facilitating an unambiguous understanding for readers.

      Weaknesses<br /> It would be helpful if the following issues about the major claims of the manuscript could be expanded and/or clarified:

      (1) We find it interesting that the reshaped model showed decreased firing rates of the projection neurons. We note that maximizing the entropy <-ln p(x)> with a regularizing term -\lambda <\sum _i f(x_i)>, which reflects the mean firing rate, results in \lambda _i = \lambda for all i in the Boltzmann distribution. In other words, in addition to the homeostatic effect of synaptic normalization which is shown in Figures 3B-D, setting all \lambda_i = 1 itself might have a homeostatic effect on the firing rates. It would be better if the contribution of these two homeostatic effects be separated. One suggestion is to verify the homeostatic effect of synaptic normalization by changing the value of \lambda.

      (2) As far as we understand, \theta_i (thresholds of the neurons) are fixed to 1 in the article. Optimizing the neural threshold as well as synaptic weights is a natural procedure (both biologically and engineeringly), and can easily be computed by a similar expression to that of a_ij (equation 3). Do the results still hold when changing \theta _i is allowed as well? For example,

      a. If \theta _i becomes larger, the mean firing rates will decrease. Does the backprop model still have higher firing rates than the reshaped model when \theta _i are also optimized?

      b. Changing \theta _i affects the dynamic range of the projection neurons, thus could modify the effect of synaptic constraints. In particular, does it affect the performance of the bounded model (relative to the homeostatic input models)?

      (3) In Figure 1, the authors claim that the reshaped RP model outperforms the RP model. This improved performance might be partly because the reshaped RP model has more parameters to be optimized than the RP model. Indeed, let the number of projections N and the in-degree of the projections K, then the RP model and the reshaped RP model have N and KN parameters, respectively. Does the reshaped model still outperform the original one when only (randomly chosen) N weights (out of a_ij) are allowed to be optimized and the rest is fixed? (or, does it still outperform the original model with the same number of optimized parameters (i.e. N/K neurons)?)

      (4) In Figure 2, the authors have demonstrated that the homeostatic synaptic normalization outperforms the bounded model when the allowed synaptic cost is small. One possible hypothesis for explaining this fact is that the optimal solution lies in the region where only a small number of |a_ij| is large and the rest is near 0. If it is possible to verify this idea by, for example, exhibiting the distribution of a_ij after optimization, it would help the readers to better understand the mechanism behind the superiority of the homeostatic input model.

      (5) In Figures 5D and 5E, the authors present how different reshaping constraints result in different learning processes ("rotation"). We find these results quite intriguing, but it would help the readers understand them if there is more explanation or interpretation. For example,

      a. In the "Reshape - Hom. circuit 4.0" plot (Fig 5D, upper-left), the rotation angle between the two models is almost always the same. This is reasonable since the Homeostatic Circuit model is the least constrained model and could be almost irrelevant to the optimization process. Is there any similar interpretation to the other 3 plots of Figure 5D?

      b. In Figure 5E, is there any intuitive explanation for why the three models take minimum rotation angle at similar global synaptic cost (~0.3)?

    1. eLife assessment

      The paper characterized a specific defect in the spatial working memory of mice with a deficit in a protein called Rac1. Rac1 inhibition was limited to the presynaptic compartment of neurons, which is significant because past work has inhibited both pre- and postsynaptic compartments. The study also identified potential effectors of Rac1. The work is important for these reasons, and the strength of the evidence is exceptional.

    2. Reviewer #1 (Public Review):

      - A summary of what the authors were trying to achieve:

      The authors focused on Rac1, one of the most extensively studied members of the Ras superfamily of small GTPases, an intracellular signal transducer that remodels actin and phosphorylation signaling networks. They performed an extensive series of behavioral tests and found a striking result of selectively inhibiting presynaptic Rac1. Previous studies have made the claim that Rac1-mediated signaling is associated with hippocampal-dependent working memory and longer-term forms of learning and memory. Rac1 was known to modulate both pre- and postsynaptic plasticity. What was missing was selective manipulation of Rac1 function at either pre- or postsynaptic loci. Kim, Soderling, and colleagues showed that following the expression of a genetically encoded Rac1-inhibitor at presynaptic terminals, spatial working memory is selectively impaired. In contrast, Rac1 inhibition at postsynaptic sites spared the spatial working memory but affected longer-term cognitive processes.

      - An account of the major strengths and weaknesses of the methods and results:

      This paper is part of an ambitious research trajectory, presented in multiple rigorous studies, that combines hypothesis-free fishing for candidate signal transduction elements with precise testing of physiological and behavioral outcomes. Each of these arenas has challenges and pitfalls. This paper contains punchlines in both behavioral and cell biological areas. The effect of presynaptic Rac1 inhibition on short-term behavioral memory was convincingly demonstrated with three different behavioral tests, including a quite striking result on delayed non-matching to place task. I found the claim of a specific effect on working memory more convincing here than in previous work. On the other hand, the authors sought to clarify the presynaptic regulatory mechanisms, leveraging new advances in mass spectrometry to identify the proteomic and post-translational landscape of presynaptic Rac1 signaling. They identified particular serine/threonine kinases and phosphorylated cytoskeletal signaling and synaptic vesicle proteins that became enriched with active Rac1. They argued that phosphorylated sites in these proteins are at positions likely to have regulatory effects on synaptic vesicles. They found changes in the distribution and morphology of synaptic vesicles following presynaptic Rac1 inhibition. They also report a postsynaptic consequence, a slightly increased spine cross-sectional area.

      - An appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      The selective agent is the Rac1-inhibiting polypeptide W56; W56 is fused to a protein with specific subcellular localizations in neurons. Hedrick, Yasuda, et al., 2016 showed that this kind of strategy enabled a spatially targeted inhibitory effect. Collaborating with Yasuda, O'Neil in Soderling's group previously reported that Rac1 negatively regulates synaptic vesicle replenishment at both excitatory and inhibitory synapses.

      In the current study by Kim et al., the goal is to interfere with Rac1 function in vivo. Once again, as in O'Neil, the functional intervention was to virally express a W56 peptide, fused to synapsin, a protein with specific subcellular localization-in this case presynaptic. The key control was to compare the effect of W56 with a scrambled sequence (Scr) in the negative control group. As verification of presynaptic efficacy, Kim found that W56-pre makes vesicles larger and further from the active zone without changing overall bouton morphology. Fresh fishing with MassSpec suggests that presynaptic vesicle proteins are affected.

      I am convinced that the presynaptic Rac1 function was successfully tweaked and that this had an effect on working memory tested with 5 s intertrial intervals, in a time range where the field is hard-pressed to find robust cell biological mechanisms for memory storage. (Ion channel dynamics are an alternative, but the focus here was on cytoskeletal, not plasma membrane proteins). What was missing was a direct index of vesicle dynamics or an explanation of why a hypothetical alteration in vesicle dynamics shows up as a change in vesicle size or location. The summarizing scheme is necessarily vague; it lacks specific details about how the effect on working memory occurs, or whether it involves excitatory as opposed to inhibitory nerve terminals.

      - A discussion of the likely impact of the work on the field, and the utility of the methods and data to the community:

      This study reveals a previously unrecognized presynaptic role of Rac1 signaling in cognitive processes and provides insights into its potential regulatory mechanisms.

      An outside observer might appreciate evidence that clearly shows that pivotal cytoskeletal cell biology is not the exclusive monopoly of either side of the synaptic cleft.

      - Any additional context you think would help readers interpret or understand the significance of the work:

      --Overall, it shows off the art of combining fishing with causal experiments, parallel to Steve Marx's work on L-type calcium channel modulation (Nature).

      --Multiple mutations associated with human neurodevelopmental and psychiatric disorders involve genes that encode regulators of the synaptic cytoskeleton. A major, unresolved question is how the disruption of specific actin filament structures leads to the onset and progression of complex synaptic and behavioral phenotypes.

      --The formation of long actin filaments along the axon's longitudinal axis is relevant to the sharing of synaptic vesicles amongst multiple boutons in so-called vesicle superpools (Chenouard & Tsien, NatComm)

    3. Reviewer #2 (Public Review):

      Summary:

      The paper described a behavioural characterisation of mice with presynaptically-inhibited Rac1 in the hippocampus. This is followed by a BioID and phosphoproteomic analysis of Rac1, highlighting potential downstream effectors of active or non-active Rac1 and potential downstream phosphorylated targets.

      Strengths:

      An original molecular approach that has been established in a previous paper by the authors (PMID 34269176) to block Rac1 function exclusively at the presynapse is now utilised to characterise a link between presynaptic dysfunction and mouse behavior. The experiments and the data well-support the conclusion that the function of Rac1 has distinct outcomes on mouse behavior, depending on its site of action.

      Weaknesses:

      A main limitation of the study is that it lacks physiological and biochemical analysis to follow up on hits identified in a BioID and phosphoprotemic analysis of presynaptic active and non-active Rac1 variants.

    1. Reviewer #1 (Public Review):

      Hippocampal place cells display a sequence of firing activities when the animal travels through a spatial trajectory at a behavioral time scale of seconds to tens of seconds. Interestingly, parts of the firing sequence also occur at a much shorter time scale: ~120 ms within individual cycles of theta oscillation. These so-called theta sequences are originally thought to naturally result from the phenomenon of theta phase precession. However, there is evidence that theta sequences do not always occur even when theta phase precession is present, for example, during the early experience of a novel maze. The question is then how they emerge with experience (theta sequence development). This study presents evidence that a special group of place cells, those tuned to fast-gamma oscillations, may play a key role in theta sequence development.

      The authors analyzed place cells, LFPs, and theta sequences as rats traveled a circular maze in repeated laps. They found that a group of place cells were significantly tuned to a particular phase of fast-gamma (FG-cells), in contrast to others that did not show such tunning (NFG-cells). The authors then omitted FG-cells or the same number of NFG-cells, in their algorithm of theta sequence detection and found that the quality of theta sequences, quantified by a weighted correlation, was worse with the FG-cell omission, compared to that with the NFG-cell omission, during later laps, but not during early laps. What made the FG-cells special for theta sequences? The authors found that FG-cells, but not NFG-cells, displayed phase recession to slow-gamma (25 - 45 Hz) oscillations (within theta cycles) during early laps (both FG- and NFG-cells showed slow-gamma phase precession during later laps). Overall, the authors conclude that FG-cells contribute to theta sequence development through slow-gamma phase precession during early laps.

      How theta sequences are formed and developed during experience is an important question, because these sequences have been implicated in several cognitive functions of place cells, including memory-guided spatial navigation. The identification of FG-cells in this study is straightforward. Evidence is also presented for the role of these cells in theta sequence development. However, given several concerns elaborated below, whether the evidence is sufficiently strong for the conclusion needs further clarification, perhaps, in future studies.

      (1) The results in Figure 3 and Figure 8 seems contradictory. In Figure 8, all theta sequences displayed a seemingly significant weighted correlation (above 0) even in early laps, which was mostly due to FG-cell sequences but not NFG-cell sequences (correlation for NFG-sequences appeared below 0). However, in Figure 3H, omitting FG-cells and omitting NFG-cells did not produce significant differences in the correlation. Conversely, FG-cell and NFG-cell sequences were similar in later laps in Figure 8 (NFG-cell sequences appeared even better than FG-cell sequences), yet omitting NFG-cells produced a better correlation than omitting FG-cells. This confusion may be related to how "FG-cell-dominant sequences" were defined, which is unclear in the manuscript. Nevertheless, the different results are not easy to understand.

      (2) The different contributions between FG-cells and NFG-cells to theta sequences are supposed not to be caused by their different firing properties (Figure 5). However, Figure 5D and E showed a large effect size (Cohen's D = 07, 0.8), although not significant (P = 0.09, 0.06). But the seemingly non-significant P values could be simply due to smaller N's (~20). In other parts of the manuscript, the effect sizes were comparable or even smaller (e.g. D = 0.5 in Figure 7B), but interpreted as positive results: P values were significant with large N's (~480 in Fig. 7B). Drawing a conclusion purely based on a P value while N is large often renders the conclusion only statistical, with unclear physical meaning. Although this is common in neuroscience publications, it makes more sense to at least make multiple inferences using similar sample sizes in the same study.

      (3) In supplementary Figure 2 - S2, FG-cells displayed stronger theta phase precession than NFG-cells, which could be a major reason why FG-cells impacted theta sequences more than NFG cells. Although factors other than theta phase precession may contribute to or interfere with theta sequences, stronger theta phase precession itself (without the interference of other factors), by definition, can lead to stronger theta sequences.

      (4) The slow-gamma phase precession of FG-cells during early laps is supposed to mediate or contribute to the emergence of theta sequences during late laps (Figure 1). The logic of this model is unclear. The slow-gamma phase precession was present in both early and late laps for FG-cells, but only present in late laps for NFG-cells. It seems more straightforward to hypothesize that the difference in theta sequences between early and later laps is due to the difference in slow-gamma phase precession of NFG cells between early and late laps. Although this is not necessarily the case, the argument presented in the manuscript is not easy to follow.

      (5) There are several questions on the description of methods, which could be addressed to clarify or strengthen the conclusions.

      (i) Were the identified fast- and slow-gamma episodes mutually exclusive?

      (ii) Was the task novel when the data were acquired? How many days (from the 1st day of the task) were included in the analysis? When the development of the theta sequence was mentioned, did it mean the development in a novel environment, in a novel task, or purely in a sense of early laps (Lap 1, 2) on each day?

      (iii) How were the animals' behavioral parameters equalized between early and later laps? For example, speed or head direction could potentially produce the differences in theta sequences.

    2. Reviewer #2 (Public Review):

      This manuscript addresses an important question that has not yet been solved in the field, what is the contribution of different gamma oscillatory inputs to the development of "theta sequences" in the hippocampal CA1 region? Theta sequences have received much attention due to their proposed roles in encoding short-term behavioral predictions, mediating synaptic plasticity, and guiding flexible decision-making. Gamma oscillations in CA1 offer a readout of different inputs to this region and have been proposed to synchronize neuronal assemblies and modulate spike timing and temporal coding. However, the interactions between these two important phenomena have not been sufficiently investigated. The authors conducted place cell and local field potential (LFP) recordings in the CA1 region of rats running on a circular track. They then analyzed the phase locking of place cell spikes to slow and fast gamma rhythms, the evolution of theta sequences during behavior, and the interaction between these two phenomena. They found that place cells with the strongest modulation by fast gamma oscillations were the most important contributors to the early development of theta sequences and that they also displayed a faster form of phase precession within slow gamma cycles nested with theta. The results reported are interesting and support the main conclusions of the authors. However, the manuscript needs significant improvement in several aspects regarding data analysis, description of both experimental and analytical methods, and alternative interpretations, as I detail below.

      • The experimental paradigm and recordings should be explained at the beginning of the Results section. Right now, there is no description whatsoever which makes it harder to understand the design of the study.

      • An important issue that needs to be addressed is the very small fraction of CA1 cells phased-locked to slow gamma rhythms (3.7%). This fraction is much lower than in many previous studies, that typically report it in the range of 20-50 %. However, this discrepancy is not discussed by the authors. This needs to be explained and additional analysis considered. One analysis that I would suggest, although there are also other valid approaches, is to, instead of just analyzing the phase locking in two discrete frequency bands, compute the phase locking will all LFP frequencies from 25-100 Hz. This will offer a more comprehensive and unbiased view of the gamma modulation of place cell firing. Alternative metrics to mean vector length that is less sensitive to firing rates, such as pairwise phase consistency index (Vinck et a., Neuroimage, 2010), could be implemented. This may reveal whether the low fraction of phase-locked cells could be due to a low number of spikes entering the analysis.

      • From the methods, it is not clear to me whether the reference LFP channel was consistently selected to be a different one that where the spikes analyzed were taken. This is the better practice to reduce the contribution of spike leakage that could substantially inflate the coupling with faster gamma frequencies. These analyses need to be described in more detail.

      • The initial framework of the authors of classifying cells into fast gamma and not fast gamma modulated implies a bimodality that may be artificial. The authors should discuss the nuances and limitations of this framework. For example, several previous work has shown that the same place cell can couple to different gamma oscillations (e.g., Lastoczni et al., Neuron, 2016; Fernandez-Ruiz et al., Neuron, 2017; Sharif et al., Neuron,2021).

      • It would be useful to provide a more thorough characterization of the physiological properties of FG and NFG cells, as this distinction is the basis of the paper. Only very little characterization of some place cell properties is provided in Figure 5. Important characteristics that should be very feasible to compare include average firing rate, burstiness, estimated location within the layer (i.e., deep vs superficial sublayers) and along the transverse axis (i.e., proximal vs distal), theta oscillation frequency, phase precession metrics (given their fundamental relationship with theta sequences), etc.

      • It is not clear to me how the analysis in Figure 6 was performed. In Figure 6B I would think that the grey line should connect with the bottom white dot in the third panel, which would be the interpretation of the results.

    3. Reviewer #3 (Public Review):

      [Editors' note: This review contains many criticisms that apply to the whole sub-field of slow/fast gamma oscillations in the hippocampus, as opposed to this particular paper. In the editors' view, these comments are beyond the scope of any single paper. However, they represent a view that, if true, should contextualise the interpretation of this paper and all papers in the sub-field. In doing so, they highlight an ongoing debate within the broader field.]

      Summary:

      The authors aimed to elucidate the role of dynamic gamma modulation in the development of hippocampal theta sequences, utilizing the traditional framework of "two gammas," a slow and a fast rhythm. This framework is currently being challenged, necessitating further analyses to establish and secure the assumed premises before substantiating the claims made in the present article.

      The results are too preliminary and need to integrate contemporary literature. New analyses are required to address these concerns. However, by addressing these issues, it may be possible to produce an impactful manuscript.

      I. Introduction<br /> Within the introduction, multiple broad assertions are conveyed that serve as the premise for the research. However, equally important citations that are not mentioned potentially contradict the ideas that serve as the foundation. Instances of these are described below:

      (1) Are there multiple gammas? The authors launched the study on the premise that two different gamma bands are communicated from CA3 and the entorhinal cortex. However, recent literature suggests otherwise, offering that the slow gamma component may be related to theta harmonics:

      From a review by Etter, Carmichael and Williams (2023)<br /> "Gamma-based coherence has been a prominent model for communication across the hippocampal-entorhinal circuit and has classically focused on slow and fast gamma oscillations originating in CA3 and medial entorhinal cortex, respectively. These two distinct gammas are then hypothesized to be integrated into hippocampal CA1 with theta oscillations on a cycle-to-cycle basis (Colgin et al., 2009; Schomburg et al., 2014). This would suggest that theta oscillations in CA1 could serve to partition temporal windows that enable the integration of inputs from these upstream regions using alternating gamma waves (Vinck et al., 2023). However, these models have largely been based on correlations between shifting CA3 and medial entorhinal cortex to CA1 coherence in theta and gamma bands. In vivo, excitatory inputs from the entorhinal cortex to the dentate gyrus are most coherent in the theta band, while gamma oscillations would be generated locally from presumed local inhibitory inputs (Pernía-Andrade and Jonas, 2014). This predominance of theta over gamma coherence has also been reported between hippocampal CA1 and the medial entorhinal cortex (Zhou et al., 2022). Another potential pitfall in the communication-through-coherence hypothesis is that theta oscillations harmonics could overlap with higher frequency bands (Czurkó et al., 1999; Terrazas et al., 2005), including slow gamma (Petersen and Buzsáki, 2020). The asymmetry of theta oscillations (Belluscio et al., 2012) can lead to harmonics that extend into the slow gamma range (Scheffer-Teixeira and Tort, 2016), which may lead to a misattribution as to the origin of slow-gamma coherence and the degree of spike modulation in the gamma range during movement (Zhou et al., 2019)."

      And from Benjamin Griffiths and Ole Jensen (2023)<br /> "That said, in both rodent and human studies, measurements of 'slow' gamma oscillations may be susceptible to distortion by theta harmonics [53], meaning open questions remain about what can be attributed to 'slow' gamma oscillations and what is attributable to theta."

      This second statement should be heavily considered as it is from one of the original authors who reported the existence of slow gamma.

      Yet another instance from Schomburg, Fernández-Ruiz, Mizuseki, Berényi, Anastassiou, Christof Koch, and Buzsáki (2014):<br /> "Note that modulation from 20-30 Hz may not be related to gamma activity but, instead, reflect timing relationships with non-sinusoidal features of theta waves (Belluscio et al., 2012) and/or the 3rd theta harmonic."

      One of this manuscript's authors is Fernández-Ruiz, a contemporary proponent of the multiple gamma theory. Thus, the modulation to slow gamma offered in the present manuscript may actually be related to theta harmonics.

      With the above emphasis from proponents of the slow/fast gamma theory on disambiguating harmonics from slow gamma, our first suggestion to the authors is that they A) address these statements (citing the work of these authors in their manuscript) and B) demonstrably quantify theta harmonics in relation to slow gamma prior to making assertions of phase relationships (methodological suggestions below). As the frequency of theta harmonics can extend as high as 56 Hz (PMID: 32297752), overlapping with the slow gamma range defined here (25-45 Hz), it will be important to establish an approach that decouples the two phenomena using an approach other than an arbitrary frequency boundary.

      (2) Can gammas be segregated into different lamina of the hippocampus? This idea appears to be foundational in the premise of the research but is also undergoing revision.

      As discussed by Etter et al. above, the initial theory of gamma routing was launched on coherence values. However, the values reported by Colgin et al. (2009) lean more towards incoherence (a value of 0) rather than coherence (1), suggesting a weak to negligible interaction. Nevertheless, this theory is coupled with the idea that the different gamma frequencies are exclusive to the specific lamina of the hippocampus.

      Recently, Deschamps et al. (2024) suggested a broader, more nuanced understanding of gamma oscillations than previously thought, emphasizing their wide range and variability across hippocampal layers. This perspective challenges the traditional dichotomy of gamma sub-bands (e.g., slow vs. medium gamma) and their associated cognitive functions based on a more rigid classification according to frequency and phase relative to the theta rhythm. Moreover, they observed all frequencies across all layers.

      Similarly, the current source density plots from Belluscio et al. (2012) suggest that SG and FG can be observed in both the radiatum and lacunosum-moleculare.

      Therefore, if the initial coherence values are weak to negligible and both slow and fast gamma are observed in all layers of the hippocampus, can the different gammas be exclusively related to either anatomical inputs or psychological functions (as done in the present manuscript)? Do these observations challenge the authors' premise of their research? At the least, please discuss.

      (3) Do place cells, phase precession, and theta sequences require input from afferent regions? It is offered in the introduction that "Fast gamma (~65-100Hz), associated with the input from the medial entorhinal cortex, is thought to rapidly encode ongoing novel information in the context (Fernandez-Ruiz et al., 2021; Kemere, Carr, Karlsson, & Frank, 2013; Zheng et al., 2016)".

      CA1 place fields remain fairly intact following MEC inactivation include Ipshita Zutshi, Manuel Valero, Antonio Fernández-Ruiz , and György Buzsáki (2022)- "CA1 place cells and assemblies persist despite combined mEC and CA3 silencing" and from Hadas E Sloin, Lidor Spivak, Amir Levi, Roni Gattegno, Shirly Someck, Eran Stark (2024) - "These findings are incompatible with precession models based on inheritance, dual-input, spreading activation, inhibition-excitation summation, or somato-dendritic competition. Thus, a precession generator resides locally within CA1."

      These publications, at the least, challenge the inheritance model by which the afferent input controls CA1 place field spike timing. The research premise offered by the authors is couched in the logic of inheritance, when the effect that the authors are observing could be governed by local intrinsic activity (e.g., phase precession and gamma are locally generated, and the attribution to routed input is perhaps erroneous). Certainly, it is worth discussing these manuscripts in the context of the present manuscript.

      II. Results

      (1) Figure 2-<br /> a. There is a bit of a puzzle here that should be discussed. If slow and fast frequencies modulate 25% of neurons, how can these rhythms serve as mechanisms of communication/support psychological functions? For instance, if fast gamma is engaged in rapid encoding (line 72) and slow gamma is related to the integration processing of learned information (line 84), and these are functions of the hippocampus, then why do these rhythms modulate so few cells? Is this to say 75% of CA1 neurons do not listen to CA3 or MEC input?

      b. Figure 2. It is hard to know if the mean vector lengths presented are large or small. Moreover, one can expect to find significance due to chance. For instance, it is challenging to find a frequency in which modulation strength is zero (please see Figure 4 of PMID: 30428340 or Figure 7 of PMID: 31324673).

      i. Please construct the histograms of Mean Vector Length as in the above papers, using 1 Hz filter steps from 1-120Hz and include it as part of Figure 2 (i.e., calculate the mean vector length for the filtered LFP in steps of 1-2 Hz, 2-3 Hz, 3-4 Hz,... etc). This should help the authors portray the amount of modulation these neurons have relative to the theta rhythm and other frequencies. If the theta mean vector length is higher, should it be considered the primary modulatory influence of these neurons (with slow and fast gammas as a minor influence)?

      ii. It is possible to infer a neuron's degree of oscillatory modulation without using the LFP. For instance, one can create an ISI histogram as done in Figure 1 here (https://www.biorxiv.org/content/10.1101/2021.09.20.461152v3.full.pdf+html; "Distinct ground state and activated state modes of firing in forebrain neurons"). The reciprocal of the ISI values would be "instantaneous spike frequency". In favor of the Douchamps et al. (2024) results, the figure of the BioRXiV paper implies that there is a single gamma frequency modulate as there is only a single bump in the ISIs in the 10^-1.5 to 10^-2 range. Therefore, to vet the slow gamma results and the premise of two gammas offered in the introduction, it would be worth including this analysis as part of Figure 2.

      c. There are some things generally concerning about Figure 2.

      i. First, the raw trace does not seem to have clear theta epochs (it is challenging to ascertain the start and end of a theta cycle). Certainly, it would be worth highlighting the relationship between theta and the gammas and picking a nice theta epoch.

      ii. Also, in panel A, there looks to be a declining amplitude relationship between the raw, fast, and slow gamma traces, assuming that the scale bars represent 100uV in all three traces. The raw trace is significantly larger than the fast gamma. However, this relationship does not seem to be the case in panel B (in which both the raw and unfiltered examples of slow and fast gamma appear to be equal; the right panels of B suggest that fast gamma is larger than slow, appearing to contradict the A= 1/f organization of the power spectral density). Please explain as to why this occurs. Including the power spectral density (see below) should resolve some of this.

      iii. Within the example of spiking to phase in the left side of Panel B (fast gamma example)- the neuron appears to fire near the trough twice, near the peak twice, and somewhere in between once. A similar relationship is observed for the slow gamma epoch. One would conclude from these plots that the interaction of the neuron with the two rhythms is the same. However, the mean vector lengths and histograms below these plots suggest a different story in which the neuron is modulated by FG but not SG. Please reconcile this.

      iv. For calculating the MVL, it seems that the number of spikes that the neuron fires would play a significant role. Working towards our next point, there may be a bias of finding a relationship if there are too few spikes (spurious clustering due to sparse data) and/or higher coupling values for higher firing rate cells (cells with higher firing rates will clearly show a relationship), forming a sort of inverse Yerkes-Dodson curve. Also, without understanding the magnitude of the MVL relative to other frequencies, it may be that these values are indeed larger than zero, but not biologically significant.

      - Please provide a scatter plot of Neuron MVL versus the Neuron's Firing Rate for 1) theta (7-9 Hz), 2) slow gamma, and 3) fast gamma, along with their line of best fit.

      - Please run a shuffle control where the LFP trace is shifted by random values between 125-1000ms and recalculate the MVL for theta, slow, and fast gamma. Often, these shuffle controls are done between 100-1000 times (see cross-correlation analyses of Fujisawa, Buzsaki et al.).

      - To establish that firing rate does not play a role in uncovering modulation, it would be worth conducting a spike number control, reducing the number of spikes per cell so that they are all equal before calculating the phase plots/MVL.

      (2) Something that I anticipated to see addressed in the manuscript was the study from Grosmark and Buzsaki (2016): "Cell assembly sequences during learning are "replayed" during hippocampal ripples and contribute to the consolidation of episodic memories. However, neuronal sequences may also reflect preexisting dynamics. We report that sequences of place-cell firing in a novel environment are formed from a combination of the contributions of a rigid, predominantly fast-firing subset of pyramidal neurons with low spatial specificity and limited change across sleep-experience-sleep and a slow-firing plastic subset. Slow-firing cells, rather than fast-firing cells, gained high place specificity during exploration, elevated their association with ripples, and showed increased bursting and temporal coactivation during postexperience sleep. Thus, slow- and fast-firing neurons, although forming a continuous distribution, have different coding and plastic properties."

      My concern is that much of the reported results in the present manuscript appear to recapitulate the observations of Grosmark and Buzsaki, but without accounting for differences in firing rate. A parsimonious alternative explanation for what is observed in the present manuscript is that high firing rate neurons, more integrated into the local network and orchestrating local gamma activity (PING), exhibit more coupling to theta and gamma. In this alternative perspective, it's not something special about how the neurons are entrained to the routed fast gamma, but that the higher firing rate neurons are better able to engage and entrain their local interneurons and, thus modulate local gamma. However, this interpretation challenges the discussion around the importance of fast gamma routed from the MEC.

      a. Please integrate the Grosmark & Buzsaki paper into the discussion.

      b. Also, please provide data that refutes or supports the alternative hypothesis in which the high firing rate cells are just more gamma modulated as they orchestrate local gamma activity through monosynaptic connections with local interneurons (e.g., Marshall et al., 2002, Hippocampal pyramidal cell-interneuron spike transmission is frequency dependent and responsible for place modulation of interneuron discharge). Otherwise, the attribution to a MEC routed fast gamma routing seems tenuous.<br /> c. It is mentioned that fast-spiking interneurons were removed from the analysis. It would be worth including these cells, calculating the MVL in 1 Hz increments as well as the reciprocal of their ISIs (described above).

      (3) Methods - Spectral decomposition and Theta Harmonics.

      a. It is challenging to interpret the exact parameters that the authors used for their multi-taper analysis in the methods (lines 516-526). Tallon-Baudry et al., (1997; Oscillatory γ-Band (30-70 Hz) Activity Induced by a Visual Search Task in Humans) discuss a time-frequency trade-off where frequency resolution changes with different temporal windows of analysis. This trade-off between time and frequency resolution is well known as the uncertainty principle of signal analysis, transcending all decomposition methods. It is not only a function of wavelet or FFT, and multi-tapers do not directly address this. (The multitaper method, by using multiple specially designed tapers -like the Slepian sequences- smooths the spectrum. This smoothing doesn't eliminate leakage but distributes its impact across multiple estimates). Given the brevity of methods and the issues of theta harmonics as offered above, it is worth including some benchmark trace testing for the multi-taper as part of the supplemental figures.

      i. Please spectrally decompose an asymmetric 8 Hz sawtooth wave showing the trace and the related power spectral density using the multiple taper method discussed in the methods.

      ii. Please also do the same for an elliptical oscillation (perfectly symmetrical waves, but also capable of casting harmonics). Matlab code on how to generate this time series is provided below:<br /> A = 1; % Amplitude<br /> T = 1/8; % Period corresponding to 8 Hz frequency<br /> omega = 2*pi/T; % Angular frequency<br /> C = 1; % Wave speed<br /> m = 0.9; % Modulus for the elliptic function (0<br /> x = linspace(0, 2*pi, 1000); % temporal domain<br /> t = 0; % Time instant

      % Calculate B based on frequency and speed<br /> B = sqrt(omega/C);

      % Cnoidal wave equation using the Jacobi elliptic function<br /> u = A .* ellipj(B.*(x - C*t), m).^2;

      % Plotting the cnoidal wave<br /> figure;<br /> plot(x./max(x), u);<br /> title('8 Hz Cnoidal Wave');<br /> xlabel('time (x)');<br /> ylabel('Wave amplitude (u)');<br /> grid on;

      The Symbolic Math Toolbox needs to be installed and accessible in your MATLAB environment to use ellipj. Otherwise, I trust that, rather than plotting a periodic orbit around a circle (sin wave) the authors can trace the movement around an ellipse with significant eccentricity (the distance between the two foci should be twice the distance between the co-vertices).

      iii. Line 522: "The power spectra across running speeds and absolute power spectrum (both results were not shown)...". Given the potential complications of multi-taper discussed above, and as each convolution further removes one from the raw data, it would be the most transparent, simple, and straightforward to provide power spectra using the simple fft.m code in Matlab (We imagine that the authors will agree that the results should be robust against different spectral decomposition methods. Otherwise, it is concerning that the results depend on the algorithm implemented and should be discussed. If gamma transience is a concern, the authors should trigger to 2-second epochs in which slow/fast gamma exceeds 3-7 std. dev. above the mean, comparing those resulting power spectra to 2-second epochs with ripples - also a transient event). The time series should be at least 2 seconds in length (to avoid spectral leakage issues and the issues discussed in Talon-Baudry et al., 1997 above).

      Please show the unmolested power spectra (Y-axis units in mV2/Hz, X-axis units as Hz) as a function of running speed (increments of 5 cm/s) for each animal. I imagine three of these PSDs for 3 of the animals will appear in supplemental methods while one will serve as a nice manuscript figure. With this plot, please highlight the regions that the authors are describing as theta, slow, and fast gamma. Also, any issues should be addressed should there be notable differences in power across animals or tetrodes (issues with locations along proximal-distal CA1 in terms of MEC/LEC input and using a local reference electrode are discussed below).

      iv. Schomberg and colleagues (2014) suggested that the modulation of neurons in the slow gamma range could be related to theta harmonics (see above). Harmonics can often extend in a near infinite as they regress into the 1/f background (contributing to power, but without a peak above the power spectral density slope), making arbitrary frequency limits inappropriate. Therefore, in order to support the analyses and assertions regarding slow gamma, it seems necessary to calculate a "theta harmonic/slow gamma ratio". Aru et al. (2015; Untangling cross-frequency coupling in neuroscience) offer that: " The presence of harmonics in the signal should be tested by a bicoherence analysis and its contribution to CFC should be discussed." Please test both the synthetic signals above and the raw LFP, using temporal windows of greater than 4 seconds (again, the large window optimizes for frequency resolution in the time-frequency trade-off) to calculate the bicoherence. As harmonics are integers of theta coupled to itself and slow gamma is also coupled to theta, a nice illustration and contribution to the field would be a method that uses the bispectrum to isolate and create a "slow gamma/harmonic" ratio.

      (4) I appreciate the inclusion of the histology for the 4 animals. Knerim and colleagues describe a difference in MEC projection along the proximal-distal axis of the CA1 region (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3866456/)- "There are also differences in their direct projections along the transverse axis of CA1, as the LEC innervates the region of CA1 closer to the subiculum (distal CA1), whereas the MEC innervates the region of CA1 closer to CA2 and CA3 (proximal CA1)" From the histology, it looks like some of the electrodes are in the part of CA1 that would be dominated by LEC input while a few are closer to where the MEC would project.

      a. How do the authors control for these differences in projections? Wouldn't this change whether or not fast gamma is observed in CA1?

      b. I am only aware of one manuscript that describes slow gamma in the LEC which appeared in contrast to fast gamma from the MEC (https://www.science.org/doi/10.1126/science.abf3119). One would surmise that the authors in the present manuscript would have varying levels of fast gamma in their CA1 recordings depending on the location of the electrodes in the Proximal-distal axis, to the extent that some of the more medial tetrodes may need to be excluded (as they should not have fast gamma, rather they should be exclusively dominated by slow gamma). Alternatively, the authors may find that there is equal fast gamma power across the entire proximal-distal axis. However, this would pose a significant challenge to the LEC/slow gamma and MEC/fast gamma routing story of Fernandez-Ruiz et al. and require reconciliation/discussion.

      c. Is there a difference in neuron modulation to these frequencies based on electrode location in CA1?

      (5) Given a comment in the discussion (see below), it will be worth exploring changes in theta, theta harmonic, slow gamma, and fast gamma power with running speed as no changes were observed with theta sequences or lap number versus. Notably, Czurko et al., report an increase in theta and harmonic power with running speed (1999) while Ahmed and Mehta (2012) report a similar effect for gamma.

      a. Please determine if the oscillations change in power and frequency of the rhythms discussed above change with running speed using the same parameters applied in the present manuscript. The specific concern is that how the authors calculate running speed is not sensitive enough to evaluate changes.

      b. It is astounding that animals ran as fast as they did in what appears to be the first lap (Figure 3F), especially as rats' natural proclivity is thigmotaxis and inquisitive exploration in novel environments. Can the authors expand on why they believe their rats ran so quickly on the first lap in a novel environment and how to replicate this? Also, please include the individual values for each animal on the same plot.

      c. Can the authors explain how the statistics on line 169 (F(4,44)) work? Specifically, it is challenging to determine how the degrees of freedom were calculated in this case and throughout if there were only 4 animals (reported in methods) over 5 laps (depicted in Figure 3F. Given line 439, it looks like trials and laps are used synonymously). Four animals over 5 laps should have a DOF of 16.

      (6) Throughout the manuscript, I am concerned about an inflation of statistical power. For example on line 162, F(2,4844). The large degrees of freedom indicate that the sample size was theta sequences or a number of cells. Since multiple observations were obtained from the same animal, the statistical assumption of independence is violated. Therefore, the stats need to be conducted using a nested model as described in Aarts et al. (2014; https://pubmed.ncbi.nlm.nih.gov/24671065/). A statistical consult may be warranted.

      (7) It is stated that one tetrode served as a quiet recording reference. The "quiet" part is an assumption when often, theta and gamma can be volume conducted to the cortex (e.g., Sirota et al., 2008; This is often why laboratories that study hippocampal rhythms use the cerebellum for the differential recording electrode and not an electrode in the corpus callosum). Generally, high frequencies propagate as well as low frequencies in the extracellular milieu (https://www.eneuro.org/content/4/1/ENEURO.0291-16.2016). For transparency, the authors should include a limitation paragraph in their discussion that describes how their local tetrode reference may be inadvertently diminishing and/or distorting the signal that they are trying to isolate. Otherwise, it would be worth hearing an explanation as to how the author's approach avoids this issue.

      Apologetically, this review is already getting long. Moreover, I have substantial concerns that should be resolved prior to delving into the remainder of the analyses. e.g., the analyses related to Figure 3-5 assert that FG cells are important for sequences. However, the relationship to gamma may be secondary to either their relationship to theta or, based on the Grosmark and Buzsaki paper, it may just be a phenomenon coupled to the fast-firing cells (fast-firing cells showing higher gamma modulation due to a local PING dynamic). Moreover, the observation of slow gamma is being challenged as theta harmonics, even by the major proponents of the slow/fast gamma theory. Therefore, the report of slow gamma precession would come as an unsurprising extension should they be revealed to be theta harmonics (however, no control for harmonics was implemented; suggestions were made above). Following these amendments, I would be grateful for the opportunity to provide further feedback.

      III. Discussion.

      a. Line 330- it was offered that fast gamma encodes information while slow gamma integrates in the introduction. However, in a task such as circular track running (from the methods, it appears that there is no new information to be acquired within a trial), one would guess that after the first few laps, slow gamma would be the dominant rhythm. Therefore, one must wonder why there are so few neurons modulated by slow gamma (~3.7%).

      b. Line 375: The authors contend that: "...slow gamma, related to information compression, was also required to modulate fast gamma phase-locked cells during sequence development. We replicated the results of slow gamma phase precession at the ensemble level (Zheng et al., 2016), and furthermore observed it at late development, but not early development, of theta sequences." In relation to the idea that slow gamma may be coupled to - if not a distorted representation of - theta harmonics, it has been observed that there are changes in theta relative to novelty.

      i. A. Jeewajee, C. Lever, S. Burton, J. O'Keefe, and N. Burgess (2008) report a decrease in theta frequency in novel circumstances that disappears with increasing familiarity.

      ii. One could surmise that this change in frequency is associated with alterations in theta harmonics (observed here as slow gamma), challenging the author's interpretation.

      iii. Therefore, the authors have a compelling opportunity to replicate the results of Jeewajee et al., characterizing changes of theta along with the development of slow gamma precession, as the environment becomes familiar. It will become important to demonstrate, using bicoherence as offered by Aru et al., how slow gamma can be disambiguated from theta harmonics. Specifically, we anticipate that the authors will be able to quantify A) theta harmonics (the number, and their respective frequencies and amplitudes), B) the frequency and amplitude of slow gamma, and C) how they can be quantitatively decoupled. Through this, their discussion of oscillatory changes with novelty-familiarity will garner a significant impact.

      c. Broadly, it is interesting that the authors emphasize the gamma frequency throughout the discussion. Given that the power spectral density of the Local Field Potential (LFP) exhibits a log-log relationship between amplitude and frequency, as described by Buzsáki (2005) in "Rhythms of the Brain," and considering that the LFP is primarily generated through synaptic transmembrane currents (Buzsáki et al., 2012), it seems parsimonious to consider that the bulk of synaptic activity occurs at lower frequencies (e.g., theta). Since synaptic transmission represents the most direct form of inter-regional communication, one might wonder why gamma (characterized by lower amplitude rhythms) is esteemed so highly compared to the higher amplitude theta rhythm. Why isn't the theta rhythm, instead, regarded as the primary mode of communication across brain regions? A discussion exploring this question would be beneficial.

    1. eLife assessment

      This study provides important information about the formation of ribbon synapses in mouse cochlear hair cells, which facilitate the temporally-precise transmission of acoustic information to the auditory nerve. Live-cell imaging provides compelling evidence that ribbon precursor volume is dynamically modified by fission and fusion events on microtubules, but some of the other evidence included, particularly in relation to the directed transport of these precursors to the hair cell active zone is incomplete. These findings will be of interest to neuroscientists studying synapse formation and function and should inspire further research into the molecular basis for synaptic ribbon maturation.

    2. Reviewer #1 (Public Review):

      Summary

      The manuscript by Voorn and collaborators aims at deciphering the microtubule-dependent ribbon formation in mouse hair cells. Using STED/confocal imaging, pharmacology tools, and mouse mutant, the group of Christian Vogl convincingly demonstrated that ribbon, the organelle that tethers vesicles at the hair cell synapse, results from the fusion and fission of ribbon precursors, moving along the microtubule network. This study goes hand in hand with a complementary paper (Hussain et al.) showing similar findings in zebrafish hair cells.

      Strengths

      This study demonstrated i) the motion of ribbons precursors along the microtubules, ii) ribbons precursors undergo multiple cycles of fusion-fission events and iii) kinesin Kif1a is critical for synaptic maturation. The results are solid and the images are mesmeric.

      Weaknesses

      As stated by the authors in the discussion, the mechanism underlying the threshold shift in the Kif1a mutant is unclear and may not be solely attributed to the reduction of the ribbon volume.

      Impact

      The synaptogenesis in the auditory sensory cell remains still elusive. Here, this study shows a high plasticity in the synaptogenesis. Indeed, the formation of the synaptic organelle is a dynamic process consisting of several rounds of fusion-fission of presynaptic elements. This study will undoubtedly boost a new line of research aimed at identifying the specific molecular determinants that target ribbon precursors to the synapse and govern the fusion-fission process.

    3. Reviewer #2 (Public Review):

      Summary

      This manuscript makes use of live cell imaging to look at aggregates of the synaptic ribbon protein ribeye to explore synapse formation in an organotypic culture system. The authors find that microtubule disruption influences the motion of a subset of ribeye spots and changes to ribbon volume. Disruption of the microtubule motor is also found to change ribeye motion and ribbon volume, albeit in the opposite direction. Together these results support a role for microtubule-based transport in synapse assembly.

      Strengths

      (1) The use of the in vitro imaging approach provides a method for high-quality live cell imaging in a mammalian preparation.

      (2) The data characterizing the movement of Ribeye in the cochlea is new and exciting.

      (3) The role of motors in the delivery of Ribeye to the synapse had never been established. The effects of nocodozole on directional asymmetry for the subset of slow-moving particles are convincing, though it is unclear to this reviewer how frequently these objects undergo directed motion.

      (4) The effect of Kif1a on ribbon size is an interesting finding that doesn't rely on overexpression and supports the importance of motors on the delivery of ribeye to the synapse.

      Weaknesses

      (1) The analysis leaves unclear what fraction of ribeye spots make use of active transport mechanisms. The authors make the claim that 54% underwent targeted transport because fits of their MSD vs time were best-fit by an exponent >1. This overstates the reliability of this approach. Purely diffusive motion will not always fit perfectly with an exponent of exactly 1 and one would expect roughly to have to have greater than 1 and half less than one, which is what they observe. In point of fact, truly directed transport should have an exponent near 2 (Figure 2F), which only a handful of spots seem to exhibit. I should also note that none of the examples look like those that are typically associated with directed motion.

      (2) The imaging approach makes use of viral expression using a non-Ribeye promoter. This overexpression approach will likely exaggerate the number of ribeye spots and could saturate binding to other proteins or other factors. Also, the promoters aren't under the control of feedback mechanisms that would typically turn off expression at the appropriate time.

      (3) The effect of Kif1A removal on the ABR threshold is very unlikely to be due to ribbon size. Complete removal of the ribbon only has a modest effect on the ABR threshold, so these modest reductions in size are unlikely to contribute much.

      (4) Fusion and fission of small aggregates are difficult to resolve with light microscopy and the examples provided in Figure 3 are indistinguishable from two spots that happen to be too close to each other to resolve.

      5) The "slight left shift" in the velocity distribution in Figure 5C does not look significant. Is it?

      6) Nocodozole and elimination of Kif1a have opposite effects on ribbon volume, which might point to alternative roles for the microtubules.

    4. Reviewer #3 (Public Review):

      Summary

      In this study, the authors addressed the question of how synaptic ribbons-specialized, electron-dense presynaptic structures-are formed from ribbon precursors in sensory hair cells. Specifically, the authors evaluated whether molecular motor-driven, microtubule-based transport plays a role in the directed transport of ribbon precursors to the active zone of cochlear hair cells and assessed whether there was a specific role for the microtubule motor Kinesin Family Member 1A (Kif1a). Using live imaging of cochlear explants and fixed images of both mature and developing cochlea, they provide evidence that ribbon precursors are actively transported on microtubules, that ribbon precursor volume is dynamically modified by fission and fusion events on microtubules, and that Kif1a plays a role in synaptic ribbon maturation.

      Strengths

      Overall, the data presented in this study support that the fission and fusion of ribbon precursors are dependent on microtubule-based translocation, and this dynamic assembly of precursors may involve Kif1a. Live-imaging data and analysis provide strong evidence for microtubule-based transport contributing to dynamic fission-fusion events of ribbon precursors. Further, fixed image analysis of Kif1a mutants supports that it plays a key role in synaptic ribbon maturation.

      Weaknesses

      While the authors clearly established the polarity and stability of microtubules in hair cells, they did not assess the net direction of putative slow microtubule-based movement (i.e. the ratios of plus to minus end-directed travel) in their analysis of ribbon precursor displacement. This information is critical in establishing a role for microtubule-based transport in localizing ribbon precursors to the active zones in the basolateral region of hair cells to form presynaptic ribbons. In addition, the discussion section did not elaborate on what is known about the coordination of molecular motor proteins during microtubule-based transport nor did it effectively incorporate the interpretation of the results with what has been described in previous studies on intracellular transport and the roles of Kif1a in synaptic vesicle precursor trafficking.

    1. eLife assessment

      This valuable study investigates the brain representations of Braille letters in blind participants and provides convincing evidence using EEG and fMRI that the decoding of letter identity across the reading hand takes place in the visual cortex. The evidence supporting the claims of the authors is solid, although the inclusion of a sighted control group and additional analyses would have strengthened the study. The work will be of interest to neuroscientists working on brain plasticity.

    2. Reviewer #1 (Public Review):

      Summary:

      The researchers examined how individuals who were born blind or lost their vision early in life process information, specifically focusing on the decoding of Braille characters. They explored the transition of Braille character information from tactile sensory inputs, based on which hand was used for reading, to perceptual representations that are not dependent on the reading hand.

      They identified tactile sensory representations in areas responsible for touch processing and perceptual representations in brain regions typically involved in visual reading, with the lateral occipital complex serving as a pivotal "hinge" region between them.

      In terms of temporal information processing, they discovered that tactile sensory representations occur prior to cognitive-perceptual representations. The researchers suggest that this pattern indicates that even in situations of significant brain adaptability, there is a consistent chronological progression from sensory to cognitive processing.

      Strengths:

      By combining fMRI and EEG, and focusing on the diagnostic case of Braille reading, the paper provides an integrated view of the transformation processing from sensation to perception in the visually deprived brain. Such a multimodal approach is still rare in the study of human brain plasticity and allows us to discern the nature of information processing in blind people's early visual cortex, as well as the time course of information processing in a situation of significant brain adaptability.

      Weaknesses:

      The lack of a sighted control group limits the interpretations of the results in terms of profound cortical reorganization, or simple unmasking of the architectural potentials already present in the normally developing brain. Moreover, the conclusions regarding the behavioral relevance of the sensory and perceptual representations in the putatively reorganized brain are limited due to the behavioral measurements adopted.

    3. Reviewer #2 (Public Review):

      Summary:

      Haupt and colleagues performed a well-designed study to test the spatial and temporal gradient of perceiving braille letters in blind individuals. Using cross-hand decoding of the read letters, and comparing it to the decoding of the read letter for each hand, they defined perceptual and sensory responses. Then they compared where (using fMRI) and when (using EEG) these were decodable. Using fMRI, they showed that low-level tactile responses specific to each hand are decodable from the primary and secondary somatosensory cortex as well as from IPS subregions, the insula, and LOC. In contrast, more abstract representations of the braille letter independent from the reading hand were decodable from several visual ROIs, LOC, VWFA, and surprisingly also EVC. Using a parallel EEG design, they showed that sensory hand-specific responses emerge in time before perceptual braille letter representations. Last, they used RSA to show that the behavioral similarity of the letter pairs correlates to the neural signal of both fMRI (for the perceptual decoding, in visual and ventral ROIs) and EEG (for both sensory and perceptual decoding).

      Strengths:

      This is a very well-designed study and it is analyzed well. The writing clearly describes the analyses and results. Overall, the study provides convincing evidence from EEG and fMRI that the decoding of letter identity across the reading hand occurs in the visual cortex in blindness. Further, it addresses important questions about the visual cortex hierarchy in blindness (whether it parallels that of the sighted brain or is inverted) and its link to braille reading.

      Weaknesses:

      Although I have some comments and requests for clarification about the details of the methods, my main comment is that the manuscript could benefit from expanding its discussion. Specifically, I'd appreciate the authors drawing clearer theoretical conclusions about what this data suggests about the direction of information flow in the reorganized visual system in blindness, the role VWFA plays in blindness (revised from the original sighted role or similar to it?), how information arrives to the visual cortex, and what the authors' predictions would be if a parallel experiment would be carried out in sighted people (is this a multisensory recruitment or reorganization?). The data has the potential to speak to a lot of questions about the scope of brain plasticity, and that would interest broad audiences.

      To aid in drawing even more concrete conclusions about the flow of information, I suggest that the authors also add at least another early visual ROI to plot more clearly whether EVC's response to braille letters arrives there through an inverted cortical hierarchy, intermediate stages from VWFA, or directly, as found in the sighted brain for spoken language.

      Similarly, it may be informative to look specifically at the occipital electrodes' time differences between decoding for the different parameters and their correlation to behavior.

      Regarding the methods, further detail on the ability to read with both hands equally and any residual vision of the participants would be helpful.

    1. eLife assessment

      This valuable study uses recently developed EEG analysis methods to investigate spatial distractor suppression in a combined visual search/working memory task. While the reported results are convincing, the combined task design leaves open alternative interpretations than those currently discussed in the manuscript, potentially limiting the generalisability of the findings to other task settings. The study will be of interest to cognitive neuroscientists and psychologists working on visual attention and memory.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors tested whether learning to suppress (ignore) salient distractors (e.g., a lone colored nontarget item) via statistical regularities (e.g., the distractor is more likely to appear in one location than any other) was proactive (prior to paying attention to the distractor) or reactive (only after first attending the distractor) in nature. To test between proactive and reactive suppression the authors relied on a recently developed and novel technique designed to "ping" the brain's hidden priority map using EEG inverted encoding models. Essentially, a neutral stimulus is presented to stimulate the brain, resulting in activity on a priority map which can be decoded and used to argue when this stimulation occurred (prior to or after attending to a distracting item). The authors found evidence that despite learning to suppress the high probability distractor location, the suppression was reactive, not proactive in nature.

      Overall, the manuscript is well-written, tests a timely question, and provides novel insight into a long-standing debate concerning distractor suppression.

      Strengths (in no particular order):

      (1) The manuscript is well-written, clear, and concise (especially given the complexities of the method and analyses).

      (2) The presentation of the logic and results is mostly clear and relatively easy to digest.

      (3) This question concerning whether location-based distractor suppression is proactive or reactive in nature is a timely question.

      (4) The use of the novel "pinging" technique is interesting and provides new insight into this particularly thorny debate over the mechanisms of distractor suppression.

      Weaknesses (in no particular order):

      (1) The authors tend to make overly bold claims without either A) mentioning the opposing claim(s) or B) citing the opposing theoretical positions. Further, the authors have neglected relevant findings regarding this specific debate between proactive and reactive suppression.

      (2) The authors should be more careful in setting up the debate by clearly defining the terms, especially proactive and reactive suppression which have recently been defined and were more ambiguously defined here.

      (3) There were some methodological choices that should be further justified, such as the choice of stimuli (e.g., sizes, colors, etc.).

      (4) The figures are often difficult to process. For example, the time courses are so far zoomed out (i.e., 0, 500, 100 ms with no other tick marks) that it makes it difficult to assess the timing of many of the patterns of data. Also, there is a lot of baseline period noise which complicates the interpretations of the data of interest.

      (5) Sometimes the authors fail to connect to the extant literature (e.g., by connecting to the ERP components, such as the N2pc and PD components, used to argue for or against proactive suppression) or when they do, overreach with claims (e.g., arguing suppression is reactive or feature-blind more generally).

    3. Reviewer #2 (Public Review):

      Summary:

      The authors investigate the mechanisms supporting learning to suppress distractors at predictable locations, focusing on proactive suppression mechanisms manifesting before the onset of a distractor. They used EEG and inverted encoding models (IEM). The experimental paradigm alternates between a visual search task and a spatial memory task, followed by a placeholder screen acting as a 'ping' stimulus -i.e., a stimulus to reveal how learned distractor suppression affects hidden priority maps. Behaviorally, their results align with the effects of statistical learning on distractor suppression. Contrary to the proactive suppression hypothesis, which predicts reduced memory-specific tuning of neural representations at the expected distractor location, their IEM results indicate increased tuning at the high-probability distractor location following the placeholder and prior to the onset of the search display.

      Strengths:

      Overall, the manuscript is well-written and clear, and the research question is relevant and timely, given the ongoing debate on the roles of proactive and reactive components in distractor processing. The use of a secondary task and EEG/IEM to provide a direct assessment of hidden priority maps in anticipation of a distractor is, in principle, a clever approach. The study also provides behavioral results supporting prior literature on distractor suppression at high-probability locations.

      Weaknesses:

      (1) At a conceptual level, I understand the debate and opposing views, but I wonder whether it might be more comprehensive to present also the possibility that both proactive and reactive stages contribute to distractor suppression. For instance, anticipatory mechanisms (proactive) may involve expectations and signals that anticipate the expected distractor features, whereas reactive mechanisms contribute to the suppression and disengagement of attention.

      (2) The authors focus on hidden priority maps in pre-distractor time windows, arguing that the results challenge a simple proactive view of distractor suppression. However, they do not provide evidence that reactive mechanisms are at play or related to the pinging effects found in the present paradigm. Is there a relationship between the tuning strength of CTF at the high-probability distractor location and the actual ability to suppress the distractor (e.g., behavioral performance)? Is there a relationship between CTF tuning and post-distractor ERP measures of distractor processing? While these may not be the original research questions, they emerge naturally and I believe should be discussed or noted as limitations.

      (3) How do the authors ensure that the increased tuning (which appears more as a half-split or hemifield effect rather than gradual fine-grained tuning, as shown in Figure 5) is not a byproduct of the dual-task paradigm used, rather than a general characteristic of learned attentional suppression? For example, the additional memory task and the repeated experience with the high-probability distractor at the specific location might have led to longer-lasting and more finely-tuned traces for memory items at that location compared to others.

      (4) It is unclear how IEM was performed on total vs. evoked power, compared to typical approaches of running it on single trials or pseudo-trials.

      (5) Following on point 1. What is the rationale for relating decreased (but not increased) tuning of CTF to proactive suppression? Could it be that proactive suppression requires anticipatory tuning towards the expected feature to implement suppression? In other terms, better 'tuning' does not necessarily imply a higher signal amplitude and could be observable even under signal suppression. The authors should comment on this and clarify.

      Minor:

      (1) In the Word file I reviewed, there are minor formatting issues, such as missing spaces, which should be double-checked.

      (2) Would the authors predict that proactive mechanisms are not involved in other forms of attention learning involving distractor suppression, such as habituation?

      (3) A clear description in the Methods section of how individual CTFs for each location were derived would help in understanding the procedure.

      (4) Why specifically 1024 resampling iterations?

    4. Reviewer #3 (Public Review):

      Summary:

      In this experiment, the authors use a probe method along with time-frequency analyses to ascertain the attentional priority map prior to a visual search display in which one location is more likely to contain a salient distractor.  The main finding is that neural responses to the probe indicate that the high probability location is attended, rather than suppressed, prior to the search display onset.  The authors conclude that suppression of distractors at high-probability locations is a result of reactive, rather than proactive, suppression.

      Strengths:

      This was a creative approach to a difficult and important question about attention.  The use of this "pinging" method to assess the attentional priority map has a lot of potential value for a number of questions related to attention and visual search. Here as well, the authors have used it to address a question about distractor suppression that has been the subject of competing theories for many years in the field. The paper is well-written, and the authors have done a good job placing their data in the larger context of recent findings in the field.

      Weaknesses:

      The link between the memory task and the search task could be explored in greater detail. For example, how might attentional priority maps change because of the need to hold a location in working memory? This might limit the generalizability of these findings. There could be more analysis of behavioral data to address this question. In addition, the authors could explore the role that intertrial repetition plays in the attentional priority map as these factors necessarily differ between conditions in the current design. Finally, the explanation of the CTF analyses in the results could be written more clearly for readers who are less familiar with this specific approach (which has not been used in this field much previously).

    1. eLife assessment

      This important work uses in vivo foveal cone-resolved imaging and simultaneous microscopic photostimulation to investigate the relationship between ocular drift - eye movements long thought to be random - and visual acuity. The surprising result is that ocular drift is systematic - causing the object to move to the center of the cone mosaic over the course of each perceptual trial. The tools used to reach this conclusion are state-of-the-art and the evidence presented is convincing. This work advances our understanding of the visuomotor system and the interplay of anatomy, oculomotor behavior, and visual acuity.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper investigates the relationship between ocular drift - eye movements long thought to be random - and visual acuity. This is a fundamental issue for how vision works. The work uses adaptive optics retinal imaging to monitor eye movements and where a target object is in the cone photoreceptor array. The surprising result is that ocular drift is systematic - causing the object to move to the center of the cone mosaic over the course of each perceptual trial. The tools used to reach this conclusion are state-of-the-art and the evidence presented is convincing.

      Strengths

      The central question of the paper is interesting, as far as I know, it has not been answered in past work, and the approaches employed in this work are appropriate and provide clear answers.

      The central finding - that ocular drift is not a completely random process - is important and has a broad impact on how we think about the relationship between eye movements and visual perception.

      The presentation is quite nice: the figures clearly illustrate key points and have a nice mix of primary and analyzed data, and the writing (with one important exception) is generally clear.

      Weaknesses

      The handling of the Nyquist limit is confusing throughout the paper and could be improved. It is not clear (at least to me) how the Nyquist limit applies to the specific task considered. I think of the Nyquist limit as saying that spatial frequencies above a certain cutoff set by the cone spacing are being aliased and cannot be disambiguated from the structure at a lower spatial frequency. In other words, there is a limit to the spatial frequency content that can be uniquely represented by discrete cone sampling locations. Acuity beyond that limit is certainly possible with a stationary image - e.g. a line will set up a distribution of responses in the cones that it covers, and without noise, an arbitrarily small displacement of the line would change the distribution of cone responses in a way that could be resolved. This is an important point because it relates to whether some kind of active sampling or movement of the detectors is needed to explain the spatial resolution results in the paper. This issue comes up in the introduction, results, and discussion. It arises in particular in the two Discussion paragraphs starting on line 343.

      One question that came up as I read the paper was whether the eye movement parameters depend on the size of the E. In other words, to what extent is ocular drift tuned to specific behavioral tasks?

    3. Reviewer #2 (Public Review):

      Summary:

      In this work, Witten et al. assess visual acuity, cone density, and fixational behavior in the central foveal region in a large number of subjects.

      This work elegantly presents a number of important findings, and I can see this becoming a landmark work in the field. First, it shows that acuity is determined by the cone mosaic, hence, subjects characterized by higher cone densities show higher acuity in diffraction-limited settings. Second, it shows that humans can achieve higher visual resolution than what is dictated by cone sampling, suggesting that this is likely the result of fixational drift, which constantly moves the stimuli over the cone mosaic. Third, the study reports a correlation between the amplitude of fixational motion and acuity, namely, subjects with smaller drifts have higher acuities and higher cone density. Fourth, it is shown that humans tend to move the fixated object toward the region of higher cone density in the retina, lending further support to the idea that drift is not a random process, but is likely controlled. This is a beautiful and unique work that furthers our understanding of the visuomotor system and the interplay of anatomy, oculomotor behavior, and visual acuity.

      Strengths:

      The work is rigorously conducted, it uses state-of-the-art technology to record fixational eye movements while imaging the central fovea at high resolution and examines exactly where the viewed stimulus falls on individuals' foveal cone mosaic with respect to different anatomical landmarks in this region. The figures are clear and nicely packaged. It is important to emphasize that this study is a real tour-de-force in which the authors collected a massive amount of data on 20 subjects. This is particularly remarkable considering how challenging it is to run psychophysics experiments using this sophisticated technology. Most of the studies using psychophysics with AO are, indeed, limited to a few subjects. Therefore, this work shows a unique set of data, filling a gap in the literature.

      Weaknesses:

      No major weakness was noted, but data analysis could be further improved by examining drift instantaneous direction rather than start-point-end-point direction, and by adding a statistical quantification of the difference in direction tuning between the three anatomical landmarks considered.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Witten et al., titled "Sub-cone visual resolution by active, adaptive sampling in the human foveola," aims to investigate the link between acuity thresholds (and hyperacuity) and retinal sampling. Specifically, using in vivo foveal cone-resolved imaging and simultaneous microscopic photostimulation, the researchers examined visual acuity thresholds in 16 volunteers and correlated them with each individual's retinal sampling capacity and the characteristics of ocular drift.

      First, the authors found that although visual acuity was highly correlated with the individual spatial arrangement of cones, for all participants, visual resolution exceeded the Nyquist sampling limit - a well-known phenomenon in the literature called hyperacuity.

      Thus, the researchers hypothesized that this increase in acuity, which could not be explained in terms of spatial encoding mechanisms, might result from exploiting the spatiotemporal characteristics of visual input, which is continuously modulated over time by eye movements even during so-called fixations (e.g., ocular drift).

      Authors reported a correlation between subjects, between acuity threshold and drift amplitude, suggesting that the visual system benefits from transforming spatial input into a spatiotemporal flow. Finally, they showed that drift, contrary to the traditional view of it as random involuntary movement, appears to exhibit directionality: drift tends to move stimuli to higher cone density areas, therefore enhancing visual resolution.

      Strengths:

      The work is of broad interest, the methods are clear, and the results are solid.

      Weaknesses:

      Literature (1/2): The authors do not appear to be aware of an important paper published in 2023 by Lin et al. (https://doi.org/10.1016/j.cub.2023.03.026), which nicely demonstrates that (i) ocular drifts are under cognitive influence, and (ii) specific task knowledge influences the dominant orientation of these ocular drifts even in the absence of visual information. The results of this article are particularly relevant and should be discussed in light of the findings of the current experiment.

      Literature (2/2): The hypothesis that hyperacuity is attributable to ocular movements has been proposed by other authors and should be cited and discussed (e.g., https://doi.org/10.3389/fncom.2012.00089, https://doi.org/10.1016/s0896-6273(01)00466-4).

      Drift Dynamic Characterization: The drift is primarily characterized as the "concatenated vector sum of all frame-wise motion vectors within the 500 ms stimulus duration.". To better compare with other studies investigating the link between drift dynamics and visual acuity (e.g., Clark et al., 2022), it would be interesting to analyze the drift-diffusion constant, which might be the parameter most capable of describing the dynamic characteristics of drift.

      Possible inconsistencies: Binocular differences are not expected based on the hypothesis; the authors may speculate a bit more about this. Additionally, the fact that hyperacuity does not occur with longer infrared wavelengths but the drift dynamics do not vary between the two conditions is interesting and should be discussed more thoroughly.

      As a Suggestion: can the authors predict the accuracy of individual participants in single trials just by looking at the drift dynamics?

    1. eLife assessment

      This manuscript aims to unravel the contribution of cholesterol to aquaporin-0 (AQP0) tetramer array formation within lens membranes. Compelling electron crystallography data are combined with solid molecular dynamics experiments to identify a specific cholesterol binding site of significance to protein clustering within lipid rafts. The important work advances our understanding of membrane biology and will be of broad interest to membrane transport biologists, biochemists, and structural biologists.

    2. Reviewer #1 (Public Review):

      Aquaporin-0 forms 2D crystals in the lens of the eye. This propensity to form 2D crystals was originally exploited to solve the structure of aquaporin-0 reconstituted in membranes. Existing structures do not explain why the proteins spontaneously form these arrays, however. In this work the authors investigate the hypothesis that the main lipids in the native membranes, sphingomyelin and cholesterol, contribute to lattice formation. By titrating the cholesterol: sphingomyelin ratio, the authors identify cholesterol binding sites of increasing stability. The authors identify a cholesterol that interacts with adjacent tetramers and is bound at an unusual membrane depth. Computational simulations suggest that this cholesterol is only stable in the context of adjacent tetramers (ie lattice formation) and that the presence of the cholesterol increases the stability of that interface. The exact mechanism is not clear, but the authors propose that the so-called "deep cholesterol" improves shape complementarity between adjacent tetramers and modulates the kinetics of protein-protein interactions. Finally, the authors provide a reasonable model for the role of cholesterol in

      Strengths of this manuscript include the analysis of multiple structures determined with different lipid compositions and lipid:cholesterol ratios. For each of these, multiple lipids can be modelled, giving a good sense of the lipid specificity at various favorable lipid binding positions. In addition, multiple hypotheses are tested in a very thorough computational analysis that provides the framework for interpreting the structural observations. The authors also provide a thorough scholarly discussion that connects their work with other studies of membrane protein-cholesterol interactions.

      The model presented by the authors is consistent with the data described.

    3. Reviewer #2 (Public Review):

      Summary:

      In the manuscript by Chiu et al., "Structure and dynamics of cholesterol-mediated aquaporin-0 arrays and implications for lipid rafts," the authors address the effect of cholesterol on array formation by AQP0. Using a combination of electron crystallography and molecular dynamics simulations, the authors show binding of a "deep" cholesterol molecule between AQP0 tetramers. Each AQP0 tetramer binds four deep cholesterols to form a crystallographic array of AQP0.

      Strengths:

      The combined approaches of electron crystallography and MD simulations under different lipid conditions (different sphingomyelin and cholesterol concentrations) are a strength of the study. The authors provide a thorough and convincing assessment of cholesterol binding, protein-protein interactions, and array formation by AQP0. The MD simulations allow the authors to consider the propensity of cholesterol to occupy the observed binding sites in the absence of crystal contacts. The combined methods and the breadth of analyses set a high standard in the field of membrane protein structural biology.

      The findings of the authors fit nicely into a growing body of literature on cholesterol binding sites that mediate membrane protein-protein interactions. Cholesterol interacts with a variety of membrane proteins via its smooth alpha face of rough beta face. AQP0 is somewhat unique in that it binds the rough face of cholesterol in a "deep" binding site that places cholesterol in the middle of the membrane bilayer. So-called "deep" cholesterol binding sites have been described for GPCRs and docking studies suggest they may exist on other ion channels and transporters. In the case of AQP0, the deep cholesterol acts as a glue that holds two tetramers together. Since each tetramer has four binding sites for deep cholesterol, the assembly and mechanical stability of an extended two-dimensional array of AQP0 tetramers is a natural consequence in lens membranes.

      Weaknesses:

      The authors report that the findings generally apply to raft formation in membranes. However, this point is less clear as the lens membrane in which AQP0 resides is rather unique in lipid and protein content and density. Nonetheless, the authors achieve the overall goal of evaluating cholesterol binding to AQP0, and there are many valuable and informative figures in the main manuscript and supplement that provide convincing results and interpretations.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript aims to unravel the mechanisms behind Aquaporin-0 (AQP0) tetramer array formation within lens membranes. The authors utilized electron crystallography and molecular dynamics (MD) simulations to shed light on the role of cholesterol in shaping the structural organization of AQP0. The evidence suggests that cholesterol not only defines the positions and orientations of associated molecules but also plays a crucial role in stabilizing AQP0 tetramer arrays. This study provides valuable insights into the potential principles driving protein clustering within lipid rafts, advancing our understanding of membrane biology.

      In this review, I will focus on the MD simulations part, since this is my area of expertise. The authors conducted an impressive set of MD simulations aiming at understanding the role of cholesterol in structural organization of AQP0 arrays. These simulations clearly demonstrate the well-defined localization of cholesterol molecules around a single AQP0 tetramer, aligning with previous computational studies and the crystallographic structures presented in this manuscript. Interestingly, the authors identified an unusual position for one cholesterol molecule, located near the center of the lipid bilayer, which was stabilized by the adjacent AQP0 tetramers. The authors showed that these adjacent tetramers can withstand a larger lateral detachment force when deep cholesterol molecules are present at the interface compared to scenarios with sphingomyelin (SM) molecules at the interface between two AQP0 tetramers. Authors interpret that result as evidence that deep cholesterol molecules mechanically stabilize the interface of the AQP0 tetramers.

      The simple steered MD simulations are typically employed to either identify pathways for subsequent free energy calculations, such as umbrella sampling or perform numerous non-equilibrium simulations, utilizing the Jarzynski equation to extract free energy. In this paper, the authors conducted steered MD simulations to examine the maximum force required to separate tetramers, and they did not carry out the more rigorous but challenging free energy calculations. The observation that the maximum force needed to separate tetramers in the presence of cholesterol (compared to the SM case) suggests a positive direction in the authors' work, however, free energy calculations would be needed to fully support the cholesterol stabilization effect.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      The model presented by the authors is consistent with the data described. Further testing of this model, for example by mutating the deep cholesterol binding site, would strengthen the model. However, such experiments might be challenging due to the relatively non-specific/hydrophobic nature of the deep cholesterol binding site.

      We completely agree that testing of the deep cholesterol-binding site by mutagenesis would be ideal. However, as the reviewer points out, such experiments would be challenging, not only because of the non-specific/hydrophobic nature of the deep cholesterol-binding site but also because we have been purifying AQP0 from natural sources (sheep eyes) and because it would be very difficult to secure the substantial amount of cryo-EM time needed to generate an electron crystallographic structure.

      Reviewer #2 (Public Review):

      The authors report that the findings generally apply to raft formation in membranes. However, this point is less clear as the lens membrane in which AQP0 resides is rather unique in lipid and protein content and density.

      We agree that the lens membrane is quite unique in its lipid and protein content and density, but rafts are also characterized by the same lipids and high protein density. Nonetheless, we do agree that our suggested implications for lipid rafts are speculative and so we emphasize this more in the revised version of the manuscript by writing: “This model is specific for the formation of AQP0 arrays in lens membranes, but we speculate that similar principles may underlie the organization of lipid rafts”.

      Reviewer #3 (Public Review):

      The authors showed that these adjacent tetramers can withstand a larger lateral detachment force when deep cholesterol molecules are present at the interface compared to scenarios with sphingomyelin (SM) molecules at the interface between two AQP0 tetramers. Authors interpret that result as evidence that deep cholesterol molecules mechanically stabilize the interface of the AQP0 tetramers. This conclusion has minor weaknesses, and the rigor of the lateral detachment simulations could be increased by establishing a reference point for the detachment force needed to separate AQP0 tetramers in a scenario without lipids at the interface between tetramers, and by increasing the number of repeats for the non-equilibrium steered MD simulations. Thermodynamic integration might be a better approach to compute the stabilization energy in the presence of cholesterol compared to the SM case.

      In all electron crystallographic structures of AQP0 determined to date, lipids have always been observed sandwiched in between the AQP0 tetramers (see, for example, Gonen et al., Nature, 2005 and Hite et al., EMBO J., 2010). Therefore, considering a scenario without lipids at the interface would be unnatural and the AQP0 array would likely not be stable. Such a scenario would thus not be the most appropriate reference point for the lateral detachment simulations. In our view, comparison of a scenario with the deep cholesterol at the interface versus a scenario without it appeared a more realistic setup to investigate the stabilizing role the deep cholesterol has on the association of AQP0 tetramers. In the Results subsection regarding these simulations, we added the following sentence to further stress the rationale of our experimental setup: “Comparison of these two cases should allow us to assess the effect of the deep-binding Chol3 molecules on the mechanical stability of the associated AQP0 tetramers.”

      Concerning the second suggestion of the reviewer of increasing the number of repeats, we doubled the number of simulation replicas: now it is n=20 for each pulling velocity and lipid interface. The trend of higher detachment forces for the interface containing cholesterol prevailed in a statistically significant, robust fashion (see Figure 7 of the revised manuscript and the main text referring to it). In consequence, as the reviewer suggested, extension of the dataset increased the rigor of the lateral detachment simulations. In addition to Figure 7 and the Results section, the Methods section and Table 4 have been updated to reflect the expanded dataset. 

      Finally, concerning the usage of thermodynamic integration to compute the stabilization energy, we agree with the reviewer that calculation of the free energy would be better to determine the thermodynamic stabilization imparted by the cholesterol molecules. At an earlier stage of the project, we did indeed consider carrying out this type of simulations, but we decided against it because of the complexity and poor convergence of such calculations. Our choice is also based on a previous attempt in which it proved very challenging to use free energy calculations to assess the binding of lipids to a flippase (see Wang et al. BioRxiv, https://doi.org/10.1101/ 2020.06.24.169771, 2021). We now included this consideration in the revised manuscript by adding the following sentence in the Discussion: “Although we provide solid evidence here that deep cholesterol impart mechanical stabilization, free energy calculations would be required to obtain the full picture of thermodynamic stabilization. Such free energy calculations are challenging for lipids, due to the chemical complexity and poor convergence involved (Wang et al., 2021), and are thus beyond the scope of the current work.”

      Reviewer #1 (Recommendations For The Authors):

      Reorganizing a few concepts would make the story easier to follow. For example, the analysis of the bilayer thickness seems disjointed. Although Figure 4 shows measurements, it is not clear that the measurements represent bilayer thickness until the last paragraph of page 21 in the discussion, where "Hydrophobic thickness" is first introduced. Moving that first paragraph of page 22 that refers to Fig. 4A to the results would be helpful to understand the figure, and would prepare the reader for this part of the discussion.

      In response to the reviewer, we moved the description of the measurements of the hydrophobic thickness to the Results section (Page 12) and adjusted the Discussion to minimize repetition (page 22).

      Likewise, Figure 4E shows measurements of something, but it is not clear that these are the dimensions of a protein pocket until well into the discussion.

      In response to the reviewer’s comment, we added a sentence both in the Results section [It sits in a pocket between the two adjacent AQP0 tetramers that is wider in the extracellular leaflet than the cytoplasmic leaflet (Figure 4E)] as well as to the caption of Figure 4E [The dotted lines indicate the distance between the two adjacent AQP0 tetramers at the positions of the ring system (~8.5 Å) and the acyl chain (~2.5 Å)].

      Figure 2 - a comment for the non-specialists on what this region of the protein is would be helpful context. Is this the pore with part of the NPA motif?

      We agree with the referee and added the following sentence to the caption of Figure 2: “A region of the water-conducting pathway close to the NPA (asparagine-proline-alanine), the AQP signature motif, is shown”.

      Reviewer #2 (Recommendations For The Authors):

      There is only one recommendation: In the results section entitled "Cholesterol positions observed in the electron crystallographic structures are representative of those around single AQP0 tetramers" the authors do not describe their approach. They refer to a reference (AponteSantamaria et al., 2012). The authors state the problem (investigate cholesterol positions), but it would be helpful to the readers if they also described the experimental approach.

      We agree with the reviewer and made the following addition to the sentence “we performed MD simulations and calculated time-averaged densities to investigate ...”

      Reviewer #3 (Recommendations For The Authors):

      Technical comments:

      (1) Authors stated: "Equilibration simulations were then performed until bulk membrane properties, such as thickness and deuterium order parameters, became stable and congruent with previous reports such as those by (Doktorova et al., 2020) and others (Figure 5-figure supplement 2 and Figure 5-figure supplement 3)." However, bilayer thickness is not represented in these figures. Additionally, I observed that the area per lipid (APL) appeared to be somewhat variable. This variation was particularly noticeable in systems where SM:CHOL=2:1, which seem to be not fully equilibrated. Is the figure displaying APL data for only one repetition? Could you please include plots for the other repetitions?

      We thank the reviewer for pointing this out. We would like to clarify that we used CHARMMGUI to generate one lipid bilayer configuration for each mixture and system size. These configurations (one per system) were extensively simulated to generate stable initial configurations of the lipid bilayers. Figure 5 – supplements 2 and 3 refer to this pre-equilibration step. The final pre-equilibrated configurations were then used in the subsequent multiple equilibrium MD runs that we performed, either with a single cholesterol molecule or with the AQP0 tetramer(s) inserted. We have clarified this procedure in the revised manuscript (see changes in the Methods section for the MD equilibrium simulations).  

      Concerning this pre-equilibration step, we have chosen the area per lipid, not thickness, to characterize the equilibration of the pure lipid bilayers. Accordingly, the area per lipid is the quantity shown in Figure 5 – figure supplement 3. We no longer refer to the membrane thickness in the revised manuscript.

      Concerning the variability in the area per lipid, we note that the large changes occur within the first few tens of nanoseconds of the pre-equilibration step, after which the area per lipid stabilizes. We would like to also point out that in Figure 5 – figure supplement 3, we chose a logarithmic scale for the time axis to actually make it possible for the reader to see the major changes that occur at the beginning of the pre-equilibration step (which would otherwise be difficult to see). In the particular case of the SM:CHOL=2:1 mixture_,_ the 64 lipids/leaflet system converged to a stable area per lipid value in the last 70 ns and the 244 lipids/leaflet system approached the same value in approximately the last 30 ns. This was a good indication that the large system had also converged. After equilibration of the membranes, a single cholesterol or AQP0 tetramer(s) were inserted and equilibrium simulations were initiated. However, the first 100 ns (or 300 ns in the case of the double tetramer system) were considered as a further equilibration time and were not included in the analysis. This is now explicitly stated in the revised manuscript: “The first 100 ns of each simulation replica (the first 300 ns for the two tetramer simulations) were considered as additional equilibration time and were not included in further analysis.”

      (2) Could you clarify the reasoning behind conducting the simulations at 323 K?

      We conducted the simulations at 323 K to ensure that the lipid bilayers were in the liquid phase.

      SM:CHOL mixtures have been reported to be in the liquid phase above 314 K (Keyvanloo et al. Biophys. J. 114: 1344, 2018). 323 K was thus chosen to be well above this value. Note that this temperature was also chosen in a previous MD simulation study of pure sphyngomyelin bilayers (Niemelä et al. Biophys. J. 87: 2976, 2004). This reasoning, as well as the two references, have been added to the Methods section in the revised manuscript.  

      (3) There appears to be a discrepancy in Figure 7. Panel F does not align with the provided caption. 

      We apologize for this mistake. The captions for panels E and F were switched. We corrected this mistake.

      (4) Likewise, in Figure 8, there is a mismatch between the caption and the figures. Furthermore, in the text, the authors assert, "In the absence of cholesterol, the AQP0 surface is completely covered by sphingomyelin in the hydrophobic region of the membrane and by water outside this region (Figure 8A, left column). As noted before, there are essentially no direct protein-protein interactions between the adjacent tetramers. When cholesterol was present at the interface, it interacted with AQP0 at the center of the membrane and remained mostly in place (Figure 8A, right column)." However, the left column shows cholesterol density. Could you please clarify this inconsistency, especially regarding the absence of cholesterol?

      We apologize for this mistake. The panels in Figure 8A showing the AQP0 surfaces in the absence and presence of cholesterol were switched. We corrected this mistake.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This manuscript by Estevam et al. reports new insights into the regulation of the receptor tyrosine kinase MET gained from two deep mutational scanning (DMS) datasets. In this paper, the authors use a classic selection system for oncogenic kinase signaling, the murine Ba/F3 cell line, to assess the functional effects of thousands of mutations in the kinase domains of MET in two contexts: (1) fusion of the whole MET intracellular region to the dimerization domain TPR, and (2) the same fusion protein, but with exon 14, which encodes part of the juxtamembrane region of MET, skipped. Critically, exon 14 skipping yields a version of MET that is found in many cancers and has higher signaling activity than the canonical MET isoform. The authors extensively analyze their DMS data to very convincingly show that their selection assay reports on kinase activity, by illustrating that many functionally important structural components of the kinase domain are not tolerant of mutation. Then, they turn their attention to a helical region of the juxtamembrane region (αJM), immediately after exon 14, which is posited to play a regulatory role in MET. Their DMS data illustrate that the strength and mutational tolerance of interactions between αJM and the key αC helix in the kinase domain depends on the presence or absence of exon 14. They also identify residues in the N-lobe of the kinase, such as P1153, which are not conserved across tyrosine kinases but appear to be essential for MET and MET-like kinases. Finally, the authors analyze their DMS data in the context of clinically-observed mutations and drug-resistance mutations.

      Overall, this manuscript is exciting because it provides new insights into MET regulation in general, as well as the role of exon 14. It also reveals ways in which the JM region of MET is different from that of many other receptor tyrosinekinases. The exon 14-skipped fusion protein DMS data is somewhat underexplored and could be discussed in greater detail, which would elevate excitement about the work. Furthermore, some of the cell biological validation experiments and the juxtaposition with clinical data are perhaps not assessed/interpreted as clearly they could be. Some constructive suggestions are given below to enhance the impact of the manuscript.

      Strengths:

      The main strengths of this paper, also summarized above in the summary, are as follows:

      (1) The authors very convincingly show that Ba/F3 cells can be coupled with deep mutational scanning to examine MET mutational effects. This is most clearly shown by highlighting how all of the known kinase structure and regulatory elements are highly sensitive to mutations, in accordance with a few other DMS datasets on other kinases.

      (2) A highlight of this paper is the juxtaposition of two DMS datasets for two different isoforms of the MET receptor. Very few comparisons like this exist in the literature, and they show how small changes to the overall architecture of a protein can impact its regulation and mutational sensitivity.

      (3) Another exciting advance in this manuscript is the deep structural analysis of the MET juxtamembrane region with respect to that of other tyrosine kinases - guided by the striking effect of mutations in the juxtamembrane helical region. The authors illustrate how the JM region of MET differs from that of other tyrosine kinases.

      (4) Overall, this manuscript will provide a resource for interpreting clinically relevant MET mutations.

      Weaknesses:

      (1) The manuscript is front-loaded with extensive analysis of the first DMS dataset, in which exon 14 is present, however, the discussion and analysis of the exon 14-skipped dataset is somewhat limited. In particular, a deeper discussion of the differences between the two datasets is warranted, to lay out the full landscape of mutations that have different functional consequences in the two isoforms. Rather, the authors only focus on differences in the JM region. What are the broader structural effects of exon 14 skipping across the whole kinase domain?

      Thank you for your feedback on our manuscript and our analysis of the exon 14 skipped mutational scanning data. The lack of a robust growth differential  between the wild type MET intracellular domain and the exon 14 skipped isoform within the Ba/F3 system suggests that there is not a significant growth advantage related to exon 14 skipping, likely due to the constitutive activation of both constructs by the TPR domain, which also suggests that the assay is potentially less sensitive to nuanced JM-driven effects between these two isoforms, aside from the highly sensitive ⍺JM-helix. We also lose insight on membrane-related interactions imposed on the juxtamembrane that may be important to fully understand the differences between these two isoforms in the cytoplasmically-expressed context. Therefore, we can at most speculate exon 14 skipped related differences between these two datasets.

      With these caveats in mind, to further address exon 14 and juxtamembrane-driven differences between these two mutational landscapes, we calculated the absolute score difference between TPR-METΔEx14 and TPR-MET (|METΔEx14 - MET|) and plotted the |ΔScore| in a heatmap. Overall, the two landscapes, as noted in the text, are largely similar with differences emerging mostly for specific mutations. Where we see the largest secondary structural difference continues to be the ⍺JM-helix, where MET is more sensitive to helix-breaking mutations such as proline. Again L1062 has the greatest difference in sensitivity between these two datasets for the ⍺JM-helix, with the introduction of negative charge resulting in loss-of-function for the TPR-MET kinase domain but having a null effect in the TPR-METΔEx14 kinase domain. Other positions with strong differences include the ⍺G and APE motif.

      We have incorporated more detailed discussion in text. 

      (2) It is unclear if gain-of-function mutations can actually be detected robustly in this specific system. This isn't a problem at face value, as different selection assays have different dynamic ranges. However, the authors don't discuss the statistical significance and reproducibility of gain- vs loss-of-function mutations, and none of the gain-of-function mutations are experimentally validated (some appear to show loss-of-function in their cellular validation assay with full-length MET). The manuscript would benefit from deeper statistical analysis (and discussion in the text) of gain-of-function mutations, as well as further validation of a broad range of activity scores in a functional assay. For the latter point, one option would be to express individual clones from their library in Ba/F3 cells and blot for MET activation loop phosphorylation (which is probably a reasonable proxy for activity/activation).

      Thank you for your comment on the statistical interpretations of gain-of-function (GOF) and loss-of-function (LOF) mutations. In this study we classify GOF and LOF based on the following metrics:

      (1) The difference between the missense mutation score and the wild type synonymous score for a given position must be smaller than the calculated propagated error, for both IL-3 withdrawal and IL-3 conditions

      (2) Missense mutations must be ≥ ±2 standard deviations (SD) from the mean of wild type synonymous mutations

      Given that our assay was conducted in a constitutively active kinase in the TPR-fusion context, gain-of-function mutations are expected to not only be rare, but also supersede baseline fitness. Within the IL-3 conditions, we expect that cells are not reliant or “addicted” to MET for growth proliferation. Nevertheless, due to the parallel nature of the screen, we can compare scores for variants in the IL-3 control and IL-3 withdrawal conditions to filter mutations that are solely exhibiting high fitness under selective pressure.

      To identify these mutations we 1) calculated the propagation of error between IL-3 and IL-3 withdrawal scores for the same variant 2) calculated the absolute difference between IL-3 and IL-3 withdrawal scores for the same variant 3) filtered variants if the IL-3 withdrawal score was ≥ +2 SDs, the IL-3 score was ≤ 0, and the absolute score difference between IL-3 and withdrawal conditions was larger than the propagated error.

      In analyzing mutations within the IL-3 withdrawal conditions, applying our statistical metrics, we find 33 mutations within the MET library, and 30 in the METΔEx14 library, that have a score of ≥ +2 SD and low propagated error. By increasing our boundary to ≥+2.5 SD, we can classify mutations with even higher confidence, identifying 10 mutations within the MET library, and 9 in the METΔEx14 library (Supplemental Data Figure 7).

      (3) In light of point 2, above, much of the discussion about clinically-relevant gain-of-function mutations feels a bit stretched - although this section is definitely very interesting in premise. A clearer delineation of gain-of-function, with further statistical support and ideally also some validation, would greatly strengthen the claims in this section.

      To address this concern, we have provided additional analysis and details on gain-of-function (GOF) classification in Supplemental Data Figure 5 and the overlap between GOF and clinically associated mutations in Supplemental Data Figure 8. Within our gain-of-function classifications, we pick up on several mutations at positions that have been clinically detected and experimentally validated in previous studies in both libraries (D1228, G1163, L1195), and show that GOF mutations also have low variance.

      Reviewer #2 (Public Review):

      Summary:

      The authors describe a deep mutational scanning (DMS) study of the kinase domain of the c-MET receptor tyrosine kinase. The screen is conducted with a highly activated fusion oncoprotein - Tpr-MET - in which the MET kinase domain is fused to the Tpr dimerization element. The mutagenized region includes the entire kinase domain and an alpha-helix in the juxtamembrane region that is essentially part of the MET kinase domain. The DMS screen is carried out in two contexts, one containing the entire cytoplasmic region of MET, and the other with an "exon 14 deletion" which removes a large portion of the juxtamembrane region (but retains the aforementioned alpha-helix). The work provides a robust and essentially exhaustive catalog of the effect of mutations (within the kinase domain) on the ability of the Tpr-MET fusion oncoproteins to drive IL3-independent growth of Ba/F3 cells. Every residue in the kinase is mutated to every natural amino acid. Given the design of the screen, one would expect it to be a powerful tool for identifying mutations that impair catalytic activity and therefore impair IL3-independent proliferation, but not the right tool for identifying gain-of-function mutations that operate by shifting the kinase from an inactive to active state (because the Tpr-Met fusion construct is already very highly activated). This is borne out by the data, which reveal many many deleterious mutations and few "gain-of-function" mutations (which are of uncertain significance, as discussed below).

      Strengths:

      The authors take a very scholarly and thorough approach to interpreting the effect of mutations in light of available information for the structure and regulation of MET and other kinases. They examine the effect of mutations in the so-called catalytic (C) and regulatory (R) spines, the interface between the JM alpha-helix and the C-helix, the glycine-rich loop, and other key elements of the kinase, providing a structural rationale for the deleterious effect of mutations. Comparison of the panoply of deleterious mutations in the TPR-met versus TPR- exon14del-MET DMS screens reveals an interesting difference - the exon14 deletion MET is much more tolerant of mutations in the JM alpha-helix/C-helix interface. The reason for this is unclear, however.

      Weaknesses:

      Because the screens were conducted with highly active Tpr-MET fusions, they have limited power to reveal gain-of-function mutations. Indeed, to the extent that Tpr-MET is as active or even more active than ligand-activated WT MET, one could argue that it is "fully" activated and that any additional gain of fitness would be "super-physiologic". I would expect such mutations to be rare (assuming that they could be detected at all in the Ba/F3 proliferation assay). Consistent with this, the authors note that gain-of-function mutations are rare in their screen (as judged by being more fit than the average of synonymous mutations). In their discussion of cancer-associated mutations, they highlight several "strong GOF variants in the DMS". It is unclear what the authors mean by "strong GOF", indeed it is unclear to this reviewer whether the screen has revealed any true gain of function mutations at all. A few points in this regard:

      (1) More active than the average of synonymous mutations (nucleotide changes that have no effect on the sequence of the expressed protein) seems to be an awfully low bar for GOF - by that measure, several synonymous mutations would presumably be classified as GOF.

      We completely agree that any mutation above the average synonymous would not be a robust assessment and thus why we statically filtered mutations in our entire analysis. To this point, and that of  Reviewer 1, we have further outlined our statistical definitions. In classifying mutations as GOF or LOF, the following parameters were used:

      (1) The difference between the missense mutation score and the wild type synonymous score for a given position must be smaller than the calculated propagated error, for both IL-3 withdrawal and IL-3 conditions

      (2) Missense mutations must be ≥ ±2 standard deviations (SD) from the mean of wild type synonymous mutations

      Therefore, only variants at the tail-ends of the mutational distribution were assessed, and further filtered based on propagation of error. For this reason, a “strong GOF” mutation as noted in this study is one that improves the fitness of an already active kinase. As pointed out, within our analysis, these are very rare occurrences, and in focusing on cancer-associated mutations we find that the variants that meet these statistical parameters require a larger genetic “leap” in the codon space. Overall, we have also changed our language in reference to GOF mutations in text.

      We hope this concern has been addressed in the new Supplemental Data Figures.

      (2) In the +IL3 heatmap in supplemental Figure 1A, there is as much or more "blue" indicating GOF as in the -IL3 heatmap, which could suggest that the observed level of gain in fitness is noise, not signal.

      We hope this concern has been addressed in the previous responses and new Supplemental Data Figures.

      (3) And finally, consistent with this interpretation, in Supplemental Figure 1C, comparing the synonymous and missense panels in the IL3 withdrawal condition suggests that the most active missense mutations (characterized here as strong GOF) are no more active than the most active synonymous mutations.

      We hope this concern has been addressed in the previous responses and figures above.

      My other major concern with the work as presented is that the authors conflate "activity" and "activation" in discussing the effects of mutations. "Activation" implies a role in regulation - affecting a switch between inactive and active conformations or states - at least in this reviewer's mind. As discussed above, the screen per se does not probe activation, only activity. To the extent that the residues discussed are important for activation/regulation of the kinase, that information is coming from prior structural/functional studies of MET and other kinases, not from the DMS screen conducted here. Of course, it is appropriate and interesting for the authors to consider residues that are known to form important structural/regulatory elements, but they should be careful with the use of activity vs. activation and make it clear to the reader that the screen probes the former. One example - in the abstract, the authors rightly note that their approach has revealed a critical hydrophobic interaction between the JM segment and the C-helix, but then they go on to assert that this points to differences in the regulation of MET and other RTKs. There is no evidence that this is a regulatory interaction, as opposed to simply a structural element present in MET (and indeed the authors' examination of prior crystal structures shows that the interaction is present in both active and inactive states.

      Thank you, and we completely agree that the distinction between “activity” and “activation” is important and that we can at most speculate and propose models for effects related to activation from this screen. We have edited the text to reflect these distinctions. In respect to activation and the second point, we believe the screen highlights the ⍺JM-C interface as a critical structural region, which may have a role in regulation based on the paradigm of juxtamembrane regulation in RTKs, the presence of a similar interface in TAM family kinases, the co-movement of the ⍺JM-helix and ⍺C-helix between active and inactive conformations in the structural ensemble, and the observation that within the TPR-METΔEx14 library there is a greater tolerance for mutations at interface positions than TPR-MET. We hope that are follow-up studies that directly probe the ⍺JM-C interface in respect to the entire juxtamembrane to truly say if/ what role this conserved motif plays in regard to MET function. We have changed the language of the text to reflect how these differences contribute to our proposed model, rather than any unintended assertion on direct regulatory effects.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggested major points to address:

      (1) Although the authors show that several key functional residues in the kinase domain are highly sensitive to mutation, it would be nice if the authors further established a clear connection between kinase activity and enrichment in the Ba/F3 assay. Specifically, it is unclear to what extent there is a correlation between the extent of enrichment/depletion and kinase activity - is a larger activity score necessarily indicative of higher kinase activity? This is partly validated by the P1153L mutation autophosphorylation western blots in Figure 4B, but this correlation is somewhat undermined by the data in 5F. Autophosphorylation data (or phosphorylation data on a direct downstream substrate) for a few mutants would really solidify what the activity score is truly reporting. This might also clarify the extent to which the difference between the two screens can be interpreted, and the extent to which gain-of-function can be interpreted.

      The Ba/F3 assay was carefully chosen for its addiction to exogenous IL-3, which serves as a permissive signaling switch. Any mutation that prevents TPR-MET/ΔEx14 from properly functioning is therefore dampening its signaling ability. Nevertheless, it is possible that some mutations with high scores are truly improving activity and others are sustaining activity through more stable interactions than the wild type kinase domain or with downstream signaling partners, which would require careful biochemical dissection outside the scope of this study. To address these points, we now refer to the mutation score simply as “score” rather than “activity score” and further discuss these caveats in text.

      (2) Overall, the exon 14-skipped dataset is under-discussed in the paper. The comparison of the two datasets is where most deep insights are likely to be found, and so a more thorough analysis/discussion of this dataset would really elevate the significance of the paper. For example, there appear to be a very large number of mutations that have divergent effects in the two screens (everything along the dashed lines in Figure 5D), but it's unclear where most of these mutations lie on the structure. It would be helpful if the residues with divergent mutational effects between the two screens (Supplementary Figure 5E) were mapped onto a structure of the JM-KD construct.

      To address this concern, new analysis has been added to the supplement, showing the score differences between MET and METΔEx14 mutations as a heatmap (Supplemental Data Figure 7A). Within this analysis we further applied our statistical filtering methods and structurally mapped positions with the greatest differential scores to show where divergent effects cluster (Supplemental Data Figure 7D). Consistent with our previous reports, the ⍺JM-helix and ⍺C-helix show the largest cluster of divergent effects, in addition to sites such as the ⍺G and APE motif. Further discussion of these points have been added to the text.

      (3) Based on the observations that αJM-αC interactions seem to be less strictly required in the exon 14 mutant, the hypothesis that exon 14 skipping merely removes a Cbl docking site seems largely unsatisfactory. There seems to be more direct structural alterations that could explain this change, but these are not really discussed or speculated on. Related to this, while L1062 mutations are more tolerated, as the authors showed in both the mutational heatmap and the cellular experiments, its binding counterpart L1125 still seems to be somewhat immutable based on the heatmaps. So, more hypothesis/exploration of how exon 14 skipping affects MET KD structure would be a nice addition to the paper.

      We agree that loss of the Cbl docking site is an insufficient model to capture the full nature of JM regulation and exon 14 skipping effects, which was a major incentive for this study. The outstanding ⍺JM-⍺C-helix sensitivity also excites us because it points to a potential regions of the JM that potentially is involved in kinase activity through ⍺C-helix interactions, much like the CDK models and other RTK-JM interactions. We observed that the ⍺JM-⍺C helix retain contact, and propose that the ⍺JM-⍺C helix move in unison between active and inactive conformations. However, it is possible that a more complicated mechanism might also exist, where there is a larger degree of maintenance of these contacts in a homodimer. For instance, in Figure 3G, if you compare the ⍺JM-helix conformations, in both RON and AXL there is more distance and a pivot away from the ⍺C-helix. It’s is possible that there are shared mechanisms between the MET and TAM families that could further elucidate exactly how these ⍺JM-helices interact with the kinase domain during the activity transitions and what biophysical role JM truncations play.

      (4) The discussion about mutations S1122Q and L1062D is a bit confusing and incomplete. From the DMS data, it appears that L1062D should be mildly gain-of-function for the exon 14 deletion variant and very loss of function for wild-type MET. In the validation HeLa cell experiments L1062D is loss-of-function in both contexts, but a mention of this discrepancy is omitted. Then, when the discordance between DMS and HeLa cell experiments is observed again for S1122Q, it is explicitly called out for activation-loop phosphorylation, but then there is no mention of the fact that HGF stimulation leads to greater pERK levels for S1122Q in the exon 14 deletion context (the opposite of the DMS result). The Erk phosphorylation discrepancy should be mentioned. It is entirely reasonable, as the authors suggest, that there are differences between full-length MET and the TPR fusions, but the enhanced Erk phosphorylation by the S1122Q mutation is surprising (and intriguing!). This section could use some re-analysis/re-writing and further discussion.

      Thank you for this comment. As noted L1062D shows slight GOF in METΔEx14 but LOF in MET. The blots show expression of L1062D and S1122Q in the full length receptor in the absence and presence of HGF stimulation. L1062D is loss of function for both contexts only in -HGF conditions, but shows expression in phosphorylated METΔEx14, but not MET. For S1122Q, indeed there is a stronger pERK signal in the METΔEx14, which highlights how probing all regions of phosphorylation (A-loop and C-tail) and many MET-associate pathways (ERK, AKT) may be important to understand in what way these mutations are affect MET phosphorylation and proliferation. We have included this point in the text.

      (5) Related to the previous point, one other thing to consider here is that perhaps gain-of-function mutations are simply not detectable in this particular DMS assay. The authors state that GOF and LOF are defined as 2 standard deviations from the mean of the WT-synonymous distribution. How many mutations are actually designated to be GOF based on this criterion? Are those GOF mutations as reproducible as the LOF mutations? It would be worthwhile to separately analyze the variance in activity scores for every loss-of-function mutation and gain-of-function mutation. It seems likely that loss-of-function scores are a lot more reproducible than gain-of-function ones, suggesting that the most apparent gain-of-function signal is just noise in the assay. The few outliers to this point (true gain-of-function mutations) may be some of the ones discussed in Figure 6. If this is true, it would lend confidence to the claims associated with Figure 6.

      In analyzing and classifying both GOF and LOF mutations, error was a main filtering parameter. Each fitness score, calculated by Enrich2, is representative of the slope across time points  and biological replicates for the read frequency of the mutation. The associated standard error (SE) reflects the variance for each mutation within the scoring framework (Rubin et al., 2017). Mutations were then further filtered based on low propagated error, calculated by comparing the standard error (SE) of each missense mutation to the SE of the respective wild type synonymous mutation. Therefore, mutations were only classified as GOF or LOF if there was low error, in addition to the other score filters previously described. We have plotted the classified GOF mutations with their respective SE in the newly incorporated Supplemental Data Figure 8C.

      (6) In the discussion of panels 6C and 6D, the assertion is that the "clinical, not validated" category has more mutations that are low-fitness outliers than the "clinical, validated" category. From the graphs, it's actually hard to tell if this is the case for two reasons: (1) the way the graphs are normalized, (to the largest value in each histogram), you cannot compare bar heights (and thus number of mutations) between two histograms on the same graph. (2) Just looking at the shapes of the distributions, or considering maybe the mean or median values, it's unclear whether the "validated" and "not validated" populations are actually different from one another.

      This is an important indication, and we have added analysis showing the distribution and number of clinically-associated mutations within our libraries without normalization in the main text and in Supplemental Data Figure 8A-B.

      (7) This sentence in the last results section is somewhat unclear: "GOF resistance mutations may indicate an effect on the equilibrium of kinase activation, whereas LOF resistance mutations likely affect inhibitor-protein interactions directly." The first part makes sense, but it is not totally obvious how one can infer anything about inhibitor-protein interactions from mutations that are LOF with respect to kinase activity. Related to this, how are LOF mutations selected in the presence of an inhibitor? Is the assumption here that the mutation might totally abrogate inhibitor binding but only slightly impair the kinase? Perhaps this could be explained a bit more.

      Here, the idea we wanted to get across is that there are two models  that can explain how a mutation can contribute to resistance: shift the activity equilibrium at baseline or directly impair drug effects and restore baseline activity. Mutations that are labeled resistant and GOF, favor the first model. Mutations that are labeled resistant and LOF, favor the second model. In the presence of an inhibitor, which is in the scope outside of this study, LOF mutations would be sensitive to the inhibitor (ie WT-like and sensitive).

      (8) Some additional details of the library preparation and sequencing should be given in the methods section. It appears that the variable region of the library is roughly 275 amino acid residues long, which means >800 bases. How was this sequenced? From the methods, it sounds like all of the variants were pooled into a single library, but then sequencing was done using a 300x300 paired-end Illumina kit, which would not cover the length of the whole variable region. Was the library actually screened in segments as sub-libraries and then separately sequenced? Alternatively, was the whole library screened at once, and then different segments were amplified out for sequencing? If the latter approach is used, this could yield confounding results for counting wild-type variants that have the parent wild-type coding sequence. For example, if you amplify your kinase library in three segments after a single selection on the whole library, and you sequence those three segments separately, you might find a read that appears as wild-type in the part you amplified/sequenced but has a mutation in a region that you did not sequence. If this approach is taken, the counts for the wild-type sequence would be inaccurate, in which case, how is the data normalized with WT as a reference? Regardless of the method used, some more details should be provided in the methods section.

      In this study, we used the Nextera XT DNA Library Preparation Kit (Illumina), which uses a tagmentanation approach that randomly fragments our 861 bp amplicon into ~300 bp fragments with a transposase, resulting in a Poisson distribution of fragment sizes. This allows for direct sequencing of all amplicons and libraries with an SP300 paired-end run, which we ran on two lanes of a NovaSeq6000. Samples are demultiplexed  and processed by our analysis pipeline with a lookup table that associates the unique dual index to the specific sample (library, time point, biological replicate, IL-3 condition).

      The TPR-MET and TPR-METΔEx14 libraries were prepared in parallel throughout the entire experiment, from cloning to virus generation to transductions, screening, cell harvesting, sequencing prep, and sequencing. In other words, the TPR-MET and TPR-METΔEx14 were transduced into their own, respective batch of cells for each biological replicate, then selected and screened on the same day for each replicate and time point. Each library and condition (time point, biological replicate, IL-3 condition) was prepared in parallel but still an independent sample. At the stage of tagmentation, each sample was arrayed, where each well corresponds to a library, biological replicate, and time point. At the stage of sequencing, samples across the two libraries were normalized to 10mM (library, biological replicate, time point, IL-3 condition) then pooled together and all run on two lanes of the same NovaSeq6000 flow cell.

      PCR and sequencing bias was one of the most important parameters for us, which is why we performed tagmentation in parallel and sequenced everything on the same run. We have added extra details to the methods and hope that we have clarified your questions on this matter.

      Suggested minor points to address:

      (1) TPR (as in TPR-MET fusion) is not defined in the text when it is first mentioned. And it wasn't immediately clear that this is not a membrane-associated domain (Figure 5E makes this way more obvious than Figure 1B does). Perhaps this could be made more explicit in the text or in Figure 1.

      We have incorporated a new schematic in Figure 1B to better illustrate the TPR-fusion constructs used within this study. The usage of the TPR-fusion is first mentioned in the introduction, paragraph 4, and revised the main-text to delineate the usage of the TPR-fusion more clearly.

      (2) In Figure 2G, it would be helpful if the wild-type amino acid residue was listed underneath the position number in the two graphs (even though those residues are also highlighted in 2H).

      Thank you for this recommendation, we have added the wild type amino acid next to the position number in the x-axis label.

      (3) For Supplementary Data Figure 2, is it possible to calculate conservation scores at each position using some kind of evolutionary model, rather than relying on visual inspection of the sequence logo? Can one quantitatively assert that the C-spine is less conserved than the R-spine overall, or can this only be said for certain positions? Related to this, in comparing Figure 2G to Supplementary Data Figure 2, it is interesting that there isn't any obvious correspondence between mutational tolerance and conservation within the C-spine. For example, 1165 seems to be the most conserved position in the C-spine, but several substitutions are tolerated at this position, just like 1210, which is one of the least conserved positions in the C-spine. Finally, it's very likely that positions 1165, 1210, 1272, and 1276 co-vary, given that they all pack into the same hydrophobic cluster. This might be why they appear less conserved. These last few points might be worth discussing briefly if the authors want to relate mutational tolerance to evolutionary conservation.

      Thank you for this recommendation. To better quantitatively determine C-spine versus R-spine conservation, we performed a multiple sequence alignment of all RTK kinase domain sequences to properly identify corresponding R- and C-spine locations, as previously done in generating the spine logos, then used the bio3D structural bioinformatics package in R to calculate the conservation score of each residue position by amino acid “similarity” with a blosum62 matrix (Supplemental Data Figure 2B). In concordance with the logos, we find that C-spine positions 1092, 1108, 1165 have the highest conservation scores, even compared to some R-spine mutations. We also see across the alignment that indeed, C-spine positions 1165 1210,1211,1212, and 1272, and 1276 co-vary within RTK families. We have revised the text to reflect these points, and more specifically discuss position-level conservation rather than generalizing conservation for the C- and R-spines.

      (4) On Page 7 of the merged document, there appear to be some figure labeling errors. In the first and second paragraphs of the "Critical contacts between..." section, Figure 3B is referenced multiple times as a structural alignment/ensemble, but this is a heatmap.

      Thank you for catching this! The correct figure panels are now referenced.

      (5) In the text describing Figure 3A, it is stated that the structures were aligned to the N-lobe, but the figure legend says that all structures were aligned to alpha-C and alpha-JM.

      Thank you - a local alignment to the ⍺JM-helix and ⍺C-helix is correct, the idea here being that if the ⍺JM-helix and ⍺C-helix are linked to an active/inactive conformation like in the case of the insulin receptor, these two clusters could be revealed through the structural ensemble. However, we discovered this was not the case, combined with the DMS sensitivity to mutations at the packing interface leads us to believe that the MET JM has a distinctive regulatory mechanism that relies on this ⍺C-helix interface. We have made this correction to the text.

      (6) It would be helpful if the alpha-C and alpha-JM helices in Figure 3D were labeled on the MET structures.

      The ⍺C-helix and ⍺JM-helix are now labeled in Figure 3D.

      (7) It appears that Figure 4E is never explicitly referenced in the text.

      Thank you, Figure 4E is now appropriately referenced in the text.

      (8) Throughout the Figure 6 legend, for the histograms, it is stated that "Counts are normalized to the total mutations in each screen dataset." This might not be the correct description of normalization, as this would mean that the sum of all of the bins should equal 1. Rather, the normalization appears to be to the bin with the largest number of mutants in it, which is given a value of 1. This difference is really critical to how one visually inspects the overlaid histograms.

      Thank you for this comment. Here, the intention was to aid in the visualization of the distribution of cancer-associated and resistance associated mutations, which is a much smaller population compared to the whole library and becomes easily masked. We originally applied a “stat(ncount)” function in R, which as noted scales the data and sets the peak to 1, which only applied to the clinical and cancer-associated mutations plotted. Now, to better compare distributions, normalization has been removed, instead opting to overlay the distributions of all missense mutations and the subset of clinical mutations directly with their own y-axis scale. This modification has been made throughout Figure 6 panels, hopefully improving interpretability.

      Reviewer #2 (Recommendations For The Authors):

      A few thoughts/suggestions:

      (1) Regarding kinase regulation, the "closing of N- and C-lobe" upon activation is an often mentioned component of activation, and I'm sure is true in many cases, but it is not a general feature of kinase activation.

      The text has been updated - we removed the description of N- and C-lobe closure. 

      (2) With respect to the inactive state of MEK, the DFG-flipped structure discussed here is almost certainly an inhibitor-induced conformation. Again, DFG-flip is often discussed as a mechanism of kinase regulation, and while in some kinases this might be the case, more often it is a drug-induced or drug-stabilized inactive conformation. The SRC/CDK-like inactive conformation in 2G15 is more likely a physiologically relevant inactive state. (or even better, the ATP-bound inactive state structure 3DKC, which exhibits a somewhat different SRC/CDK-like inactive conformation).

      The PDB 3R7O structure was chosen as the main representation because it was the clearest representation of a wild type structure with an aligned R- and C- spine, solvent-exposed, phosphorylated activation loop. Although 3DKC is bound to ATP, this structure is still in an inactive conformation and has stabilizing mutations (Y1234/F, Y1235D) and an atypical alpha helix structure in the activation loop. However, we agree the SRC/CDK-like inactive conformation is an important representation and we have incorporated our structural mapping on 2G15 in the new supplemental figures with further details on statistical analysis and comparison of libraries.

      (3) Following the comments above, I would describe the process of activation in a simpler way (in any case, it is peripheral to the work described here). Something along the lines of "phosphorylation on tyrosines XX and XX induces rearrangement of the activation segment and promotes and stabilizes the inward active position of the C-helix." Can go on to mention that this forms the E1127/K1110 salt bridge. (The DFG is already "in" in the SRc/CDK-like inactive state).

      We have changed the language to more simply describe activation. Thank you!

      (4) Would be great to see DMS with the intact receptor done in a way that could identify mutations that lead to activation in a ligand-independent manner. (but obviously beyond the scope of this paper).

      Agreed! This would be an excellent follow up for the future, especially to elucidate juxtamembrane regulation, as the membrane context is likely required.

      A typo or two:

      Boarded instead of bordered/outlined in legend to Fig. 1.

      P11553L in the 2nd line of the 2nd paragraph in that section.

      Thank you, we have addressed these typos!

    2. eLife assessment

      This manuscript describes a deep mutational scanning study of the kinase domain of the MET receptor tyrosine kinase. The study yields an important catalog of essentially all possible deleterious mutations in this portion of the receptor., with convincing evidence. The manuscript will be of interest to researchers working in the field of receptor tyrosine kinases.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors describe a deep mutational scanning (DMS) study of the kinase domain of the c-MET receptor tyrosine kinase. The screen is conducted with a highly activated fusion oncoprotein - Tpr-MET - in which the MET kinase domain is fused to the Tpr dimerization element. The mutagenized region includes the entire kinase domain and an alpha-helix in the juxtamembrane region that is essentially part of the MET kinase domain. The DMS screen is carried out in two contexts, one containing the entire cytoplasmic region of MET, and the other with an "exon 14 deletion" which removes a large portion of the juxtamembrane region (but retains the aforementioned alpha-helix). The work provides a robust and essentially exhaustive catalog of the effect of mutations (within the kinase domain) on the ability the Tpr-MET fusion oncoproteins to drive IL3-independent growth of Ba/F3 cells. Every residue in the kinase is mutated to every natural amino acid. Given the design of the screen, one would expect it to be a powerful tool for identifying mutations that impair catalytic activity and therefore impair IL3-independent proliferation. This is borne out by the data, which reveal many many deleterious mutations. The study reveals relatively few "gain-of-fitness" mutations, but this is not unexpected because it is carried out with an already-activated form of the MET kinase (the oncogenic Tpr-met fusion).

      Strengths:

      The authors take a very scholarly and thorough approach in interpreting the effect of mutations in light of available information for the structure and regulation of MET and other kinases. They examine the effect of mutations in the so-called catalytic (C) and regulatory (R) spines, the interface between the JM alpha-helix and the C-helix, the glycine-rich loop and other key elements of the kinase, providing a structural rationale for the deleterious effect of mutations. Comparison of the panoply of deleterious mutations in the TPR-met versus TPR- exon14del-MET DMS screens reveals an interesting difference - the exon14 deletion MET is much more tolerant of mutations in the JM alpha-helix/C-helix interface. The reason for this is unclear, however.

      An important qualification of the study is that it was carried out with the already highly activated Tpr-Met fusion. As a consequence, it is not expected to reveal mutations that activate the kinase -- activate in the sense of promoting a switch between physiologically-relevant inactive and active states. Consistent with this, the authors note that gain-of-fitness mutations are rare in their screen, and those that are identified induce modest but significant increases in fitness.

    1. Reviewer #2 (Public Review):

      Summary:

      The study focuses on the vomeronasal organ, the peripheral chemosensory organ of the accessory olfactory system, by employing single-cell transcriptomics. The author analyzed the mouse vomeronasal organ, identifying diverse cell types through their unique gene expression patterns. Developmental gene expression analysis revealed that two classes of sensory neurons diverge in their maturation from common progenitors, marked by specific transient and persistent transcription factors. A comparative study between major neuronal subtypes, which differ in their G-protein sensory receptor families and G-protein subunits (Gnai2 and Gnao1, respectively), highlighted a higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. Moreover, distinct differences in ER content and ultrastructure suggest some intriguing roles of ER in Gnao1-positive vomeronasal neurons. This work is likely to provide useful data for the community and is conceptually novel with the unique role of ER in a subset of vomeronasal neurons. This reviewer has some minor concerns and some suggestions to improve the manuscript.

      Strengths:

      (1) The study identified diverse cell types based on unique gene expression patterns, using single-cell transcriptomic.

      (2) The analysis suggests that two classes of sensory neurons diverge during maturation from common progenitors, characterized by specific transient and persistent transcription factors.

      (3) A comparative study highlighted differences in Gnai2- and Gnao1-positive sensory neurons.

      (4) Higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons.

      (5) Distinct differences in ER content and ultrastructure suggest unique roles of ER in Gnao1-positive vomeronasal neurons.

      (6) The research provides conceptually novel on the unique role of ER in a subset of vomeronasal neurons, offering valuable insights to the community.

      Weaknesses:

      (1) The connection between observations from sc RNA-seq and EM is unclear.

      (2) The lack of quantification for the ER phenotype is a concern.

    2. eLife assessment

      This valuable study uses single-cell transcriptomics to explore the mouse vomeronasal organ and represents an advance that enhances our understanding of neural diversity within this sensory system. Findings suggest a unique endoplasmic reticulum (ER) structure in Gnao1 neurons and allow for the synthesis of a developmental trajectory from stem cells to mature vomeronasal sensory neurons. Convincing methods, data, and analyses broadly support the claims, although experiments supporting the main ER-related claim require additional quantification of co-expression and statistics on labeling intensity or coverage. Adding these data would greatly strengthen the conclusions of the paper.

    3. Reviewer #1 (Public Review):

      Devakinandan and colleagues present a manuscript analyzing single-cell RNA-sequencing data from the mouse vomeronasal organ. The main advances in this manuscript are to identify and verify the differential expression of genes that distinguish apical and basal vomeronasal neurons. The authors also identify the enriched expression of ER-related genes in Gnao1 neurons, which they verify with in situ hybridizations and immunostaining, and also explore via electron microscopy. Finally, the results of this manuscript are presented in an online R shiny app. Overall, these data are a useful resource to the community. I have a few concerns about the manuscript, which I've listed below.

      General Concerns:

      (1) The authors mention that they were unable to identify the cells in cluster 13. This cluster looks similar to the "secretory VSN" subtype described in a recent preprint from C. Ron Yu's lab (10.1101/2024.02.22.581574). The authors could try comparing or integrating their data with this dataset (or that in Katreddi et al. 2022) to see if this is a common cell type across datasets (or arises from a specific type of cell doublets). In situ hybridizations for some of the marker genes for this cluster could also highlight where in the VNO these cells reside.

      (2) I found the UMAPs for the neurons somewhat difficult to interpret. Unlike Katreddi et al. 2022 or Hills et al. 2024, it's tricky to follow the developmental trajectories of the cells in the UMAP space. Perhaps the authors could try re-embedding the data using gene sets that don't include the receptors? It would also be interesting to see if the neuron clusters still cluster by receptor-type even when the receptors are excluded from the gene sets used for clustering. Plots relating the original clusters to the neuronal clusters, or dot plots showing marker gene expression for the neuronal clusters might both be useful. For example, right now it's difficult to interpret clusters like n8-13.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Devakinandan and colleagues have undertaken a thorough characterization of the cell types of the mouse vomeronasal organ, focusing on the vomeronasal sensory neurons (VSNs). VSNs are known to arise from a common pool of progenitors that differentiate into two distinct populations characterized by the expression of either the G protein subunit Gnao1 or Gnai2. Using single-cell RNA sequencing followed by unsupervised clustering of the transcriptome data, the authors identified three Gnai2+ VSN subtypes and a single Gnao1+ VSN type. To study VSN developmental trajectories, Devakinandan and colleagues took advantage of the constant renewal of the neuronal VSN pool, which allowed them to harvest all maturation states. All neurons were re-clustered and a pseudotime analysis was performed. The analysis revealed the emergence of two pools of Gap43+ clusters from a common lineage, which differentiate into many subclusters of mature Gnao1+ and Gnai2+ VSNs. By comparing the transcriptomes of these two pools of immature VSNs, the authors identified a number of differentially expressed transcription factors in addition to known markers. Next, by comparing the transcriptomes of mature Gnao1+ and Gnai2+ VSNs, the authors report the enrichment of ER-related genes in Gnao1+ VSNs. Using electron microscopy, they found that this enrichment was associated with specific ER morphology in Gnao1+ neurons. Finally, the authors characterized chemosensory receptor expression and co-expression (as well as H2-Mv proteins) in mature VSNs, which recapitulated known patterns.

      Strengths:

      The data presented here provide new and interesting perspectives on the distinguishing features between Gnao1+ and Gnai2+ VSNs. These features include newly identified markers, such as transcription factors, as well as an unsuspected ER-related peculiarity in Gnao1+ neurons, consisting of a hypertrophic ER and an enrichment in ER-related genes. In addition, the authors provide a comprehensive picture of specific co-expression patterns of V2R chemoreceptors and H2-Mv genes.

      Importantly, the authors provide a browser (scVNOexplorer) for anyone to explore the data, including gene expression and co-expression, number and proportion of cells, with a variety of graphical tools (violin plots, feature plots, dot plots, ...).

      Weaknesses:

      The study still requires refined analyses of the data and rigorous quantification to support the main claims.

      The method description for filtering and clustering single-cell RNA-sequencing data is incomplete. The Seurat package has many available pipelines for single-cell RNA-seq analysis, with a significant impact on the output data. How did the authors pre-process and normalize the data? Was the pipeline used with default settings? What batch correction method was applied to the data to mitigate possible sampling or technical effects? Moreover, the authors do not describe how cell and gene filtering was performed. The data in Figure 7-Supplement 3 show that one-sixth of the V1Rs do not express any chemoreceptor, while over a hundred cells express more than one chemoreceptor. Do these cells have unusually high or low numbers of genes or counts? To exclude the possibility of a technical artifact in these observations, the authors should describe how they dealt with putative doublet cells or debris. Surprisingly, some clusters are characterized by the expression of specific chemoreceptors (VRs). Have these been used for clustering? If so, clustering should be repeated after excluding these receptors.

      The identification of the VSN types should be consistent across the different analyses and validated. The data presented in Figure 1 lists four mature VSN types, whereas the re-clustering of neurons presented in Figure 3 leads to a different subdivision. At present, it remains unclear whether these clusters reflect the biology of the system or are due to over-clustering of the data, and therefore correspond to either noise or arbitrary splitting of continua. Clusters should be merged if they do not correspond to discrete categories of cells, and correspondence should be established between the different clustering analyses. To validate the detected clusters as cell types, markers characteristic of each of these populations can be evaluated by ISH or IHC.

      There is a lack of quantification of imaging data, which provides little support for the ER-related main claim. Quantification of co-expression and statistics on labeling intensity or coverage would greatly strengthen the conclusions and the title of the paper.

    5. Author response:

      eLife assessment

      This valuable study uses single-cell transcriptomics to explore the mouse vomeronasal organ and represents an advance that enhances our understanding of neural diversity within this sensory system. Findings suggest a unique endoplasmic reticulum (ER) structure in Gnao1 neurons and allow for the synthesis of a developmental trajectory from stem cells to mature vomeronasal sensory neurons. Convincing methods, data, and analyses broadly support the claims, although experiments supporting the main ER-related claim are incomplete and lack quantification of co-expression and statistics on labeling intensity or coverage. Adding these data would greatly strengthen the conclusions of the paper.

      Public Reviews:

      Reviewer #1 (Public Review):

      Devakinandan and colleagues present a manuscript analyzing single-cell RNA-sequencing data from the mouse vomeronasal organ. The main advances in this manuscript are to identify and verify the differential expression of genes that distinguish apical and basal vomeronasal neurons. The authors also identify the enriched expression of ER-related genes in Gnao1 neurons, which they verify with in situ hybridizations and immunostaining, and also explore via electron microscopy. Finally, the results of this manuscript are presented in an online R shiny app. Overall, these data are a useful resource to the community. I have a few concerns about the manuscript, which I've listed below.

      General Concerns:

      (1) The authors mention that they were unable to identify the cells in cluster 13. This cluster looks similar to the "secretory VSN" subtype described in a recent preprint from C. Ron Yu's lab (10.1101/2024.02.22.581574). The authors could try comparing or integrating their data with this dataset (or that in Katreddi et al. 2022) to see if this is a common cell type across datasets (or arises from a specific type of cell doublets). In situ hybridizations for some of the marker genes for this cluster could also highlight where in the VNO these cells reside.

      Cluster13 (Obp2a+) cells identified in our study have similar gene expression markers to those identified with the “putative secretory” cells in Hills et al. manuscript. At the time this manuscript was available publicly, our publication was already finalized and communicated. We welcome the suggestion to integrate data, which we will attempt and address in our revision.      

      (2) I found the UMAPs for the neurons somewhat difficult to interpret. Unlike Katreddi et al. 2022 or Hills et al. 2024, it's tricky to follow the developmental trajectories of the cells in the UMAP space. Perhaps the authors could try re-embedding the data using gene sets that don't include the receptors? It would also be interesting to see if the neuron clusters still cluster by receptor-type even when the receptors are excluded from the gene sets used for clustering. Plots relating the original clusters to the neuronal clusters, or dot plots showing marker gene expression for the neuronal clusters might both be useful. For example, right now it's difficult to interpret clusters like n8-13.

      We will represent the UMAPs to make the developmental trajectory clearer. How neuron clusters are affected by the presence or exclusion of receptors is an interesting question that we will address in our revision, along with showing markers of each neuronal cluster, as suggested by the reviewer.  

      Reviewer #2 (Public Review):

      Summary:

      The study focuses on the vomeronasal organ, the peripheral chemosensory organ of the accessory olfactory system, by employing single-cell transcriptomics. The author analyzed the mouse vomeronasal organ, identifying diverse cell types through their unique gene expression patterns. Developmental gene expression analysis revealed that two classes of sensory neurons diverge in their maturation from common progenitors, marked by specific transient and persistent transcription factors. A comparative study between major neuronal subtypes, which differ in their G-protein sensory receptor families and G-protein subunits (Gnai2 and Gnao1, respectively), highlighted a higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons. Moreover, distinct differences in ER content and ultrastructure suggest some intriguing roles of ER in Gnao1-positive vomeronasal neurons. This work is likely to provide useful data for the community and is conceptually novel with the unique role of ER in a subset of vomeronasal neurons. This reviewer has some minor concerns and some suggestions to improve the manuscript.

      Strengths:

      (1) The study identified diverse cell types based on unique gene expression patterns, using single-cell transcriptomic.

      (2) The analysis suggests that two classes of sensory neurons diverge during maturation from common progenitors, characterized by specific transient and persistent transcription factors.

      (3) A comparative study highlighted differences in Gnai2- and Gnao1-positive sensory neurons.

      (4) Higher expression of endoplasmic reticulum (ER) associated genes in Gnao1 neurons.

      (5) Distinct differences in ER content and ultrastructure suggest unique roles of ER in Gnao1-positive vomeronasal neurons.

      (6) The research provides conceptually novel on the unique role of ER in a subset of vomeronasal neurons, offering valuable insights to the community.

      Weaknesses:

      (1) The connection between observations from sc RNA-seq and EM is unclear.

      (2) The lack of quantification for the ER phenotype is a concern.

      We would like to point out that the connection between scRNA-seq and EM was made in our experiments that investigated the localization of ER proteins via IHC (in Figure 5). The intriguing observation that the levels of a number of ER luminal and membrane proteins were higher in Gnao1 compared to Gnai2 neurons, led us to hypothesize a differential ER content or ultrastructure, which was verified by EM. The quantification of ER phenotype would definitely strengthen our observations, which we will add in our revised manuscript.       

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Devakinandan and colleagues have undertaken a thorough characterization of the cell types of the mouse vomeronasal organ, focusing on the vomeronasal sensory neurons (VSNs). VSNs are known to arise from a common pool of progenitors that differentiate into two distinct populations characterized by the expression of either the G protein subunit Gnao1 or Gnai2. Using single-cell RNA sequencing followed by unsupervised clustering of the transcriptome data, the authors identified three Gnai2+ VSN subtypes and a single Gnao1+ VSN type. To study VSN developmental trajectories, Devakinandan and colleagues took advantage of the constant renewal of the neuronal VSN pool, which allowed them to harvest all maturation states. All neurons were re-clustered and a pseudotime analysis was performed. The analysis revealed the emergence of two pools of Gap43+ clusters from a common lineage, which differentiate into many subclusters of mature Gnao1+ and Gnai2+ VSNs. By comparing the transcriptomes of these two pools of immature VSNs, the authors identified a number of differentially expressed transcription factors in addition to known markers. Next, by comparing the transcriptomes of mature Gnao1+ and Gnai2+ VSNs, the authors report the enrichment of ER-related genes in Gnao1+ VSNs. Using electron microscopy, they found that this enrichment was associated with specific ER morphology in Gnao1+ neurons. Finally, the authors characterized chemosensory receptor expression and co-expression (as well as H2-Mv proteins) in mature VSNs, which recapitulated known patterns.

      Strengths:

      The data presented here provide new and interesting perspectives on the distinguishing features between Gnao1+ and Gnai2+ VSNs. These features include newly identified markers, such as transcription factors, as well as an unsuspected ER-related peculiarity in Gnao1+ neurons, consisting of a hypertrophic ER and an enrichment in ER-related genes. In addition, the authors provide a comprehensive picture of specific co-expression patterns of V2R chemoreceptors and H2-Mv genes.

      Importantly, the authors provide a browser (scVNOexplorer) for anyone to explore the data, including gene expression and co-expression, number and proportion of cells, with a variety of graphical tools (violin plots, feature plots, dot plots, ...).

      Weaknesses:

      The study still requires refined analyses of the data and rigorous quantification to support the main claims.

      The method description for filtering and clustering single-cell RNA-sequencing data is incomplete. The Seurat package has many available pipelines for single-cell RNA-seq analysis, with a significant impact on the output data. How did the authors pre-process and normalize the data? Was the pipeline used with default settings? What batch correction method was applied to the data to mitigate possible sampling or technical effects? Moreover, the authors do not describe how cell and gene filtering was performed.

      The data in Figure 7-Supplement 3 show that one-sixth of the V1Rs do not express any chemoreceptor, while over a hundred cells express more than one chemoreceptor. Do these cells have unusually high or low numbers of genes or counts? To exclude the possibility of a technical artifact in these observations, the authors should describe how they dealt with putative doublet cells or debris.

      Surprisingly, some clusters are characterized by the expression of specific chemoreceptors (VRs). Have these been used for clustering? If so, clustering should be repeated after excluding these receptors.

      The identification of the VSN types should be consistent across the different analyses and validated. The data presented in Figure 1 lists four mature VSN types, whereas the re-clustering of neurons presented in Figure 3 leads to a different subdivision. At present, it remains unclear whether these clusters reflect the biology of the system or are due to over-clustering of the data, and therefore correspond to either noise or arbitrary splitting of continua. Clusters should be merged if they do not correspond to discrete categories of cells, and correspondence should be established between the different clustering analyses. To validate the detected clusters as cell types, markers characteristic of each of these populations can be evaluated by ISH or IHC.

      There is a lack of quantification of imaging data, which provides little support for the ER-related main claim. Quantification of co-expression and statistics on labeling intensity or coverage would greatly strengthen the conclusions and the title of the paper.

      scRNA-seq data analysis methods: We agree with the reviewer and will elaborate on the various criterion, parameters and methods in our revision. As described above, our revised manuscript will include analysis of how inclusion / exclusion of VRs affects cell clusters, as well as quantification of the ER phenotype. We will address the reviewer’s concern of over-clustering.

      We think that the cells expressing zero as well as two V1Rs are real and cannot be attributed to debris or doublets for the following reasons:

      a) Cells expressing no V1Rs are not necessarily debris because they express other neuronal markers at the same level as cells that express one or two V1Rs. Higher expression threshold values used in our analysis may have somewhat increased the proportion of cells with zero V1Rs. We will modify figure 7-supplement 3c to add another group showing Gnai2 level in cells expressing zero V1Rs.

      b) Cells co-expressing V1R genes: We listed the frequency of cells co-expressing V1R gene combinations in Supplementary table - 8. Among 134 cells that express two V1Rs, 44 cells express Vmn1r85+Vmn1r86, 21 express Vmn1r184+Vmn1r185, 13 express Vmn1r56+Vmn1r57, 6 express Vmn1r168+Vmn1r177, and so on. Doublets generally are a random combination of two cells. Here, each specific co-expression combination represents multiple cells and is highly unlikely by random chance. Some of the co-expression combinations were identified earlier and verified experimentally in Lee et al., 2019 and Hills et. al. Furthermore, Figure-7 supplement 3c shows that the level of Gnai2 expression is comparable across cells expressing one or two V1Rs. If the V1R expressing cells are doublets, we expect the level of Gnai2 to be higher, as compared to cells expressing single V1R. We will elaborate on this in our revised manuscript.

    1. eLife assessment

      This study compiles a wide range of results on the connectivity, stimulus selectivity, and potential role of the claustrum in sensory behavior. While most of the connectivity results confirm earlier studies, this valuable work provides incomplete evidence that the claustrum responds to multimodal stimuli and that local connectivity is reduced across cells that have similar long-range connectivity. The conclusions drawn from the behavioral results are weakened by the animals' poor performance on the designed task.This study has the potential to be of interest to neuroscientists.

    2. Reviewer #1 (Public Review):

      Summary:

      The paper by Shelton et al investigates some of the anatomical and physiological properties of the mouse claustrum. First, they characterize the intrinsic properties of claustrum excitatory and inhibitory neurons and determine how these different claustrum neurons receive input from different cortical regions. Next, they perform in vitro patch clamp recordings to determine the extent of intraclaustrum connectivity between excitatory neurons. Following these experiments, in vivo axon imaging was performed to determine how claustrum-retrosplenial cortex neurons are modulated by different combinations of auditory, visual, and somatosensory input. Finally, the authors perform claustrum lesions to determine if claustrum neurons are required for performance on a multisensory discrimination task

      Strengths:

      An important potential contribution the authors provide is the demonstration of intra-claustrum excitation. In addition, this paper provides the first experimental data where two cortical inputs are independently stimulated in the same experiment (using 2 different opsins). Overall, the in vitro patch clamp experiments and anatomical data provide confirmation that claustrum neurons receive convergent inputs from areas of the frontal cortex. These experiments were conducted with rigor and are of high quality.

      Weaknesses:

      The title of the paper states that claustrum neurons integrate information from different cortical sources. However, the authors did not actually test or measure integration in the manuscript. They do show physiological convergence of inputs on claustrum neurons in the slice work. Testing integration through simultaneous activation of inputs was not performed. The convergence of cortical input has been recently shown by several other papers (Chia et al), and the current paper largely supports these previous conclusions. The in vivo work did test for integration because simultaneous sensory stimulations were performed. However, integration was not measured at the single cell (axon) level because it was unclear how activity in a single claustrum ROI changes in response to (for example) visual, tactile, and visual-tactile stimulations. Reading the discussion, I also see the authors speculate that the sensory responses in the claustrum could arise from attentional or salience-related inputs from an upstream source such as the PFC. In this case, claustrum cells would not integrate anything (but instead respond to PFC inputs).

      The different experiments in different figures often do not inform each other. For example, the authors show in Figure 3 that claustrum-RSP cells (CTB cells) do not receive input from the auditory cortex. But then, in Figure 6 auditory stimuli are used. Not surprisingly, claustrum ROIs respond very little to auditory stimuli (the weakest of all sensory modalities). Then, in Figure 7 the authors use auditory stimuli in the multisensory task. It seems that these experiments were done independently and were not used to inform each other.

      One novel aspect of the manuscript is the focus on intraclaustrum connectivity between excitatory cells (Figure 2). The authors used wide-field optogenetics to investigate connectivity. However, the use of paired patch-clamp recordings remains the ground truth technique for determining the rate of connectivity between cell types, and paired recordings were not performed here. It is difficult to understand and gain appreciation for intraclaustrum connectivity when only wide-field optogenetics is used.

      In Figure 2, CLA-rsp cells express Chrimson, and the authors removed cells from the analysis with short latency responses (which reflect opsin expression). But wouldn't this also remove cells that express opsin and receive monosynaptic inputs from other opsin-expressing cells, therefore underestimating the connectivity between these CLA-rsp neurons? I think this needs to be addressed.

      In Figure 5J the lack of difference in the EPSC-IPSC timing in the RSP is likely due to 1 outlier EPSC at 30ms which is most likely reflecting polysynaptic communication. Therefore, I do not feel the argument being made here with differences in physiology is particularly striking.

      In the text describing Figure 5, the authors state "These experiments point to a complex interaction ....likely influenced by cell type of CLA projection and intraclaustral modules in which they participate". How does this slice experiment stimulating axons from one input relate to different CLA cell types or intra-claustrum circuits? I don't follow this argument.

      In Figure 6G and H, the blank condition yields a result similar to many of the sensory stimulus conditions. This blank condition (when no stimulus was presented) serves as a nice reference to compare the rest of the conditions. However, the remainder of the stimulation conditions were not adjusted relative to what would be expected by chance. For example, the response of each cell could be compared to a distribution of shuffled data, where time-series data are shuffled in time by randomly assigned intervals and a surrogate distribution of responses generated. This procedure is repeated 200-1000x to generate a distribution of shuffled responses. Then the original stimulus-triggered response (1s post) could be compared to shuffled data. Currently, the authors just compare pre/post-mean data using a Mann-Whitney test from the mean overall response, which could be biased by a small number of trials. Therefore, I think a more conservative and statistically rigorous approach is warranted here, before making the claim of a 20% response probability or 50% overall response rate.

      Regarding Figure 6, a more conventional way to show sensory responses is to display a heatmap of the z-scored responses across all ROIs, sorted by their post-stimulus response. This enables the reader to better visualize and understand the claims being made here, rather than relying on the overall mean which could be influenced by a few highly responsive ROIs.

      For Figure 6, it would also help to display some raw data showing responses at the single ROI level and the population level. If these sensory stimulations are modulating claustrum neurons, then this will be observable on the mean population vector (averaged df/f across all ROIs as a function of time) within a given experiment and would add support to the conclusions being made.

      As noted by the authors, there is substantial evidence in the literature showing that motor activity arises in mice during these types of sensory stimulation experiments. It is foreseeable that at least some of the responses measured here arise from motor activity. It would be important to identify to what extent this is the case.

      All claims in the results for Figure 6 such as "the proportion of responsive axons tended to be highest when stimuli were combined" should be supported by statistics.

      In Figure 7, the authors state that mice learned the structure of the task. How is this the case, when the number of misses is 5-6x greater than the number of hits on audiovisual trials (S Figure 19). I don't get the impression that mice perform this task correctly. As shown in Figure 7I, the hit rate is exceptionally low on the audiovisual port in controls. I just can't see how control and lesion mice can have the same hit rate and false alarm rate yet have different d'. Indeed, I might be missing something in the analysis. However, given that both groups of mice are not performing the task as designed, I fail to see how the authors' claim regarding multisensory integration by the claustrum is supported. Even if there is some difference in the d' measure, what does that matter when the hits are the least likely trial outcome here for both groups.

      In the discussion, it is stated that "While axons responded inconsistently to individual stimulus presentations, their responsivity remained consistent between stimuli and through time on average...". I do not understand this part of the sentence. Does this mean axons are consistently inconsistent?

      In the discussion, the authors state their axon imaging results contrast with recent studies in mice. Why not actually do the same analysis that Ollerenshaw did, so this statement is supported by fact? As pointed out above, the criteria used to classify an axon as responsive to stimuli were very liberal in this current manuscript.

      I find the discussion wildly speculative and broad. For example, "the integrative properties of the CLA could act as a substrate for transforming the information content of its inputs (e.g. reducing trial-to-trial variability of responses to conjunctive stimuli...)". How would a claustrum neuron responding with a 10% reliability to a stimuli (or set of stimuli) provide any role in reducing trial-to-trial variability of sensory activity in the cortex?

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Shelton et al. explore the organization of the Claustrum. To do so, they focus on a specific claustrum population, the one projecting to the retrosplenial cortex (CLA-RSP neurons). Using an elegant technical approach, they first described electrophysiological properties of claustrum neurons, including the CLA-RSP ones. Further, they showed that CLA-RSP neurons (1) directly excite other CLA neurons, in a 'projection-specific' pattern, i.e. CLA-RSP neurons mainly excite claustrum neurons not projecting to the RSP and (2) received excitatory inputs from multiple cortical territories (mainly frontal ones). To confirm the 'integrative' property of claustrum networks, they then imaged claustrum axons in the cortex during single- or multi-sensory stimulations. Finally, they investigated the effect of CLA-RSP lesion on performance in a sensory detection task.

      Strengths:<br /> Overall, this is a really good study, using state-of-the-art technical approaches to probe the local/global organization of the Claustrum. The in-vitro part is impressive, and the results are compelling.

      Weaknesses:<br /> One noteworthy concern arises from the terminology used throughout the study. The authors claimed that the claustrum is an integrative structure. Yet, integration has a specific meaning, i.e. the production of a specific response by a single neuron (or network) in response to a specific combination of several input signals. In this study, the authors showed compelling results in favor of convergence rather than integration. On a lighter note, the in-vivo data are less convincing, and do not entirely support the claim of "integration" made by the authors.

    4. Reviewer #3 (Public Review):

      The claustrum is one of the most enigmatic regions of the cerebral cortex, with a potential role in consciousness and integrating multisensory information. Despite extensive connections with almost all cortical areas, its functions and mechanisms are not well understood. In an attempt to unravel these complexities, Shelton et al. employed advanced circuit mapping technologies to examine specific neurons within the claustrum. They focused on how these neurons integrate incoming information and manage the output. Their findings suggest that claustrum neurons selectively communicate based on cortical projection targets and that their responsiveness to cortical inputs varies by cell type.

      Imaging studies demonstrated that claustrum axons respond to both single and multiple sensory stimuli. Extended inhibition of the claustrum significantly reduced animals' responsiveness to multisensory stimuli, highlighting its critical role as an integrative hub in the cortex.

      However, the study's conclusions at times rely on assumptions that may undermine their validity. For instance, the comparison between RSC-projecting and non-RSC-projecting neurons is problematic due to potential false negatives in the cell labeling process, which might not capture the entire neuron population projecting to a brain area. This issue casts doubt on the findings related to neuron interconnectivity and projections, suggesting that the results should be interpreted with caution. The study's approach to defining neuron types based on projection could benefit from a more critical evaluation or a broader methodological perspective.

      Nevertheless, the study sets the stage for many promising future research directions. Future work could particularly focus on exploring the functional and molecular differences between E1 and E2 neurons and further assess the implications of the distinct responses of excitatory and inhibitory claustrum neurons for internal computations. Additionally, adopting a different behavioral paradigm that more directly tes2ts the integration of sensory information for purposeful behavior could also prove valuable.

  2. Jun 2024
    1. eLife assessment

      This valuable study uses dynamic metabolic models to compare perturbation responses in a bacterial system, analyzing whether they return to their steady state or amplify beyond the initial perturbation. The evidence supporting the emergent properties of perturbed metabolic systems to network topology and sensitivity to specific metabolites is compelling. However, the mathematical explanation of the perturbation response is incomplete, and a more comprehensive metabolic and biosynthesis model would be beneficial.

    2. Reviewer #1 (Public Review):

      Summary

      The author studied metabolic networks for central metabolism, focusing on how system trajectories returned to their steady state. To quantify the response, systematic perturbation was performed in simulation and the maximal destabilization away from the steady state (compared with the initial perturbation distance) was characterized. The author analyzed the perturbation response and found that sparse networks and networks with more cofactors are more "stable", in the sense that the perturbed trajectories have smaller deviations along the path back to the steady state.

      Strengths and major contributions

      The author compared three metabolic models and performed systematic perturbation analysis in simulation. This is the first work to characterize how perturbed trajectories deviate from equilibrium in large biochemical systems and illustrated interesting findings about the difference between sparse biological systems and randomly simulated reaction networks.

      Weaknesses

      There are two main weaknesses in this study:

      First, the metabolic network in this study is incomplete. For example, amino acid synthesis and lipid synthesis are important for biomass and growth, but they are not included in the three models used in this study. NADH and NADPH are as important as ATP/ADP/AMP, but they are not included in the models. In the future, a more comprehensive metabolic and biosynthesis model is required.

      Second, this work does not provide a mathematical explanation of the perturbation response χ. Since the perturbation analysis is performed close to the steady state (or at least belongs to the attractor of single-steady-state), local linear analysis would provide useful information. By complementing with other analysis in dynamical systems (described below) we can gain more logical insights about perturbation response.

      Discussion and impact for the field

      Metabolic perturbation is an important topic in cell biology and has important clinical implications in pharmacodynamics. The computational analysis in this study provides an initiative for future quantitative analysis on metabolism and homeostasis.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have conducted a valuable comparative analysis of perturbation responses in three nonlinear kinetic models of E. coli central carbon metabolism found in the literature. They aimed to uncover commonalities and emergent properties in the perturbation responses of bacterial metabolism. They discovered that perturbations in the initial concentrations of specific metabolites, such as adenylate cofactors and pyruvate, significantly affect the maximal deviation of the responses from steady-state values. Furthermore, they explored whether the network connectivity (sparse versus dense connections) influences these perturbation responses. The manuscript is reasonably well written.

      Strengths:

      Well-defined and valuable research questions.

      Weaknesses:

      (1) In the study on determining key metabolites affecting responses to perturbations (starting from line 171), the authors fix the values of individual concentrations to their steady-state values and observe the responses. Such a procedure adds artificial constraints to the network because, in the natural responses of cells (and models) to perturbations, it is highly unlikely that metabolites will not evolve in time. By fixing the values of specific metabolites, the authors prohibit the metabolic network from evolving in the most optimal way to compensate for the perturbation. Instead of this procedure, have the authors considered for this task applying techniques from variance-based sensitivity analysis (Sobol, global sensitivity analysis), where they can calculate the first-order sensitivity index and total effect index? Using this technique, the authors would be able to determine the key metabolites while allowing for metabolic responses to perturbations without unnatural constraints.

      (2) To follow up on the previous remark, the authors state that the metabolites that augment the response coefficient when their concentration is fixed tend to be allosteric regulators. The authors should report which allosteric regulations are implemented in each of the models so that one can compare against Figure 2. Again, the effect of allosteric regulation by a specific metabolite that is quantified the way the authors did is biased by fixing the concentration value - it is true that negative feedback is broken when the metabolite concentration is fixed, however, in the rate law, there is still the fixed inhibition term with its value corresponding to the inhibition at the steady state. To see the effect of allosteric regulation by a metabolite, one can change the inhibition constants instead of constraining the responses with fixed concentrations.

      (3) Given the role of ATP in metabolic processes, the authors' finding of the sensitivity of the three networks' responses to perturbations in the AXP concentrations seems reasonable. However, drawing such firm conclusions from only three models, with each of them built around one steady state and having one kinetic parameter set despite that they were built for different physiologies, raises some questions. It is well-known in studies related to basins of attraction of the steady states that the nonlinear responses also depend on the actual steady states, the values of kinetic parameters, and implemented kinetic rate law, i.e., not only on the topology of the underlying systems. In the population of only three models, we cannot exclude the possibility of overlaps and strong similarities in the values of kinetic parameters, steady states, and enzyme saturations that all affect and might bias the observed responses. Ideally, to eliminate the possibility of such biases, one should simulate responses of a large population of models for multiple physiologies (and the corresponding steady states) and multiple parameter sets per physiology. This can be a difficult task, but having more kinetic models in this work would go a long way toward more convincing results. Recently, E. coli nonlinear kinetic models from several groups appeared that might help in this task, e.g., Haiman et al., PLoS Comput Biol, 17(1): e1008208, (2021), Choudhury et al., Nat Mach Intell, 4, 710-719, (2022); Hu et al., Metab Eng, 82, 123-133 (2024), Narayanan et al., Nat Commun, 15:723, (2024).

      (4) Can the authors share their insights on what could be the underlying reasons for the bimodal distribution in Figure 1E? Even after adding random reactions, the distribution still has two modes - why is that?

      (5) Considering the effects of the sparsity of the networks on the perturbation responses (from line 223 onwards), when we compare the three analyzed models, it is clear that the Khodayari et al. model is a superset of the other two models. Therefore, this model can be considered as, e.g., Chassagnole model with Nadd reactions (though not randomly added). Based on Figures 1b and S2, one can observe that the responses of the Khodayari models have stronger responses, which is exactly opposite to the authors' conclusion that adding the reactions weakens the responses. The authors should comment on this.

    1. Author Response

      The following is the authors’ response to the current reviews.

      Reviewer #1 (Public Review):

      The manuscript by Sejour et al. is testing "translational ramp" model described previously by Tuller et al. in S. cerevisiae. Authors are using bioinformatics and reporter based experimental approaches to test whether "rare codons" in the first 40 codons of the gene coding sequences increase translation efficiency and regulate abundance of translation products in yeast cells. Authors conclude that "translation ramp" model does not have support using a new set of reporters and bioinformatics analyses. The strength of bioinformatic evidence and experimental analyses (even very limited) of the rare codons insertion in the reporter make a compelling case for the authors claims. However the major weakness of the manuscript is that authors do not take into account other models that previously disputed "rare or slow codon" model of Tuller et al. and overstate their own results that are rather limited. This maintains to be the weak part of the manuscript even in the revised form.

      We are glad the reviewer thinks our evidence makes “a compelling case for the authors claims”. This was our main aim, and we are satisfied with this.

      The reviewer believes the major weakness of the manuscript is that we do not take into account other models and do not (see below) cite numerous other relevant papers. The reviewer made essentially the same criticism at the first review, at which time we looked quite hard for papers generally meeting the reviewer’s description. We found a few, which we incorporated here. Still, we did not find the body of evidence whose existence the reviewer implies. We are citing every study we know to be relevant, though of course we will have inadvertently missed some, given the huge body of literature. After the first round of review, we wrote “the reviewer did not give specific references, and, though we looked, we weren’t always sure which papers the reviewer had in mind.” We hoped the reviewer would provide citations. But only two citations are provided here, both to A. Kochetov, and these don’t seem central to the reviewer’s points.

      The studies that authors do not mention argue with "translation ramp" model and show more thorough analyses of translation initiation to elongation transition as well as early elongation "slow down" in ribosome profiling data. Moreover several studies have used bioinformatical analyses to point out the evolution of N-terminal sequences in multiple model organisms including yeast, focusing on either upstream ORFs (uORFs) or already annotated ORFs. The authors did not mention multiple of these studies in their revised manuscript and did not comment on their own results in the context of these previous studies.

      Mostly, we do not know to what papers the reviewer is referring. This may be our failing, but it would have helped if the reviewer had cited one of them. There are papers discussing the evolution of N-terminal sequences, but as far as we know, these do not discuss translation speed or codon usage. Of course, we may have missed some papers.

      As such the authors approach to data presentation, writing and data discussion makes the manuscript rather biased, focused on criticizing Tuller et al. study and short on discussing multiple other possible reasons for slow translation elongation at the beginning of the protein synthesis. This all together makes the manuscript at the end very limited.

      We think the reviewer may be considering our paper as being generally about translation speeds, whereas in our minds, it is not. This difference in views as to what the paper is “about” is perhaps causing friction. To us, it is indeed a limited paper. We are narrowly focused on the finding of Tuller that there is an enrichment of rare, slow codons at the 5’ end of genes, and we have sought an explanation of this particular fact. This is not a paper about rates of translation generally—it is a limited paper about the reason for the 5’ enrichment of rare, slow codons.

      To expand on this, the encoded slow 5’ translation due to rare, slow codons (of Tuller et al.) is a small effect (1% to 3%). The possible unencoded slow 5’ translation of unknown mechanism discussed by some other papers (e.g., Weinberg et al. 2016, Shah et al. 2013) is a much larger effect (50% or more). Just from the different magnitudes, it seems likely these are different phenomena. And yet, despite the small size of the encoded effect, it is for some reason this paper by Tuller et al. that has captured the attention of the literature: as we point out below, Tuller et al. has been cited over 900 times. Partly because of the wide and continuing influence of this paper, it is worth specifically and narrowly addressing its findings.

      Reviewer #2 (Public Review):

      Tuller et al. first made the curious observation, that the first ∼30-50 codons in most organisms are encoded by scarce tRNAs and appear to be translated slower than the rest of the coding sequences (CDS). They speculated that this has evolved to pace ribosomes on CDS and prevent ribosome collisions during elongation - the "Ramp" hypothesis. Various aspects of this hypothesis, both factual and in terms of interpreting the results, have been challenged ever since. Sejour et al. present compelling results confirming the slower translation of the first ~40 codons in S. cerevisiae but providing an alternative explanation for this phenomenon. Specifically, they show that the higher amino acid sequence divergence of N-terminal ends of proteins and accompanying lower purifying selection (perhaps the result of de novo evolution) is sufficient to explain the prevalence of rare slow codons in these regions. These results are an important contribution in understanding how aspects of the evolution of protein coding regions can affect translation efficiency on these sequences and directly challenge the "Ramp" hypothesis proposed by Tuller et al.

      I believe the data is presented clearly and the results generally justify the conclusions.

      We thank the reviewer for his/her attention to the manuscript, and for his/her comments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      As mentioned in the public review major weakness of the manuscript is the lack of analyses for confounding effects, overstatements of the results (using single amino acid sequence reporter) and the lack of discussion of previous work that argues against Tuller et al model. In my previous review I mentioned multiple other studies that addressed "slow codons" model in more detail.

      No, the reviewer did not cite any specific studies.

      While some of these studies are mentioned in the revised manuscript, authors are still rather biased and selective in their discussions. I should also point out that previous studies, that authors fail again to mention, were focused on either translation initiation, initiation to elongation transition or early elongation effects in relation to mRNA sequence, structure, codons as well as amino acid sequence. Also additional studies with bioinformatic analyses of N-terminal conservation and existence of start sites at the beginning of the protein sequences in multiple model organisms were also omitted.

      Again, we do not know to what papers the reviewer is referring. But this sounds like a lot. Our paper is aimed at a specific, narrow topic: Why is there an excess of rare, slow codons in the 5’ region of genes? We are not trying to make general statements about all things affecting and affected by translation speed, we are just trying to explain the excess of rare, slow codons.

      In general manuscript seems to be too much focused-on discussion of Tuller's paper . . .

      Yes, we are focused on the Tuller findings, the excess of rare slow codons in 5’ regions.

      . . . and arguing with the model that was already shown by multiple other studies to be limited and not correct.

      We find it unsatisfactory that the reviewer states in a public review that there are multiple other studies showing that the Tuller model is not correct, and yet does not cite any of them. Furthermore, for the reviewer to say that Tuller et al. is “not correct” is too sweeping. The core finding of Tuller et al. was the excess of rare, slow codons in the 5’ regions of genes. We confirm this; we believe it is correct; we are not aware of any literature disputing this. Then, Tuller interpreted this as an adaptation to promote translational efficiency. On the interpretation, we disagree with Tuller. But if one is to disagree with this interpretation, one needs an alternative explanation of the fact of the excess rare, slow codons. Providing such an alternative explanation, and doing an experiment to distinguish the explanations, is our contribution. We are not aware of any other paper making our interpretation.

      There are of course many papers that discuss various aspects of translation at the 5’ ends of genes, and we do cite quite a few such papers in our manuscript, though certainly not all. But papers of this general kind do not, and cannot, show that Tuller et al. is “not correct”. As far as we know, no paper provides an alternative explanation for the rare slow codons, and no paper does an experiment to modulate translation speed and look at the effect on gene expression. Notably, the slow translation phenomenon associated with the rare codons found by Tuller et al. is a very small effect—a change of about 1% to 3% of translation speed. Some other papers on translation speed are dealing with possible changes in the range of 50% or more. These are presumably some other phenomenon (if indeed they are even real changes in translation speed), and, whether they are true or not, the results and interpretations of Tuller et al. could still be true or not. Of course, if we knew of some previous paper showing the Tuller paper is not correct, we should and would cite it.

      To expand on the current view of Tuller in the literature, Tuller et al. has been cited 956 times according to Google Scholar. This makes it an extremely influential paper. After finding Tuller et al. in Entrez Pubmed, one can look under “Cited by” and see the five most recent papers that cite Tuller et al. The five papers given on May 23 2024 were Bharti . . . Ignatova 2024; Uddin 2024; Khandia . . . Choudhary 2024; Love and Nair 2024; and Oelschlaeger 2024. We went through these five most recent papers that cite Tuller et al., and asked, did these authors cite the Tuller results as fully correct, or did they mention any doubts about the results? All five of the papers cited the Tuller results as fully correct, with no mention of any kind of doubt. For instance, Kandia et al. 2024 state “The slow “ramp” present at 5’ end of mRNA forms an optimal and robust means to reduce ribosomal traffic jams, thus minimizing the cost of protein expression40.”, while Oelschlaeger (2024) states “Slow translation ramps have also been described elsewhere and proposed to prevent traffic jams along the mRNA [51,52,53].” Although Uddin (2024) cited Tuller as fully correct, Uddin seemed to think (it is a little unclear) that Tuller found an enrichment of highly-used codons, opposite to the actual finding. The multiple contrary studies mentioned by the reviewer do not seem to have been very influential.

      There are papers containing skepticism about the Tuller interpretation, and also papers with results that are difficult to reconcile in a common-sense way with the Tuller interpretation. But skepticism, and a difficulty to reconcile with common sense, are far from a demonstration that a paper is incorrect. Indeed, Tuller et al. may have been published in Cell, and may be so highly cited, exactly because the findings are counter-intuitive, colliding with common sense. Our contribution is to find a common-sense interpretation of the surprising but correct underlying fact of the 5’ enrichment of rare, slow codons.

      Having wrote that in the previous review, I have to admit that Sejour et al manuscript in the main text has a minimal amount of novelty with experimental evidence, the conclusions are based on three reporters with and without stalling/collision sequence with the same amino acid sequence and varying codons. Some more novelty is seen in bioinformatic analyses of multiple yeast sequences and sequence conservation at the N-termini of proteins. However, even this part of the manuscript is not discussed fully and with correct comparison to previous studies. Authors, based on my previous comments discuss further experimental shortcomings in their new and "expanded" discussion but the use of a single reporter in this case cannot relate to all differences that may be coming from ORFs seen in complete yeast transcriptome. There are multiple studies that used more reporters with more than one amino-acid and mRNA sequence as well as with similar variation of the rare or common codons. The handwaving argument about the influence of all other mechanisms that can arise from different start sites, RNA structure, peptide interaction with exit channel, peptidyl-tRNA drop-off, eIF3 complex initiation-elongation association, and etc, is just pointing up to a manuscript that is more about bashing up Tuller's model and old paper than trying to make a concise story about their own results and discuss their study in plethora of studies that indicated multiple other models for slow early elongation.

      We don’t understand why the reviewer is so grudging.

      Discussion of the ribosome's collisions and potential impact of such scenario in the author's manuscript is left completely without citation, even though such work has relevant results to the author's conclusions and Tuller's model.

      This is not true. We cite Dao Duc and Song (2018) “The impact of ribosomal interference, codon usage, and exit tunnel interactions on translation elongation rate variation.” PLoS Genet 14, and Tesina, . . . and Green (2020) “Molecular mechanism of translational stalling by inhibitory codon combinations and Poly(A) tracts. EMBO J., which are two excellent papers on this subject. We also cite Gamble et al. (2016), who found the underlying result, but at that time did not attribute it to ribosome collisions.

      Previous studies (not cited) for example clearly indicate how the length from stalling sequence to start codon is related to ribosome collisions. Moreover such studies are pointing out differences in initiation vs elongation rates that may impact ribosome collisions and protein expression. Both of these topics would be very valuable in discussions of evolutionary changes in the current yeast ORFs. Not to mention that authors do not really discuss also possibilities for differences in 5'UTRs and uORFs in relation to downstream ORFs sequence and codon composition.

      It is not clear to us that such papers are highly relevant to the issue on which we are working.

      The argument about whether cycloheximide or not is doing 5' ribosome slowdown (lines 425-443) is just rambling about Weinberg's paper from 2016 without any real conclusion. In this section authors are just throwing down hypothesis that were more clearly explained in Weinberg's manuscript or shown experimentally in studies done after the Weinberg et al. paper was published.

      Earlier, the reviewer had the criticism that “The studies that authors do not mention argue with "translation ramp" model and show more thorough analyses of translation initiation to elongation transition as well as early elongation "slow down" in ribosome profiling data.” The main study we know of dealing with these issues like these is that of Weinberg et al. 2016. In our opinion, this is a thoughtful paper on these issues. But now, at this point, the reviewer seems to criticize the fact that we do extensively cite results from Weinberg et al. It is true that there is no ultimate conclusion, but why there is no conclusion is a little bit interesting. Weinberg et al show that even in studies that do not use cycloheximide as the first step in ribosome profiling, there is some left-over high density of ribosomes near 5’ ends. But, all these ribosome profiling experiments do use cycloheximide at a later step in the procedure. Until someone does a ribosome profiling experiment without the use of any cycloheximide at any step, there will be no firm conclusion. This is not our fault—and also not the issue we are writing about. And, the reason this paragraph is in the manuscript at all is that the reviewer (we thought) had asked for something like this in the first review.

      At the end, even in the limited novelty of evolutionary arguments about non-existing N-terminal conservation of codons or amino acids they fail to cite and discuss previous work by Kochetov (BioEssays, 2008 and NAR, 2011) which have additional explanation on evolution of N-terminal sequences in yeast, human or Drosophila.

      These two papers of Dr. Kochetov’s have some relevance and we now cite them. These are the only papers cited by the reviewer in his/her two reviews.

      Probably the reviewer would have preferred a paper on a different subject.


      The following is the authors’ response to the original reviews.

      Response to Reviewers:

      We thank the reviewers for their comments, and their evident close reading of the manuscript. Generally, we agree with the reviewers on the strengths and weaknesses of our manuscript. Our revised manuscript has a more extensive discussion of alternative explanations for initial high ribosome density as seen by ribosome profiling, and which more specifically points out the limitations of our work.

      As a preface to specific responses to the reviewers, we will say that we could divide observations of slow initial translation into two categories, which we will call “encoded slow codons”, and “increased ribosome density”. With respect to the first category, Tuller et al. documented initial “encoded slow codons”, that is, there is a statistical excess of rare, slowly-translated codons at the 5’ ends of genes. Although the size of this effect is small, statistical significance is extremely high, and the existence of this enrichment is not in any doubt. At first sight, this appears to be a strong indication of a preference for slow initial translation. In our opinion, our main contribution is to show that there is an alternative explanation for this initial enrichment of rare, slow codons—that they are a spandrel, a consequence of sequence plasticity at the 5’ (and 3’) ends of genes. The reviewers seem to generally agree with this, and we are not aware that any other work has provided an explanation for the 5’ enrichment of rare codons.

      The second category of observations pertaining to slow initial translation is “increased ribosome density”. Early ribosome profiling studies used cycloheximide to arrest cell growth, and these studies showed a higher density of ribosomes near the 5’ end of genes than elsewhere. This high initial ribosome density helped motivate the paper of Tuller et al., though their finding of “encoded slow codons” could explain only a very small part of the increased ribosome density. More modern ribosome profiling studies do not use cycloheximide as the first step in arresting translation, and in these studies, the density of ribosomes near the 5’ end of genes is greatly reduced. And yet, there remains, even in the absence of cycloheximide at the first step, a significantly increased density of ribosomes near the 5’ end (e.g., Weinberg et al., 2016). (However, most or all of these studies do use cycloheximide at a later step in the protocol, and the possibility of a cycloheximide artefact is difficult to exclude.) Some of the reviewer’s concerns are that we do not explain the increased 5’ ribosome density seen by ribosome profiling. We agree; but we feel it is not the main point of our manuscript. In revision, we more extensively discuss other work on increased ribosome density, and more explicitly point out the limitations of our manuscript in this regard. We also note, though, that increased ribosome density is not a direct measure of translation speed—it can have other causes.

      Specific Responses.

      Reviewer 1 was concerned that we did not more fully discuss other work on possible reasons for slow initial translation. We discuss such work more extensively in our revision. However, as far as we know, none of this work proposes a reason for the 5’ enrichment of rare, slow codons, and this is the main point of our paper. Furthermore, it is not completely clear that there is any slow initial translation. The increase in ribosome density seen in flash-freeze ribosome profiling could be an artefact of the use of cycloheximide at the thaw step of the protocols; or it could be a real measure of high ribosome density that occurs for some other reason than slow translation (e.g., ribosomes might have low processivity at the 5’ end).

      Reviewer 1 was also concerned about confounding effects in our reporter gene analysis of the effects of different codons on efficiency of translation. We have two comments. First, it is important to remember that although we changed codons in our reporters, we did not change any amino acids. We changed codons only to synonymous codons. Thus at least one of the reviewer’s possible confounding effects—interactions of the nascent peptide chain with the exit channel of the ribosome—does not apply. However, of course, the mRNA nucleotide sequence is altered, and this would cause a change in mRNA structure or abundance, which could matter. We agree this is a limitation to our approach. However, to fully address it, we feel it would be necessary to examine a really large number of quite different sequences, which is beyond the scope of this work. Furthermore, mRNAs with low secondary structure at the 5’ end probably have relatively high rates of initiation, and also relatively high rates of elongation, and it might be quite difficult to disentangle these. But in neither case is there an argument that slow initial translation is efficient. Accurate measurement of mRNA levels would be helpful, but would not disentangle rates of initiation from rates of elongation as causes of changes in expression.

      Reviewer 2 was concerned that the conservation scores for the 5’ 40 amino acids, and the 3’ 40 amino acids were similar, but slow translation was only statistically significant for the 5’ 40 amino acids. As we say in the manuscript, we are also puzzled by this. We note that 3’ translation is statistically slow, if one looks over the last 100 amino acids. Our best effort at an explanation is a sort of reverse-Tuller explanation: that in the last 40 amino acids, the new slow codons created by genome plasticity are fairly quickly removed by purifying selection, but that in the first 40 amino acids, for genes that need to be expressed at low levels, purifying selection against slow codons is reduced, because poor translation is actually advantageous for these genes. To expand on this a bit, we feel that the 5000 or so proteins of the proteome have to be expressed in the correct stoichiometric ratios, and that poor translation can be a useful tool to help achieve this. In this explanation, slow translation at the 5’ end is bad for translation (in agreement with our reporter experiments), but can be good for the organism, when it occurs in front of a gene that needs to be expressed poorly. Whereas, in Tuller, slow translation at the 5’ end is good for translation.

      Reviewer 2 wondered whether the N-terminal fusion peptide affects GFP fluorescence in our reporter. This specific reporter, with this N-terminus, has been characterized by Dean and Grayhack (2012), and by Gamble et al. (2016), and the idea that a super-folder GFP reporter is not greatly affected by N-terminal fusions is based on the work of Pedelacq (2006). None of these papers show whether this N-terminal fusion might have some effect, but together, they provide good reason to think that any effect would be small. These citations have been added.

    1. Author response:

      Reviewer #1 (Public Review):

      Abbasi et al. assess in this MEG study the directed connectivity of both cortical and subcortical regions during continuous speech production and perception. The authors observed bidirectional connectivity patterns between speech-related cortical areas as well as subcortical areas in production and perception. Interestingly, they found in speaking low-frequency connectivity from subcortical (the right cerebellum) to cortical (left superior temporal) areas, while connectivity from the cortical to subcortical areas was in the high frequencies. In listening a similar cortico-subcortical connectivity pattern was observed for the low frequencies, but the reversed connectivity in the higher frequencies was absent.

      The work by Abbasi and colleagues addresses a relevant, novel topic, namely understanding the brain dynamics between speaking and listening. This is important because traditionally production and perception of speech and language are investigated in a modality-specific manner. To have a more complete understanding of the neurobiology underlying these different speech behaviors, it is key to also understand their similarities and differences. Furthermore, to do so, the authors utilize state-of-the-art directed connectivity analyses on MEG measurements, providing a quite detailed profile of cortical and subcortical interactions for the production and perception of speech. Importantly, and perhaps most interesting in my opinion, is that the authors find evidence for frequency-specific directed connectivity, which is (partially) different between speaking and listening. This could suggest that both speech behaviors rely (to some extent) on similar cortico-cortical and cortico-subcortical networks, but different frequency-specific dynamics.

      These elements mentioned above (investigation of both production and perception, both cortico-cortical and cortico-subcortical connectivity is considered, and observing frequency-specific connectivity profiles within and between speech behaviors), make for important novel contributions to the field. Notwithstanding these strengths, I find that they are especially centered on methodology and functional anatomical description, but that precise theoretical contributions for neurobiological and cognitive models of speech are less transparent. This is in part because the study compares speech production and perception in general, but no psychophysical or psycholinguistic manipulations are considered. I also have some critical questions about the design which may pose some confounds in interpreting the data, especially with regard to comparing production and perception.

      (1) While the cortico-cortical and cortico-subcortical connectivity profiles highlighted in this study and the depth of the analyses are impressive, what these data mean for models of speech processing remains on the surface. This is in part due, I believe, to the fact that the authors have decided to explore speaking and listening in general, without targeting specific manipulations that help elucidate which aspects of speech processing are relevant for the particular connectivity profiles they have uncovered. For example, the frequency-specific directed connectivity is it driven by low-level psychophysical attributes of the speech or by more cognitive linguistic properties? Does it relate to the monitoring of speech, timing information, and updating of sensory predictions? Without manipulations trying to target one or several of these components, as some of the referenced work has done (e.g., Floegel et al., 2020; Stockert et al., 2021; Todorović et al., 2023), it is difficult to draw concrete conclusions as to which representations and/or processes of speech are reflected by the connectivity profiles. An additional disadvantage of not having manipulations within each speech behavior is that it makes the comparison between listening and speaking harder. That is, speaking and listening have marked input-output differences which likely will dominate any comparison between them. These physically driven differences (or similarities for that matter; see below) can be strongly reduced by instead exploring the same manipulations/variables between speaking and listening. If possible (if not to consider for future work), it may be interesting to score psychophysical (e.g., acoustic properties) or psycholinguistic (e.g., lexical frequency) information of the speech and see whether and how the frequency-specific connectivity profiles are affected by it.

      We thank the reviewer for pointing this out. The current study is indeed part of a larger project investigating the role of the internal forward model in speech perception and production. In the original, more comprehensive study, we also included a masked condition where participants produced speech as usual, but their auditory perception was masked. This allowed us to examine how the internal forward model behaves when it doesn't receive the expected sensory consequences of generated speech. However, for the current study, we focused solely on data from the speaking and listening conditions due to its specific research question. We agree that further manipulations would be interesting. However, for this study our focus was on natural speech and we avoided other manipulations (beyond masked speech) so that we can have sufficiently long recording time for the main speaking and listening conditions.

      (2) Recent studies comparing the production and perception of language may be relevant to the current study and add some theoretical weight since their data and interpretations for the comparisons between production and perception fit quite well with the observations in the current work. These studies highlight that language processes between production and perception, specifically lexical and phonetic processing (Fairs et al., 2021), and syntactic processing (Giglio et al., 2024), may rely on the same neural representations, but are differentiated in their (temporal) dynamics upon those shared representations. This is relevant because it dispenses with the classical notion in neurobiological models of language where production and perception rely on (partially) dissociable networks (e.g., Price, 2010). Rather those data suggest shared networks where different language behaviors are dissociated in their dynamics. The speech results in this study nicely fit and extend those studies and their theoretical implications.

      We thank the reviewer for the suggestion and we will include these references and the points made by the reviewer in our revised manuscript.

      (3) The authors align the frequency-selective connectivity between the right cerebellum and left temporal speech areas with recent studies demonstrating a role for the right cerebellum for the internal modelling in speech production and monitoring (e.g., Stockert et al., 2021; Todorović et al., 2023). This link is indeed interesting, but it does seem relevant to point out that at a more specific scale, it does not concern the exact same regions between those studies and the current study. That is, in the current study the frequency-specific connectivity with temporal regions concerns lobule VI in the right cerebellum, while in the referenced work it concerns Crus I/II. The distinction seems relevant since Crus I/II has been linked to the internal modelling of more cognitive behavior, while lobule VI seems more motor-related and/or contextual-related (e.g., D'Mello et al., 2020; Runnqvist et al., 2021; Runnqvist, 2023).

      We thank the reviewer for their insightful comment. The reference was intended to provide evidence for the role of the cerebellum in internal modelling in speech. We do not claim that we have the spatial resolution with MEG to reliably spatially resolve specific parts of the cerebellum.

      (4) On the methodological side, my main concern is that for the listening condition, the authors have chosen to play back the speech produced by the participants in the production condition. Both the fixed order as well as hearing one's own speech as listening condition may produce confounds in data interpretation, especially with regard to the comparison between speech production and perception. Could order effects impact the observed connectivity profiles, and how would this impact the comparison between speaking and listening? In particular, I am thinking of repetition effects present in the listening condition as well as prediction, which will be much more elevated for the listening condition than the speaking condition. The fact that it also concerns their own voice furthermore adds to the possible predictability confound (e.g., Heinks-Maldonado et al., 2005). In addition, listening to one's speech which just before has been articulated may, potentially strategically even, enhance inner speech and "mouthing" in the participants, hereby thus engaging the production mechanism. Similarly, during production, the participants already hear their own voice (which serves as input in the subsequent listening condition). Taken together, both similarities or differences between speaking and listening connectivity may have been due to or influenced by these order effects, and the fact that the different speech behaviors are to some extent present in both conditions.

      This is a valid point raised by the reviewer. By listening to their own previously produced speech, our participants might have anticipated and predicted the sentences easier. However, during designing our experiment, we tried to lower the chance of this anticipation by several steps. First, participants were measured in separate sessions for speech production and perception tasks. There were always several days' intervals between performing these two conditions. Secondly, our questions were mainly about a common/general topic. Consequently, participants may not remember their answers completely.

      Importantly, using the same stimulus material for speaking and listening guaranteed that there was no difference in the low-level features of the material for both conditions that could have affected the results of our statistical comparison.

      Due to bone conduction, hearing one’s unaltered own speech from a recording may seem foreign and could lead to unwanted emotional reactions e.g. embarrassment, so participants were asked whether they heard their own voice in a recording already (e.g. from a self-recorded voice-message in WhatsApp) which most of them confirmed. Participants were also informed that they were going to hear themselves during the measurement to further reduce unwanted psychophysiological responses.

      (5) The ability of the authors to analyze the spatiotemporal dynamics during continuous speech is a potentially important feat of this study, given that one of the reasons that speech production is much less investigated compared to perception concerns motor and movement artifacts due to articulation (e.g., Strijkers et al., 2010). Two questions did spring to mind when reading the authors' articulation artifact correction procedure: If I understood correctly, the approach comes from Abbasi et al. (2021) and is based on signal space projection (SSP) as used for eye movement corrections, which the authors successfully applied to speech production. However, in that study, it concerned the repeated production of three syllables, while here it concerns continuous speech of full words embedded in discourse. The articulation and muscular variance will be much higher in the current study compared to three syllables (or compared to eye movements which produce much more stable movement potentials compared to an entire discourse). Given this, I can imagine that corrections of the signal in the speaking condition were likely substantial and one may wonder (1) how much signal relevant to speech production behavior is lost?; (2) similar corrections are not necessary for perception, so how would this marked difference in signal processing affect the comparability between the modalities?

      One of the results of our previous study (Abbasi et al., 2021) was that the artefact correction was not specific to individual syllables but generalised across syllables. Also, the repeated production of syllables was associated with substantial movements of the articulators mimicking those observed during naturalistic speaking. We therefore believe that the artefact rejection is effective during speaking. We also checked this by investigating speech related coherence in brain parcels in spatial proximity to the articulators. In our previous study we also show that the correction method retains neural activity to a very large degree. We are therefore confident that speaking and listening conditions can be compared and that the loss of true signals from correcting the speaking data will be minor.

      References:

      • Abbasi, O., Steingräber, N., & Gross, J. (2021). Correcting MEG artifacts caused by overt speech. Frontiers in Neuroscience, 15, 682419.

      • D'Mello, A. M., Gabrieli, J. D., & Nee, D. E. (2020). Evidence for hierarchical cognitive control in the human cerebellum. Current Biology, 30(10), 1881-1892.

      • Fairs, A., Michelas, A., Dufour, S., & Strijkers, K. (2021). The same ultra-rapid parallel brain dynamics underpin the production and perception of speech. Cerebral Cortex Communications, 2(3), tgab040.

      • Floegel, M., Fuchs, S., & Kell, C. A. (2020). Differential contributions of the two cerebral hemispheres to temporal and spectral speech feedback control. Nature Communications, 11(1), 2839.

      • Giglio, L., Ostarek, M., Sharoh, D., & Hagoort, P. (2024). Diverging neural dynamics for syntactic structure building in naturalistic speaking and listening. Proceedings of the National Academy of Sciences, 121(11), e2310766121.

      • Heinks‐Maldonado, T. H., Mathalon, D. H., Gray, M., & Ford, J. M. (2005). Fine‐tuning of auditory cortex during speech production. Psychophysiology, 42(2), 180-190.

      • Price, C. J. (2010). The anatomy of language: a review of 100 fMRI studies published in 2009. Annals of the new York Academy of Sciences, 1191(1), 62-88.

      • Runnqvist, E., Chanoine, V., Strijkers, K., Pattamadilok, C., Bonnard, M., Nazarian, B., ... & Alario, F. X. (2021). Cerebellar and cortical correlates of internal and external speech error monitoring. Cerebral Cortex Communications, 2(2), tgab038.

      • Runnqvist, E. (2023). Self-monitoring: The neurocognitive basis of error monitoring in language production. In Language production (pp. 168-190). Routledge.

      • Stockert, A., Schwartze, M., Poeppel, D., Anwander, A., & Kotz, S. A. (2021). Temporo-cerebellar connectivity underlies timing constraints in audition. Elife, 10, e67303.

      • Strijkers, K., Costa, A., & Thierry, G. (2010). Tracking lexical access in speech production: electrophysiological correlates of word frequency and cognate effects. Cerebral cortex, 20(4), 912-928.

      • Todorović, S., Anton, J. L., Sein, J., Nazarian, B., Chanoine, V., Rauchbauer, B., ... & Runnqvist, E. (2023). Cortico-cerebellar monitoring of speech sequence production. Neurobiology of Language, 1-21.

      Reviewer #2 (Public Review):

      Summary:

      The authors re-analyse MEG data from a speech production and perception study and extend their previous Granger causality analysis to a larger number of cortical-cortical and in particular cortical-subcortical connections. Regions of interest were defined by means of a meta-analysis using Neurosynth.org and connectivity patterns were determined by calculating directed influence asymmetry indices from the Granger causality analysis results for each pair of brain regions. Abbasi et al. report feedforward signals communicated via fast rhythms and feedback signals via slow rhythms below 40 Hz, particularly during speaking. The authors highlight one of these connections between the right cerebellum lobule VI and auditory association area A5, where in addition the connection strength correlates negatively with the strength of speech tracking in the theta band during speaking (significant before multiple comparison correction). Results are interpreted within a framework of active inference by minimising prediction errors.

      While I find investigating the role of cortical-subcortical connections in speech production and perception interesting and relevant to the field, I am not yet convinced that the methods employed are fully suitable to this endeavour or that the results provide sufficient evidence to make the strong claim of dissociation of bottom-up and top-down information flow during speaking in distinct frequency bands.

      Strengths:

      The investigation of electrophysiological cortical-subcortical connections in speech production and perception is interesting and relevant to the field. The authors analyse a valuable dataset, where they spent a considerable amount of effort to correct for speech production-related artefacts. Overall, the manuscript is well-written and clearly structured.

      Weaknesses:

      The description of the multivariate Granger causality analysis did not allow me to fully grasp how the analysis was performed and I hence struggled to evaluate its appropriateness. Knowing that (1) filtered Granger causality is prone to false positives and (2) recent work demonstrates that significant Granger causality can simply arise from frequency-specific activity being present in the source but not the target area without functional relevance for communication (Schneider et al. 2021) raises doubts about the validity of the results, in particular with respect to their frequency specificity. These doubts are reinforced by what I perceive as an overemphasis on results that support the assumption of specific frequencies for feedforward and top-down connections, while findings not aligning with this hypothesis appear to be underreported. Furthermore, the authors report some main findings that I found difficult to reconcile with the data presented in the figures. Overall, I feel the conclusions with respect to frequency-specific bottom-up and top-down information flow need to be moderated and that some of the reported findings need to be checked and if necessary corrected.

      Major points

      (1) I think more details on the multivariate GC approach are needed. I found the reference to Schaum et al., 2021 not sufficient to understand what has been done in this paper. Some questions that remained for me are:

      (i) Does multivariate here refer to the use of the authors' three components per parcel or to the conditioning on the remaining twelve sources? I think the latter is implied when citing Schaum et al., but I'm not sure this is what was done here?

      If it was not: how can we account for spurious results based on indirect effects?

      Yes, multivariate refers to the three components.

      (ii) Did the authors check whether the GC of the course-target pairs was reliably above the bias level (as Schaum et. al. did for each condition separately)? If not, can they argue why they think that their results would still be valid? Does it make sense to compute DAIs on connections that were below the bias level? Should the data be re-analysed to take this concern into account?

      We performed statistics on DAI and believe that this is a valid approach. We argue that random GC effects would not survive our cluster-corrected statistics.

      (iii) You may consider citing the paper that introduced the non-parametric GC analysis (which Schaum et al. then went on to apply): Dhamala M, Rangarajan G, Ding M. Analyzing Information Flow in Brain Networks with Nonparametric Granger Causality. Neuroimage. 2008; 41(2):354-362. https://doi.org/10.1016/j.neuroimage.2008.02. 020

      Thanks, we will add this reference in the revised version.

      (2) GC has been discouraged for filtered data as it gives rise to false positives due to phase distortions and the ineffectiveness of filtering in the information-theoretic setting as reducing the power of a signal does not reduce the information contained in it (Florin et al., 2010; Barnett and Seth, 2011; Weber et al. 2017; Pinzuti et al., 2020 - who also suggest an approach that would circumvent those filter-related issues). With this in mind, I am wondering whether the strong frequency-specific claims in this work still hold.

      This must be a misunderstanding. We are aware of the problem with GC on filtered data. But GC was here computed on broadband data and not in individual frequency bands.

      (3) I found it difficult to reconcile some statements in the manuscript with the data presented in the figures:

      (i) Most notably, the considerable number of feedforward connections from A5 and STS that project to areas further up the hierarchy at slower rhythms (e.g. L-A5 to R-PEF, R-Crus2, L CB6 L-Tha, L-FOP and L-STS to R-PEF, L-FOP, L-TOPJ or R-A5 as well as R-STS both to R-Crus2, L-CB6, L-Th) contradict the authors' main message that 'feedback signals were communicated via slow rhythms below 40 Hz, whereas feedforward signals were communicated via faster rhythms'. I struggled to recognise a principled approach that determined which connections were highlighted and reported and which ones were not.

      (ii) "Our analysis also revealed robust connectivity between the right cerebellum and the left parietal cortex, evident in both speaking and listening conditions, with stronger connectivity observed during speaking. Notably, Figure 4 depicts a prominent frequency peak in the alpha band, illustrating the specific frequency range through which information flows from the cerebellum to the parietal areas." There are two peaks discernible in Figure 4, one notably lower than the alpha band (rather theta or even delta), the other at around 30 Hz. Nevertheless, the authors report and discuss a peak in the alpha band.

      (iii) In the abstract: "Notably, high-frequency connectivity was absent during the listening condition." and p.9 "In contrast with what we reported for the speaking condition, during listening, there is only a significant connectivity in low frequency to the left temporal area but not a reverse connection in the high frequencies."

      While Fig. 4 shows significant connectivity from R-CB6 to A5 in the gamma frequency range for the speaking, but not for the listening condition, interpreting comparisons between two effects without directly comparing them is a common statistical mistake (Makin and Orban de Xivry). The spectrally-resolved connectivity in the two conditions actually look remarkably similar and I would thus refrain from highlighting this statement and indicate clearly that there were no significant differences between the two conditions.

      (iv) "This result indicates that in low frequencies, the sensory-motor area and cerebellum predominantly transmit information, while in higher frequencies, they are more involved in receiving it."

      I don't think that this statement holds in its generality: L-CB6 and R-3b both show strong output at high frequencies, particularly in the speaking condition. While they seem to transmit information mainly to areas outside A5 and STS these effects are strong and should be discussed.

      We appreciate the reviewer's thoughtful comments. We acknowledge that not all connectivity patterns strictly adhere to the initial observation regarding feedback and feedforward communication. It's true that our primary focus was on interactions between brain regions known to be crucial for speech prediction, including auditory, somatosensory, and cerebellar areas. However, we also presented connectivity patterns across other regions to provide a more comprehensive picture of the speech network. We believe this broader perspective can be valuable for future research directions.

      Regarding the reviewer's observation about the alpha band peak in Figure 4, we agree that a closer examination reveals the connectivity from right cerebellum to the left parietal is in a wider low frequency range. We will refrain from solely emphasizing the alpha band and acknowledge the potential contribution of lower frequencies to cerebellar-parietal communication.

      We also appreciate the reviewer highlighting the need for a more nuanced interpretation of the listening condition connectivity compared to the speaking condition. The reviewer is correct in pointing out that while Figure 4 suggests a high-frequency connectivity from L-A5 to R-CB only in the speaking condition, a direct statistical comparison between conditions might not reveal a significant difference. We will revise the manuscript to clarify this point.

      Finally, a closer examination of Figure 3 revealed that the light purple and dark green edges in the speaking condition for R-CB6 and L-3b suggest outgoing connections at low frequencies, while other colored edges indicate information reception at high frequencies. We acknowledge that exceptions to this directional pattern might exist and warrant further investigation in future studies.

      (4) "However, definitive conclusions should be drawn with caution given recent studies raising concerns about the notion that top-down and bottom-up signals can only be transmitted via separate frequency channels (Ferro et al., 2021; Schneider et al., 2021; Vinck et al., 2023)."

      I appreciate this note of caution and think it would be useful if it were spelled out to the reader why this is the case so that they would be better able to grasp the main concerns here. For example, Schneider et al. make a strong point that we expect to find Granger-causality with a peak in a specific frequency band for areas that are anatomically connected when the sending area shows stronger activity in that band than the receiving one, simply because of the coherence of a signal with its own linear projection onto the other area. The direction of a Granger causal connection would in that case only indicate that one area shows stronger activity than the other in the given frequency band. I am wondering to what degree the reported connectivity pattern can be traced back to regional differences in frequency-specific source strength or to differences in source strength across the two conditions.

      This is indeed an important point. That is why we are discussing our results with great caution and specifically point the reader to the relevant literature. We are indeed thinking about a future study where we investigate this connectivity using other connectivity metrics and a detailed consideration of power.

      Reviewer #3 (Public Review):

      In the current paper, Abbasi et al. aimed to characterize and compare the patterns of functional connectivity across frequency bands (1 Hz - 90 Hz) between regions of a speech network derived from an online meta-analysis tool (Neurosynth.org) during speech production and perception. The authors present evidence for complex neural dynamics from which they highlight directional connectivity from the right cerebellum to left superior temporal areas in lower frequency bands (up to beta) and between the same regions in the opposite direction in the (lower) high gamma range (60-90 Hz). Abbasi et al. interpret their findings within the predictive coding framework, with the cerebellum and other "higher-order" (motor) regions transmitting top-down sensory predictions to "lower-order" (sensory) regions in the lower frequencies and prediction errors flowing in the opposite direction (i.e., bottom-up) from those sensory regions in the gamma band. They also report a negative correlation between the strength of this top-down functional connectivity and the alignment of superior temporal regions to the syllable rate of one's speech.

      Strengths:

      (1) The comprehensive characterization of functional connectivity during speaking and listening to speech may be valuable as a first step toward understanding the neural dynamics involved.

      (2) The inclusion of subcortical regions and connectivity profiles up to 90Hz using MEG is interesting and relatively novel.

      (3) The analysis pipeline is generally adequate for the exploratory nature of the work.

      Weaknesses:

      (1) The work is framed as a test of the predictive coding theory as it applies to speech production and perception, but the methodological approach is not suited to this endeavor.

      We agree that we cannot provide definite evidence for predictive coding in speech production and perception and we believe that we do not make that claim in the manuscript. However, our results are largely consistent with what can be expected based on predictive coding theory.

      (2) Because of their theoretical framework, the authors readily attribute roles or hierarchy to brain regions (e.g., higher- vs lower-order) and cognitive functions to observed connectivity patterns (e.g., feedforward vs feedback, predictions vs prediction errors) that cannot be determined from the data. Thus, many of the authors' claims are unsupported.

      We will revise the manuscript to more clearly differentiate our results (e.g. directed Granger-Causality from A to B) from their interpretation (potentially indicating feedforward or feedback signals).

      (3) The authors' theoretical stance seems to influence the presentation of the results, which may inadvertently misrepresent the (otherwise perfectly valid; cf. Abbasi et al., 2023) exploratory nature of the study. Thus, results about specific regions are often highlighted in figures (e.g., Figure 2 top row) and text without clear reasons.

      Our connectograms reveal a multitude of results that we hope is interesting to the community. At the same time the wealth of findings poses a problem for describing them. We did not see a better way then to highlight specific connections of interest.

      (4) Some of the key findings (e.g., connectivity in opposite directions in distinct frequency bands) feature in a previous publication and are, therefore, interesting but not novel.

      We actually see this as a strength of the current manuscript. The computation of connectivity is here extended to a much larger sample of brain areas. It is reassuring to see that the previously reported results generalise to other brain areas.

      (5) The quantitative comparison between speech production and perception is interesting but insufficiently motivated.

      We thank the reviewer for this comment. We have addressed that in detail in response to the point (1&4) of reviewer 1.

      (6) Details about the Neurosynth meta-analysis and subsequent selection of brain regions for the functional connectivity analyses are incomplete. Moreover, the use of the term 'Speech' in Neurosynth seems inappropriate (i.e., includes irrelevant works, yielding questionable results). The approach of using separate meta-analyses for 'Speech production' and 'Speech perception' taken by Abbasi et al. (2023) seems more principled. This approach would result, for example, in the inclusion of brain areas such as M1 and the BG that are relevant for speech production.

      We agree that there are inherent limitations in automated meta-analysis tools such as Neurosynth. Papers are used in the meta-analysis that might not be directly relevant. However, Neurosynth has proven its usefulness over many years and has been used in many studies. We also agree that our selection of brain areas is not complete. But Granger Causality analysis of every pair of ROIs leads to complex results and we had to limit our selection of areas.

      (7) The results involving subcortical regions are central to the paper, but no steps are taken to address the challenges involved in the analysis of subcortical activity using MEG. Additional methodological detail and analyses would be required to make these results more compelling. For example, it would be important to know what the coverage of the MEG system is, what head model was used for the source localization of cerebellar activity, and if specific preprocessing or additional analyses were performed to ensure that the localized subcortical activity (in particular) is valid.

      There is a large body of evidence demonstrating that MEG can record signals from deep brain areas such as thalamus and cerebellum including Attal & Schwarz 2013, Andersen et al, Neuroimage 2020; Piastra et al., 2020; Schnitzler et al., 2009. These and other studies provide evidence that state-of-the-art recording (with multichannel SQUID systems) and analysis is sufficient to allow reconstruction of subcortical areas. However, spatial resolution is clearly reduced for these deep areas. We will add a statement in the revised manuscript to acknowledge this limitation.

      (8) The results and methods are often detailed with important omissions (a speech-brain coupling analysis section is missing) and imprecisions (e.g., re: Figure 5; the Connectivity Analysis section is copy-pasted from their previous work), which makes it difficult to understand what is being examined and how. (It is also not good practice to refer the reader to previous publications for basic methodological details, for example, about the experimental paradigm and key analyses.) Conversely, some methodological details are given, e.g., the acquisition of EMG data, without further explanation of how those data were used in the current paper.

      We will revise the relevant sections of the manuscript.

      (9) The examination of gamma functional connectivity in the 60 - 90 Hz range could be better motivated. Although some citations involving short-range connectivity in these frequencies are given (e.g., within the visual system), a more compelling argument for looking at this frequency range for longer-range connectivity may be required.

      Given previous evidence of connectivity in the gamma band we think that it would be a weakness to exclude this frequency band from analysis.

      (10) The choice of source localization method (linearly constrained minimum variance) could be explained, particularly given that other methods (e.g. dynamic imaging of coherent sources) were specifically designed and might potentially be a better alternative for the types of analyses performed in the study.

      Both LCMV and DICS are beamforming methods. We used LCMV because we wanted used Granger Causality which requires broadband signals. DICS would only provide frequency-specific band-limited signals.

      (11) The mGC analysis needs to be more comprehensively detailed for the reader to be able to assess what is being reported and the strength of the evidence. Relatedly, first-level statistics (e.g., via estimation of the noise level) would make the mGC and DAI results more compelling.

      We perform group-level cluster-based statistics on mGC while correcting for multiple comparisons across frequency bands and brain parcels and report only significant results. This is an established approach that is routinely used in this type of studies.

      (12) Considering the exploratory nature of the study, it is essential for other researchers to continue investigating and validating the results presented in the current manuscript. Thus, it is concerning that data and scripts are not fully and openly available. Data need not be in its raw state to be shared and useful, which circumvents the stated data privacy concerns.

      We acknowledge the reviewer's concern regarding the full availability of the dataset. Due to privacy limitations on the collected data, we are unable to share it publicly at this time. However, to promote transparency and enable further exploration, we have provided the script used for data analysis and an example dataset. This example dataset should provide a clear understanding of the data structure and variables used in the analysis. Additionally, we are happy to share the complete dataset upon request from research teams interested in performing in-depth secondary analyses.

    1. eLife assessment

      This important work offers a thorough exploration of the molecular features of different cell types within the mouse vomeronasal organ, including the expression of chemosensory receptors, using single-cell transcriptomics. The data are thoughtfully analyzed and presented, although the evidence is incomplete and only partially supports some of the claims made by the authors.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors comprehensively present data from single-cell RNA sequencing and spatial transcriptomics experiments of the juvenile male and female mouse vomeronasal organ, with a particular emphasis on the neuronal populations found in this sensory tissue. The use of these two methods effectively maps the locations of relevant cell types in the vomeronasal organ at a level of depth beyond what is currently known. Targeted analysis of the neurons in the vomeronasal organ produced several important findings, notably the common co-expression of multiple vomeronasal type 1 receptors (V1Rs), vomeronasal type 2 receptors (V2Rs), and both V1R+V2Rs by individual neurons, as well as the presence of a small but noteworthy population of neurons expressing olfactory receptors (ORs) and associated signal transduction molecules. Additionally, the authors identify transcriptional patterns associated with neuronal development/maturation, producing lists of genes that can be used and/or further investigated by the field. Finally, the authors report the presence of coordinated combinatorial expression of transcription factors and axon guidance molecules associated with multiple neuronal types, providing the framework for future studies aimed at understanding how these patterns relate to the complex glomerular organization in the accessory olfactory bulb. Several of these conclusions have been reached by previous studies, partially limiting the overall impact of the current work. However, when combined, these results provide important insights into the cellular diversity in the vomeronasal organ that are likely to support multiple future studies of the vomeronasal system.

      Strengths:

      The comprehensive analysis of the data provides a wealth of information for future research into vomeronasal organ function. The targeted analysis of neuronal gene transcription demonstrates the co-expression of multiple receptors by individual neurons and confirms the presence of a population of OR-expressing neurons in the vomeronasal organ. Although many of these findings have been noted by others, the depth of analysis here validates and extends prior findings in an effective manner. The use of spatial transcriptomics to identify the locations of specific cell types is especially useful and produces a template for the field's continued research into the various cell types present in this complex sensory tissue. Overall, the manuscript's biggest strength is found in the richness of the data presented, which will not only support future work in the broader field of vomeronasal system function but also provide insights into others studying complex sensory tissues.

      Weaknesses:

      As noted above, several previous studies have identified co-expression of vomeronasal receptors by vomeronasal sensory neurons, and the expression of non-vomeronasal receptors, and this was not adequately addressed in the manuscript as presented. The inherent weaknesses of single-cell RNA sequencing studies based on the 10x Genomics platforms (need to dissociate tissues, limited depth of sequencing, etc.) are acknowledged. However, the authors document their extensive attempts to avoid making false positive conclusions through the use of software tools designed for this purpose. Because of its complexity, there are some portions of the manuscript where the data are difficult to interpret as presented, but this is a relatively minor weakness. The data resulting from the use of the Resolve Biosciences spatial transcriptomics platform are somewhat difficult to interpret, and the methods are somewhat opaque. That said, the resulting data provide useful links between transcriptional identities and cellular locations, which is not possible without the use of such tools.

    3. Reviewer #2 (Public Review):

      In their paper entitled "Molecular, Cellular, and Developmental Organization of the Mouse Vomeronasal Organ at Single Cell Resolution" Hills Jr. et al. perform single-cell transcriptomic profiling and analyze tissue distribution of a large number of transcripts in the mouse vomeronasal organ (VNO). The use of these complementary tools provides a robust approach to investigating many aspects of vomeronasal sensory neuron (VSN) biology based on transcriptomics. Harnessing the power of these techniques, the authors present the discovery of previously unidentified sensory neuron types in the mouse VNO. Furthermore, they report co-expression of chemosensory receptors from different clades on individual neurons, including the co-expression of VR and OR. Finally, they evaluated the correlation between transcription factor expression and putative surface axon guidance molecules during the development of different neuronal lineages. Based on such correlation analysis, authors further propose a putative cascade of events that could give rise to different neuronal lineages and morphological organization.

      Taken together, Hills Jr. et al. present findings on (a) cell types in the VNO, (b) novel classes of sensory neurons, (c) developmental trajectories of the neuronal linage, (d) receptor expression in VSNs, (e) co-expression of chemosensory receptors, (f) a surface molecule code for individual receptor types, and (g) transcriptional regulation of receptor and axon guidance cues. Before outlining the major strengths and weaknesses of the manuscript, we need to disclose that, while we are comfortable reviewing aspects (a) to (e) of their work, we lack the expertise to provide constructive criticism on the two last points (f) and (g). Thus, we will not comment on these.

      In general, interpretations/claims put forward by Hills Jr. et al. appear striking at first glance. Upon careful review of the manuscript, however, it becomes apparent that many of the groundbreaking discoveries lack compelling support. Several (not all) of the results presented in this work lack novelty, accurate interpretability, and corroboration. A recurrent theme throughout the manuscript is an incomplete, and somewhat misleading account of the current knowledge in the field. This is perhaps most apparent in the introductory paragraphs, where the authors present a biased report of previously published work, largely including only those results that do not overlap with their own findings, but ignoring results that would question the novelty of the data presented here. For example: "...In contrast, transcriptomic information of the VNO is rather limited (Ref 24,25)...". Indeed, transcriptomic information of the mouse VNO is limited. Here, however, the authors ignore recent reports of robust single-cell transcriptomic analysis from adult and juvenile mice. These papers are, in part, cited later in this manuscript (ref 88, 89, 90, 91), or are completely missing (doi.org/10.7554/eLife.77259). Regardless, previously published results on the same topics have to be included in the Introduction to put the background and novelty of the findings into perspective.

      General comments on (a) cell types in the VNO

      The authors performed single-cell transcriptomic analysis of a large number of cells from both adult and juvenile VNO, creating the largest dataset of its kind to date. This dataset contains a wealth of information and, once made public, could be a valuable resource to the community. However, the analysis implemented in this paper raises several questions:

      Did the authors perform any cell selectivity, or any directed dissection, to obtain mainly neuronal cells? Previous studies reported a greater proportion of non-neuronal cells. For example, while Katreddi and co-workers (ref 89) found that the most populated clusters are identified as basal cells, macrophages, pericytes, and vascular smooth muscle, Hills Jr. et al. in this work did not report such types of cells. Did the authors check for the expression of marker genes listed in Ref 89 for such cell types?

      The authors should report the marker genes used for cell annotation. This is important for data validation, comparison with other publicly available datasets, as well as future use of this dataset.<br /> The authors reported no differences between juvenile and adult samples, and between male and female samples. It is not clear how they evaluate statistically significant differences, which statistical test was used, or what parameters were evaluated.

      "Based on our transcriptomic analysis, we conclude that neurogenic activity is restricted to the marginal zone." This conclusion is quite a strong statement, given that this study was not directed to carefully study neurogenesis distribution, and when neurogenesis in the basal zone has been proposed by other works, as stated by the authors.

      General comments on (b) novel classes of sensory neurons

      The authors report at least two new types of sensory neurons in the mouse VNO, a finding of huge importance that could have a substantial impact on the field of sensory physiology. However, the evidence for such new cell types is based solely on this transcriptomic dataset and, as such, is quite weak, since many crucial morphological and physiological aspects would be missing to clearly identify them as novel cell types. As stated before, many control and confirmatory experiments, and a careful evaluation of the results presented in this work must be performed to confirm such a novel and interesting discovery. The reported "novel classes of sensory neurons" in this work could represent previously undescribed types of sensory neurons, but also previously reported cells (see below) or simply possible single-cell sequencing artefacts.

      The authors report the co-expression of V2R and Gnai2 transcripts based on sequencing data. That could dramatically change classical classifications of basal and apical VSNs. However, did the authors find support for this co-expression in spatial molecular imaging experiments?

      Canonical OSNs: The authors report a cluster of cells expressing neuronal markers and ORs and call them canonical OSN. However, VSNs expressing ORs have already been reported in a detailed study showing their morphology and location inside the sensory epithelium (References 82, 83). Such cells are not canonical OSNs since they do not show ciliary processes, they express TRPC2 channels and do not express Golf. Are the "canonical OSNs" reported in this study and the OR-expressing VSNs (ref 82, 83) different? Which parameters, other than Gnal and Cnga2 expression, support the authors' bold claim that these are "canonical OSNs"? What is the morphology of these neurons? In addition, the mapping of these "canonical OSNs" shown in Figure 2D paints a picture of the negligible expression/role of these cells (see their prediction confidence).

      Secretory VSN: The authors report another novel type of sensory neurons in the VNO and call them "secretory VSNs". Here, the authors performed an analysis of differentially expressed genes for neuronal cells (dataset 2) and found several differentially expressed genes in the sVSN cluster. However, it would be interesting to perform a gene expression analysis using the whole dataset including neuronal and non-neuronal cells. Could the authors find any marker gene that unequivocally identifies this new cell type?

      When the authors evaluated the distribution of sVSN using the Molecular Cartography technique, they found expression of sVSN in both sensory and non-sensory epithelia. How do the authors explain such unexpected expression of sensory neurons in the non-sensory epithelium?

      The low total genes count and low total reads count, combined with an "expression of marker genes for several cell types" could indicate low-quality beads (contamination) that were not excluded with the initial parameter setting. It looks like cells in this cluster express a bit of everything V1R, V2R, OR, secretory proteins...

      General comments on (c) developmental trajectories of the neuronal linage

      The authors evaluated a possible cascade of events leading to the development of different lineages of mature sensory neurons using GBCs as a starting point. They found the differential expression of several transcription factors at different stages of development. This analysis was performed correctly, and its interpretation is coherent. However, it is mysterious why the authors included only classical V1R and V2R-expressing neurons, while the novel sensory neurons, cOSN and sVSN, were not included. Furthermore, it is important to notice again the misreport of previously published works.

      The authors wrote "...the transcriptomic landscape that specifies the lineages is not known...". This statement is not completely true, or at least misleading. There are still many undiscovered aspects of the transcriptomics landscape and lineage determination in VSNs. However, authors cannot ignore previously reported data showing the landscape of neuronal lineages in VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). Expression of most of the transcription factors reported by this study (Ascl1, Sox2, Neurog1, Neurod1...) were already reported, and for some of them, their role was investigated, during early developmental stages of VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). In summary, the authors should fully include the findings from previous works (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259), clearly state what has been already reported, what is contradictory and what is new when compared with the results from this work.

      General comments on (d) receptor expression in VSNs

      The authors evaluated the expression of chemosensory receptors in the VNO and correlated receptor expression with the expression of transcription factors. The analysis of such correlation showed that, while the expression of V1Rs is mainly correlated with the expression of the transcription factor Meis2, the expression of V2Rs is correlated with the combination of many transcription factors. These results are interesting, however, the co-expression of specific V2Rs with specific transcription factors does not imply a direct implication in receptor selection. Directed experiments to evaluate the VR expression dependent on a specific transcription factor must be performed.

      This study reports that transcription factors, such as Pou2f1, Atf5, Egr1, or c-Fos could be associated with receptor choice in VSNs. However, no further evidence is shown to support this interaction. Based on these purely correlative data, it is rather bold to propose cascade model(s) of lineage consolidation.

      General comments on (e) co-expression of chemosensory receptors

      The authors use spatial molecular imaging to evaluate the co-expression of many chemosensory receptors in single VNO cells. Molecular Cartography is a powerful tool and the reported data in this work is truly interesting. The authors show some clear confirmation of previously reported V2R co-expression (Figure 5H), and new co-expression of chemosensory receptors including V1R, V2R, and Fpr (Figure 5G-K).

      However, it is difficult to evaluate and interpret the results due to the lack of cell borders in spatial molecular imaging. The inclusion of cell border delimitation in the reported images (membrane-stained or computer-based) could be tremendously beneficial for the interpretation of the results.

      It is surprising that the authors reported a new cell type expressing OR, however, they did not report the expression of ORs in Molecular Cartography technique. Did the authors evaluate the expression of OR using the cartography technique?

    4. Reviewer #3 (Public Review):

      This study presents a detailed examination of the molecular and cellular organization of the mouse VNO, unveiling new cell types, receptor co-expression patterns, lineage specification regulation, and potential associations between transcription factors, guidance molecules, and receptor types crucial for vomeronasal circuitry wiring specificity. The study identifies a novel type of VSN molecularly different from classic VSNs, which may serve as an accessory to other VSNs by secreting olfactory binding proteins and mucins in response to VNO activation. They also describe a previously undetected co-expression of multiple VRs in individual VSNs, providing an interesting view of the ongoing discussion on how receptor choice occurs in VSNs, either stochastic or deterministic. Finally, the study correlates the expression of axon guidance molecules associated with individual VRs, providing a putative molecular mechanism that specifies VSN axon projections and their connection with postsynaptic cells in the accessory olfactory bulb.

      The conclusions of this paper are well supported by data, but some aspects of data analysis and acquisition need to be clarified and extended.

      (1) The authors claim that they have identified two new classes of sensory neurons, one being a class of canonical olfactory sensory neurons (OSNs) within the VNO. This classification as canonical OSNs is based on expression data of neurons lacking the V1R or V2R markers but instead expressing ORs and signal transduction molecules, such as Gnal and Cnga2. Since OR-expressing neurons in the VNO have been previously described in many studies, it remains unclear to me why these OR-expressing cells are considered here a "new class of OSNs." Moreover, morphological features, including the presence of cilia, and functional data demonstrating the recognition of chemosignals by these neurons, are still lacking to classify these cells as OSNs akin to those present in the MOE. While these cells do express canonical markers of OSNs, they also appear to express other VSN-typical markers, such as Gnao1 and Gnai2 (Figure 2B), which are less commonly expressed by OSNs in the MOE. Therefore, it would be more precise to characterize this population as atypical VSNs that express ORs, rather than canonical OSNs.

      (2) The second new class of sensory neurons identified corresponds to a group of VSNs expressing prototypical VSN markers (including V1Rs, V2Rs, and ORs), but exhibiting lower ribosomal gene expression. Clustering analysis reveals that this cell group is relatively isolated from V1R- and V2R-expressing clusters, particularly those comprising immature VSNs. The question then arises: where do these cells originate? Considering their fewer overall genes and lower total counts compared to mature VSNs, I wonder if these cells might represent regular VSNs in a later developmental stage, i.e., senescent VSNs. While the secretory cell hypothesis is compelling and supported by solid data, it could also align with a late developmental stage scenario. Further data supporting or excluding these hypotheses would aid in understanding the nature of this new cell cluster, with a comparison between juvenile and adult subjects appearing particularly relevant in this context.

      (3) The authors' decision not to segregate the samples according to sex is understandable, especially considering previous bulk transcriptomic and functional studies supporting this approach. However, many of the highly expressed VR genes identified have been implicated in detecting sex-specific pheromones and triggering dimorphic behavior. It would be intriguing to investigate whether this lack of sex differences in VR expression persists at the single-cell level. Regardless of the outcome, understanding the presence or absence of major dimorphic changes would hold broad interest in the chemosensory field, offering insights into the regulation of dimorphic pheromone-induced behavior. Additionally, it could provide further support for proposed mechanisms of VR receptor choice in VSNs.

      (4) The expression analysis of VRs and ORs seems to have been restricted to the cell clusters associated with the neuronal lineage. Are VRs/ORs expressed in other cell types, i.e. sustentacular, HBC, or other cells?

    5. Author response:

      We would like to thank all reviewers for their time, critical evaluation, recognition, and constructive comments of the manuscript. We will revise the manuscript accordingly. Below are our point-to-point response to the comments.

      From Reviewer #1:

      “…several previous studies have identified co-expression of vomeronasal receptors by vomeronasal sensory neurons, and the expression of non-vomeronasal receptors, and this was not adequately addressed in the manuscript as presented.”

      We plan to add context and citations to the Introduction and Results sections relating to recent studies on the co-expression of vomeronasal receptors and the expression of non-vomeronasal receptors in VSNs.

      “The data resulting from the use of the Resolve Biosciences spatial transcriptomics platform are somewhat difficult to interpret, and the methods are somewhat opaque.”

      Unfortunately, detailed Molecular Cartography protocols remain proprietary at Resolve Biosciences and were not disclosed. We acknowledge this limitation. Our role in the acquisition and processing of data for this experiment is included in the current Methods section. We will clarify this in the revised manuscript. Additional figures produced by the Molecular Cartography analysis will also be added (See response to Reviewer #2, below) to the supplemental materials to help clarify interpretation of the results.

      From Reviewer #2:

      “…the authors present a biased report of previously published work, largely including only those results that do not overlap with their own findings, but ignoring results that would question the novelty of the data presented here.”

      We had no intention of misleading the readers. In fact, we have discussed discrepancies between our results with other studies. However, we inadvertently left out a critical publication in preparing the manuscript. We plan to add context and citations (where missing) relating to recent studies that use single cell RNA sequencing in the vomeronasal organ, studies relating to the co-expression of vomeronasal receptors, and studies discussing V1R/V2R lineage determination.

      “Did the authors perform any cell selectivity, or any directed dissection, to obtain mainly neuronal cells? Previous studies reported a greater proportion of non-neuronal cells. For example, while Katreddi and co-workers (ref 89) found that the most populated clusters are identified as basal cells, macrophages, pericytes, and vascular smooth muscle, Hills Jr. et al. in this work did not report such types of cells. Did the authors check for the expression of marker genes listed in Ref 89 for such cell types?”

      For VNO dissections, we removed bones and blood vessels from VNO tissue and only kept the sensory epithelium. This procedure removed vascular smooth muscle cells, pericytes, and other non-neuronal cell types, which explains differences in cell proportions between out study and previous studies. We used a DAPI/Draq5 assay to sort live/nucleated cells for sequencing and no specific markers were used for cell selection. All cells in the experiment were successfully annotated using the cell-type markers shown in Fig. 1B, save for cells from the sVSN cluster, which were novel, and required further analysis to characterize.

      “The authors should report the marker genes used for cell annotation.”

      Marker genes used for cell annotation are shown in figure 1B. A full list of all marker genes used in the cell annotation process will be provided.

      “The authors reported no differences between juvenile and adult samples, and between male and female samples. It is not clear how they evaluate statistically significant differences, which statistical test was used, or what parameters were evaluated.”

      The claims made about male/female mice and P14/P56 mice directly pertain to the distribution of clusters and cells in UMAP space as seen in Figure 1 C & D. We have indeed performed differential gene expression analysis for male/female and P14/P56 comparisons using the FindMarkers function from the Seurat R package. Although we have found significant differential expression between male and female, and between P14 and P56 animals, the genes in this list do not appear to be influential for the neuronal lineage and cell type specification or related to cell adhesion molecules, which are the main focuses of this study. Nevertheless, we plan to add these results to the supplemental materials in a revised manuscript.

      “‘Based on our transcriptomic analysis, we conclude that neurogenic activity is restricted to the marginal zone.’ This conclusion is quite a strong statement, given that this study was not directed to carefully study neurogenesis distribution, and when neurogenesis in the basal zone has been proposed by other works, as stated by the authors.”

      Eighteen slides from whole VNO sections were used in Molecular Cartography analysis, while one representative slide was used to present findings. Across all slides, GBCs, INPs, and iVSNs show a pattern of proximity to the marginal zone (MZ), with GBCs presenting nearest to the MZ and iVSNs presented furthest. We believe that the full scope of our results justifies our claim that neurogenesis is restricted to the MZ. This claim is also supported by the 2021 study by Katreddi & Forni. We will provide additional figures to further support this claim.

      “The authors report at least two new types of sensory neurons in the mouse VNO, a finding of huge importance that could have a substantial impact on the field of sensory physiology. However, the evidence for such new cell types is based solely on this transcriptomic dataset and, as such, is quite weak, since many crucial morphological and physiological aspects would be missing to clearly identify them as novel cell types. As stated before, many control and confirmatory experiments, and a careful evaluation of the results presented in this work must be performed to confirm such a novel and interesting discovery. The reported "novel classes of sensory neurons" in this work could represent previously undescribed types of sensory neurons, but also previously reported cells (see below) or simply possible single-cell sequencing artefacts.”

      The reviewer is correct that detailed morphological and physiological studies are needed to further understand these cells. This is an opinion we share. Our paper is primarily intended as a resource paper to provide access to a large-scale single-cell RNA-sequenced dataset and discoveries based on the transcriptomic data that can support and inspire ongoing and future experiments in the field. Nonetheless, we are confident that neither of the novel cell clusters are the result of sequencing artefacts. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are physically connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN (S1H) cell clusters each show distinct and self-consistent expressions of genes. Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. Additional figures for mOSN differential gene expression and gene ontology analysis results will be added to the supplemental figures.

      “The authors report the co-expression of V2R and Gnai2 transcripts based on sequencing data. That could dramatically change classical classifications of basal and apical VSNs. However, did the authors find support for this co-expression in spatial molecular imaging experiments?” 

      Genes with extremely high expression levels overwhelm signals from other genes, and therefore had to be removed from the experiment. This is a limitation of the Molecular Cartography platform. Unfortunately, Gnai2 was determined to be one of these genes and was not evaluated for this purpose.

      “Canonical OSNs: The authors report a cluster of cells expressing neuronal markers and ORs and call them canonical OSN. However, VSNs expressing ORs have already been reported in a detailed study showing their morphology and location inside the sensory epithelium (References 82, 83). Such cells are not canonical OSNs since they do not show ciliary processes, they express TRPC2 channels and do not express Golf. Are the "canonical OSNs" reported in this study and the OR-expressing VSNs (ref 82, 83) different? Which parameters, other than Gnal and Cnga2 expression, support the authors' bold claim that these are "canonical OSNs"? What is the morphology of these neurons? In addition, the mapping of these "canonical OSNs" shown in Figure 2D paints a picture of the negligible expression/role of these cells (see their prediction confidence).” 

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. After performing differential gene expression on the putative mOSN cluster, comparing with V1R and V2R VSNs, independently, GO analysis returned the top significantly enriched GO molecular function, ‘olfactory receptor activity’, and the top significantly enriched cellular component, ‘cilium’. Because we were limited to list of 100 genes in Molecular Cartography probe panel, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs.

      “Secretory VSN: The authors report another novel type of sensory neurons in the VNO and call them "secretory VSNs". Here, the authors performed an analysis of differentially expressed genes for neuronal cells (dataset 2) and found several differentially expressed genes in the sVSN cluster. However, it would be interesting to perform a gene expression analysis using the whole dataset including neuronal and non-neuronal cells. Could the authors find any marker gene that unequivocally identifies this new cell type?”

      We did not find unequivocal marker genes for sVSNs. We did perform differential analysis of the sVSN cluster with whole VNO data and with the neuronal subset, as well as against specific cell-types. We could not find a single gene that was perfectly exclusive to sVSNs. We used a combinatorial marker-gene approach to predicting sVSNs in the Molecular Cartography data. This required a larger subset of our 100 gene panel to be dedicated to genes for detecting sVSNs.

      “When the authors evaluated the distribution of sVSN using the Molecular Cartography technique, they found expression of sVSN in both sensory and non-sensory epithelia. How do the authors explain such unexpected expression of sensory neurons in the non-sensory epithelium?” 

      In our scRNA-Seq experiment, blood vessels were removed, limiting the power to distinguish between certain cell types. Because of the limited number of genes that we can probe using Molecular Cartography, the number of genes associated with sVSNs may be present in the non-sensory epithelium. This could lead to the identification of cells that may or may not be identical to the sVSNs in the non-neuronal epithelium. Indeed, further studies will need to be conducted to determine the specificity of these cells.

      “The low total genes count and low total reads count, combined with an "expression of marker genes for several cell types" could indicate low-quality beads (contamination) that were not excluded with the initial parameter setting. It looks like cells in this cluster express a bit of everything V1R, V2R, OR, secretory proteins...”

      We are confident that the putative sVSN cell cluster is not the result of low-quality cells. We performed a robust quality-control protocol, including count correction for ambient RNA with the R package, SoupX, multiplet cell detection and removal with the Python module, Scrublet, and a strict 5% mitochondrial gene expression cut-off. Furthermore, the cell clusters in question show no signs of being the result of sequencing artefacts, as they are connected in a reasonable orientation to the rest of the neuronal lineage in modular clusters in 2D and 3D UMAP space. The OSN and sVSN cell clusters each show distinct and self-consistent expressions of genes (Fig. S1H). Gene ontology (GO) analysis reveals significant GO term enrichment for both the sVSN (Fig. 2G) and mOSN clusters when compared to mature V1R and V2R VSNs, indicating functional differences. Moreover, while some genes were expressed at a lower level when compared to the canonical VSNs, others were expressed at higher levels, precluding the cause of discrepancy as resulting from an overall loss of gene counts.

      “The authors wrote ‘...the transcriptomic landscape that specifies the lineages is not known...’. This statement is not completely true, or at least misleading. There are still many undiscovered aspects of the transcriptomics landscape and lineage determination in VSNs. However, authors cannot ignore previously reported data showing the landscape of neuronal lineages in VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). Expression of most of the transcription factors reported by this study (Ascl1, Sox2, Neurog1, Neurod1...) were already reported, and for some of them, their role was investigated, during early developmental stages of VSNs (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259). In summary, the authors should fully include the findings from previous works (Ref ref 88, 89, 90, 91 and doi.org/10.7554/eLife.77259), clearly state what has been already reported, what is contradictory and what is new when compared with the results from this work.“

      This is a difference in opinion about the terminology. Transcriptomic landscape in our paper refers to the genome-wide expression by individual cells, not just individual genes. The reviewer is correct that many of the genetic specifiers have been identified, which we cited and discussed. We consider these studies as providing a “genetic” underpinning, rather than the “transcriptomic landscape” in lineage progression. We will clarify this point in the revised manuscript. 

      “…the co-expression of specific V2Rs with specific transcription factors does not imply a direct implication in receptor selection. Directed experiments to evaluate the VR expression dependent on a specific transcription factor must be performed.” 

      The reviewer is correct, and we did not claim that the co-expression of specific transcription factors indicate a direct relationship with receptor selection. We agree that further directed experiments are required to investigate this question.

      “This study reports that transcription factors, such as Pou2f1, Atf5, Egr1, or c-Fos could be associated with receptor choice in VSNs. However, no further evidence is shown to support this interaction. Based on these purely correlative data, it is rather bold to propose cascade model(s) of lineage consolidation.”

      The reviewer is correct. As any transcriptomic study will only be correlative, additional studies will be needed to unequivocally determine the mechanistic link between the transcription factors with receptor choice. Our model provides a base for these studies.

      “The authors use spatial molecular imaging to evaluate the co-expression of many chemosensory receptors in single VNO cells. […] However, it is difficult to evaluate and interpret the results due to the lack of cell borders in spatial molecular imaging. The inclusion of cell border delimitation in the reported images (membrane-stained or computer-based) could be tremendously beneficial for the interpretation of the results.”

      The most common practice for cell segmentation of spatial transcriptomics data is to determine cell borders based on nuclear staining with expansion. We have tested multiple algorithms based on recent studies, but each has its own caveat. We will clarify this point in the revised manuscript.

      “It is surprising that the authors reported a new cell type expressing OR, however, they did not report the expression of ORs in Molecular Cartography technique. Did the authors evaluate the expression of OR using the cartography technique?” 

      We were limited to a 100-gene probe panel and only included one OR, the expression was not high enough for us to substantiate any claims.

      From Reviewer #3:

      “(1) The authors claim that they have identified two new classes of sensory neurons, one being a class of canonical olfactory sensory neurons (OSNs) within the VNO. This classification as canonical OSNs is based on expression data of neurons lacking the V1R or V2R markers but instead expressing ORs and signal transduction molecules, such as Gnal and Cnga2. Since OR-expressing neurons in the VNO have been previously described in many studies, it remains unclear to me why these OR-expressing cells are considered here a "new class of OSNs." Moreover, morphological features, including the presence of cilia, and functional data demonstrating the recognition of chemosignals by these neurons, are still lacking to classify these cells as OSNs akin to those present in the MOE. While these cells do express canonical markers of OSNs, they also appear to express other VSN-typical markers, such as Gnao1 and Gnai2 (Figure 2B), which are less commonly expressed by OSNs in the MOE. Therefore, it would be more precise to characterize this population as atypical VSNs that express ORs, rather than canonical OSNs.”

      We observe OR expression in VSNs in our data; these cells cluster with VSNs. The putative mOSN cluster exhibits its own trajectory, distinct from VSN clusters. These cells express Gnal (Golf), which is not expressed in VSNs expressing ORs, nor in any other cell-type in the data. We have performed differential gene expression analysis on the putative mOSN cluster to compare with V1R and V2R VSNs. GO analysis returned the top significantly enriched GO terms include “olfactory receptor activity” and “cilium”., further supporting that these are OSNs Because we were limited to list of 100 genes in Molecular Cartography probe panels, we have prioritized the detection of canonical VNO cell-types, vomeronasal receptor co-expression, and the putative sVSNs, and were not able to include a robust analysis of the putative OSNs. With regard to Gnai2 and Go expression, we have examined our data from the OSNs dissociated from the olfactory epithelium and detected substantial expression of both. This new analysis provides additional support for our claim. We will update the information in a revised manuscript.

      “(2) The second new class of sensory neurons identified corresponds to a group of VSNs expressing prototypical VSN markers (including V1Rs, V2Rs, and ORs), but exhibiting lower ribosomal gene expression. Clustering analysis reveals that this cell group is relatively isolated from V1R- and V2R-expressing clusters, particularly those comprising immature VSNs. The question then arises: where do these cells originate? Considering their fewer overall genes and lower total counts compared to mature VSNs, I wonder if these cells might represent regular VSNs in a later developmental stage, i.e., senescent VSNs. While the secretory cell hypothesis is compelling and supported by solid data, it could also align with a late developmental stage scenario. Further data supporting or excluding these hypotheses would aid in understanding the nature of this new cell cluster, with a comparison between juvenile and adult subjects appearing particularly relevant in this context.” 

      We wholeheartedly agree with this assessment. Our initial thought was that these were senescent VSNs, but the trajectory analysis did not support this scenario, leading us to propose that these are putative secretive cells. Our analysis also shows that overall, 46% of the putative sVSNs were from the P14 sample and 54% from P56. These cells comprise roughly 6.4% of all P14 cells and 8.5% of P56 cells. In comparison, 28.4% of all cells are mature V1R VSNs at P14, but the percentage rise to 46.7% at P56. The significant presence of sVSNs at P14, and the disproportionate increase when compared with mature VSNs indicate that these are unlikely to be late developmental stage or senescent cells, although we cannot exclude these possibilities. We plan to clarify these points in the revised manuscript.   

      We did not include sVSNs in the trajectory inference analysis because of inherent uncertainty about their developmental origins. However, PCA embeddings were the basis of the pseudotime analysis, and those embeddings that do include the sVSN cluster show that it is distributed evenly between the mature V1R and V2R clusters, with all mature clusters equidistant from GBC and INP clusters, indicating that they may indeed originate from the same stem cell populations. We plan to include trajectory analysis based on this assumption in the revised manuscript.

      (3) The authors' decision not to segregate the samples according to sex is understandable, especially considering previous bulk transcriptomic and functional studies supporting this approach. However, many of the highly expressed VR genes identified have been implicated in detecting sex-specific pheromones and triggering dimorphic behavior. It would be intriguing to investigate whether this lack of sex differences in VR expression persists at the single-cell level. Regardless of the outcome, understanding the presence or absence of major dimorphic changes would hold broad interest in the chemosensory field, offering insights into the regulation of dimorphic pheromone-induced behavior. Additionally, it could provide further support for proposed mechanisms of VR receptor choice in VSNs. 

      The reviewer raised a good point. We did not observe differences between male and female, or between P14 and P56 mice in the distribution of clusters and cells in UMAP space. Indeed, our differential expression analysis has revealed significantly differentially expressed genes in both comparisons. These genes have not been implicated in lineage or cell type determination and we decided not to include the analysis in the current version. In the revised manuscript, we plan to include the results.   

      “(4) The expression analysis of VRs and ORs seems to have been restricted to the cell clusters associated with the neuronal lineage. Are VRs/ORs expressed in other cell types, i.e. sustentacular, HBC, or other cells?” 

      Sparsely expressed low counts of VR and OR genes were observed in non-neuronal cell-types. When their expression as a percentage of cell-level gene counts is considered, however, the expression is negligible when compared to the neurons. The observed expression may be explained by stochastic base-level expression, or it may be the result of remnant ambient RNA that passed filtering. We will clarify this point in the revision.

    1. eLife assessment

      This important study demonstrates a novel method for imaging glutamate receptors in situ via cryo-ET. The use of cutting-edge methods is well-described and is convincing, but there are minor concerns as to how generally this approach can be used in imaging cell surface receptors. This paper is broadly relevant to biophysicists and neuroscientists.

    2. Reviewer #1 (Public Review):

      Summary:

      Matsui et al. present an experimental pipeline for visualizing the molecular machinery of synapses in the brain, which includes numerous techniques, starting with generating labeled antibodies and recombinant mice, continuing with HPF and FIB milling, and finishing with tilt series collection and 3D image processing. This pipeline represents a breakthrough in the preparation of brain tissue for high-resolution imaging and can be used in future tomographic research to reconstruct molecular details of synaptic complexes as well as pre- and post-synaptic assemblies. This methodology can also be adapted for a broader range of tissue preparations and signifies the next step towards a better structural understanding of how molecular machineries operate in natural conditions.

      Strengths:

      The manuscript is very well written, contains a detailed description of methodology, provides nice illustrations, and will be an outstanding guide for future research.

      Weaknesses:

      None noted.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors present a method that allows for the identification and localization of molecular machinery at chemical synapses in unstained, unfixed native brain tissue slices. They believe that this approach will provide a 3D structural basis for understanding different mechanisms of synaptic transmission, plasticity, and development. To achieve this, the group used genetically engineered mouse lines and generated thin brain slices that underwent high-pressure freezing (HPF) and focused ion beam (FIB) milling. Utilizing cryo-electron tomography (cryo-ET) and integrating it with cryo-fluorescence microscopy, they achieved micrometer resolution in identifying the glutamatergic synapses along with nanometer resolution to locate AMPA receptors GluA2-subunits using Fab-AuNP conjugates. The findings are summarized with detailed examples of successfully prepared substrates for cryo-ET, specific morphological identification and localization, and the detailed structural organization of excitatory synapses, including synaptic vesicle clusters close to the postsynaptic density and in the cleft.

      Strengths:

      The study advances previous work that used cultured neurons or synaptosomes. Combining cryo-electron tomography (cryo-ET) with fluorescence-guided targeting and labeling with Fab-AuNP conjugates enabled the study of synapses and molecular structures in their native environment without chemical fixation or staining. This preserves their near-native state, offering high specificity and resolution. The methods developed are generalizable, allowing adaptation for identifying and localizing other key molecules at glutamatergic synapses and potentially useful for studying a variety of synapses and cellular structures beyond the scope of this research.

      Weaknesses

      The preparation and imaging techniques are complex and require highly specialized equipment and expertise, potentially limiting their accessibility and widespread adoption.

      Additionally, the methods might need further modifications/tweaks to study other types of synapses or molecular structures effectively.

      The reliance on genetically engineered mouse lines may again impact the generalizability of the findings.

      Similarly, the requirement of monoclonal, high-affinity antibodies/Fab fragments to specifically label receptors/proteins would limit the wider employment of these methods.

    1. eLife assessment

      This study provides the first analysis of vascular stabilization on the critical and evolutionarily conserved structure around the Circle of Willis in the brain, strengthened by using parallel in vivo and in vitro experimental approaches. The evidence supporting the claims is solid and the work will be valuable for scientists studying developmental and disease-related vascular stabilization.

    2. Reviewer #2 (Public Review):

      Summary:

      Cheng et al. explore the development of the arteries that form the circle of Willis and investigate how blood flow pulsatility influences vascular smooth muscle cell (VSMC) differentiation. Using live confocal imaging of the developing zebrafish, the authors show that endothelial cells in circle of Willis arteries transition from venous to arterial identity between 54 hours post-fertilization (hpf) and 3 days post-fertilization (dpf), and that this coincides with pdgfrb+ mural cell progenitor differentiation into acta2+ arterial VSMCs. They find that the anterior portions of the circle of Willis, including the internal carotid arteries (CaDI), establish acta2 expression earlier than posterior aspects, likely due to faster flow rate and increased pulsatility through the CaDI. Then, using computational fluid dynamics, an in vitro co-culture assay, and genetic and drug manipulations of blood flow, the authors provide evidence that pdgfrb+ differentiation is dependent upon pulsatile blood flow and klf2a activation. The results add to our understanding of vascular development and suggest that deficits in pulsatile flow could be potential drivers of arteriopathies.

      Strengths:

      (1) Longitudinal confocal imaging of live developing zebrafish makes the timeline of arterial development in the circle of Willis easy to understand. This is a strong approach to studying how vascular networks are altered with genetic and pharmacological manipulations.<br /> (2) Rigorous use of multiple techniques to test the hypothesis that pulsatile blood flow is required for smooth muscle cell differentiation. The microangiography experiment, in vitro co-culture assay, and genetic and drug manipulations of heart rate at various developmental timepoints yield outcomes that are consistent with the hypothesis.

      Weaknesses:

      (1) The authors should provide more information on how blood flow velocity and wall shear stress are calculated from circle of Willis vascular structure. It is presumed that these values are dependent upon the 3-D morphology of the vessel network, as labeled by intravenous dextran dye, but this is not clear. Small local differences in vessel diameter and shape will influence blood flow velocity, but these morphological changes are not clearly articulated. Further, it is unclear how flow input levels to the CaDI and basilar arteries are decided across time-points. In general, descriptions of the blood flow modeling are very sparse.<br /> (2) Is it possible to measure the blood flow speed empirically with line-scanning or high-speed tracking of labeled blood cells? This would provide some validation of the modeling results.<br /> (3) Does the cardiac injection of dextran itself affect the diameter or flow of the arteries, given the invasiveness of the procedure? This could be examined in fish with a transgenic endothelial label and with vs. without dextran.<br /> (4) The data from the microangiography experiment in Figure 3 does not fully support the stated results. The authors report that the CaDI had the highest blood flow speed starting from 54 hfp, but it does not appear to be higher than the other arteries at this time point. Additionally, there is not sufficient evidence that wall shear stress coincides with smooth muscle cell differentiation in the CaDI. Wall shear stress appears to be similar between 54 hpf and 3 dpf in the CaDI, only increasing between 3 dpf and 4 dpf, while differentiation is shown to begin at 3 dpf.<br /> (5) The genetic and drug manipulations of heart rate are important experiments, but more detail is required to understand the effects of the manipulations. At least, a discussion on the limitations of these manipulations is needed. For example, how does one separate the pulsatile versus nutritive effects of blood flow/heart rate reduction? It is possible that off-target or indirect effects of Nifedipine decrease smooth muscle cell proliferation, or that altered cardiac contractility fundamentally alters many aspects of vascular development other than blood flow. Nifedipine is also likely to act upon VSMC calcium handling in the circle of Willis, which may in turn affect cell maturation.<br /> (6) It is unclear if acta2 expression is conferring vascular tone, as would be expected if the cells are behaving as mature VSMCs. Does arterial diameter decrease with an increase in acta2 expression? Are acta2 positive mural cells associated with more dynamic changes in arteriole diameter under basal or stimulated conditions?

    3. Reviewer #3 (Public Review):

      Summary:

      Cheng et al. studied if and how blood flow regulates differentiation of vascular smooth muscle cells (VSMC) in the Circle of Willis (CW) in zebrafish embryos. They show that CW vessels gradually acquire arterial identity. VSMCs also undergo gradual differentiation, which correlates with blood flow velocity. Using cell culture they show that pulsatile blood flow promotes pericyte differentiation into smooth muscle cells. They further identify transcription factor klf2a as differentially regulated by blood flow, and show that klf2a inhibition results in VSMC differentiation. The authors conclude that pulsatile flow promotes VSMC differentiation through klf2a activation.

      Strengths:

      Overall this is an important study, because VSMC differentiation in CW has not been previously studied, although analogous observations regarding the role of blood flow and klf2 involvement have been previously made in other systems and other vascular beds, for example, mouse klf2 mutants, which have deficient VSMC coverage of the dorsal aorta (Wu et al., 2008, JBC 283: 3942-50). The results convincingly show that VSMC differentiation in CW depends on the blood flow, and that klf2a flow dependent function regulates VSMC differentiation.

      Weaknesses:

      (1) The provided data do not support correlation between wall shear stress (WSS) and acta2+ cell number. The number of acta2+ cells in CaDI increases dramatically between 54 hpf and 3 dpf (Fig. 2F). However, the graph provided in the response to reviewers shows that WSS in CaDI is actually lower at 3 dpf compared to 54 hpf. Authors argue that Pearson correlation analysis shows that both variables increase together, but this is calculated over the stage between 54 hpf and 4 dpf. acta2+ cells appear by 3 dpf, and at this stage WSS in CaDI is not increased (or even lower), which argues agains WSS being the cause of acta2+ cell differentiation. Furthermore, data in Fig. 3I-K show that WSS actually decreases in BCA and PCS between 54 hpf and 4 dpf, while the number of acta2+ increases in BCA and PCS by 4 dpf. This also argues against the argument that WSS affects differentiation of acta2+ cells.<br /> (2) In multiple instances, results are based on a single independent experiment (Fig. 3, Fig. 4H, I, Fig. S2 and Fig. S3) with only a few embryos analyzed in many cases. This falls short of expected standards in the field, and it is unclear if these results are reproducible.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Weaknesses to be addressed: 

      (1) More detail is required to understand the effects of genetic and drug manipulations on heart rate as these are important experiments. At the very least, a discussion on the limitations of these manipulations is needed. 

      - For example, how does one separate the pulsatile versus nutritive effects of blood flow/heartrate reduction? 

      - The conclusion that arterial SMC differentiation is driven by pulsatile blood flow needs to be toned down. Indeed, this conclusion is mainly supported by in vitro cell co-cultures exposed to laminar versus pulsatile flow. In vivo, reducing Tnnt2a expression affects cardiac contractility and blood flow does not selectively affect pulsatility. To make this conclusion, the authors would need an experimental means to selectively dampen the pulsatility of blood flow.

      We understand this concern and we toned down the statements related to the pulsatile flow of our conclusion by using 'flow' instead of 'pulsatile flow' in all text except for the in vitro co-cultures part. We also added a paragraph to discuss the limited capability of qualitatively reduce blood flow in vivo, and acknowledge that the effects of nutrients and flow reduction could not be uncoupled in live zebrafish embryos. We proposed that in the future, in vitro 3D vascular culture models may be combined with microfluidics to precisely calibrate nutrient composition in culture media, flow velocity and pulse; these methods would help address these questions more thoroughly. See page 11-12 line 312-322.

      (2) Since mural cells are sensitive to transmural pressure, could the authors elaborate on the potential role of raised intravascular pressure in SMC differentiation? This would better parallel rodents and humans. 

      We thank you for this suggestion. We added a paragraph to discuss the potential role of raised intravascular pressure in VSMC differentiation in the discussion section (see page 11 line 296-311).

      (3) The authors use nifedipine to reduce blood flow. Nifedipine is a specific and potent inhibitor of voltage-dependent calcium channels (VDCC) which are expressed in SMCs. Prior studies (PMID: 35588738) showed that VDCC blockers increased rather than inhibited SMC differentiation. Nifedipine is also likely to act upon VSMC calcium handling in the circle of Willis, which may in turn affect cell maturation. Could the authors comment on this seeming discrepancy?

      It is possible that off-target or indirect effects of Nifedipine decrease smooth muscle cell proliferation, or that altered cardiac contractility fundamentally alters aspects of vascular development other than blood flow. 

      - Additionally, it would be helpful to report the quantitative heart rate reduction achieved with Nifedipine. This would clear up concerns that the heart rate reduction is too large for normal vascular development to occur, and thus decrease proliferation rate independent of changes in blood flow pulsatility. 

      We concur with these comments, which is why our experimentation with Nifedipine is reinforced by employing an alternative, non-pharmacological strategy to inhibit blood flow: the use of morpholino against tnnt2a gene. The results with either Nifedipine or tnnt2a support the lack of VSMCs maturation. In addition, we provided the quantitative heart rate reduction achieved with Nifedipine shown in new Figure S2A-S2C, suggesting that the drug is not completely halting the heart rate but decreasing it. Nevertheless, we report that Zebrafish embryos can survive and develop a normal blood vascular system without any heartbeat. Hence, we exclude that the effect on VSMCs maturation is linked non-specifical effects caused by the loss of heartbeat. Nevertheless, we now acknowledged in our discussion the limitation of nifedipine, as it may affect VSMC through VDCCs (page 12, line 323-334).

      We also added a paragraph in the discussion section to compare nifedipine, an L-type VDCC blocker, and ML218, a T-type VDCC selective inhibitor from the previous study (Ando et al., 2022). We noted that in this previous study, the increase in VSMC differentiation only occur on anterior metencephalic central arteries (AMCtAs) that are more than 40 mm away from the BCA; these AMCtAs are much smaller than CoW arteries and have different geometry hence possible different kinetics of VSMC maturation (Ando et al., 2022) as our manuscript discovery would suggest.

      (4) The authors should provide more information on how blood flow velocity and wall shear stress are calculated from the Circle of Willis vascular structure. It is presumed that these values are dependent upon the 3-D morphology of the vessel network, as labeled by intravenous dextran dye, but this is not clear. (a second reviewer similarly comments: I was unclear how flow velocity values were obtained in Fig. 3E. Are they based on computational simulation, or are they experimentally calculated following the dextran injection?) Small local differences in vessel diameter and shape will influence blood flow velocity, but these morphological changes are not clearly articulated. Further, it is unclear how flow input levels to the CaDI and basilar arteries are decided across time points. For instance, is it possible to measure the blood flow speed empirically with line-scanning or high-speed tracking of labeled blood cells or particles? This would provide validation of the modeling results. 

      The computational fluid dynamic simulation was performed according to previous study from our lab (Barak et al., 2021). Blood flow velocity and wall shear stress are dependent upon the 3D morphology of the vessel network labeled by intravascular dextran. Details on how the computational fluid dynamic simulation was performed are added in method section page 17 line 433-449.

      Moreover, to address this reviewer concern we have now provided new experimental measurement of blood flow using the red blood cell (RBC) velocity via axial line scanning microscopy in Tg(kdrl:gfp;gata1:DsRed)zn1/sd2 zebrafish embryos at 54 hpf, 3 dpf, and 4 dpf. By using the experimental RBC velocity, we re-simulated the computational fluid dynamic. The new findings align with our conclusion and are further elaborated upon in response to this reviewer comment listed as point 6. Details on how RBC velocity calculated is added in method section page 16 line 414-431.

      (5) Does the cardiac injection of dextran itself affect the diameter of the arteries, given the invasiveness of the procedure? This could be examined in fish with a transgenic endothelial label with and without dextran. 

      Here, we performed an experiment on wildtype zebrafish at 5 days post-fertilization (dpf) with and without Dextran injection, examining the effects of Dextran injection on vessel diameters. As shown in the representative image below, the XZ panel clearly illustrates a Dextran-filled PCS vessel with no alteration in vessel size. Dextran microangiography, a technique employed to obtain vessel geometry with fluorescent microsphere, has been well established in zebrafish (Kamei et al., 2010). Our findings, demonstrating that Dextran does not affect vessel size, are consistent with previous studies utilizing Dextran microangiography.

      Author response image 1.

      (6) The data from the microangiography experiment in Figure 3 does not fully support the stated results. The authors report that the CaDI had the highest blood flow speed starting from 54 hpf, but it does not appear to be higher than the other arteries at this time point. Additionally, there is not sufficient evidence that wall shear stress coincides with smooth muscle cell differentiation in the CaDI. Wall shear stress appears to be similar between 54 hpf and 3 dpf in the CaDI, only increasing between 3 dpf and 4 dpf, while differentiation is shown to begin at 3 dpf. The authors need to address this and/or soften conclusions. 

      First, In response to this specific reviewer concern, we measured red blood cell (RBC) velocity by used axial line scanning microscopy to analyze Tg(kdrl:gfp;gata1:DsRed)zn1/sd2 zebrafish embryos (the detailed method was added in Method section in the manuscript). We replaced the computational simulated blood flow velocity by RBC velocity in new Figure 3E-3G, and re-run the computational simulated wall shear stress (WSS) using the RBC velocity in new Figure 3I-3K. We compared RBC velocity and WSS among different vessels at each time point. We confirmed that CaDI has the highest RBC velocity starting from 54 hpf to 4 dpf (new Figure 3A-3C, and 3E-3G) and found an overall increase in average WSS from 54 hpf to 4 dpf (new Figure 3A-3C, and 3H). Further, WSS in CaDI was significantly higher than BCA and PCS at 54 hpf, 3 dpf, and 4 dpf (new Figure 3A-3C, 3I-3K). Altogether, the CFD simulation suggests that CoW arteries experience different hemodynamic WSS that is associated with spatiotemporal pattern of VSMC differentiation on CoW arteries.”.  (Page 6, line 153-162)

      Second, to identify the correlation of WSS and VSMC differentiation in CaDI, we performed Pearson correlation analysis. In the image provided here, we plotted a linear regression with normalized # of acta2+ cells in CaDI and WSS with developmental stages (54 hpf, 3 and 4 dpf), and performed Pearson correlation coefficient analysis by using GraphPad Prism 10.0.3. The correlation coefficient r = 0.595, suggesting that the two variables (acta2+ cells and WSS) tend to increase together with developmental stages (54 hpf, 3 and 4 dpf).

      Author response image 2.

      Third, we softened our conclusion as the RBC velocity across CoW arteries was differentially distributed while VSMC differentiation occurred in these vessels.

      (7) It is unclear if acta2 expression is conferring vascular tone, as would be expected if the cells are behaving as mature VSMCs. Does arterial diameter decrease with an increase in acta2 expression? Are acta2-positive mural cells associated with more dynamic changes in arteriole diameter under basal or stimulated conditions? 

      Thanks for this interesting question. VSMC maturation and its vasoactivity could be further investigated in the future. Our study focused on early stage of VSMC differentiation, in which pdgfrb+ progenitors started to express VSMC marker acta2. We discussed the onset of transgelin expression and loss of abcc9 expression as markers of VSMC maturation. In addition, a previous study found that VSMC covered vessels in zebrafish brain dilate as early as 4 dpf and constrict at 6 dpf (Bahrami & Childs, 2020). Future study may focus on the association between expression of different VSMC markers and VSMC functional maturation. (page 10, line 272-279)

      (8) The authors argue that CoW vessels transition from venous to arterial identity (Fig. 1). However, kdrl is not an ideal arterial marker for this experiment as it is expressed in both arteries and veins. While it is true that many arterial beds have stronger kdrl expression than the veins, its expression in both arteries and veins changes with developmental stage, and its expression level may vary depending on the type of vessel. Therefore, showing that kdrl increases from 32 hpf - 4 dpf in CoW vessels is not convincing because its expression may increase in both venous or arterial vasculature as the vessels mature. In addition, flt4 expression is not exclusively venous; for example, it has noticeable expression in the dorsal aorta at 24-32 hpf stages. It would be helpful to confirm this transition by analyzing additional arterial and venous markers. 

      We acknowledge this and we added a paragraph to discuss the limitation. We combined loss of flt4 and increase in kdrl to establish the temporal sequence of circle of Willis morphogenesis, arterial specification, and VSMC differentiation. We acknowledge that additional arterial and venous markers need to be analyzed for a more thorough characterization of arterial specification in vertebrate brain vascular development. See page 12 line 335-341.

      (9) The authors show that acta2+ VSMCs are absent in tnnt2a MO embryos, concluding that blood flow is required for their differentiation from pericytes. However, there is no data showing that pericytes are still present in tnnt2a MO embryos. Although this has been previously shown by Ando et al 2016, it would be beneficial to confirm in the current study as this is a critical piece of evidence needed for this conclusion. 

      To determine if blood flow is dispensable for pdgfrb+ progenitor recruitment, we performed tnnt2a MO (0.35 ng/embryo) injection in Tg(pdgrb:egfp, kdrl:ras-mcherry) ncv22/s896. Loss of blood flow did not affect pdgfrb+ progenitor emergence around the CoW (new Figure S2G-S2H) at 3 days post fertilization (dpf). This is consistent with previous observation in Ando et al 2016 Figure S2C (Ando et al., 2016).

      (10) The authors show that klf2a MO injected embryos have a reduced number of VSMCs at 3 dpf but a normal number at 4 dpf (Fig. 6), concluding that klf2a is only important to initiate CaDI muscularization. If this is true, it would raise important questions about how VSMCs differentiate at a later stage in the absence of klf2a. For instance, is blood flow not required to differentiate at a later stage, or is there another factor that compensates in the absence of klf2a? The alternative explanation/ caveat is that klf2a MO loses efficacy with development, leading to the recovery of VSMCs at this stage. Therefore, it would be important to confirm this result using a genetic klf2a mutant. 

      Thank you for pointing this out.  We note that based on the klf2a reporter line, klf2a activity in CoW arterial endothelial cells is highly correlated with the number of acta2+ VSMCs in CaDI, BCA and PCS at 3 dpf (r = 0.974, new Figure S5J). Interestingly however, klf2a activity remained stable from 3 dpf to 4 dpf, well beyond initiation of VSMC differentiation. Thus, we speculate sustained klf2a expression may support further maturation of VSMCs, as acta2+ VSMCs showed distinct morphology at 4 dpf compared with 3 dpf. (Page 10, line 268-272). As for the observation that klf2a morphants have normal number of VSMCs at 4 dpf, we think that in addition to the temporary effect of morpholino, a proximal explanation is compensation by paralogous klf2b in zebrafish. We acknowledge that further characterization of CoW VSMC development in klf2a and klf2b double genetic mutants (Rasouli et al., 2018; Steed et al., 2016) may help determine whether klf2b compensates klf2a in CoW VSMC differentiation beyond 4 dpf. See page 10-11 line 292-295.

      (11) A large part of the discussion focuses on Notch and Wnt signaling, as downstream Klf2 effectors. While these are reasonable hypotheses to propose, there is no data on the involvement of these pathways in the current study. It seems excessive to speculate on detailed mechanisms of how Klf2 activates Notch and Wnt signaling in the absence of data showing that these pathways are affected in CoW vessels. Therefore, the discussion could be shortened here unless additional data can be obtained to demonstrate the involvement of these pathways in VSMCs in CoW.

      We concur and have condensed the discussion on Notch and Wnt signaling as downstream klf2 effectors.

      Minor comments: 

      (1) Line 138 "CaDI is the only vessels in the CoW receiving pulsatile arterial blood low ... ". Adding a reference to support this statement would be useful. 

      We agree and revised this sentence into ‘CaDI receive proximal arterial feed through lateral dorsal aorta from cardiac outflow tract (Isogai et al., 2001)’. It was also based on our general observation of zebrafish vascular anatomy and blood flow under a confocal microscope.

      (2) The image insets in Figs. 1A, 2A, 4E-L, 5A, 6A are quite small. Please make them larger to help the reader interpret the findings. 

      We agree. We maximized the image size to help the reader interpret the finding, and to visualize confocal images and schematics side-by-side.

      (3) The schematics in Figs. 1-2, and 4-6 are helpful, but the different cell types are difficult to see because they are small and their colors/shapes are not very distinct. 

      We agree. We increased the size and color contrast to provide better visualization of the schematics in new schematic Figures. 1-2 and 4-6.

      (4) It is stated that there are no diameter differences between different arteries, but statistics are not reported. 

      The statistics in Figure 3D were performed by ordinary two-way ANOVA followed by Tukey’s multiple comparisons test, with a single pooled variance. Here we added pairwise comparisons among vessels in the CoW. Hence when non indicated the difference are non-significant.

      (5) Figure 3F would be better visualized on a log scale, as it is difficult to see the differences between each post-fertilization timepoint. 

      We agree. In the new Figure 3H, the average wall shear stress (WSS) in CoW arteries is presented on log scale in y axis to see the differences between each post-fertilization timepoint.

      (6) Please provide more background and validation on the pericyte cell line, and their use for the questions in this study. 

      Thank you for the question, TgBAC(pdgfrb:egfp)ncv22 was generated and described by Ando et al 2016 to clarify mural cell coverage of vascular endothelium in zebrafish (Ando et al., 2016). We added a describe in the method section to provide background and validation on this pericyte line (see page 13 line 368-372).

      (7) Flow velocity and WSS changes are shown in each vessel in Figs. 3E,G. However, the comparison should be made between different types of vessels to see if there is a statistical difference and PCS, for example, which would explain differences in VSMC coverage. 

      We agreed. We compared the difference among arteries in the CoW at each developmental timepoint and performed ordinary one-way ANOVA with Tukey’s multiple comparisons test. Figure. 3E is replaced by new Figure. 3E-G and Figure. 3G is replaced by new Figure. 3I-K.

      (8) Similarly, between CaDI, the number of klf2a cells in Fig. 5B should be compared between different vessels, not between different stages of the same vessel. 

      We agree. In new Figure 5B-E, the number of klf2a+ cells per 100 μm vessel length are compared among different vessels at each developmental stage and analyzed by ordinary one-way ANOVA with Tukey’s multiple comparisons test.

      (9) When quantifying klf2+ cells in Fig. 5, it would be helpful to quantify klf2 expression level between cells in different vessels. This could be done by quantifying GFP expression in existing images. The difference in expression level may explain the variation between CaDI and PCS more accurately than just the difference in cell number. 

      The GFP expression reflect the stability of GFP protein expression and labels discrete nuclei with active klf2a expression. Hence the quantification of GFP level might not give an accurate readout of klf2a expression per se but rather of its activity. For this reason we don’t think that this experiment will add accurate measurement of klf2a expression.

      (10) Do data points in Figure 4D correspond to different cells in the same chamber experiment? If so, they cannot be treated as independent replicates. Each data point should correspond to an independent replicate experiment. 

      We agree. Now in the figure legend, we report the number of cells analyzed.

      (11) Graph placement is confusing in Figs. 4I, M. An adjacent Fig. 4G shows Nifedipine treated embryos, while the graph next to (Fig. 4I) shows acta+ cell number from tnnt2a 4 dpf experiment. Similarly, the bottom Fig. 4K tnn2a 4 dpf MO experiment has an adjacent graph Fig. 4M, which shows nifedipine treatment quantification, which makes it very confusing. 

      We agreed. We rearranged Figure 4E (representative images of control embryos at 3 dpf and 4 dpf), Figure 4F (tnnt2a MO embryos at 3 dpf and 4 dpf), Figure 4G (nifedipine treated embryos at 3 dpf and 4 dpf).

      Reference:

      Ando, K., Fukuhara, S., Izumi, N., Nakajima, H., Fukui, H., Kelsh, R. N., & Mochizuki, N. (2016). Clarification of mural cell coverage of vascular endothelial cells by live imaging of zebrafish. Development, 143(8), 1328-1339. https://doi.org/10.1242/dev.132654

      Ando, K., Tong, L., Peng, D., Vazquez-Liebanas, E., Chiyoda, H., He, L., Liu, J., Kawakami, K., Mochizuki, N., Fukuhara, S., Grutzendler, J., & Betsholtz, C. (2022). KCNJ8/ABCC9-containing K-ATP channel modulates brain vascular smooth muscle development and neurovascular coupling. Dev Cell, 57(11), 1383-1399 e1387. https://doi.org/10.1016/j.devcel.2022.04.019

      Bahrami, N., & Childs, S. J. (2020). Development of vascular regulation in the zebrafish embryo. Development, 147(10). https://doi.org/10.1242/dev.183061

      Barak, T., Ristori, E., Ercan-Sencicek, A. G., Miyagishima, D. F., Nelson-Williams, C., Dong, W., Jin, S. C., Prendergast, A., Armero, W., Henegariu, O., Erson-Omay, E. Z., Harmanci, A. S., Guy, M., Gultekin, B., Kilic, D., Rai, D. K., Goc, N., Aguilera, S. M., Gulez, B., . . . Gunel, M. (2021). PPIL4 is essential for brain angiogenesis and implicated in intracranial aneurysms in humans. Nat Med, 27(12), 2165-2175. https://doi.org/10.1038/s41591-021-01572-7

      Isogai, S., Horiguchi, M., & Weinstein, B. M. (2001). The vascular anatomy of the developing zebrafish: an atlas of embryonic and early larval development. Dev Biol, 230(2), 278-301. https://doi.org/10.1006/dbio.2000.9995

      Kamei, M., Isogai, S., Pan, W., & Weinstein, B. M. (2010). Imaging blood vessels in the zebrafish. In Methods in cell biology (Vol. 100, pp. 27-54). Elsevier.

      Rasouli, S. J., El-Brolosy, M., Tsedeke, A. T., Bensimon-Brito, A., Ghanbari, P., Maischein, H. M., Kuenne, C., & Stainier, D. Y. (2018). The flow responsive transcription factor Klf2 is required for myocardial wall integrity by modulating Fgf signaling. Elife, 7. https://doi.org/10.7554/eLife.38889

      Steed, E., Faggianelli, N., Roth, S., Ramspacher, C., Concordet, J. P., & Vermot, J. (2016). klf2a couples mechanotransduction and zebrafish valve morphogenesis through fibronectin synthesis. Nat Commun, 7, 11646. https://doi.org/10.1038/ncomms11646

    1. eLife assessment

      This important study demonstrates that combining AlphaFold2 with the author's sampling method AF2-RAVE improves protein-ligand docking for three protein kinases and their inhibitors. The evidence is compelling but would benefit from a more complete description of the methodology and a clear assessment of the method's range of applicability. The work will be of interest to researchers who work on computer-aided drug design.

    2. Reviewer #1 (Public Review):

      The development of effective computational methods for protein-ligand binding remains an outstanding challenge to the field of drug design. This impressive computational study combines a variety of structure prediction (AlphaFold2) and sampling (RAVE) tools to generate holo-like protein structures of three kinases (DDR1, Abl1, and Src kinases) for binding to type I and type II inhibitors. Of central importance to the work is the conformational state of the Asp-Phy-Gly "DFG motif" where the Asp points inward (DFG-in) in the active state and outward (DFG-out) in the inactive state. The kinases bind to type I or type II inhibitors when in the DFG-in or DFG-out states, respectively.

      It is noted that while AlphaFold2 can be effective in generating ligand-free apo protein structures, it is ineffective at generating holo-structures appropriate for ligand binding. Starting from the native apo structure, structural fluctuations are necessary to access holo-like structures appropriate for ligand binding. A variety of methods, including reduced multiple sequence alignment (rMSA), AF2-cluster, and AlphaFlow may be used to create decoy structures. However, those methods can be limited in the diversity of structures generated and lack a physics-based analysis of Boltzmann weight critical to their relative evaluation.

      To address this need, the authors combine AlphaFold2 with the Reweighted Autoencoded Variational Bayes for Enhanced Sampling (RAVE) method, to explore metastable states and create a Boltzmann ranking. With that variety of structures in hand, grid-based docking methods Glide and Induced-Fit Docking (IFD) were used to generate protein-ligand (kinase-inhibitor) complexes.

      The authors demonstrate that using AlphaFold2 alone, there is a failure to generate DFG-out structures needed for binding to type II inhibitors. By applying the AlphaFold2 with rMSA followed by RAVE (using short MD trajectories, SPIB-based collective variable analysis, and enhanced sampling using umbrella sampling), metastable DFG-out structures with Boltzmann weighting are generated enabling protein-ligand binding. Moreover, the authors found that the successful sampling of DFG-out states for one kinase (DDR1) could be used to model similar states for other proteins (Abl1 and Src kinase). The AF2RAVE approach is shown to result in a set of holo-like protein structures with a 50% rate of docking type II inhibitors.

      Overall, this is excellent work and a valuable contribution to the field that demonstrates the strengths and weaknesses of state-of-the-art computational methods for protein-ligand binding. The authors also suggest promising directions for future study, noting that potential enhancements in the workflow may result from the use of binding site prediction models and free energy perturbation calculations.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript explores the utility of AlphaFold2 (AF2) and the author's own AF2-RAVE method for drug discovery. As has been observed elsewhere, the predictive power of docking against AF2 structures is quite limited, particularly for proteins like kinases that have non-trivial conformational dynamics. However, using enhanced sampling methods like RAVE to explore beyond AF2 starting structures leads to a significant improvement.

      Strengths:

      This is a nice demonstration of the utility of the authors' previously published RAVE method.

      Weaknesses:

      My only concern is the authors' discussion of induced fit. I'm quite confident the structures discussed are present in the absence of ligand binding, consistent with conformational selection. It seems the author's own data also argues for an important role in conformational selection. It would be nice to acknowledge this instead of going along with the common practice in drug discovery of attributing any conformational changes to induced fit without thoughtful consideration of conformational selection.

    4. Reviewer #3 (Public Review):

      In this manuscript, the authors aim to enhance AlphaFold2 for protein conformation-selective drug discovery through the integration of AlphaFold2 and physics-based methods, focusing on improving the accuracy of predicting protein structures ensemble and small molecule binding of metastable protein conformations to facilitate targeted drug design.

      The major strength of the paper lies in the methodology, which includes the innovative integration of AlphaFold2 with all-atom enhanced sampling molecular dynamics and induced fit docking to produce protein ensembles with structural diversity. Moreover, the generated structures can be used as reliable crystal-like decoys to enrich metastable conformations of holo-like structures. The authors demonstrate the effectiveness of the proposed approach in producing metastable structures of three different protein kinases and perform docking with their type I and II inhibitors. The paper provides strong evidence supporting the potential impact of this technology in drug discovery. However, limitations may exist in the generalizability of the approach across other structures, especially complex structures such as protein-protein or DNA-protein complexes.

      The authors largely achieved their aims by demonstrating that the AF2RAVE-Glide workflow can generate holo-like structure candidates with a 50% successful docking rate for known type II inhibitors. This work is likely to have a significant impact on the field by offering a more precise and efficient method for predicting protein structure ensemble, which is essential for designing targeted drugs. The utility of the integrated AF2RAVE-Glide approach may streamline the drug discovery process, potentially leading to the development of more effective and specific medications for various diseases.

    1. eLife assessment

      The mechanisms that ensure accurate chromosome segregation are key for genome integrity and defects therein can cause human disease. Although the involvement of MAP kinases in modulating mitosis is known, this manuscript makes a valuable contribution by going to some lengths to reveal links between Spindle Assembly Checkpoint dynamics and stress-responsive MAP-kinase pathways. The strength of the evidence is solid but there are minor weaknesses, which need to be addressed.

    2. Reviewer #1 (Public Review):

      Summary:

      This manuscript addresses two main issues:<br /> (i) do MAPKs play an important role in SAC regulation in single-cell organism such as S pombe?<br /> (ii) what is the nature of their involvement and what are their molecular targets?

      The authors have extensively used the cold-sensitive β-tubulin mutant to activate or inactivate SAC employing an arrest-release protocol. Localization of Cdc13 (cyclin B) to the SPBs is used as a readout for the SAC activation or inactivation. The roles of two major MAPK pathways i.e. stress-activated pathway (SAP) and cell integrity pathway (CIP), have been explored in this context (with CIP more extensively than SAP). Sty1Δ or pmk1Δ mutants were used to inactivate the SAP or CIP pathways and wis1DD or pek1DD expression was utilized to constitutively activate these pathways, respectively. Lowering of Slp1Cdc20 abundance (by phosphorylation of Slp1-Thr 480) is revealed as the main function of MAPK to augment the robustness of the spindle assembly checkpoint.

      Strengths:

      The experiments are generally well-conducted, and the results support the interpretations in various sections. The experimental data clearly supports some of the key conclusions:

      (1) While inactivation of SAP and CIP compromises SAC-imposed arrest, their constitutive activation delays the release from the SAC-imposed arrest.<br /> (2) CIP signaling, but not SAP signaling, attenuates Slp1Cdc20 levels.<br /> (3) Pmk1 and Cdc20 physically interact and Pmk1-docking sequences in Slp1 (PDSS) are identified and confirmed by mutational/substitution experiments.<br /> (4) Thr480 (and also S76) is identified as the residue phosphorylated by Pmk1. S28 and T31 are identified as Cdk1 phosphorylation sites. These are confirmed by mutational and other related analyses.<br /> (5) Functional aspects of the phosphorylation sites have been elucidated to some extent: (a) Phosphorylation of Slp1-T480 by Pmk1 reduces its abundance thereby augmenting the SAC-induced arrest (b) S28, T31 (also S59) are phosphorylated by Cdk1(c) K472 and K479 residues are involved in ubiquitylation of Slp.

      Weaknesses:

      (1) Cdc13 localization to SPBs has been used as a readout for SAC activation/inactivation throughout the manuscript. However, the only image showing such localization (Figure 1C) is of poor quality where the Cdc13 localization to SPBs is barely visible. This should be replaced by a better image.

      (2) The overlapping error bars in Cdc13-localization data in some figures (for instance Figure 3E and 4H) make the effect of various mutations on SAC activation/inactivation rather marginal. In some of these cases, Western-blotting data support the authors' conclusions better.

      (3) This specific point is not really a weakness but rather a loose end:<br /> One of the conclusions of this study is that MAPK (PMK1) contributes to the robustness of SAC-induced arrest by lowering the abundance of Slp1Cdc20. The authors have used pmk1Δ or constitutively activating the MAPK pathways (Pek1DD) and documented their effect on SAC activation/inactivation dynamics. It is not clear if SAC activation also leads to activation of MAPK pathways for them to contribute to the SAC robustness. To tie this loose end, the author could have checked if the MAPK pathway is also activated under the conditions when SAC is activated. Unless this is shown, one must assume that the authors are attributing the effect they observe to the basal activity of MAPKs.

      (4) This is also a loose end:<br /> The authors show that activation of stress pathways (by addition of KCl for instance) causes phosphorylation-dependent Slp1Cdc20 downregulation (Figure 6) under the SAC-activating condition. Does activation of the stress pathway cause phosphorylation-dependent Slp1Cdc20 downregulation under the non-SAC-activation condition or does it occur only under the SAC-activating condition?

      (5) Although the authors have gone to some length to identify S28 and T31 (also S59) as phosphorylation sites for Cdk1, their functional significance in the context of MAPK involvement is not yet clear. Perhaps it is outside the scope of this study to dig deeper into this aspect more than the authors have.

      (6) In its current state, the Discussion section is quite disjointed. The first section "Involvement of MAPKs in cell cycle regulation" should be in the Introduction section (very briefly, if at all). It certainly does not belong to the Discussion section. In any case, the Discussion section should be more organized with a better flow of arguments/interpretations.

    3. Reviewer #2 (Public Review):

      Summary:

      This study by Sun et al. presents a role for the S. pombe MAP kinase Pmk1 in the activation of the Spindle Assembly Checkpoint (SAC) via controlling the protein levels of APC/C activator Cdc20 (Slp1 in S. pombe). The data presented in the manuscript is thorough and convincing. The authors have shown that Pmk1 binds and phosphorylates Slp1, promoting its ubiquitination and subsequent degradation. Since Cdc20 is an activator of APC/C, which promotes anaphase entry, constitutive Pmk1 activation leads to an increased percentage of metaphase-arrested cells. The authors have used genetic and environmental stress conditions to modulate MAP kinase signalling and demonstrate their effect on APC/C activation. This work provides evidence for the role of MAP kinases in cell cycle regulation in S. pombe and opens avenues for exploration of similar regulation in other eukaryotes.

      Strengths:

      The authors have done a very comprehensive experimental analysis to support their hypothesis. The data is well represented, and including a model in every figure summarizes the data well.

      Weaknesses:

      As mentioned in the comments, the manuscript does not establish that MAP kinase activity leads to genome stability when cells are subjected to genotoxic stressors. That would establish the importance of this pathway for checkpoint activation.

    1. eLife assessment

      The study is noteworthy for its effort to achieve a deeper understanding of PTH-1 Receptor signaling. This molecular pathway which underpins the control of calcium and phosphate metabolism throughout life in land-dwelling animals, can be targeted to the therapeutic benefit of people with osteoporosis. We consider the significance of the findings in this paper to be valuable to the community of investigators working on PTH receptor and PTH ligand signaling. The strength of the evidence is solid and it could become even stronger by addressing a few shortcomings.

    2. Reviewer #1 (Public Review):

      Summary:

      In this work, the authors investigate the functional difference between the most commonly expressed form of PTH, and a novel point mutation in PTH identified in a patient with chronic hypocalcemia and hyperphosphatemia. The value of this mutant form of PTH as a potential anabolic agent for bone is investigated alongside PTH(1-84), which is a previously used anabolic therapy. The authors have achieved the aims of the study. Their conclusion, however, that this suggests a "new path of therapeutic PTH analog development" seems unfounded; the benefit of this PTH variant is not clear, but the work is still interesting.

      The work does not identify why the patient with this mutation has hypocalcemia and hyperphosphatemia; this was not the goal of the study, but the data are useful for helping to understand that.

      Strengths:

      The work is novel, as it describes the function of a novel, naturally occurring, variant of PTH in terms of its ability to dimerise, to lead to cAMP activation, to increase serum calcium, and its pharmacological action compared to normal PTH.

      Weaknesses:

      (1) The use of very young, 8-10 week old, mice as a model of postmenopausal osteoporosis is a major limitation of this study. At 8 weeks, the effect of ovariectomy leads to lack of new trabecular bone formation, rather than trabecular bone loss due to a defect in bone remodelling. Although the findings here provide a comparison between two forms of PTH, it is unlikely to be of direct relevance to the patient population. For example, the authors find an inhibitory effect of PTH on osteoclast surface, which is very unusual. Adding to this concern is that the authors have not described the regions used for histomorphometry, and from their figures (particularly the TRAP stain), it seems that the primary spongiosa (which is a region of growth) has been used for histomorphometry, rather than the secondary spongiosa (which more accurately reflects bone remodelling). Much further detail is needed to justify the use of this very young model, and a section on the limitations of this model is needed. Please provide that section in the revised manuscript.

      (2) It is also somewhat concerning that the age range is from 8-10 weeks, increasing the variability within the model. Did the age of mice differ between the groups analysed?

      (3) Methods are not sufficiently detailed. For example, the regions used for histomorphometry are not described, there is no information on micro-CT thresholds, no detail on the force used for mechanical testing. Please address this request.

      (4) There are three things unclear about the calvarial injection mouse model. Firstly, were the mice injected over the calvariae or with a standard subcutaneous injection (e.g. at the back of the neck)? If they were injected over the calvaria, why were both surfaces measured? Secondly, why was the dose of the R25C-PTH double that of PTH(1-34)? Thirdly, there is no justification for the use of "more intense coloration" as a marker of new bone; this requires calcein labelling to prove it new bone. It would be more reliable to measure and report the thickness of the calvaria. Please address these technical questions.

      (5) The presentation of mechanical testing data is not sufficient. Example curves should be shown, and data corrected for bone size needs to be shown. The difference in mechanical behaviour is interesting, but does it stem from a difference in the amount of bone, or two a difference in the quality of the bone? Please explain this matter better in the manuscript.

      (6) The micro-CT analysis of the cortical bone in the OVX model is insufficient. Please indicate whether cross-sectional area has increased. Is there an increase in the size of the bones, or is the increase in cortical thickness due to a narrowing of the marrow space? This may help resolve the apparent contradiction between the cortical thickness data (where there is no difference between the two PTH formulations) and the mechanical testing data (where there is a difference). Please explain this matter better in the manuscript.

      (7) The evidence that dimeric PTH has a different effect to monomeric PTH is very slim; I am not sure this is a real effect. Such differences take a long time to sort out (e.g. the field is still trying to determine whether teriparatide and abaloparatide are different). I think the authors need to look more carefully at their data - almost all effects are the same. Ultimately, the statement that dimeric PTH may be a more effective anabolic therapy than monomeric PTH are not supported by the data, and this should be removed. There is little to no difference found between normal PTH and the variant in their effects on calcium and phosphate homeostasis or on bone mass. However, the analysis has been somewhat cursory, with insufficient mechanical testing or cortical data presented. Many of the effects seem to be the same (e.g. cortical thickness, P1NP, ALP, vertebral BV/TV and MAR), but the way it is written it sounds like there is a difference. Please remove some of the unfounded claims that you have made in this manuscript.

      (8) Statistical analysis used multiple t-tests. ANOVA would be more appropriate.

    3. Reviewer #2 (Public Review):

      Summary:

      The study conducted by Noh et al. investigated the effects of parathyroid hormone (PTH) and a dimeric PTH peptide on bone formation and serum biochemistry in ovariectomized mice as a model for postmenopausal osteoporosis. The authors claimed that the dimeric PTH peptide has pharmacological benefits over PTH in promoting bone formation, despite both molecules having similar effects on bone formation and serum Ca2+. However, after careful evaluation, I am not convinced that this manuscript adds a significant contribution to the literature on bone and mineral research.

      Strengths:

      Experiments are well performed, but strengths are limited to the methodology used to evaluate bone formation and serum biochemical analysis.

      Weaknesses:

      (1) Limited significance of this study:<br /> • this study follows a previous study (not cited) reporting the effect of the dimeric R25CPTH(1-34) on bone regeneration in an osteoporotic dog (Beagle) model (Jeong-Oh Shin et al., eLife 13:RP93830, 2024). It's unclear why the authors tested the dimeric R25C-PTH peptide on a rodent animal model, which has limitations because the healing mechanism of human bone is more similar in dogs than in mice.<br /> • the authors should clarify why they tested the effects of dimeric R25CPTH(1-34) and not dimeric R25CPTH(1-84)?<br /> • The study is descriptive with no mechanism.

      (2) Statistics are inadequately described or performed for the experimental design:<br /> • the statistical analysis in Figure 5 needs to be written in a way that makes it clearer how statistics were done; t-test or one-way ANOVA?<br /> • Statistics in Figures 6 and 7 should be performed by one-way ANOVA to compare the mean values of one variable among three or more groups, and not t-test.

      (3) Misleading and confused discussion:<br /> • The first paragraph lacks clarity in the PTH nomenclature and the authors should provide a clear statement that the PTH mutant found in patients is likely a monomeric R25CPTH(1-84), considering that there has been no proof of a dimeric form.<br /> • Moreover, the authors should discuss the study by White et al. (PNAS 2019), which shows that there are defective PTH1R signaling responses to monomeric R25CPTH(1-34). This results in faster ligand dissociation, rapid receptor recycling, a short cAMP time course, and a loss of calcium ion allosteric effect.<br /> • The authors should also clarify what they mean by "the dimeric form of R25CPTH can serve as a new peptide ...(lines 328-329)" The dimeric R25CPTH(1-34) induces similar bone anabolic effects and calcemic responses to PTH(1-34), so it is unclear what the new benefit of the dimeric PTH is.

      Please address these concerns.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We are thankful for the handling of our manuscript. The following is a summary of our response and what we have done:

      (1) We are most thankful for the very thorough evaluation of our manuscript.

      (2) We were a bit shocked by the very negative commentary of referee 2.

      (3) We think, what put referee 2 off so much is that we were overconfident in the strength of our conclusions. We consider such overconfidence a big mistake. We have revised the manuscript to fix this problem.

      (4) We respond in great depth to all criticism and also go into technicalities.

      (5) We consider the possibility of a mistake. Yet, we carefully weighed the evidence advanced by referee 2 and by us and found that a systematic review supports our conclusions. Hence, we also resist the various attempts to crush our paper.

      (6) We added evidence (peripherin-antibody staining; our novel Figure 2) that suggests we correctly identified the inferior olive.

      (7) The eLife format – in which critical commentary is published along with the paper – is a fantastic venue to publish, what appears to be a surprisingly controversial issue.

      eLife assessment

      This potentially valuable study uses classic neuroanatomical techniques and synchrotron X-ray tomography to investigate the mapping of the trunk within the brainstem nuclei of the elephant brain. Given its unique specializations, understanding the somatosensory projections from the elephant trunk would be of general interest to evolutionary neurobiologists, comparative neuroscientists, and animal behavior scientists. However, the anatomical analysis is inadequate to support the authors' conclusion that they have identified the elephant trigeminal sensory nuclei rather than a different brain region, specifically the inferior olive.

      Comment: We are happy that our paper is considered to be potentially valuable. Also, the editors highlight the potential interest of our work for evolutionary neurobiologists, comparative neuroscientists, and animal behavior scientists. The editors are more negative when it comes to our evidence on the identification of the trigeminal nucleus vs the inferior olive. We have five comments on this assessment. (i) We think this assessment is heavily biased by the comments of referee 2. We show that the referee’s comments are more about us than about our paper. Hence, the referee failed to do their job (refereeing our paper) and should not have succeeded in leveling our paper. (ii) We have no ad hoc knock-out experiments to distinguish the trigeminal nucleus vs the inferior olive. Such experiments (extracellular recording & electrolytic lesions, viral tracing would be done in a week in mice, but they cannot and should not be done in elephants. (iii) We have extraordinary evidence. Nobody has ever described a similarly astonishing match of body (trunk folds) and myeloarchitecture in the brain before. (iv) We show that our assignment of the trigeminal nucleus vs the inferior olive is more plausible than the current hypothesis about the assignment of the trigeminal nucleus vs the inferior olive as defended by referee 2. We think this is why it is important to publish our paper. (v) We think eLife is the perfect place for our publication because the deviating views of referee 2 are published along.

      Change: We performed additional peripherin-antibody staining to differentiate the inferior olive and trigeminal nucleus. Peripherin is a cytoskeletal protein that is found in peripheral nerves and climbing fibers. Specifically, climbing fibers of various species (mouse, rabbit, pig, cow, and human; Errante et al., 1998) are stained intensely with peripherin-antibodies. What is tricky for our purposes is that there is also some peripherin-antibody reactivity in the trigeminal nuclei (Errante et al., 1998). Such peripherin-antibody reactivity is weaker, however, and lacks the distinct axonal bundle signature that stems from the strong climbing fiber peripherin-reactivity as seen in the inferior olive (Errante et al., 1998). As can be seen in our novel Figure 2, we observe peripherin-reactivity in axonal bundles (i.e. in putative climbing fibers), in what we think is the inferior olive. We also observe weak peripherin-reactivity, in what we think is the trigeminal nucleus, but not the distinct and strong labeling of axonal bundles. These observations are in line with our ideas but are difficult to reconcile with the views of the referee. Specifically, the lack of peripherin-reactive axon bundles suggests that there are no climbing fibers in what the referee thinks is the inferior olive.

      Errante, L., Tang, D., Gardon, M., Sekerkova, G., Mugnaini, E., & Shaw, G. (1998). The intermediate filament protein peripherin is a marker for cerebellar climbing fibres. Journal of neurocytology, 27, 69-84.

      Reviewer #1 :

      Summary:

      This fundamental study provides compelling neuroanatomical evidence underscoring the sensory function of the trunk in African and Asian elephants. Whereas myelinated tracts are classically appreciated as mediating neuronal connections, the authors speculate that myelinated bundles provide functional separation of trunk folds and display elaboration related to the "finger" projections. The authors avail themselves of many classical neuroanatomical techniques (including cytochrome oxidase stains, Golgi stains, and myelin stains) along with modern synchrotron X-ray tomography. This work will be of interest to evolutionary neurobiologists, comparative neuroscientists, and the general public, with its fascinating exploration of the brainstem of an icon sensory specialist. 

      Comment: We are incredibly grateful for this positive assessment.

      Changes: None.

      Strengths: 

      - The authors made excellent use of the precious sample materials from 9 captive elephants. 

      - The authors adopt a battery of neuroanatomical techniques to comprehensively characterize the structure of the trigeminal subnuclei and properly re-examine the "inferior olive".

      - Based on their exceptional histological preparation, the authors reveal broadly segregated patterns of metabolic activity, similar to the classical "barrel" organization related to rodent whiskers. 

      Comment: The referee provides a concise summary of our findings.

      Changes: None.

      Weaknesses: 

      - As the authors acknowledge, somewhat limited functional description can be provided using histological analysis (compared to more invasive techniques). 

      - The correlation between myelinated stripes and trunk fold patterns is intriguing, and Figure 4 presents this idea beautifully. I wonder - is the number of stripes consistent with the number of trunk folds? Does this hold for both species? 

      Comment: We agree with the referee’s assessment. We note that cytochrome-oxidase staining is an at least partially functional stain, as it reveals constitutive metabolic activity. A significant problem of the work in elephants is that our recording possibilities are limited, which in turn limits functional analysis. As indicated in Figure 5 (our former Figure 4) for the African elephant Indra, there was an excellent match of trunk folds and myelin stripes. Asian elephants have more, and less conspicuous trunk folds than African elephants. As illustrated in Figure 7, Asian elephants have more, and less conspicuous myelin stripes. Thus, species differences in myelin stripes correlate with species differences in trunk folds.

      Changes: We clarify the relation of myelin stripe and trunk fold patterns in our description of Figure 7.

      Reviewer #2 (Public Review): 

      The authors describe what they assert to be a very unusual trigeminal nuclear complex in the brainstem of elephants, and based on this, follow with many speculations about how the trigeminal nuclear complex, as identified by them, might be organized in terms of the sensory capacity of the elephant trunk.

      Comment: We agree with the referee’s assessment that the putative trigeminal nucleus described in our paper is highly unusual in size, position, vascularization, and myeloarchitecture. This is why we wrote this paper. We think these unusual features reflect the unique facial specializations of elephants, i.e. their highly derived trunk. Because we have no access to recordings from the elephant brainstem, we cannot back up all our functional interpretations with electrophysiological evidence; it is therefore fair to call them speculative.

      Changes: None.

      The identification of the trigeminal nuclear complex/inferior olivary nuclear complex in the elephant brainstem is the central pillar of this manuscript from which everything else follows, and if this is incorrect, then the entire manuscript fails, and all the associated speculations become completely unsupported. 

      Comment: We agree.

      Changes: None.

      The authors note that what they identify as the trigeminal nuclear complex has been identified as the inferior olivary nuclear complex by other authors, citing Shoshani et al. (2006; 10.1016/j.brainresbull.2006.03.016) and Maseko et al (2013; 10.1159/000352004), but fail to cite either Verhaart and Kramer (1958; PMID 13841799) or Verhaart (1962; 10.1515/9783112519882-001). These four studies are in agreement, but the current study differs.

      Comment & Change: We were not aware of the papers of Verhaart and included them in the revised manusript.

      Let's assume for the moment that the four previous studies are all incorrect and the current study is correct. This would mean that the entire architecture and organization of the elephant brainstem is significantly rearranged in comparison to ALL other mammals, including humans, previously studied (e.g. Kappers et al. 1965, The Comparative Anatomy of the Nervous System of Vertebrates, Including Man, Volume 1 pp. 668-695) and the closely related manatee (10.1002/ar.20573). This rearrangement necessitates that the trigeminal nuclei would have had to "migrate" and shorten rostrocaudally, specifically and only, from the lateral aspect of the brainstem where these nuclei extend from the pons through to the cervical spinal cord (e.g. the Paxinos and Watson rat brain atlases), the to the spatially restricted ventromedial region of specifically and only the rostral medulla oblongata. According to the current paper, the inferior olivary complex of the elephant is very small and located lateral to their trigeminal nuclear complex, and the region from where the trigeminal nuclei are located by others appears to be just "lateral nuclei" with no suggestion of what might be there instead.

      Comment: We have three comments here:

      (1) The referee correctly notes that we argue the elephant brainstem underwent fairly major rearrangements. In particular, we argue that the elephant inferior olive was displaced laterally, by a very large cell mass, which we argue is an unusually large trigeminal nucleus. To our knowledge, such a large compact cell mass is not seen in the ventral brain stem of any other mammal.

      (2) The referee makes it sound as if it is our private idea that the elephant brainstem underwent major rearrangements and that the rest of the evidence points to a conventional ‘rodent-like’ architecture. This is far from the truth, however. Already from the outside appearance (see our Figure 1B and Figure 7A) it is clear that the elephant brainstem has huge ventral bumps not seen in any other mammal. An extraordinary architecture also holds at the organizational level of nuclei. Specifically, the facial nucleus – the most carefully investigated nucleus in the elephant brainstem – has an appearance distinct from that of the facial nuclei of all other mammals (Maseko et al., 2013; Kaufmann et al., 2022). If both the overall shape and the constituting nuclei of the brainstem are very different from other mammals, it is very unlikely if not impossible that the elephant brainstem follows in all regards a conventional ‘rodent-like’ architecture.

      (3) The inferior olive is an impressive nucleus in the partitioning scheme we propose (Figure 2). In fact – together with the putative trigeminal nucleus we describe – it’s the most distinctive nucleus in the elephant brainstem. We have not done volumetric measurements and cell counts here, but think this is an important direction for future work. What has informed our work is that the inferior olive nucleus we describe has the serrated organization seen in the inferior olive of all mammals. We will discuss these matters in depth below.

      Changes: None.

      Such an extraordinary rearrangement of brainstem nuclei would require a major transformation in the manner in which the mutations, patterning, and expression of genes and associated molecules during development occur. Such a major change is likely to lead to lethal phenotypes, making such a transformation extremely unlikely. Variations in mammalian brainstem anatomy are most commonly associated with quantitative changes rather than qualitative changes (10.1016/B978-0-12-804042-3.00045-2). 

      Comment: We have two comments here:

      (1) The referee claims that it is impossible that the elephant brainstem differs from a conventional brainstem architecture because this would lead to lethal phenotypes etc. Following our previous response, this argument does not hold. It is out of the question that the elephant brainstem looks very different from the brainstem of other mammals. Yet, it is also evident that elephants live. The debate we need to have is not if the elephant brainstem differs from other mammals, but how it differs from other mammals.

      (2) In principle we agree with the referee’s thinking that the model of the elephant brainstem that is most likely to be correct is the one that requires the least amount of rearrangements to other mammals. We therefore prepared a comparison of the model the referee is proposing (Maseko et al., 2013; see Referee Table 1 below) with our proposition. We scored these models on their similarity to other mammals. We find that the referee’s ideas (Maseko et al., 2013) require more rearrangements relative to other mammals than our suggestion.

      Changes: Inclusion of Referee Table 1, which we discuss in depth below.

      The impetus for the identification of the unusual brainstem trigeminal nuclei in the current study rests upon a previous study from the same laboratory (10.1016/j.cub.2021.12.051) that estimated that the number of axons contained in the infraorbital branch of the trigeminal nerve that innervate the sensory surfaces of the trunk is approximately 400 000. Is this number unusual? In a much smaller mammal with a highly specialized trigeminal system, the platypus, the number of axons innervating the sensory surface of the platypus bill skin comes to 1 344 000 (10.1159. Yet, there is no complex rearrangement of the brainstem trigeminal nuclei in the brain of the developing or adult platypus (Ashwell, 2013, Neurobiology of Monotremes), despite the brainstem trigeminal nuclei being very large in the platypus (10.1159/000067195). Even in other large-brained mammals, such as large whales that do not have a trunk, the number of axons in the trigeminal nerve ranges between 400,000 and 500,000 (10.1007. The lack of comparative support for the argument forwarded in the previous and current study from this laboratory, and that the comparative data indicates that the brainstem nuclei do not change in the manner suggested in the elephant, argues against the identification of the trigeminal nuclei as outlined in the current study. Moreover, the comparative studies undermine the prior claim of the authors, informing the current study, that "the elephant trigeminal ganglion ... point to a high degree of tactile specialization in elephants" (10.1016/j.cub.2021.12.051). While clearly, the elephant has tactile sensitivity in the trunk, it is questionable as to whether what has been observed in elephants is indeed "truly extraordinary".

      Comment: These comments made us think that the referee is not talking about the paper we submitted, but that the referee is talking about us and our work in general. Specifically, the referee refers to the platypus and other animals dismissing our earlier work, which argued for a high degree of tactile specialization in elephants. We think the referee’s intuitions are wrong and our earlier work is valid.

      Changes: We prepared a Author response image 1 (below) that puts the platypus brain, a monkey brain, and the elephant trigeminal ganglion (which contains a large part of the trunk innervating cells) in perspective.

      Author response image 1.

      The elephant trigeminal ganglion is comparatively large. Platypus brain, monkey brain, and elephant ganglion. The elephant has two trigeminal ganglia, which contain the first-order somatosensory neurons. They serve mainly for tactile processing and are large compared to a platypus brain (from the comparative brain collection) and are similar in size to a monkey brain. The idea that elephants might be highly specialized for trunk touch is also supported by the analysis of the sensory nerves of these animals (Purkart et al., 2022). Specifically, we find that the infraorbital nerve (which innervates the trunk) is much thicker than the optic nerve (which mediates vision) and the vestibulocochlear nerve (which mediates hearing). Thus, not everything is large about elephants; instead, the data argue that these animals are heavily specialized for trunk touch.

      But let's look more specifically at the justification outlined in the current study to support their identification of the unusually located trigeminal sensory nuclei of the brainstem. 

      (1) Intense cytochrome oxidase reactivity.

      (2) Large size of the putative trunk module.

      (3) Elongation of the putative trunk module.

      (4) The arrangement of these putative modules corresponds to elephant head

      anatomy. 

      (5) Myelin stripes within the putative trunk module that apparently match trunk folds. <br /> (6) Location apparently matches other mammals.

      (7) Repetitive modular organization apparently similar to other mammals. <br /> (8) The inferior olive described by other authors lacks the lamellated appearance of this structure in other mammals.

      Comment: We agree those are key issues.

      Changes: None.

      Let's examine these justifications more closely.

      (1) Cytochrome oxidase histochemistry is typically used as an indicative marker of neuronal energy metabolism. The authors indicate, based on the "truly extraordinary" somatosensory capacities of the elephant trunk, that any nuclei processing this tactile information should be highly metabolically active, and thus should react intensely when stained for cytochrome oxidase. We are told in the methods section that the protocols used are described by Purkart et al (2022) and Kaufmann et al (2022). In neither of these cited papers is there any description, nor mention, of the cytochrome oxidase histochemistry methodology, thus we have no idea of how this histochemical staining was done. To obtain the best results for cytochrome oxidase histochemistry, the tissue is either processed very rapidly after buffer perfusion to remove blood or in recently perfusion-fixed tissue (e.g., 10.1016/0165-0270(93)90122-8). Given: (1) the presumably long post-mortem interval between death and fixation - "it often takes days to dissect elephants"; (2) subsequent fixation of the brains in 4% paraformaldehyde for "several weeks"; (3) The intense cytochrome oxidase reactivity in the inferior olivary complex of the laboratory rat (Gonzalez-Lima, 1998, Cytochrome oxidase in neuronal metabolism and Alzheimer's diseases); and (4) The lack of any comparative images from other stained portions of the elephant brainstem; it is difficult to support the justification as forwarded by the authors. The histochemical staining observed is likely background reactivity from the use of diaminobenzidine in the staining protocol. Thus, this first justification is unsupported. 

      Comment: The referee correctly notes the description of our cytochrome-oxidase reactivity staining was lacking. This is a serious mistake of ours for which we apologize very much. The referee then makes it sound as if we messed up our cytochrome-oxidase staining, which is not the case. All successful (n = 3; please see our technical comments in the recommendation section) cytochrome-oxidase stainings were done with elephants with short post-mortem times (≤ 2 days) to brain removal/cooling and only brief immersion fixation (≤ 1 day). Cytochrome-oxidase reactivity in elephant brains appears to be more sensitive to quenching by fixation than is the case for rodent brains. We think it is a good idea to include a cytochrome-oxidase staining overview picture because we understood from the referee’s comments that we need to compare our partitioning scheme of the brainstem with that of other authors. To this end, we add a cytochrome-oxidase staining overview picture (Author response image 3) along with an alternative interpretation from Maseko et al., 2013.

      Changes: (1) We added details on our cytochrome-oxidase reactivity staining protocol and the cytochrome-oxidase reactivity in the elephant brain in the manuscript and in our response to the general recommendations.

      (2) We provide a detailed discussion of the technicalities of cytochrome-oxidase staining below in the recommendation section, where the referee raised further criticisms.

      (3) We include a cytochrome-oxidase staining overview picture (Author response image 2) along with an alternative interpretation from Maseko et al., 2013.

      Author response image 2.

      Cytochrome-oxidase staining overview. Coronal cytochrome-oxidase staining overview from African elephant cow Indra; the section is taken a few millimeters posterior to the facial nucleus. Brown is putatively neural cytochrome-reactivity, and white is the background. Black is myelin diffraction and (seen at higher resolution, when you zoom in) erythrocyte cytochrome-reactivity in blood vessels (see our Figure 1E-G); such blood vessel cytochrome-reactivity is seen, because we could not perfuse the animal. There appears to be a minimal outside-in-fixation artifact (i.e. a more whitish/non-brownish appearance of the section toward the borders of the brain). This artifact is not seen in sections from Indra that we processed earlier or in other elephant brains processed at shorter post-mortem/fixation delays (see our Figure 1C).

      The same structures can be recognized in Author response image 2 and Supplememntary figure 36 of Maseko et al. (2013). The section is taken at an anterior-posterior level, where we encounter the trigeminal nuclei in pretty much all mammals. Note that the neural cytochrome reactivity is very high, in what we refer to as the trigeminal-nuclei-trunk-module and what Maseko et al. refer to as inferior olive. Myelin stripes can be recognized here as white omissions.

      At the same time, the cytochrome-oxidase-reactivity is very low in what Maseko et al. refer to as trigeminal nuclei. The indistinct appearance and low cytochrome-oxidase-reactivity of the trigeminal nuclei in the scheme of Maseko et al. (2013) is unexpected because trigeminal nuclei stain intensely for cytochrome-oxidase-reactivity in most mammals and because the trigeminal nuclei represent the elephant’s most important body part, the trunk. Staining patterns of the trigeminal nuclei as identified by Maseko et al. (2013) are very different at more posterior levels; we will discuss this matter below.

      Justifications (2), (3), and (4) are sequelae from justification (1). In this sense, they do not count as justifications, but rather unsupported extensions. 

      Comment: These are key points of our paper that the referee does not discuss.

      Changes: None.

      (4) and (5) These are interesting justifications, as the paper has clear internal contradictions, and (5) is a sequelae of (4). The reader is led to the concept that the myelin tracts divide the nuclei into sub-modules that match the folding of the skin on the elephant trunk. One would then readily presume that these myelin tracts are in the incoming sensory axons from the trigeminal nerve. However, the authors note that this is not the case: "Our observations on trunk module myelin stripes are at odds with this view of myelin. Specifically, myelin stripes show no tapering (which we would expect if axons divert off into the tissue). More than that, there is no correlation between myelin stripe thickness (which presumably correlates with axon numbers) and trigeminal module neuron numbers. Thus, there are numerous myelinated axons, where we observe few or no trigeminal neurons. These observations are incompatible with the idea that myelin stripes form an axonal 'supply' system or that their prime function is to connect neurons. What do myelin stripe axons do, if they do not connect neurons? We suggest that myelin stripes serve to separate rather than connect neurons." So, we are left with the observation that the myelin stripes do not pass afferent trigeminal sensory information from the "truly extraordinary" trunk skin somatic sensory system, and rather function as units that separate neurons - but to what end? It appears that the myelin stripes are more likely to be efferent axonal bundles leaving the nuclei (to form the olivocerebellar tract). This justification is unsupported.

      Comment: The referee cites some of our observations on myelin stripes, which we find unusual. We stand by the observations and comments. The referee does not discuss the most crucial finding we report on myelin stripes, namely that they correspond remarkably well to trunk folds.

      Changes: None.

      (6) The authors indicate that the location of these nuclei matches that of the trigeminal nuclei in other mammals. This is not supported in any way. In ALL other mammals in which the trigeminal nuclei of the brainstem have been reported they are found in the lateral aspect of the brainstem, bordered laterally by the spinal trigeminal tract. This is most readily seen and accessible in the Paxinos and Watson rat brain atlases. The authors indicate that the trigeminal nuclei are medial to the facial nerve nucleus, but in every other species, the trigeminal sensory nuclei are found lateral to the facial nerve nucleus. This is most salient when examining a close relative, the manatee (10.1002/ar.20573), where the location of the inferior olive and the trigeminal nuclei matches that described by Maseko et al (2013) for the African elephant. This justification is not supported. 

      Comment: The referee notes that we incorrectly state that the position of the trigeminal nuclei matches that of other mammals. We think this criticism is justified.

      Changes: We prepared a comparison of the Maseko et al. (2013) scheme of the elephant brainstem with our scheme of the elephant brainstem (see below Referee Table 1). Here we acknowledge the referee’s argument and we also changed the manuscript accordingly.

      (7) The dual to quadruple repetition of rostrocaudal modules within the putative trigeminal nucleus as identified by the authors relies on the fact that in the neurotypical mammal, there are several trigeminal sensory nuclei arranged in a column running from the pons to the cervical spinal cord, these include (nomenclature from Paxinos and Watson in roughly rostral to caudal order) the Pr5VL, Pr5DM, Sp5O, Sp5I, and Sp5C. However, these nuclei are all located far from the midline and lateral to the facial nerve nucleus, unlike what the authors describe in the elephants. These rostrocaudal modules are expanded upon in Figure 2, and it is apparent from what is shown that the authors are attributing other brainstem nuclei to the putative trigeminal nuclei to confirm their conclusion. For example, what they identify as the inferior olive in Figure 2D is likely the lateral reticular nucleus as identified by Maseko et al (2013). This justification is not supported.

      Comment: The referee again compares our findings to the scheme of Maseko et al. (2013) and rejects our conclusions on those grounds. We think such a comparison of our scheme is needed, indeed.

      Changes: We prepared a comparison of the Maseko et al. (2013) scheme of the elephant brainstem with our scheme of the elephant brainstem (see below Referee Table 1).

      (8) In primates and related species, there is a distinct banded appearance of the inferior olive, but what has been termed the inferior olive in the elephant by other authors does not have this appearance, rather, and specifically, the largest nuclear mass in the region (termed the principal nucleus of the inferior olive by Maseko et al, 2013, but Pr5, the principal trigeminal nucleus in the current paper) overshadows the partial banded appearance of the remaining nuclei in the region (but also drawn by the authors of the current paper). Thus, what is at debate here is whether the principal nucleus of the inferior olive can take on a nuclear shape rather than evince a banded appearance. The authors of this paper use this variance as justification that this cluster of nuclei could not possibly be the inferior olive. Such a "semi-nuclear/banded" arrangement of the inferior olive is seen in, for example, giraffe (10.1016/j.jchemneu.2007.05.003), domestic dog, polar bear, and most specifically the manatee (a close relative of the elephant) (brainmuseum.org; 10.1002/ar.20573). This justification is not supported. 

      Comment: We carefully looked at the brain sections referred to by the referee in the brainmuseum.org collection. We found contrary to the referee’s claims that dogs, polar bears, and manatees have a perfectly serrated (a cellular arrangement in curved bands) appearance of the inferior olive. Accordingly, we think the referee is not reporting the comparative evidence fairly and we wonder why this is the case.

      Changes: None.

      Thus, all the justifications forwarded by the authors are unsupported. Based on methodological concerns, prior comparative mammalian neuroanatomy, and prior studies in the elephant and closely related species, the authors fail to support their notion that what was previously termed the inferior olive in the elephant is actually the trigeminal sensory nuclei. Given this failure, the justifications provided above that are sequelae also fail. In this sense, the entire manuscript and all the sequelae are not supported.

      Comment: We disagree. To summarize:

      (1) Our description of the cytochrome oxidase staining lacked methodological detail, which we have now added; the cytochrome oxidase reactivity data are great and support our conclusions.

      (2)–(5)The referee does not really discuss our evidence on these points.

      (6) We were wrong and have now fixed this mistake.

      (7) The referee asks for a comparison to the Maseko et al. (2013) scheme (agreed, see Referee Table 1).

      (8) The referee bends the comparative evidence against us.

      Changes: None.

      A comparison of the elephant brainstem partitioning schemes put forward by Maseko et al 2013 and by Reveyaz et al.

      To start with, we would like to express our admiration for the work of Maseko et al. (2013). These authors did pioneering work on obtaining high-quality histology samples from elephants. Moreover, they made a heroic neuroanatomical effort, in which they assigned 147 brain structures to putative anatomical entities. Most of their data appear to refer to staining in a single elephant and one coronal sectioning plane. The data quality and the illustration of results are excellent.

      We studied mainly two large nuclei in six (now 7) elephants in three (coronal, parasagittal, and horizontal) sectioning planes. The two nuclei in question are the two most distinct nuclei in the elephant brainstem, namely an anterior ventromedial nucleus (the trigeminal trunk module in our terminology; the inferior olive in the terminology of Maseko et al., 2013) and a more posterior lateral nucleus (the inferior olive in our terminology; the posterior part of the trigeminal nuclei in the terminology of Maseko et al., 2013).

      Author response image 3 gives an overview of the two partitioning schemes for inferior olive/trigeminal nuclei along with the rodent organization (see below).

      Author response image 3.

      Overview of the brainstem organization in rodents & elephants

      The strength of the Maseko et al. (2013) scheme is the excellent match of the position of elephant nuclei to the position of nuclei in the rodent (Author response image 3). We think this positional match reflects the fact that Maseko et al. (2013) mapped a rodent partitioning scheme on the elephant brainstem. To us, this is a perfectly reasonable mapping approach. As the referee correctly points out, the positional similarity of both elephant inferior olive and trigeminal nuclei to the rodent strongly argues in favor of the Maseko et al. (2013), because brainstem nuclei are positionally very conservative.

      Other features of the Maseko et al. (2013) scheme are less favorable. The scheme marries two cyto-architectonically very distinct divisions (an anterior indistinct part) and a super-distinct serrated posterior part to be the trigeminal nuclei. We think merging entirely distinct subdivisions into one nucleus is a byproduct of mapping a rodent partitioning scheme on the elephant brainstem. Neither of the two subdivisions resemble the trigeminal nuclei of other mammals. The cytochrome oxidase staining patterns differ markedly across the anterior indistinct part (see our Author response image 3) and the posterior part of the trigeminal nuclei and do not match with the intense cytochrome oxidase reactivity of other mammalian trigeminal nuclei (Author response image 2). Our anti-peripherin staining (the novel Figure 2 of our manuscript) indicates that there probably no climbing fibers, in what Maseko et al. think. is inferior olive; this is a potentially fatal problem for the hypothesis. The posterior part of Maseko et al. (2013) trigeminal nuclei has a distinct serrated appearance that is characteristic of the inferior olive in other mammals. Moreover, the inferior olive of Maseko et al. (2013) lacks the serrated appearance of the inferior olive seen in pretty much all mammals; this is a serious problem.

      The partitioning scheme of Reveyaz et al. comes with poor positional similarity but avoids the other problems of the Maseko et al. (2013) scheme. Our explanation for the positionally deviating location of trigeminal nuclei is that the elephant grew one of the if not the largest trigeminal systems of all mammals. As a result, the trigeminal nuclei grew through the floor of the brainstem. We understand this is a post hoc just-so explanation, but at least it is an explanation.

      The scheme of Reveyaz et al. was derived in an entirely different way from the Maseko model. Specifically, we were convinced that the elephant trigeminal nuclei ought to be very special because of the gigantic trigeminal ganglia (Purkart et al., 2022). Cytochrome-oxidase staining revealed a large distinct nucleus with an elongated shape. Initially, we were freaked out by the position of the nucleus and the fact that it was referred to as inferior olive by other authors. When we found an inferior-olive-like nucleus at a nearby (although at an admittedly unusual) location, we were less worried. We then optimized the visualization of myelin stripes (brightfield imaging etc.) and were able to collect an entire elephant trunk along with the brain (African elephant cow Indra). When we made the one-to-one match of Indra’s trunk folds and myelin stripes (former Figure 4, now Figure 5) we were certain that we had identified the trunk module of the trigeminal nuclei. We already noted at the outset of our rebuttal that we now consider such certainty a fallacy of overconfidence. In light of the comments of Referee 2, we feel that a further discussion of our ideas is warranted.

      A strength of the Reveyaz model is that nuclei look like single anatomical entities. The trigeminal nuclei look like trigeminal nuclei of other mammals, the trunk module has a striking resemblance to the trunk and the inferior olive looks like the inferior olive of other mammals.

      We evaluated the fit of the two models in the form of a table (Author response table 1; below). Unsurprisingly, Author response table 1 aligns with our views of elephant brainstem partitioning.

      Author response table 1

      Qualitative evaluation of elephant brainstem partitioning schemes

      ++ = Very attractive; + = attractive; - = unattractive; -- = very unattractive

      We scored features that are clear and shared by all mammals – as far as we know them – as very attractive.

      We scored features that are clear and are not shared by all mammals – as far as we know them – as very unattractive.

      Attractive features are either less clear or less well-shared features.

      Unattractive features are either less clear or less clearly not shared features.

      Author response table 1 suggests two conclusions to us. (i) The Reveyaz et al. model has mainly favorable properties. The Maseko et al. (2013) model has mainly unfavorable properties. Hence, the Reveyaz et al. model is more likely to be true. (ii) The outcome is not black and white, i.e., both models have favorable and unfavorable properties. Accordingly, we overstated our case in our initial submission and toned down our claims in the revised manuscript.

      What the authors have not done is to trace the pathway of the large trigeminal nerve in the elephant brainstem, as was done by Maseko et al (2013), which clearly shows the internal pathways of this nerve, from the branch that leads to the fifth mesencephalic nucleus adjacent to the periventricular grey matter, through to the spinal trigeminal tract that extends from the pons to the spinal cord in a manner very similar to all other mammals. Nor have they shown how the supposed trigeminal information reaches the putative trigeminal nuclei in the ventromedial rostral medulla oblongata. These are but two examples of many specific lines of evidence that would be required to support their conclusions. Clearly, tract tracing methods, such as cholera toxin tracing of peripheral nerves cannot be done in elephants, thus the neuroanatomy must be done properly and with attention to detail to support the major changes indicated by the authors. 

      Comment: The referee claims that Maseko et al. (2013) showed by ‘tract tracing’ that the structures they refer to trigeminal nuclei receive trigeminal input. This statement is at least slightly misleading. There is nothing of what amounts to proper ‘tract tracing’ in the Maseko et al. (2013) paper, i.e. tracing of tracts with post-mortem tracers. We tried proper post-mortem tracing but failed (no tracer transport) probably as a result of the limitations of our elephant material. What Maseko et al. (2013) actually did is look a bit for putative trigeminal fibers and where they might go. We also used this approach. In our hands, such ‘pseudo tract tracing’ works best in unstained material under bright field illumination, because myelin is very well visualized. In such material, we find: (i) massive fiber tracts descending dorsoventrally roughly from where both Maseko et al. 2013 and we think the trigeminal tract runs. (ii) These fiber tracts run dorsoventrally and approach, what we think is the trigeminal nuclei from lateral.

      Changes: Ad hoc tract tracing see above.

      So what are these "bumps" in the elephant brainstem? 

      Four previous authors indicate that these bumps are the inferior olivary nuclear complex. Can this be supported?

      The inferior olivary nuclear complex acts "as a relay station between the spinal cord (n.b. trigeminal input does reach the spinal cord via the spinal trigeminal tract) and the cerebellum, integrating motor and sensory information to provide feedback and training to cerebellar neurons" (https://www.ncbi.nlm.nih.gov/books/NBK542242/). The inferior olivary nuclear complex is located dorsal and medial to the pyramidal tracts (which were not labeled in the current study by the authors but are clearly present in Fig. 1C and 2A) in the ventromedial aspect of the rostral medulla oblongata. This is precisely where previous authors have identified the inferior olivary nuclear complex and what the current authors assign to their putative trigeminal nuclei. The neurons of the inferior olivary nuclei project, via the olivocerebellar tract to the cerebellum to terminate in the climbing fibres of the cerebellar cortex.

      Comment: We agree with the referee that in the Maseko et al. (2013) scheme the inferior olive is exactly where we expect it from pretty much all other mammals. Hence, this is a strong argument in favor of the Maseko et al. (2013) scheme and a strong argument against the partitioning scheme suggested by us.

      Changes: Please see our discussion above.

      Elephants have the largest (relative and absolute) cerebellum of all mammals (10.1002/ar.22425), this cerebellum contains 257 x109 neurons (10.3389/fnana.2014.00046; three times more than the entire human brain, 10.3389/neuro.09.031.2009). Each of these neurons appears to be more structurally complex than the homologous neurons in other mammals (10.1159/000345565; 10.1007/s00429-010-0288-3). In the African elephant, the neurons of the inferior olivary nuclear complex are described by Maseko et al (2013) as being both calbindin and calretinin immunoreactive. Climbing fibres in the cerebellar cortex of the African elephant are clearly calretinin immunopositive and also are likely to contain calbindin (10.1159/000345565). Given this, would it be surprising that the inferior olivary nuclear complex of the elephant is enlarged enough to create a very distinct bump in exactly the same place where these nuclei are identified in other mammals? 

      Comment: We agree with the referee that it is possible and even expected from other mammals that there is an enlargement of the inferior olive in elephants. Hence, a priori one might expect the ventral brain stem bumps to the inferior olive, this is perfectly reasonable and is what was done by previous authors. The referee also refers to calbindin and calretinin antibody reactivity. Such antibody reactivity is indeed in line with the referee’s ideas and we considered these findings in our Referee Table 1. The problem is, however, that neither calbindin nor calretinin antibody reactivity are highly specific and indeed both nuclei in discussion (trigeminal nuclei and inferior olive) show such reactivity. Unlike the peripherin-antibody staining advanced by us, calbindin nor calretinin antibody reactivity cannot distinguish the two hypotheses debated.

      Changes: Please see our discussion above.

      What about the myelin stripes? These are most likely to be the origin of the olivocerebellar tract and probably only have a coincidental relationship with the trunk. Thus, given what we know, the inferior olivary nuclear complex as described in other studies, and the putative trigeminal nuclear complex as described in the current study, is the elephant inferior olivary nuclear complex. It is not what the authors believe it to be, and they do not provide any evidence that discounts the previous studies. The authors are quite simply put, wrong. All the speculations that flow from this major neuroanatomical error are therefore science fiction rather than useful additions to the scientific literature. 

      Comment: It is unlikely that the myelin stripes are the origin of the olivocerebellar tract as suggested by the referee. Specifically, the lack of peripherin-reactivity indicates that these fibers are not climbing fibers (our novel Figure 2). In general, we feel the referee does not want to discuss the myelin stripes and obviously thinks we made up the strange correspondence of myelin stripes and trunk folds.

      Changes: Please see our discussion above.

      What do the authors actually have? 

      The authors have interesting data, based on their Golgi staining and analysis, of the inferior olivary nuclear complex in the elephant.

      Comment: The referee reiterates their views.

      Changes: None.

      Reviewer #3 (Public Review):

      Summary: 

      The study claims to investigate trunk representations in elephant trigeminal nuclei located in the brainstem. The researchers identified large protrusions visible from the ventral surface of the brainstem, which they examined using a range of histological methods. However, this ventral location is usually where the inferior olivary complex is found, which challenges the author's assertions about the nucleus under analysis. They find that this brainstem nucleus of elephants contains repeating modules, with a focus on the anterior and largest unit which they define as the putative nucleus principalis trunk module of the trigeminal. The nucleus exhibits low neuron density, with glia outnumbering neurons significantly. The study also utilizes synchrotron X-ray phase contrast tomography to suggest that myelin-stripe-axons traverse this module. The analysis maps myelin-rich stripes in several specimens and concludes that based on their number and patterning they likely correspond with trunk folds; however, this conclusion is not well supported if the nucleus has been misidentified.

      Comment: The referee gives a concise summary of our findings. The referee acknowledges the depth of our analysis and also notes our cellular results. The referee – in line with the comments of Referee 2 – also points out that a misidentification of the nucleus under study is potentially fatal for our analysis. We thank the referee for this fair assessment.

      Changes: We feel that we need to alert the reader more broadly to the misidentification concern. We think the critical comments of Referee 2, which will be published along with our manuscript, will go a long way in doing so. We think the eLife publishing format is fantastic in this regard. We will also include pointers to these concerns in the revised manuscript.

      Strengths: 

      The strength of this research lies in its comprehensive use of various anatomical methods, including Nissl staining, myelin staining, Golgi staining, cytochrome oxidase labeling, and synchrotron X-ray phase contrast tomography. The inclusion of quantitative data on cell numbers and sizes, dendritic orientation and morphology, and blood vessel density across the nucleus adds a quantitative dimension. Furthermore, the research is commendable for its high-quality and abundant images and figures, effectively illustrating the anatomy under investigation.

      Comment: Again, a very fair and balanced set of comments. We are thankful for these comments.

      Changes: None.

      Weaknesses: 

      While the research provides potentially valuable insights if revised to focus on the structure that appears to be the inferior olivary nucleus, there are certain additional weaknesses that warrant further consideration. First, the suggestion that myelin stripes solely serve to separate sensory or motor modules rather than functioning as an "axonal supply system" lacks substantial support due to the absence of information about the neuronal origins and the termination targets of the axons. Postmortem fixed brain tissue limits the ability to trace full axon projections. While the study acknowledges these limitations, it is important to exercise caution in drawing conclusions about the precise role of myelin stripes without a more comprehensive understanding of their neural connections.

      Comment: The referee points out a significant weakness of our study, namely our limited understanding of the origin and targets of the axons constituting the myelin stripes. We are very much aware of this problem and this is also why we directed high-powered methodology like synchrotron X-ray tomograms to elucidate the structure of myelin stripes. Such analysis led to advances, i.e., we now think, what looks like stripes are bundles and we understand the constituting axons tend to transverse the module. Such advances are insufficient, however, to provide a clear picture of myelin stripe connectivity.

      Changes: We think solving the problems raised by the referee will require long-term methodological advances and hence we will not be able to solve these problems in the current revision. Our long-term plans for confronting these issues are the following: (i) Improving our understanding of long-range connectivity by post-mortem tracing and MR-based techniques such as Diffusion-Tensor-Imaging. (ii) Improving our understanding of mid and short-range connectivity by applying even larger synchrotron X-ray tomograms and possible serial EM.

      Second, the quantification presented in the study lacks comparison to other species or other relevant variables within the elephant specimens (i.e., whole brain or brainstem volume). The absence of comparative data for different species limits the ability to fully evaluate the significance of the findings. Comparative analyses could provide a broader context for understanding whether the observed features are unique to elephants or more common across species. This limitation in comparative data hinders a more comprehensive assessment of the implications of the research within the broader field of neuroanatomy. Furthermore, the quantitative comparisons between African and Asian elephant specimens should include some measure of overall brain size as a covariate in the analyses. Addressing these weaknesses would enable a richer interpretation of the study's findings.

      Comment: The referee suggests another series of topics, which include the analysis of brain parts volumes or overall brain size. We agree these are important issues, but we also think such questions are beyond the scope of our study.

      Changes: We hope to publish comparative data on elephant brain size and shape later this year.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I realize that elephant brains are a limiting resource in this project, along with the ability to perform functional investigations. However, I believe that Prof. Jon Kaas (Vanderbilt University) has one or more series of Nissl-stained brainstems from elephants. These might be of potential interest, as they were previously used to explore general patterns of trigeminal brainstem organization in a comparative manner (see Sawyer and Sarko, 2017, "Comparative Anatomy and Evolution of the Somatosensory Brain Stem" in the Evolution of Nervous System series) and might shed light on the positioning of the trigeminal complex and IO, with parts of the trigeminal nerve itself still attached to these sections.

      Comment: The referee suggests adding data from more elephants and we think this is a great suggestion because our ns are small. We followed this advice. We agree we need more comparative neuroanatomy of elephants and the urgency of this matter is palpable in the heated debate we have with Referee 2. Specifically, we need more long-range and short-range analysis of elephant brains.

      Changes: We plan to include data in the revised manuscript about cytoarchitectonics (Nissl), cytochrome-oxidase reactivity, and possibly also antibody reactivity from an additional animal, i.e., from the African elephant cow Bibi. The quality of this specimen is excellent and the post-mortem time to brain extraction was very short.

      We also have further plans for connectivity analysis (see our response above), but such data will not become available fast enough for the revision.

      Other recommendations: 

      - A general schematic showing input from trunk to PrV to the trigeminal subnuclei (as well as possibly ascending connections) might be informative to the reader, in terms of showing which neural relay is being examined.

      Comment: We think this is a very good suggestion in principle, but we were not satisfied with the schematics we came up with.

      Changes: None.

      - Perhaps a few more sentences described the significance of synchrotron tomography for those who may be unfamiliar.

      Comment & Change: We agree and implement this suggestion.

      - "Belly-shaped" trunk module description is unclear on page 9. 

      Comment & Change: We clarified this matter.

      - Typo on the last sentence of page 9. 

      Comment & Change: We fixed this mistake.

      Reviewer #2 (Recommendations For The Authors): 

      The data is only appropriate a specialized journal and is limited to the Golgi analysis of neurons within the inferior olivary complex of the elephant. This reviewer considers that the remainder of the work is speculation and that the paper in its current version is not salvageable.

      Comment: Rather than suggesting changes, the referee makes it clear that the referee does not want to see our paper published. We think this desire to reject is not rooted in a lack of quality of our work. In fact, we did an immense amount of work (detailed cytoarchitectonic analysis of six (now seven) elephant brainstems rather than one as in the case of our predecessors), cell counts, and X-ray tomography. Instead, we think the problem is rooted in the fact that we contradict the referee. To us, such suppression of diverging opinions – provided they are backed up with data – is a scientifically deeply unhealthy attitude. Science lives from the debate and this is why we did not exclude any referees even though we knew that our results do not align with the views of all of the few actors in the field.

      Changes: We think the novel eLife publishing scheme was developed to prevent such abuse. We look forward to having our data published along with the harsh comments of the referee. The readers and subsequent scientific work will determine who’s right and who’s wrong.

      In order to convince readers of the grand changes to the organization of the brainstem in a species suggested by the authors the data presented needs to be supported. It is not. 

      Comment: Again, this looks to us like more of the ‘total-rejection-commentary’ than like an actual recommendation.

      Changes: None.

      The protocol for the cytochrome oxidase histochemistry is not available in the locations indicated by the authors, and it is very necessary to provide this, as I fully believe that the staining obtained is not real, given the state of the tissue used. 

      Comment: We apologize again for not including the necessary details on our cytochrome-oxidase staining.

      From these comments (and the initial comments above) it appears that the referee is uncertain about the validity of cytochrome-oxidase staining. We (M.B., the senior author) have been doing this particular stain for approximately three decades. The referee being unfamiliar with cytochrome-oxidase staining is fine, but we can’t comprehend how the referee then comes to the ‘full belief’ that our staining patterns are ‘not real’ when the visual evidence indicates the opposite. We feel the referee does not want to believe our data.

      From hundreds of permutations, we can assure the referee that cytochrome-oxidase staining can go wrong in many ways. The most common failure outcome in elephants is a uniform light brown stain after hours or days of the cytochrome-oxidase reaction. This outcome is closely associated with long ≥2 days post-mortem/fixation times and reflects the quenching of cytochrome-oxidases by fixation. Interestingly, cytochrome-oxidase staining in elephant brains is distinctly more sensitive to quenching by fixation than cytochrome-oxidase staining in rodent brains. Another, more rare failure of cytochrome-oxidase staining comes as entirely white or barely colored sections; this outcome is usually associated with a bad reagent (most commonly old DAB, but occasionally also old or bad catalase, in case you are using a staining protocol with catalase). Another nasty cytochrome-oxidase staining outcome is smeary all-black sections. In this case, a black precipitate sticks to sections and screws up the staining (filtering and more gradual heating of the staining solution usually solve this problem). Thus, you can get uniformly white, uniformly light brown, and smeary black sections as cytochrome-oxidase staining failures. What you never get from cytochrome-oxidase staining as an artifact are sections with a strong brown to lighter brown differential contrast. All sections with strong brown to lighter brown differential contrast (staining successes) show one and the same staining pattern in a given brain area, i.e., brownish barrels in the rodent cortex, brownish barrelettes (trigeminal nuclei) in the rodent brainstem, brownish putative trunk modules/inferior olives (if we believe the referee) in the elephant brainstem. Cytochrome-oxidase reactivity is in this regard remarkably different from antibody staining. In antibody staining you can get all kinds of interesting differential contrast staining patterns, which mean nothing. Such differential contrast artifacts in antibody staining arise as a result of insufficient primary antibody specificity, the secondary antibody binding non-specifically, and of what have you not reasons. The reason that the brown differential contrast of cytochrome-oxidase reaction is pretty much fool-proof, relates to the histochemical staining mechanism, which is based on the supply of specific substrates to a universal mitochondrial enzyme. The ability to reveal mitochondrial metabolism and the universal and ‘fool-proof’ staining qualities make the cytochrome-oxidase reactivity a fantastic tool for comparative neuroscience, where you always struggle with insufficient information about antigen reactivity.

      We also note that the contrast of cytochrome-oxidase reactivity seen in the elephant brainstem is spectacular. As the Referee can see in our Figure 1C we observe a dark brown color in the putative trunk module, with the rest of the brain being close to white. Such striking cytochrome-oxidase reactivity contrast has been observed only very rarely in neuroanatomy: (i) In the rest of the elephant brain (brainstem, thalamus cortex) we did not observe as striking contrast as in the putative trunk module (the inferior olive according to the referee). (ii) In decades of work with rodents, we have rarely seen such differential activity. For example, cortical whisker-barrels (a classic CO-staining target) in rodents usually come out as dark brown against a light brown background.

      What all of this commentary means is that patterns revealed by differential cytochrome-oxidase staining in the elephant brain stem are real.

      Changes: We added details on our cytochrome-oxidase reactivity staining protocol and commented on cytochrome-oxidase reactivity in the elephant brain in general.

      The authors need to recognize that the work done in Africa on elephant brains is of high quality and should not be blithely dismissed by the authors - this stinks of past colonial "glory", especially as the primary author on these papers is an African female.

      Comment: The referee notes that we unfairly dismiss the work of African scientists and that our paper reflects a continuation of our horrific colonial past because we contradict the work of an African woman. We think such commentary is meant to be insulting and prefer to return to the scientific discourse. We are staunch supporters of diversity in science. It is simply untrue, that we do not acknowledge African scientists or the excellent work done in Africa on elephant brains. For example, we cite no less than four papers from the Manger group. We refer countless times in the manuscript to these papers, because these papers are highly relevant to our work. We indeed disagree with two anatomical assignments made by Maseko et al., 2013. Such differences should not be overrated, however. As we noted before, such differences relate to only 2 out of 147 anatomical assignments made by these authors. More generally, discussing and even contradicting papers is the appropriate way to acknowledge scientists. We already expressed we greatly admire the pioneering work of the Manger group. In our view, the perfusion of elephants in the field is a landmark experiment in comparative neuroanatomy. We closely work with colleagues in Africa and find them fantastic collaborators. When the referee is accusing us of contradicting the work of an African woman, the referee is unfairly and wrongly accusing us of attacking a scientist’s identity. More generally, we feel the discussion should focus on the data presented.

      Changes: None.

      In addition, perfusing elephants in the field with paraformaldehyde shortly after death is not a problem "partially solved" when it comes to collecting elephant tissue (n.b., with the right tools the brain of the elephant can be removed in under 2 hours). It means the problem IS solved. This is evidenced by the quality of the basic anatomical, immuno-, and Golgi-staining of the elephant tissue collected in Africa.

      Comment: This is not a recommendation. We repeat: In our view, the perfusion of elephants in the field by the Manger group is a landmark experiment in comparative neuroanatomy. Apart, from that, we think the referee got our ‘partially solved comment’ the wrong way. It is perhaps worthwhile to recall the context of this quote. We first describe the numerous limitations of our elephant material; admitting these limitations is about honesty. Then, we wanted to acknowledge previous authors who either paved the way for elephant neuroanatomy (Shoshani) or did a better job than we did (Manger; see the above landmark experiment). These citations were meant as an appreciation of our predecessors’ work and by far not meant to diminish their work. Why did we say that the problems of dealing with elephant material are only partially solved? Because elephant neuroanatomy is hard and the problems associated with it are by no means solved. Many previous studies rely on single specimen and our possibilities of accessing, removing, processing, and preserving elephant brains are limited and inferior to the conditions elsewhere. Doing a mouse brain is orders of magnitude easier than doing an elephant brain (because the problems of doing mouse anatomy are largely solved), yet it is hard to publish a paper with six elephant brains because the referees expect evidence at least half as good as what you get in mice.

      Changes: We replaced the ‘partially solved’ sentence.

      The authors need to give credit where credit is due - the elephant cerebellum is clearly at the core of controlling trunk movement, and as much as primary sensory and final stage motor processing is important, the complexity required for the neural programs needed to move the trunk either voluntarily or in response to stimuli, is being achieved by the cerebellum. The inferior olive is part of this circuit and is accordingly larger than one would expect.

      Comment: We think it is very much possible that the elephant cerebellum is important in trunk control.

      Changes: We added a reference to the elephant cerebellum in the introduction of our manuscript.

    2. Reviewer #2 (Public Review):

      Here I submit my previous review and a great deal of additional information following on from the initial review and the response by the authors.

      * Initial Review *

      Assessment:

      This manuscript is based upon the unprecedented identification of an apparently highly unusual trigeminal nuclear organization within the elephant brainstem, related to a large trigeminal nerve in these animals. The apparently highly specialized elephant trigeminal nuclear complex identified in the current study has been classified as the inferior olivary nuclear complex in four previous studies of the elephant brainstem. The entire study is predicated upon the correct identification of the trigeminal sensory nuclear complex and the inferior olivary nuclear complex in the elephant, and if this is incorrect, then the remainder of the manuscript is merely unsupported speculation. There are many reasons indicating that the trigeminal nuclear complex is misidentified in the current study, rendering the entire study, and associated speculation, inadequate at best, and damaging in terms of understanding elephant brains and behaviour at worst.

      Original Public Review:

      The authors describe what they assert to be a very unusual trigeminal nuclear complex in the brainstem of elephants, and based on this, follow with many speculations about how the trigeminal nuclear complex, as identified by them, might be organized in terms of the sensory capacity of the elephant trunk.<br /> The identification of the trigeminal nuclear complex/inferior olivary nuclear complex in the elephant brainstem is the central pillar of this manuscript from which everything else follows, and if this is incorrect, then the entire manuscript fails, and all the associated speculations become completely unsupported.

      The authors note that what they identify as the trigeminal nuclear complex has been identified as the inferior olivary nuclear complex by other authors, citing Shoshani et al. (2006; 10.1016/j.brainresbull.2006.03.016) and Maseko et al (2013; 10.1159/000352004), but fail to cite either Verhaart and Kramer (1958; PMID 13841799) or Verhaart (1962; 10.1515/9783112519882-001). These four studies are in agreement, the current study differs.

      Let's assume for the moment that the four previous studies are all incorrect and the current study is correct. This would mean that the entire architecture and organization of the elephant brainstem is significantly rearranged in comparison to ALL other mammals, including humans, previously studied (e.g. Kappers et al. 1965, The Comparative Anatomy of the Nervous System of Vertebrates, Including Man, Volume 1 pp. 668-695) and the closely related manatee (10.1002/ar.20573). This rearrangement necessitates that the trigeminal nuclei would have had to "migrate" and shorten rostrocaudally, specifically and only, from the lateral aspect of the brainstem where these nuclei extend from the pons through to the cervical spinal cord (e.g. the Paxinos and Watson rat brain atlases), the to the spatially restricted ventromedial region of specifically and only the rostral medulla oblongata. According to the current paper the inferior olivary complex of the elephant is very small and located lateral to their trigeminal nuclear complex, and the region from where the trigeminal nuclei are located by others, appears to be just "lateral nuclei" with no suggestion of what might be there instead.

      Such an extraordinary rearrangement of brainstem nuclei would require a major transformation in the manner in which the mutations, patterning, and expression of genes and associated molecules during development occurs. Such a major change is likely to lead to lethal phenotypes, making such a transformation extremely unlikely. Variations in mammalian brainstem anatomy are most commonly associated with quantitative changes rather than qualitative changes (10.1016/B978-0-12-804042-3.00045-2).

      The impetus for the identification of the unusual brainstem trigeminal nuclei in the current study rests upon a previous study from the same laboratory (10.1016/j.cub.2021.12.051) that estimated that the number of axons contained in the infraorbital branch of the trigeminal nerve that innervate the sensory surfaces of the trunk is approximately 400 000. Is this number unusual? In a much smaller mammal with a highly specialized trigeminal system, the platypus, the number of axons innervating the sensory surface of the platypus bill skin comes to 1 344 000 (10.1159/000113185). Yet, there is no complex rearrangement of the brainstem trigeminal nuclei in the brain of the developing or adult platypus (Ashwell, 2013, Neurobiology of Monotremes), despite the brainstem trigeminal nuclei being very large in the platypus (10.1159/000067195). Even in other large-brained mammals, such as large whales that do not have a trunk, the number of axons in the trigeminal nerve ranges between 400 000 and 500 000 (10.1007/978-3-319-47829-6_988-1). The lack of comparative support for the argument forwarded in the previous and current study from this laboratory, and that the comparative data indicates that the brainstem nuclei do not change in the manner suggested in the elephant, argues against the identification of the trigeminal nuclei as outlined in the current study. Moreover, the comparative studies undermine the prior claim of the authors, informing the current study, that "the elephant trigeminal ganglion ... point to a high degree of tactile specialization in elephants" (10.1016/j.cub.2021.12.051). While clearly the elephant has tactile sensitivity in the trunk, it is questionable as to whether what has been observed in elephants is indeed "truly extraordinary".

      But let's look more specifically at the justification outlined in the current study to support their identification of the unusual located trigeminal sensory nuclei of the brainstem.

      (1) Intense cytochrome oxidase reactivity<br /> (2) Large size of the putative trunk module<br /> (3) Elongation of the putative trunk module<br /> (4) Arrangement of these putative modules correspond to elephant head anatomy<br /> (5) Myelin stripes within the putative trunk module that apparently match trunk folds<br /> (6) Location apparently matches other mammals<br /> (7) Repetitive modular organization apparently similar to other mammals.<br /> (8) The inferior olive described by other authors lacks the lamellated appearance of this structure in other mammals

      Let's examine these justifications more closely.

      (1) Cytochrome oxidase histochemistry is typically used as an indicative marker of neuronal energy metabolism. The authors indicate, based on the "truly extraordinary" somatosensory capacities of the elephant trunk, that any nuclei processing this tactile information should be highly metabolically active, and thus should react intensely when stained for cytochrome oxidase. We are told in the methods section that the protocols used are described by Purkart et al (2022) and Kaufmann et al (2022). In neither of these cited papers is there any description, nor mention, of the cytochrome oxidase histochemistry methodology, thus we have no idea of how this histochemical staining was done. In order to obtain the best results for cytochrome oxidase histochemistry, the tissue is either processed very rapidly after buffer perfusion to remove blood or in recently perfusion-fixed tissue (e.g., 10.1016/0165-0270(93)90122-8). Given: (1) the presumably long post-mortem interval between death and fixation - "it often takes days to dissect elephants"; (2) subsequent fixation of the brains in 4% paraformaldehyde for "several weeks"; (3) The intense cytochrome oxidase reactivity in the inferior olivary complex of the laboratory rat (Gonzalez-Lima, 1998, Cytochrome oxidase in neuronal metabolism and Alzheimer's diseases); and (4) The lack of any comparative images from other stained portions of the elephant brainstem; it is difficult to support the justification as forwarded by the authors. It is likely that the histochemical staining observed is background reactivity from the use of diaminobenzidine in the staining protocol. Thus, this first justification is unsupported.<br /> Justifications (2), (3), and (4) are sequelae from justification (1). In this sense, they do not count as justifications, but rather unsupported extensions.

      (4) and (5) These are interesting justifications, as the paper has clear internal contradictions, and (5) is a sequelae of (4). The reader is led to the concept that the myelin tracts divide the nuclei into sub-modules that match the folding of the skin on the elephant trunk. One would then readily presume that these myelin tracts are in the incoming sensory axons from the trigeminal nerve. However, the authors note that this is not the case: "Our observations on trunk module myelin stripes are at odds with this view of myelin. Specifically, myelin stripes show no tapering (which we would expect if axons divert off into the tissue). More than that, there is no correlation between myelin stripe thickness (which presumably correlates with axon numbers) and trigeminal module neuron numbers. Thus, there are numerous myelinated axons, where we observe few or no trigeminal neurons. These observations are incompatible with the idea that myelin stripes form an axonal 'supply' system or that their prime function is to connect neurons. What do myelin stripe axons do, if they do not connect neurons? We suggest that myelin stripes serve to separate rather than connect neurons." So, we are left with the observation that the myelin stripes do not pass afferent trigeminal sensory information from the "truly extraordinary" trunk skin somatic sensory system, and rather function as units that separate neurons - but to what end? It appears that the myelin stripes are more likely to be efferent axonal bundles leaving the nuclei (to form the olivocerebellar tract). This justification is unsupported.

      (6) The authors indicate that the location of these nuclei matches that of the trigeminal nuclei in other mammals. This is not supported in any way. In ALL other mammals in which the trigeminal nuclei of the brainstem have been reported they are found in the lateral aspect of the brainstem, bordered laterally by the spinal trigeminal tract. This is most readily seen and accessible in the Paxinos and Watson rat brain atlases. The authors indicate that the trigeminal nuclei are medial to the facial nerve nucleus, but in every other species the trigeminal sensory nuclei are found lateral to the facial nerve nucleus. This is most salient when examining a close relative, the manatee (10.1002/ar.20573), where the location of the inferior olive and the trigeminal nuclei matches that described by Maseko et al (2013) for the African elephant. This justification is not supported.

      (7) The dual to quadruple repetition of rostro-caudal modules within the putative trigeminal nucleus as identified by the authors relies on the fact that in the neurotypical mammal, there are several trigeminal sensory nuclei arranged in a column running from the pons to the cervical spinal cord, these include (nomenclature from Paxinos and Watson in roughly rostral to caudal order) the Pr5VL, Pr5DM, Sp5O, Sp5I, and Sp5C. But, these nuclei are all located far from the midline and lateral to the facial nerve nucleus, unlike what the authors describe in the elephants. These rostrocaudal modules are expanded upon in Figure 2, and it is apparent from what is shown is that the authors are attributing other brainstem nuclei to the putative trigeminal nuclei to confirm their conclusion. For example, what they identify as the inferior olive in figure 2D is likely the lateral reticular nucleus as identified by Maseko et al (2013). This justification is not supported.

      (8) In primates and related species, there is a distinct banded appearance of the inferior olive, but what has been termed the inferior olive in the elephant by other authors does not have this appearance, rather, and specifically, the largest nuclear mass in the region (termed the principal nucleus of the inferior olive by Maseko et al, 2013, but Pr5, the principal trigeminal nucleus in the current paper) overshadows the partial banded appearance of the remaining nuclei in the region (but also drawn by the authors of the current paper). Thus, what is at debate here is whether the principal nucleus of the inferior olive can take on a nuclear shape rather than evince a banded appearance. The authors of this paper use this variance as justification that this cluster of nuclei could not possibly be the inferior olive. Such a "semi-nuclear/banded" arrangement of the inferior olive is seen in, for example, giraffe (10.1016/j.jchemneu.2007.05.003), domestic dog, polar bear, and most specifically the manatee (a close relative of the elephant) (brainmuseum.org; 10.1002/ar.20573). This justification is not supported.

      Thus, all the justifications forwarded by the authors are unsupported. Based on methodological concerns, prior comparative mammalian neuroanatomy, and prior studies in the elephant and closely related species, the authors fail to support their notion that what was previously termed the inferior olive in the elephant is actually the trigeminal sensory nuclei. Given this failure, the justifications provided above that are sequelae also fail. In this sense, the entire manuscript and all the sequelae are not supported.

      What the authors have not done is to trace the pathway of the large trigeminal nerve in the elephant brainstem, as was done by Maseko et al (2013), which clearly shows the internal pathways of this nerve, from the branch that leads to the fifth mesencephalic nucleus adjacent to the periventricular grey matter, through to the spinal trigeminal tract that extends from the pons to the spinal cord in a manner very similar to all other mammals. Nor have they shown how the supposed trigeminal information reaches the putative trigeminal nuclei in the ventromedial rostral medulla oblongata. These are but two examples of many specific lines of evidence that would be required to support their conclusions. Clearly tract tracing methods, such as cholera toxin tracing of peripheral nerves cannot be done in elephants, thus the neuroanatomy must be done properly and with attention to details to support the major changes indicated by the authors.

      So what are these "bumps" in the elephant brainstem?

      Four previous authors indicate that these bumps are the inferior olivary nuclear complex. Can this be supported?

      The inferior olivary nuclear complex acts "as a relay station between the spinal cord (n.b. trigeminal input does reach the spinal cord via the spinal trigeminal tract) and the cerebellum, integrating motor and sensory information to provide feedback and training to cerebellar neurons" (https://www.ncbi.nlm.nih.gov/books/NBK542242/). The inferior olivary nuclear complex is located dorsal and medial to the pyramidal tracts (which were not labelled in the current study by the authors but are clearly present in Fig. 1C and 2A) in the ventromedial aspect of the rostral medulla oblongata. This is precisely where previous authors have identified the inferior olivary nuclear complex and what the current authors assign to their putative trigeminal nuclei. The neurons of the inferior olivary nuclei project, via the olivocerebellar tract to the cerebellum to terminate in the climbing fibres of the cerebellar cortex.

      Elephants have the largest (relative and absolute) cerebellum of all mammals (10.1002/ar.22425), this cerebellum contains 257 x109 neurons (10.3389/fnana.2014.00046; three times more than the entire human brain, 10.3389/neuro.09.031.2009). Each of these neurons appears to be more structurally complex than the homologous neurons in other mammals (10.1159/000345565; 10.1007/s00429-010-0288-3). In the African elephant, the neurons of the inferior olivary nuclear complex are described by Maseko et al (2013) as being both calbindin and calretinin immunoreactive. Climbing fibres in the cerebellar cortex of the African elephant are clearly calretinin immunopositive and also are likely to contain calbindin (10.1159/000345565). Given this, would it be surprising that the inferior olivary nuclear complex of the elephant is enlarged enough to create a very distinct bump in exactly the same place where these nuclei are identified in other mammals?

      What about the myelin stripes? These are most likely to be the origin of the olivocerebellar tract and probably only have a coincidental relationship to the trunk. Thus, given what we know, the inferior olivary nuclear complex as described in other studies, and the putative trigeminal nuclear complex as described in the current study, is the elephant inferior olivary nuclear complex. It is not what the authors believe it to be, and they do not provide any evidence that discounts the previous studies. The authors are quite simply put, wrong. All the speculations that flow from this major neuroanatomical error are therefore science fiction rather than useful additions to the scientific literature.

      What do the authors actually have?<br /> The authors have interesting data, based on their Golgi staining and analysis, of the inferior olivary nuclear complex in the elephant.

      * Review of Revised Manuscript *

      Assessment:

      There is a clear dichotomy between the authors and this reviewer regarding the identification of specific structures, namely the inferior olivary nuclear complex and the trigeminal nuclear complex, in the brainstem of the elephant. The authors maintain the position that in the elephant alone, irrespective of all the published data on other mammals and previously published data on the elephant brainstem, these two nuclear complexes are switched in location. The authors maintain that their interpretation is correct, this reviewer maintains that this interpretation is erroneous. The authors expressed concern that the remainder of the paper was not addressed by the reviewer, but the reviewer maintains that these sequelae to the misidentification of nuclear complexes in the elephant brainstem renders any of these speculations irrelevant as the critical structures are incorrectly identified. It is this reviewer's opinion that this paper is incorrect. I provide a lot of detail below in order to provide support to the opinion I express.

      Public Review of Current Submission:

      As indicated in my previous review of this manuscript (see above), it is my opinion that the authors have misidentified, and indeed switched, the inferior olivary nuclear complex (IO) and the trigeminal nuclear complex (Vsens). It is this specific point only that I will address in this second review, as this is the crucial aspect of this paper - if the identification of these nuclear complexes in the elephant brainstem by the authors is incorrect, the remainder of the paper does not have any scientific validity.

      The authors, in their response to my initial review, claim that I "bend" the comparative evidence against them. They further claim that as all other mammalian species exhibit a "serrated" appearance of the inferior olive, and as the elephant does not exhibit this appearance, that what was previously identified as the inferior olive is actually the trigeminal nucleus and vice versa.

      For convenience, I will refer to IOM and VsensM as the identification of these structures according to Maseko et al (2013) and other authors and will use IOR and VsensR to refer to the identification forwarded in the study under review.<br /> The IOM/VsensR certainly does not have a serrated appearance in elephants. Indeed, from the plates supplied by the authors in response (Referee Fig. 2), the cytochrome oxidase image supplied and the image from Maseko et al (2013) shows a very similar appearance. There is no doubt that the authors are identifying structures that closely correspond to those provided by Maseko et al (2013). It is solely a contrast in what these nuclear complexes are called and the functional sequelae of the identification of these complexes (are they related to the trunk sensation or movement controlled by the cerebellum?) that is under debate.

      Elephants are part of the Afrotheria, thus the most relevant comparative data to resolve this issue will be the identification of these nuclei in other Afrotherian species. Below I provide images of these nuclear complexes, labelled in the standard nomenclature, across several Afrotherian species.

      (A) Lesser hedgehog tenrec (Echinops telfairi)

      Tenrecs brains are the most intensively studied of the Afrotherian brains, these extensive neuroanatomical studies undertaken primarily by Heinz Künzle. Below I append images (coronal sections stained with cresol violet) of the IO and Vsens (labelled in the standard mammalian manner) in the lesser hedgehog tenrec. It should be clear that the inferior olive is located in the ventral midline of the rostral medulla oblongata (just like the rat) and that this nucleus is not distinctly serrated. The Vsens is located in the lateral aspect of the medulla skirted laterally by the spinal trigeminal tract (Sp5). These images and the labels indicating structures correlate precisely with that provide by Künzle (1997, 10.1016/S0168- 0102(97)00034-5), see his Figure 1K,L. Thus, in the first case of a related species, there is no serrated appearance of the inferior olive, the location of the inferior olive is confirmed through connectivity with the superior colliculus (a standard connection in mammals) by Künzle (1997), and the location of Vsens is what is considered to be typical for mammals. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Decision letter image 1.

      (B) Giant otter shrew (Potomogale velox)

      The otter shrews are close relatives of the Tenrecs. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see hints of the serration of the IO as defined by the authors, but we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Decision letter image 2.

      (C) Four-toed sengi (Petrodromus tetradactylus)

      The sengis are close relatives of the Tenrecs and otter shrews, these three groups being part of the Afroinsectiphilia, a distinct branch of the Afrotheria. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see vague hints of the serration of the IO (as defined by the authors), and we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Decision letter image 3.

      (D) Rock hyrax (Procavia capensis)

      The hyraxes, along with the sirens and elephants form the Paenungulata branch of the Afrotheria. Below I append images of cresyl violet (left column) and myelin (right column) stained coronal sections through the brainstem with the IO, Vsens and Sp5 labelled as per the standard mammalian anatomy. Here we see hints of the serration of the IO (as defined by the authors), but we also see evidence of a more "bulbous" appearance of subnuclei of the IO (particularly the principal nucleus), and we also see many myelin stripes across the IO. Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Decision letter image 4.

      (E) West Indian manatee (Trichechus manatus)

      The sirens are the closest extant relatives of the elephants in the Afrotheria. Below I append images of cresyl violet (top) and myelin (bottom) stained coronal sections (taken from the University of Wisconsin-Madison Brain Collection, https://brainmuseum.org, and while quite low in magnification they do reveal the structures under debate) through the brainstem with the IO, Vsens and Sp5 labelled as per standard mammalian anatomy. Here we see the serration of the IO (as defined by the authors). Vsens is located laterally and skirted by the Sp5. This is in agreement with the authors, as they propose that ONLY the elephants show the variations they report.

      Decision letter image 5.

      These comparisons and the structural identification, with which the authors agree as they only distinguish the elephants from the other Afrotheria, demonstrate that the appearance of the IO can be quite variable across mammalian species, including those with a close phylogenetic affinity to the elephants. Not all mammal species possess a "serrated" appearance of the IO. Thus, it is more than just theoretically possible that the IO of the elephant appears as described prior to this study.

      So what about elephants? Below I append a series of images from coronal sections through the African elephant brainstem stained for Nissl, myelin, and immunostained for calretinin. These sections are labelled according to standard mammalian nomenclature. In these complete sections of the elephant brainstem, we do not see a serrated appearance of the IOM (as described previously and in the current study by the authors). Rather the principal nucleus of the IOM appears to be bulbous in nature. In the current study, no image of myelin staining in the IOM/VsensR is provided by the authors. However, in the images I provide, we do see the reported myelin stripes in all stains - agreement between the authors and reviewer on this point. The higher magnification image to the bottom left of the plate shows one of the IOM/VsensR myelin stripes immunostained for calretinin, and within the myelin stripes axons immunopositive for calretinin are seen (labelled with an arrow). The climbing fibres of the elephant cerebellar cortex are similarly calretinin immunopositive (10.1159/000345565). In contrast, although not shown at high magnification, the fibres forming the Sp5 in the elephant (in the Maseko description, unnamed in the description of the authors) show no immunoreactivity to calretinin.

      Decision letter image 6.

      Peripherin Immunostaining

      In their revised manuscript the authors present immunostaining of peripherin in the elephant brainstem. This is an important addition (although it does replace the only staining of myelin provided by the authors which is unusual as the word myelin is in the title of the paper) as peripherin is known to specifically label peripheral nerves. In addition, as pointed out by the authors, peripherin also immunostains climbing fibres (Errante et al., 1998). The understanding of this staining is important in determining the identification of the IO and Vsens in the elephant, although it is not ideal for this task as there is some ambiguity. Errante and colleagues (1998; Fig. 1) show that climbing fibres are peripherin-immunopositive in the rat. But what the authors do not evaluate is the extensive peripherin staining in the rat Sp5 in the same paper (Errante et al, 1998, Fig. 2). The image provided by the authors of their peripherin immunostaining (their new Figure 2) shows what I would call the Sp5 of the elephant to be strongly peripherin immunoreactive, just like the rat shown in Errant et al (1998), and more over in the precise position of the rat Sp5! This makes sense as this is where the axons subserving the "extraordinary" tactile sensitivity of the elephant trunk would be found (in the standard model of mammalian brainstem anatomy). Interestingly, the peripherin immunostaining in the elephant is clearly lamellated...this coincides precisely with the description of the trigeminal sensory nuclei in the elephant by Maskeo et al (2013) as pointed out by the authors in their rebuttal. Errante et al (1998) also point out peripherin immunostaining in the inferior olive, but according to the authors this is only "weakly present" in the elephant IOM/VsensR. This latter point is crucial. Surely if the elephant has an extraordinary sensory innervation from the trunk, with 400 000 axons entering the brain, the VsensR/IOM should be highly peripherin-immunopositive, including the myelinated axon bundles?! In this sense, the authors argue against their own interpretation - either the elephant trunk is not a highly sensitive tactile organ, or the VsensR is not the trigeminal nuclei it is supposed to be.

      Summary:

      (1) Comparative data of species closely related to elephants (Afrotherians) demonstrates that not all mammals exhibit the "serrated" appearance of the principal nucleus of the inferior olive.

      (2) The location of the IO and Vsens as reported in the current study (IOR and VsensR) would require a significant, and unprecedented, rearrangement of the brainstem in the elephants independently. I argue that the underlying molecular and genetic changes required to achieve this would be so extreme that it would lead to lethal phenotypes. Arguing that the "switcheroo" of the IO and Vsens does occur in the elephant (and no other mammals) and thus doesn't lead to lethal phenotypes is a circular argument that cannot be substantiated.

      (3) Myelin stripes in the subnuclei of the inferior olivary nuclear complex are seen across all related mammals as shown above. Thus, the observation made in the elephant by the authors in what they call the VsensR, is similar to that seen in the IO of related mammals, especially when the IO takes on a more bulbous appearance. These myelin stripes are the origin of the olivocerebellar pathway, and are indeed calretinin immunopositive in the elephant as I show.

      (4) What the authors see aligns perfectly with what has been described previously, the only difference being the names that nuclear complexes are being called. But identifying these nuclei is important, as any functional sequelae, as extensively discussed by the authors, is entirely dependent upon accurately identifying these nuclei.

      (4) The peripherin immunostaining scores an own goal - if peripherin is marking peripheral nerves (as the authors and I believe it is), then why is the VsensR/IOM only "weakly positive" for this stain? This either means that the "extraordinary" tactile sensitivity of the elephant trunk is non-existent, or that the authors have misinterpreted this staining. That there is extensive staining in the fibre pathway dorsal and lateral to the IOR (which I call the spinal trigeminal tract), supports the idea that the authors have misinterpreted their peripherin immunostaining.

      (5) Evolutionary expediency. The authors argue that what they report is an expedient way in which to modify the organisation of the brainstem in the elephant to accommodate the "extraordinary" tactile sensitivity. I disagree. As pointed out in my first review, the elephant cerebellum is very large and comprised of huge numbers of morphologically complex neurons. The inferior olivary nuclei in all mammals studied in detail to date, give rise to the climbing fibres that terminate on the Purkinje cells of the cerebellar cortex. It is more parsimonious to argue that, in alignment with the expansion of the elephant cerebellum (for motor control of the trunk), the inferior olivary nuclei (specifically the principal nucleus) have had additional neurons added to accommodate this cerebellar expansion. Such an addition of neurons to the principal nucleus of the inferior olive could readily lead to the loss of the serrated appearance of the principal nucleus of the inferior olive, and would require far less modifications in the developmental genetic program that forms these nuclei. This type of quantitative change appears to be the primary way in which structures are altered in the mammalian brainstem.

    3. eLife assessment

      This valuable study uses neuroanatomical techniques to investigate somatosensory projections from the elephant trunk to the brainstem. Given its unique specializations, understanding how the elephant trunk is represented within the brain is of general interest to evolutionary and comparative neuroscientists. The authors present solid evidence for the existence of a novel isomorphism in which the folds of the trunk are mapped onto the trigeminal nucleus; however, due to their unusual structure, some uncertainty remains about the identification and anatomical organization of nuclei within the elephant brainstem.

    4. Reviewer #1 (Public Review):

      This manuscript remains an intriguing investigation of the elephant brainstem, with particular attention drawn to possible sensory and motor representation of the renowned trunk of African and Asian elephants. As the authors note, this area has traditionally been identified as part of the superior olivary complex and associated with the fine motor control of the trunk; however, notable patterns within myelin stripes suggest that its parcellation may relate to specific regions/folds found along the long axis of the trunk, including elaborated regions for the trunk "finger" distal end.

      In this iteration of the manuscript, the researchers have provided peripherin antibody staining within the regions they have identified as the trigeminal nucleus and the superior olive. These data, with abundant peripherin expression within climbing fibers of the presumed superior olive and relatively lower expression within the trigeminal nucleus, bolster their interpretation of having comprehensively identified the trigeminal nucleus and trunk representation via a battery of neuroanatomical methods.

      All other conclusions remain the same, and these data have provoked intriguing and animated discussion on classification of neuroanatomical structure, particularly in species with relatively limited access to specimens. Most significantly, these discussions have underscored the fundamental nature of comparative methods (from protein to cellular to anatomical levels), including interpreting homologous structures among species of varying levels of relatedness.

    5. Reviewer #3 (Public Review):

      Summary:

      The study claims to investigate trunk representations in elephant trigeminal nuclei located in the brainstem. The researchers identify large protrusions visible from the ventral surface of the brainstem, which they examined using a range of histological methods. However, this ventral location is usually where the inferior olivary complex is found, which challenges the author's assertions about the nucleus under analysis. They find that this brainstem nucleus of elephants contains repeating modules, with a focus on the anterior and largest unit which they define as the putative nucleus principalis trunk module of the trigeminal. The nucleus exhibits low neuron density, with glia outnumbering neurons significantly. The study also utilizes synchrotron X-ray phase contrast tomography to suggest that myelin-stripe-axons traverse this module. The analysis maps myelin-rich stripes in several specimens and concludes that based on their number and patterning that they likely correspond with trunk folds; however this conclusion is not well supported if the nucleus has been misidentified.

      Strengths:

      The strength of this research lies in its comprehensive use of various anatomical methods, including Nissl staining, myelin staining, Golgi staining, cytochrome oxidase labeling, and synchrotron X-ray phase contrast tomography. The inclusion of quantitative data on cell numbers and sizes, dendritic orientation and morphology, and blood vessel density across the nucleus adds a quantitative dimension. Furthermore, the research is commendable for its high-quality and abundant images and figures, effectively illustrating the anatomy under investigation.

      Weaknesses:

      While the research provides potentially valuable insights if revised to focus on the structure that appears to be inferior olivary nucleus, there are certain additional weaknesses that warrant further consideration. First, the suggestion that myelin stripes solely serve to separate sensory or motor modules rather than functioning as an "axonal supply system" lacks substantial support due to the absence of information about the neuronal origins and the termination targets of the axons. Postmortem fixed brain tissue limits the ability to trace full axon projections. While the study acknowledges these limitations, it is important to exercise caution in drawing conclusions about the precise role of myelin stripes without a more comprehensive understanding of their neural connections.

      Second, the quantification presented in the study lacks comparison to other species or other relevant variables within the elephant specimens (i.e., whole brain or brainstem volume). The absence of comparative data to different species limits the ability to fully evaluate the significance of the findings. Comparative analyses could provide a broader context for understanding whether the observed features are unique to elephants or more common across species. This limitation in comparative data hinders a more comprehensive assessment of the implications of the research within the broader field of neuroanatomy. Furthermore, the quantitative comparisons between African and Asian elephant specimens should include some measure of overall brain size as a covariate in the analyses. Addressing these weaknesses would enable a richer interpretation of the study's findings.

    6. Reviewer #4 (Public Review):

      Summary:

      The authors report a novel isomorphism in which the folds of the elephant trunk are recognizably mapped onto the principal sensory trigeminal nucleus in the brainstem. Further, they identifiy the enlarged nucleus as being situated in this species in an unusual ventral midline position.

      Strengths:

      The identity of the purported trigeminal nucleus and the isomorphic mapping with the trunk folds is supported by multiple lines of evidence: enhanced staining for cytochrome oxidase, an enzyme associated with high metabolic activity; dense vascularization, consistent with high metabolic activity; prominent myelinated bundles that partition the nucleus in a 1:1 mapping of the cutaneous folds in the trunk periphery; near absence of labeling for the anti-peripherin antibody, specific for climbing fibers, which can be seen as expected in the inferior olive; and a high density of glia.

      Weaknesses:

      Despite the supporting evidence listed above, the identification of the gross anatomical bumps, conspicuous in the ventral midline, is problematic. This would be the standard location of the inferior olive, with the principal trigeminal nucleus occupying a more dorsal position. This presents an apparent contradiction which at a minimum needs further discussion. Major species-specific specializations and positional shifts are well-documented for cortical areas, but nuclear layouts in the brainstem have been considered as less malleable.

    7. Reviewer #5 (Public Review):

      After reading the manuscript and the concerns raised by reviewer 2 I see both sides of the argument - the relative location of trigeminal nucleus versus the inferior olive is quite different in elephants (and different from previous studies in elephants), but when there is a large disproportionate magnification of a behaviorally relevant body part at most levels of the nervous system (certainly in the cortex and thalamus), you can get major shifting in location of different structures. In the case of the elephant, it looks like there may be a lot of shifting. Something that is compelling is that the number of modules separated but the myelin bands correspond to the number of trunk folds which is different in the different elephants. This sort of modular division based on body parts is a general principle of mammalian brain organization (demonstrated beautifully for the cuneate and gracile nucleus in primates, VP in most of species, S1 in a variety of mammals such as the star nosed mole and duck-billed platypus). I don't think these relative changes in the brainstem would require major genetic programming - although some surely exists. Rodents and elephants have been independently evolving for over 60 million years so there is a substantial amount of time for changes in each l lineage to occur.

      I agree that the authors have identified the trigeminal nucleus correctly, although comparisons with more out groups would be needed to confirm this (although I'm not suggesting that the authors do this). I also think the new figure (which shows previous divisions of the brainstem versus their own) allows the reader to consider these issues for themselves. When reviewing this paper, I actually took the time to go through atlases of other species and even look at some of my own data from highly derived species. Establishing homology across groups based only on relative location is tough especially when there appears to be large shifts in relative location of structures. My thoughts are that the authors did an extraordinary amount of work on obtaining, processing and analyzing this extremely valuable tissue. They document their work with images of the tissue and their arguments for their divisions are solid. I feel that they have earned the right to speculate - with qualifications - which they provide.

    1. Author response:

      Thank you for organising the review and providing us with the reviewer's feedback. These comments are very useful, and we would like to express our gratitude to the reviewers for their efforts.

      The reviewers all point out a number of related improvements, relating to: 1) describing various processing steps more clearly, in the online documentation but also in the manuscript itself (e.g. for particle picking), 2) describing more clearly what features Ais offers, how these compare to those of other programmes, and how they might be interfaced with in third-party programmes (e.g. the expected format of models), and 3) a degree of subjectivity in discussion of the results presented in the manuscript (e.g. our statement that Pix2pix performed better in some cases than did other architectures).

      We will address these points, as well as the various other suggestions, in the upcoming revised manuscript and updates to Ais.

    1. eLife assessment

      The authors further corroborated their model that Netrin signaling promotes survival and dissemination of non-proliferating ovarian cancer cells. These valuable results were found to be of significant potential interest to cancer biologists in as much as they address gaps in knowledge pertinent to the mechanisms underpinning ovarian cancer spread. In general, it was thought that solid experimental evidence was provided to support the role of Netrin signaling in fueling ovarian cancer progression.

    2. Joint Public Review:

      In this article, the authors employed modified CRISPR screens ["guide-only (GO)-CRISPR"] in the attempt to identify the genes which may mediate cancer cell dormancy in the high grade serous ovarian cancer (HGSOC) spheroid culture models. Using this approach, they observed that abrogation of several of the components of the netrin (e.g., DCC, UNC5Hs) and MAPK pathways compromise survival of non-proliferative ovarian cancer cells. This strategy was complemented by the RNAseq approach which revealed that number of the components of the netrin pathway are upregulated in non-proliferative ovarian cancer cells, and that their overexpression is lost upon disruption of DYRK1A kinase that has been previously demonstrated to play a major role in survival of these cells. Perampalam et al. then employed a battery of cell biology approaches to support the model whereby the Netrin signaling governs the MEK-ERK axis to support survival of non-proliferative ovarian cancer cells. Moreover, the authors show that overexpression of Netrins 1 and 3 bolsters dissemination of ovarian cancer cells in the xenograft mouse model, while also providing evidence that high levels of the aforementioned factors are associated with poor prognosis of HGSOC patients.

      Strengths:

      In this valuable study Perampalam et al. developed a CRISPR-based screening approach to identify key genes that are enriched in high grade serous ovarian cancer spheroids. This led to a discovery that Netrin signaling plays a prominent role in survival of ovarian cancer cells. During revision, the authors provide additional evidence to support their central claims and to this end, it was found that they now provide solid evidence to substantiate the proposed model. This work is anticipated to be of interest to cancer biologists specializing in ovarian cancer biology.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Perampalam et al. describe novel methods for genome-wide CRISPR screening to identify and validate genes essential for HGSOC spheroid viability. In this study, they report that Netrin signaling is essential for maintaining disseminated cancer spheroid survival, wherein overexpression of Netrin pathway genes increases tumor burden in a xenograft model of ovarian cancer. They also show that high netrin expression correlates with poor survival outcomes in ovarian cancer patients. The study provides insights into the biology of netrin signaling in DTC cluster survival and warrants development of therapies to block netrin signaling for treating serous ovarian cancer.

      Strengths:

      - The study identifies Netrin signaling to be important in disseminated cancer spheroid survival

      - A Novel GO-CRISPR methodology was used to find key genes and pathways essential for disseminated cancer cell survival

      Thanks for the endorsement of our work and its importance to metastasis in ovarian cancer.

      Weaknesses:

      - The term dormancy is not fully validated and requires additional confirmation to claim the importance of Netrin signaling in "dormant" cancer survival.

      - Findings shown in the study largely relate to cancer dissemination and DTS survival rather than cancer dormancy.

      Much of the validation of dormancy and cell cycle arrest in HGSOC spheroids, as well as the culture model, have been published previously and hence was not repeated here.  I think this reviewer will appreciate the updated citations and explanations to better illustrate the state of knowledge.  We have also added new experiments that further emphasize the dormant state of spheroid cells in culture and xenografts, as well as patient derived spheroids used in this study.

      Reviewer #1 (Recommendations for Authors):

      (1) It is unclear what spheroid/adherent enrichment ratio is and how it ties into genes affecting cell viability. Why is an ER below 1 the criteria for selecting survival genes?

      Our screen uses the ‘guide only’ comparison in each culture condition to establish a gene score under that specific condition.  A low adherent score captures genes that are essential under standard culture conditions where cells are proliferating and this can include genes needed for proliferation or other basic functions in cell physiology.  A low spheroid score identifies the genes that are most depleted in suspension when cells are growth arrested and this is an indication of cell death in this condition.  Since gene knock outs are first established in adherent proliferating conditions, essential genes under these conditions will already start to become depleted from the population before suspension culture.  By selecting genes with a ratio of <1 we can identify those that are most relevant to dormant suspension culture conditions.  Ultimately, the lowest enrichment ratio scores represent genes whose loss of function is dispensable in the initial adherent condition, but critical for survival in suspension and this is what we aimed to identify. We’ve updated Figure 1B to illustrate this and we’ve updated the explanation of the enrichment ratio on page 6, lines 144 to 147 of the results.

      (2) The WB for phospho-p38 in figure 1A for OVCAR8 line does not show increased phosphorylation in the spheroid relative to the adherent. If anything, phospho-p38 appears to be reduced in the spheroid. Can the authors provide a better western blot?

      We’ve updated this blot with a longer exposure, see Figure 1A.  Phosphorylation levels of p38 are essentially unchanged in OVCAR8 cells in suspension culture, although the overall levels of p38 may be slightly reduced in dormant culture conditions.

      (3) How did the authors confirm dormancy apart from western blot for phospho-ERK vs phospho-p38? Authors should add EdU/BrdU staining and/or Ki67 staining to confirm dormancy.

      Previous publications that appear as citations 7,10, and 33 in the reference list established the growth arrest state of these cells in suspension culture in the past.  This included measuring other known markers of dormancy and quiescence such as p27, p130, and reduced cyclin/cdk activity and 3H-thymidine incorporation. In addition, other associated characteristics of dormancy such as EMT and catabolic metabolism have been demonstrated in these culture conditions (see citation 11 and Rafehi et al. Endocr. Relat. Cancer 23;147-59).  We’ve added these additional citations to our descriptions of dormant spheroid culture to better clarify the status of these cells in our experiments (see page 6, lines 126-28).  To ensure that cells are growth arrested in the experiments shown in this paper, we have updated Figure 1A to include blots of p130 and Ki67 to further emphasize that spheroid cells are not proliferating as the quiescence marker (p130) is high and the proliferative marker (Ki67) is lost in suspension culture.

      (4) Can the authors report spheroid volume over time in culture? How was viability measured?

      We’ve updated the methods (see page 27, line 574) to better highlight the description of cell survival that answers both of these questions. At the ends of experimental time points in both the screen and viability assays we captured live cells by replating on adherent plasticware. We fixed and stained with crystal violet and photographed plates to illustrate the sizes of spheroids (shown in Fig. 2 Supplement 1E, Fig. 6C, and 7D). We subsequently extracted the dye and quantitated it spectrophotometrically to quantitatively compare biomass of viable cells between experiments irrespective of the relatively random shapes of spheroids. We found reattachment and staining in this manner to match traditional viability assays such as CellTiter-Glo in a previous paper (10). Furthermore, biomass never increases in culture and diminishes gradually over time in culture consistent with the non-proliferative state of these experiments. Double checks of this equivalency of viability and reattached biomass measurments, as well as demonstrating that biomass is lost over time, are shown in Fig. 2 Supplement 1E that compares reattached crystal violet staining measurements with CellTiter-Glo for DYRK1A knock out cells over time in culture. In addition, we include a comparison of crystal violet staining of reattached spheroids with trypan blue dye exclusion in Fig. 5G and H. In both cases reattachment and more direct viability assays demonstrate the same conclusion that Netrin signaling supports viability in dormant culture.

      (5) Please show survival significance of Netrin signaling genes in recurrence/relapse free survival to claim importance in cancer dormancy.

      See Fig. 7 Supplement 1C where we include the recurrence free survival data. Netrin-1, and -3 high expressors also have a numerically shorter progression free survival but it is not statistically significant. Netrin-1 overexpression alone is also shown and it shows shorter survival with a P-value of 0.0735. Elevated survival of dormant cells in a residual disease state is expected to increase the chance of relapse and shorten this interval. Thus, this data is consistent with our model, but lacks statistical significance. 

      There are many alternative ways to interpret what shorter progression free survival, or overall survival, may mean biologically. Since survival of dormant cells is but one of them, we also added new data to experimentally investigate the role of endogenous Netrin signaling in dormant residual disease in Fig. 6 and described on page 12, lines 266-87.  We used xenograft experiments to show OVCAR8 spheroids form and withdraw from the cell cycle equivalently to suspension culture following intraperitoneal injection.  Furthermore, loss of Netrin signaling due to receptor deletions compromises survival during this early window before disseminated lesions form.  This argues that Netrin signaling contributes to survival during this window of dormancy.  In addition, mice engrafted with mutant cells experience prolonged survival when Netrin signaling is blocked.  Together, these experiments further argue that Netrin signaling supports survival in the dormant, non-proliferative phase, and leads to reduced survival of mice.

      (6) The authors show IHC staining of patient ascites derived HGSOC spheroids. However, no marker for dormancy is shown in these spheroids. Adding Ki67 staining or phospho-ERK vs phospho-p38 would be necessary to confirm cancer dormancy.

      We have added new staining for Ki67 and p130 that compares these markers in HGSOC tumors where Ki67 is high and p130 is low with ascites derived spheroids where staining is the opposite. Importantly, expression of p130 is linked to cellular quiescence and is not found to accumulate in the nucleus of cells that are just transiting through G1.  This confirms that the ascites derived spheroids are dormant.  See Fig. 4A-E and described on page 9, lines 201-7.

      (7) Overall, the findings are interesting in the context of cancer dissemination. There is not enough evidence for cancer dormancy and the importance of Netrin signaling in the survival of cancer dormancy. Overexpression of Netrin increases phosphorylation of ERK, leading one to expect an increase in proliferation. This suggests that Netrin breaks cancer cells out of dormancy, into a proliferative state.

      We have found that the discovery of Netrin activation of MEK-ERK in growth arrested cells is counterintuitive to many cancer researchers.  However, this axis exists in other paradigms of Netrin signaling in axon outgrowth that are not proliferation related (see citation 26, Forcet et al. Nature 417; 443-7 as an example).  We have added Fig. 5D and descriptions on page 11, lines 244-52 to better clarify that Netrins CAN’T induce cell proliferation through ERK.  Addition of recombinant Netrin-1 can only induce ERK phosphorylation in suspension culture conditions and not in quiescent adherent conditions.  The small magnitude of ERK phosphorylation induced by Netrin-1 in suspension compared to treating adherent, quiescent cells with the same concentration of mitogenic EGF further emphasizes that this is not a proliferative signal.  Lastly, the new xenograft experiment in Fig. 6A-D (described on page 12, lines 266-81 demonstrates the growth arrested context in which Netrin signaling in dormant spheroids leads supports viability.

      (8) If authors wish to claim cancer dormancy as the premise of their study, additional confirmatory experiments are required to support their claims. Alternatively, based on the current findings of the study, it would be best to change the premise of the article to Netrin signaling in cancer dissemination and survival of disseminated cancer spheroids rather than cancer dormancy.

      I expect that this reviewer will agree that we have added more than sufficient explanations of background work on HGSOC spheroid dormancy from the literature, as well as new experiments that address their questions about dormancy in our experiments.

      Reviewer #2 (Public Review):

      Summary:

      In this article, the authors employed modified CRISPR screens ["guide-only (GO)-CRISPR"] in the attempt to identify the genes which may mediate cancer cell dormancy in the high grade serous ovarian cancer (HGSOC) spheroid culture models. Using this approach, they observed that abrogation of several of the components of the netrin (e.g., DCC, UNC5Hs) and MAPK pathways compromise the survival of non-proliferative ovarian cancer cells. This strategy was complemented by the RNAseq approach which revealed that a number of the components of the netrin pathway are upregulated in non-proliferative ovarian cancer cells and that their overexpression is lost upon disruption of DYRK1A kinase that has been previously demonstrated to play a major role in survival of these cells. Perampalam et al. then employed a battery of cell biology approaches to support the model whereby the Netrin signaling governs the MEK-ERK axis to support survival of non-proliferative ovarian cancer cells. Moreover, the authors show that overexpression of Netrins 1 and 3 bolsters dissemination of ovarian cancer cells in the xenograft mouse model, while also providing evidence that high levels of the aforementioned factors are associated with poor prognosis of HGSOC patients.

      Strengths:

      Overall it was thought that this study is of potentially broad interest in as much as it provides previously unappreciated insights into the potential molecular underpinnings of cancer cell dormancy, which has been associated with therapy resistance, disease dissemination, and relapse as well as poor prognosis. Notwithstanding the potential limitations of cellular models in mimicking cancer cell dormancy, it was thought that the authors provided sufficient support for their model that netrin signaling drives survival of non-proliferating ovarian cancer cells and their dissemination. Collectively, it was thought that these findings hold a promise to significantly contribute to the understanding of the molecular mechanisms of cancer cell dormancy and in the long term may provide a molecular basis to address this emerging major issue in the clinical practice.

      Thanks for the kind words about the importance of our work in the broader challenges of cancer treatment.

      Weaknesses:

      Several issues were observed regarding methodology and data interpretation. The major concerns were related to the reliability of modelling cancer cell dormancy. To this end, it was relatively hard to appreciate how the employed spheroid model allows to distinguish between dormant and e.g., quiescent or even senescent cells. This was in contrast to solid evidence that netrin signaling stimulates abdominal dissemination of ovarian cancer cells in the mouse xenograft and their survival in organoid culture. Moreover, the role of ERK in mediating the effects of netrin signaling in the context of the survival of non-proliferative ovarian cancer cells was found to be somewhat underdeveloped.

      Experiments previously published in citation 7 show that growth arrest in patient ascites derived spheroids is fully reversible and that argued against non-proliferative spheroids being a form of senescence and moved this work into the dormancy field.  We have added extensive new support for our model systems and data to address the counterintuitive aspects of MEK-ERK signaling in survival instead of proliferation. 

      Reviewer #1 Recommendations for Authors

      (1) A better characterization of the spheroid model may be warranted, including staining for the markers of quiescence and senescence (including combining these markers with staining for the components of the netrin pathway)

      See Figure 1A and page 6, lines 126-36 where we have added blots for Ki67 and p130 to better emphasize the arrested proliferative state of cells in our screening conditions.  We have also added these same controls for patient ascites-derived spheroids in Figure 4 and described on page 9, lines 203-7.  One realization from this CRISPR screen, and others in our lab, is that it identifies functionally important aspects of cell physiology and not necessarily ones that are easily explored using commercially available antibodies.  Netrin-1 and -3 staining of patient derived spheroids in Fig. 4, as well as cell line spheroids stained in Fig. 4 Supplement 1 further support the relevance of this pathway in dormant cancer cells because Netrins are expressed in the right place at the right time.  The Netrin-1 stimulation experiments in Fig. 5C were originally carried out to probe HGSOC cells for functionality of Netrin receptors since we couldn’t reliably detected them by blotting or staining with available antibodies.  This demonstrates that this pathway is active in the various HGSOC cell lines we’ve used and specifically, using OVCAR8 cells, we show it is only active in suspension culture conditions.

      (2) In figure 1A it appears that total p38 levels are reduced in some cell lines in spheroid vs. adherent culture. The authors should comment on this.

      These blots have been updated to be more clear.  Overall p38 levels may be reduced in some cell lines and when compared with activation levels of phosphorylated p38 it suggests the fraction of activated p38 is higher. OVCAR8 cells may be an exception where the overall activity level remains approximately the same.

      (3) The authors should perhaps provide a clearer rationale for choosing to focus on the netrin signaling vs. e.g., GPCR signaling, and consider more explicit defining of "primary" vs. "tertiary" categories in Reactome gene set analysis.

      We’ve updated Fig. 1E and the text on page7, lines 161-5 to illustrate which gene categories identified in the screen belong to which tiers of Reactome categories. It better visualizes why we have investigated the Axon guidance pathway that includes Netrin because it is a highly specific signaling pathway that scores similarly to the broader and less specific categories at the very top of the list. As an aside, the GPCR signaling and GPCR downstream signaling have proven to be fairly intractable categories.  As best we can tell the GPCR downstream signaling category is full of MAPK family members and likely represents some redundancy with MAPK further down.  

      (4) In figure 3A-C, including factors whose expression did not appear to change between adherent and suspension conditions may be warranted as the internal control. Figure 3D-F may benefit from some sort of quantification.

      The mRNA expression levels are normalized to GAPDH as an internal control. We have updated this figure and re-plotted it as fold change relative to adherent culture cells with statistical comparisons to indicate which are significantly upregulated in suspension culture.

      The IHC experiments are now in Fig. 4D-F and show positive staining for Netrin-1 and -3.  Netrin-3 is easiest to see, while Netrin-1 is trickier because the difference with the no primary antibody control isn’t intensity, but the tint of the DAB stain.  We had to counter stain the patient spheroids with Hematoxylin in order for the slide scanner to find the best focal plane and make image registration between sections possible.  This unfortunately makes the Netrin-1 staining rather subtle.  For cell line spheroids in the Fig. 4, Supplement 1 we didn’t need the slide scanner and show negative controls without counter stain that are much more convincing of Netrin-1 detection and reassure us that our staining detects the intended target.  We’ve updated the labels in Fig. 4 and Fig. 4, Supplement 1 for this to be more intuitive.  Unfortunately, relying on the tint of the DAB stain leaves this as a qualitative experiment.

      - In figure 4C-E the authors show that Netrin-1 stimulation induces ERK phosphorylation whereby it is argued that this is a "low-level" stimulation of ERK signaling required for the survival of ovarian cells in the suspension. This is however hard to appreciate, and it was thought that having adherent cells in parallel would be helpful to wage whether this indeed is a "low level" ERK activity. Moreover, the authors should likely include downstream substrates of ERK (e.g., RSKs) as well as p38 in these experiments. The control experiments for the effects of PD184352 on ERK phosphorylation also appear to be warranted. Finally, performing the experiments with PD184352 in the presence of Netrin-1 stimulation would also be advantageous.

      We have added a new Netrin-1 stimulation experiment in Fig. 4D (described on page 11, line 244-52) that shows that Netrins can only activate  very low levels of ERK phosphorylation in suspension when proliferation is arrested. Netrin-1 stimulation of quiescent adherent cells where stimulation of proliferation is possible shows that Netrins are unable to activate ERK phosphorylation in this condition.  In contrast, we also stimulate quiescent adherent OVCAR8 cells with an equal concentration of EGF (a known mitogen) to offer high level ERK phosphorylation as a side by side comparison.  I think that this offers clear evidence that Netrin signaling is inconsistent with inducing cell proliferation.  We’ve also updated citations in the introduction to include citation 26 that offers a previously reported paradigm of Netrin-ERK signaling in axon outgrowth that is a non-cancer, non-proliferative context to remind readers that Netrins utilize MEK-ERK differently. 

      We highlight Netrin-MEK-ERK signaling as key to survival for a number of reasons.  First, Netrin signaling in this paradigm does not fit the dependence receptor paradigm where loss of Netrin receptors protect against cell death.  Fig. 5B rules this out as receptor loss never offers a survival advantage, but clearly receptor deletions compromise survival in suspension culture.  Second, positive Netrin signaling is known to support survival by inactivating phosphorylation of DAPK1.  We’ve added this experiment as Fig. 5 Supplement 1D and show that loss of Netrin receptors doesn’t reduce DAPK1 phosphorylation in a time course of suspension culture.  Consequently, we conclude this isn’t the survival signal either.  Since MEK and ERK family members scored in our screen, we investigated their role in survival.  We now show two different MEK inhibitors with different inhibitory mechanisms to confirm that MEK inhibition induces cell death. In addition to the previous PD184352 inhibitor in our first submission, we’ve added Trametinib as well and this is shown in Fig. 5G.  Since it is surprising the MEK inhibition can kill instead of just arrest proliferation, we’ve also added another cell death assay in which we show trypan blue dye exclusion as a second look at survival.  This is now Fig. 5H.  Lastly, we include Trametinib inhibition of ERK phosphorylation in these assays in Fig. 5I.  While we leave open what takes place downstream of ERK, our model in Fig. 5J offers a very detailed look at the components upstream.

      - Does inhibition of ERK prevent the abdominal spread of ovarian cancer cells? The authors may feel that this is out of the scope of the study, which I would agree with, but then the claims regarding ERK being the major mediator of the effects of netrin signaling should be perhaps slightly toned down.

      We agree that loss of function xenograft experiments will enhance our discovery of Netrin’s role in dormancy and metastasis.  We have added a new Fig. 6 that uses xenografts with Netrin receptor deficient OVCAR8 cells (UNC5 4KO).  It demonstrates that two weeks following IP engraftment we can isolate spheroids from abdominal washes and that cells have entered a state of reduced proliferation as determined by lowered Ki67 expression as well as other proliferation inducing genes.  In the case of UNC5 4KO cells, there is significant attrition of these cells as determined by recovering spheroids in adherent culture (Fig.6C) and by Alu PCR to detect human cells in abdominal washes (Fig. 6D).  Lastly, xenografts of UNC5 4KO cells cause much less aggressive disease and significantly extend survival of these mice (Fig. 6E,F).  Not exactly the experiment that the reviewer is asking for, but a clear indication that Netrin signaling supports survival in xenograft model of dormancy.

      - Notwithstanding that this could be deduced from figures 6D and F, it would be helpful if the number of mice used in each experimental group is clearly annotated in the corresponding figure legends. Moreover, indicating the precise statistical tests that were used in the figures would be helpful (e.g., specifying whether anova is one-way, two-way, or?)

      We have added labels to what is now Fig. 8B to indicate the number of animals used for each genotype of cells.  We have also updated figure legends to include more details of statistical tests used in each instance.

    1. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Gitanjali Roy et al. applies deep transfer learning (DEGAS) to assign patient-level disease attributes (metadata) to single cells of T2D and non-diabetic patients, including obese patients. This led to the identification of a singular cluster of T2D-associated β-cells; and two subpopulations of obese- β-cells derived from either non-diabetic or T2D donors. The objective was to identify novel and established genes implicated in T2D and obesity. Their final goal is to validate their findings at the protein level using immunohistochemistry of pancreas tissue from non-diabetic and T2D organ donors.

      Strengths:

      This paper is well-written, and the findings are relevant for β-cell heterogeneity in T2D and obesity.

      Weaknesses:

      The validation they provide is not sufficiently strong: no DLK1 immunohistochemistry is shown of obese patient-derived sections. Additional presumptive relevant candidates from this transcriptomic analysis should be screened for, at the protein level.

    2. eLife assessment

      This is a useful study that used DEGAS, a deep transfer learning tool, to identify distinct pancreatic beta cell subpopulations that could be associated with type 2 diabetes (T2D) and/or obesity status. The data supporting the authors' findings is solid and demonstrates that DEGAS will be a helpful tool for analyzing cell-specific transcriptomic phenotypes. This study will be of interest to researchers studying the genetics of T2D.

    3. Reviewer #1 (Public Review):

      In this manuscript, Roy et al. used the previously published deep transfer learning tool, DEGAS, to map disease associations onto single-cell RNA-seq data from bulk expression data. The authors performed independent runs of DEGAS using T2D or obesity status and identified distinct β-cell subpopulations. β-cells with high obese-DEGAS scores contained two subpopulations derived largely from either non-diabetic or T2D donors. Finally, immunostaining using human pancreas sections from healthy and T2D donors validated the heterogeneous expression and depletion of DLK1 in T2D islets.

      Strengths:

      (1) This meta-analysis of previously published scRNA-seq data using a deep transfer learning tool.

      (2) Identification of novel beta cell subclusters.

      (3) Identified a relatively innovative role of DLK1 in T2D disease progression.

      Weaknesses:

      (1) There is little overlap of the DE list of bulk RNA-seq analysis in Figure 1D and 1E overlap with the DE list of pseudo-bulk RNA-seq analysis of all cells in Figure S2C.

      (2) The biological meaning of "beta cells had the lowest scores compared to other cell types" is not clear.

      (3) The figures and supplemental figures were not cited following the sequence, which makes the manuscript very difficult to read. Some supplemental figures, such as Figures S1C-S1D, S2B-S2E, S3A-S3B, were not cited or mentioned in the text.

      (4) In Figure 7, the current resolution is too low to determine the localization of DLK1.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Roy et al. used the previously published deep transfer learning tool, DEGAS, to map disease associations onto single-cell RNA-seq data from bulk expression data. The authors performed independent runs of DEGAS using T2D or obesity status and identified distinct β-cell subpopulations. β-cells with high obese-DEGAS scores contained two subpopulations derived largely from either non-diabetic or T2D donors. Finally, immunostaining using human pancreas sections from healthy and T2D donors validated the heterogeneous expression and depletion of DLK1 in T2D islets.

      Strengths:

      (1) This meta-analysis of previously published scRNA-seq data using a deep transfer learning tool.

      (2) Identification of novel beta cell subclusters.

      (3) Identified a relatively innovative role of DLK1 in T2D disease progression.

      We thank the reviewer for their constructive critiques and positive feedback. We hope to further apply deep transfer learning tools in future scRNA-seq meta-analyses.

      Weaknesses:

      (1) There is little overlap of the DE list of bulk RNA-seq analysis in Figure 1D and 1E overlap with the DE list of pseudo-bulk RNA-seq analysis of all cells in Figure S2C.

      We thank the reviewer for this insightful thought and plan to perform additional analyses and comparisons to address this comment.

      (2) The biological meaning of "beta cells had the lowest scores compared to other cell types" is not clear.

      We agree with the reviewer and will amend this statement to clarify in the revised manuscript. In summary, the relatively lower T2D-DEGAS scores for beta cells overall compared to all other cell types (alpha cells, acinar cells, etc) reflects the fact that in T2D, beta cell-specific genes can be downregulated. This is also possibly due to beta cell loss in T2D and would be reflected in bulk islet RNAseq data. This affects the DEGAS model which is reflected in the scores of all cells in the scRNA-seq data (Fig 3A). For this reason, subsetting the beta cells and replotting them on their own (Fig 4B) is an important step to identify relative differences in DEGAS scores between different subsets of beta cells.

      (3) The figures and supplemental figures were not cited following the sequence, which makes the manuscript very difficult to read. Some supplemental figures, such as Figures S1C-S1D, S2B-S2E, S3A-S3B, were not cited or mentioned in the text.

      We apologize and thank the reviewer for pointing out these errors. All of the annotated errors will be amended in the revised manuscript.

      (4) In Figure 7, the current resolution is too low to determine the localization of DLK1.

      We will include the original highest-resolution confocal images in our resubmission. We will also improve the color combination to improve visibility of colocalization of DLK1 with Insulin.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Gitanjali Roy et al. applies deep transfer learning (DEGAS) to assign patient-level disease attributes (metadata) to single cells of T2D and non-diabetic patients, including obese patients. This led to the identification of a singular cluster of T2D-associated β-cells; and two subpopulations of obese- β-cells derived from either non-diabetic or T2D donors. The objective was to identify novel and established genes implicated in T2D and obesity. Their final goal is to validate their findings at the protein level using immunohistochemistry of pancreas tissue from non-diabetic and T2D organ donors.

      Strengths:

      This paper is well-written, and the findings are relevant for β-cell heterogeneity in T2D and obesity.

      We thank the reviewer for their constructive critiques and positive feedback. We believe this study can improve our understanding β-cell heterogeneity in the context of T2D and obesity.

      Weaknesses:

      The validation they provide is not sufficiently strong: no DLK1 immunohistochemistry is shown of obese patient-derived sections. Additional presumptive relevant candidates from this transcriptomic analysis should be screened for, at the protein level.

      Thank the reviewer for this suggestion. We are planning to perform new immunostaining of DLK1 in human pancreas tissue sections from non-diabetic lean, non-diabetic obese, T2D lean, and T2D obese donors. We also note that Table S6 contains the patient metadata for the pancreas samples we show in the current manuscript. Two of the T2D donors have BMI > 30 (obese). However, the non-diabetic donors have BMI between 26-29. Our new planned studies should address the question of differential DLK1 expression / beta cell heterogeneity in the context of both diabetes and obesity.

    1. eLife assessment

      ProtSSN is a valuable approach that generates protein embeddings by integrating sequence and structural information, demonstrating improved prediction of mutation effects on thermostability compared to sequence-only models. The work is currently incomplete as it lacks a thorough comparison against other recent top-performing methods that also incorporate structural data, such as SaProt, EVE-based models, and GEMME. Providing a comprehensive analysis benchmarking ProtSSN against these state-of-the-art structure-based approaches would significantly strengthen the evidence supporting the utility of ProtSSN's joint sequence-structure representations.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors introduce a denoising-style model that incorporates both structure and primary-sequence embeddings to generate richer embeddings of peptides. My understanding is that the authors use ESM for the primary sequence embeddings, take resolved structures (or use structural predictions from AlphaFold when they're not available), and then develop an architecture to combine these two with a loss that seems reminiscent of diffusion models or masked language model approaches. The embeddings can be viewed as ensemble-style embedding of the two levels of sequence information, or with AlphaFold, an ensemble of two methods (ESM+AlphaFold). The authors also gather external datasets to evaluate their approach and compare it to previous approaches. The approach seems promising and appears to out-compete previous methods at several tasks. Nonetheless, I have strong concerns about a lack of verbosity as well as the exclusion of relevant methods and references.

      Advances:

      I appreciate the breadth of the analysis and comparisons to other methods. The authors separate tasks, models, and sizes of models in an intuitive, easy-to-read fashion that I find valuable for selecting a method for embedding peptides. Moreover, the authors gather two datasets for evaluating embeddings' utility for predicting thermostability. Overall, the work should be helpful for the field as more groups choose methods/pretraining strategies amenable to their goals, and can do so in an evidence-guided manner.

      Considerations:

      Primarily, a majority of the results and conclusions (e.g., Table 3) are reached using data and methods from ProteinGym, yet the best-performing methods on ProteinGym are excluded from the paper (e.g., EVE-based models and GEMME). In the ProteinGym database, these methods outperform ProtSSN models. Moreover, these models were published over a year---or even 4 years in the case of GEMME---before ProtSSN, and I do not see justification for their exclusion in the text.

      Secondly, related to the comparison of other models, there is no section in the methods about how other models were used, or how their scores were computed. When comparing these models, I think it's crucial that there are explicit derivations or explanations for the exact task used for scoring each method. In other words, if the pre-training is indeed an important advance of the paper, the paper needs to show this more explicitly by explaining exactly which components of the model (and previous models) are used for evaluation. Are the authors extracting the final hidden layer representations of the model, treating these as features, and then using these features in a regression task to predict fitness/thermostability/DDG etc.? How are the model embeddings of other methods being used, since, for example, many of these methods output a k-dimensional embedding of a given sequence, rather than one single score that can be correlated with some fitness/functional metric? Summarily, I think the text lacks an explicit mention of how these embeddings are being summarized or used, as well as how this compares to the model presented.

      I think the above issues can mainly be addressed by considering and incorporating points from Li et al. 2024[1] and potentially Tang & Koo 2024[2]. Li et al.[1] make extremely explicit the use of pretraining for downstream prediction tasks. Moreover, they benchmark pretraining strategies explicitly on thermostability (one of the main considerations in the submitted manuscript), yet there is no mention of this work nor the dataset used (FLIP (Dallago et al., 2021)) in this current work. I think a reference and discussion of [1] is critical, and I would also like to see comparisons in line with [1], as [1] is very clear about what features from pretraining are used, and how. If the comparisons with previous methods were done in this fashion, this level of detail needs to be included in the text.

      To conclude, I think the manuscript would benefit substantially from a more thorough comparison of previous methods. Maybe one way of doing this is following [1] or [2], and using the final embeddings of each method for a variety of regression tasks---to really make clear where these methods are performing relative to one another. I think a more thorough methods section detailing how previous methods did their scoring is also important. Lastly, TranceptEVE (or a model comparable to it) and GEMME should also be mentioned in these results, or at the bare minimum, be given justification for their absence.

      [1] Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models<br /> Francesca-Zhoufan Li, Ava P. Amini, Yisong Yue, Kevin K. Yang, Alex X. Lu<br /> bioRxiv 2024.02.05.578959; doi: https://doi.org/10.1101/2024.02.05.578959

      [2] Evaluating the representational power of pre-trained DNA language models for regulatory genomics<br /> Ziqi Tang, Peter K Koo<br /> bioRxiv 2024.02.29.582810; doi: https://doi.org/10.1101/2024.02.29.582810

    3. Reviewer #2 (Public Review):

      Summary:

      To design proteins and predict disease, we want to predict the effects of mutations on the function of a protein. To make these predictions, biologists have long turned to statistical models that learn patterns that are conserved across evolution. There is potential to improve our predictions however by incorporating structure. In this paper, the authors build a denoising auto-encoder model that incorporates sequence and structure to predict mutation effects. The model is trained to predict the sequence of a protein given its perturbed sequence and structure. The authors demonstrate that this model is able to predict the effects of mutations better than sequence-only models.

      As well, the authors curate a set of assays measuring the effect of mutations on thermostability. They demonstrate their model also predicts the effects of these mutations better than previous models and make this benchmark available for the community.

      Strengths:

      The authors describe a method that makes accurate mutation effect predictions by informing its predictions with structure.

      Weaknesses:

      It is unclear how this model compares to other methods of incorporating structure into models of biological sequences, most notably SaProt (https://www.biorxiv.org/content/10.1101/2023.10.01.560349v1.full.pdf).

      ProteinGym is largely made of deep mutational scans, which measure the effect of every mutation on a protein. These new benchmarks contain on average measurements of less than a percent of all possible point mutations of their respective proteins. It is unclear what sorts of protein regions these mutations are more likely to lie in; therefore it is challenging to make conclusions about what a model has necessarily learned based on its score on this benchmark. For example, several assays in this new benchmark seem to be similar to each other, such as four assays on ubiquitin performed at pH 2.25 to pH 3.0.

    1. Reviewer #1 (Public Review):

      Summary:

      The authors demonstrate that the immunosuppressive environment in pancreatic ductal adenocarcinoma (PDAC) can be mitigated by a combination of ionizing radiation (IR), CCR5 inhibition, and PD1 blockade. This combination therapy increases tissue-resident natural killer (trNK) cells that facilitate CD8 T cell activity, resulting in a reduction of E-cadherin positive tumor cells. They identify a specific "hypofunctional" NK cell population in both mouse and human PDAC that supports CD8 T cell involvement. A trNK signature is found to be associated with better survival outcomes in PDAC and other solid tumors.

      Overall, I think this is an interesting study that combines testing of therapeutic concepts in mice with bioinformatics analysis of single cell transcriptome data in primary tumors and exploration of clinical outcomes using signature genes in TCGA data. The key finding is that immunoregulatory properties of tumor infiltrating/resident CD56-bright NK cells (assumed to be non-cytotoxic) are beneficial for outcome through cross-talk with DC and recruitment of CD8 T cells. The latter is specifically induced by irradiation combined with CCR5i and PD1 blockade.

      These results support the notion that IR/CCR5i/αPD1 combination treatment alters immune infiltration by reducing Tregs and increasing NK and CD8 T cells, thereby resulting in greater local tumor control.

      Although the language was slightly modified in the revised version I think it is important to point out that transcripts (not protein expression) of KLRC2 is common in CD56bright NK cells and does not really reflect "adaptive-like" NK cells.

    2. eLife assessment

      This valuable manuscript provides an interesting account documenting the role of resident CD56(br) NK cells in driving interaction with dendritic cells that attract CD8+ T cells to the pancreas cancer tumor microenvironment (TME). The work convincingly illustrates how irradiation combined with CCR5i and PD1 blockade leads to a reduction in pancreatic cancer growth that correlates with a reduction in Treg cells and enhancement of NK and CD8 T cells in the TME. The correlation of NKC1 signature with survival in pancreatic cancer patients is indeed of broader interest regarding potential relevance to other types of cancer.

    3. Reviewer #2 (Public Review):

      Summary:

      This work elaborates on a combined therapeutic approach comprising ionizing radiation and CCR5i/αPD1 immunotherapy as a promising strategy in pancreatic cancer. Previous research has established that NK cell-derived CCL5 and XCL1 play a crucial role in recruiting cDC1 cells to the tumor microenvironment, contributing to tumor control. In this study, by using a murine pancreatic cancer model, the authors propose that the addition of radiation therapy to CCR5i and αPD1 immunotherapy could upregulate CD8+ T cells and a subgroup of NK cells within the tumor and result in better tumor control. They further analyzed human single-cell sequencing data from pancreatic cancer patients and identified one subgroup of NK cells (NK C1) with tissue-resident features. Subsequent cell-cell contact analysis reveals the NK-cDC1-CD8 cell axis in pancreatic cancer. By analyzing TCGA data, they found that high NK C1 signature levels were associated with better survival in pancreatic cancer patients. Thus, radiotherapy could benefit the outcome of patients bearing low NK C1 signatures. Importantly, the positive correlation between NK C1 score with survival extends beyond pancreatic cancer, showing potential applicability across various solid cancers.

      Strengths:

      This study could add new insight into the clinical practice by introducing such novel combined therapy and shed light on the underlying immune cell dynamics. These findings hold potential for more effective and targeted treatment in the future. Mouse experiments nicely confirmed that such combined therapy could significantly reduce tumor volume. The elegant use of single-cell sequencing analysis and human database examination enriches the narrative and strengthens the study's foundation. Additionally, the notion that NK C1 signature correlates with patient survival in various solid cancers is of high interest and relevance.

      Weaknesses:

      The authors have addressed some of my concerns. However, others remain and should be discussed.

      (1) The role of CCR5i requires further clarification/ discussion. While the authors demonstrated its capacity to reduce Treg in murine tumors, its impact on other cell populations, including NK cells and CD8+ T cells, was not observed. Nevertheless, the effect of CCR5i on tumor growth in Figure 2B seems pathogenic. If the combination of radiotherapy and αPD1 already can achieve good outcomes as shown in Figure 3A, the necessity to include CCR5i is questioned. Overall, a more comprehensive elucidation of the roles of CCL5 and CCR5i in this context would be good. Alternatively, this limitation should be discussed.<br /> (2) In line with this, spatial plots in Figure 4 did not include the group with only radiotherapy and αPD1. This inclusion would facilitate a clearer comparison and better highlight the essential role of CCR5i.<br /> (3) Human database analysis showed a positive correlation between NK C1 score and CCL5 in pancreatic cancer. Furthermore, radiotherapy could benefit the outcome of patients bearing low NK C1 scores. It would be interesting to test, if radiotherapy could also benefit patients with low CCL5 levels in this cohort. This is a key question since the role of CCL5/CCR5i is not well verified. Alternatively, this point could be mentioned and discussed.

    4. Reviewer #3 (Public Review):

      Summary:

      In the submitted manuscript by Go et al, the authors evaluated the tumor microenvironment in pancreatic ductal adenocarcinoma (PDAC) and made a number of interesting observations, including the following: 1) CCL5 expression within the tumor microenvironment negatively correlated with clinical outcomes in human patients with PDAC; 2) there were both positive and negative correlations between CCL5 expression and the expression of specific genes (e.g. those encoding CD56 and CD16, respectively) included among gene signature lists for Treg, MDSC, TAM, and NK cells; 3) CCR5 inhibition with the inhibitor, maraviroc, reduced Treg infiltration but not that of other immune cell types in an orthotopic murine model of PDAC; 4) CCR5 inhibition augmented anti-PD1 immunotherapy when combined with ionizing radiation (IR) therapy in the murine model; 5) the above therapy resulted in increased infiltration of CD8+ cytotoxic T cells as well as of a subset of NKG2D-negative, tissue-residency (tr) marker expressing NK cells (deemed Cluster 1 NK in their data sets) that inversely correlated with the number of E-cadherin+ cells (i.e. tumor cells) and showed predicted interactions with cDC1 dendritic cells (including XCL1/XCL2 expressed by the NK and XCR1 expressed by the cDC1); 6) the authors identified a number of putative signals stemming from the trNK (e.g. IL-16, TNFSF14, FASLG, CSF, MIF) as well as incoming from cDC1s to NK (e.g. BAG6-NKp30); 7) these trNK cells positively correlated with good outcomes and with CD8+ T cell infiltrations in human PDAC as well as in many other solid tumor types; and 8) importantly, the benefit of IR therapy was specific to the subset of PDAC patients (represented in the TCGA dataset) that were predicted to have low amounts of trNK cells. The authors used murine experimental models, multi-plexed imaging analyses, and a number of publicly available sequencing data sets from human tumor samples to perform their investigations. Based on their findings, the authors proposed that combining IR with CCR5 inhibition and anti-PD1 immunotherapy is a promising strategy to treat solid cancers.

      Strengths:

      Overall, the collective analyses and conclusions appear to be novel and could be of high and rapid impact on the field, particularly in terms of directing clinical trials to incorporate IR with CCR5 inhibition and immunotherapy. The manuscript is well written; the figures are for the most part clear; and the Discussion is very thoughtful.

      Weaknesses:

      In the revised manuscript, the authors addressed my original concerns. I have no new major concerns with the study. One of the limitations is that the authors did not perform functional in vivo or ex vivo assays to address some of the major hypotheses that arose from the descriptive, correlative data; but overall, this does not detract from the enthusiasm for the work or the potential significance and impact of the study.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review): 

      Summary:

      The authors demonstrate that the immunosuppressive environment in pancreatic ductal adenocarcinoma (PDAC) can be mitigated by a combination of ionizing radiation (IR), CCR5 inhibition, and PD1 blockade. This combination therapy increases tissue-resident natural killer (trNK) cells that facilitate CD8 T cell activity, resulting in a reduction of E-cadherin positive tumor cells. They identify a specific "hypofunctional" NK cell population in both mouse and human PDAC that supports CD8 T cell involvement. A trNK signature is found to be associated with better survival outcomes in PDAC and other solid tumors.   

      Strengths: 

      Overall, I think this is an interesting study that combines testing of therapeutic concepts in mice with bioinformatics analysis of single-cell transcriptome data in primary tumors and exploration of clinical outcomes using signature genes in TCGA data. The key finding is that immunoregulatory properties of tumor-infiltrating/resident CD56-bright NK cells (assumed to be non-cytotoxic) are beneficial for outcome through cross-talk with DC and recruitment of CD8 T cells. The latter is specifically induced by irradiation combined with CCR5i and PD1 blockade. 

      "These results collectively support the notion that IR/CCR5i/αPD1 combination treatment alters immune infiltration by reducing Tregs and increasing NK and CD8 T cells, thereby resulting in greater local tumor control." I agree with this conclusion.  

      Weaknesses:  

      There are a few points to discuss and that the authors may want to address. 

      (1)   "Notably, CCR5i significantly reduced Treg infiltration but had no effect on the infiltration of other immune cells, indicating the active recruitment of CCR5+ Tregs in PDAC (Figure 2B)." 

      CCR5i treatment seems to inhibit infiltration of CD8 T cells and NK cells to a greater extent, in relative terms, compared to Treg, albeit it is not statistically significant. If this visual inspection of the graph does not reflect reality, additional experiments may be needed to verify the selective targeting of Tregs or confirm the fact that also CD8 T cells and NK cells are affected by single agent CCR5i. The reduced recruitment of Treg, NK cells, and CD8T cells was completely reversed when combined with irradiation. In the data shown in Figure 3E it seems as if CCR5i induced infiltration of Tregs along with other immune cells. However, this said, I agree with the conclusion of the authors that this combined treatment leads to an altered immune composition and ratio between Tregs and effector cells (CD8T cells and NK cells). Could this altered composition be displayed more clearly? 

      We would like to thank the reviewer for their comments and agree that there is a trend for reduced NK and T-cell infiltration during CCR5i standalone treatment (as seen in Figure 2B), although it does not reach significance. To reflect this more clearly, we have added n.s (non-significant) for the NK cells and CD8+ T-cells and adjusted the text to reflect a trend for decreased NK and CD8+ T-cell infiltration (See Lines 162-165). Moreover, to reflect the data accurately, we have taken the Treg data out of the original Figure 2B and present it separately as a percentage of CD45+CD3+ T-cells.

      (2) The definition of active and hypofunctional NK cells based on solely NKG2D expression alone seems like an oversimplification. I realize it is not trivial to test tumor-infiltrating NK cells from these tumors functionally but perhaps scRNAseq of the tumors would allow for characterization of cytotoxicity scores using KEGG or GO analysis or reversed gene set enrichment in responders/non-responders.  

      We agree that scRNA-seq of tumors would add to the overall characterization of the tumor-infiltrating NK cells and their characterization, however we are currently unfortunately not in the position to carry out this experiment. We did however immunophenotype the tumor infiltrating NK cell population in more depth by also looking at NKp46 and NKG2D surface expression. This newly added data demonstrates not only increased infiltration of “bona-fide” trNK cells (based on surface expression of CD103+CD49a+) under the triple treatment combination, but more importantly these trNK have reduced levels of CD69, NKp46, NKG2D and increased TIM-3 surface expression compared to conventional NK cells – suggesting that these trNKs could be more hypoactive compared to the conventional NK cells. These data have been added to the manuscript as Figure 4E, F; Figure supplement 4E-G and Lines 244-260 in the revised manuscript. To clarify this difference, we have replaced the word “hypofunctional” with “hypoactive” throughout the manuscript.

      (3) It seems as if the abstract refers to this phenotype incorrectly since the "hyporesponsive" subset is described as NKG2C-negative. 

      We apologize for the typographic confusion and have corrected our abstract and changed the subset to NKG2D-negative (as was intended).

      (4) "The NK_C1 cluster correlates best with the hypofunction NK phenotype observed in mice as similarly displayed reduced activation (reduced NKG7, NKp80, GZMA, and PRF1) with additional expression of tissue residency markers CD103, CD49a and, surprisingly, the adaptive activating receptor NKG2C (KLRC2) (Figure 5B, C)." 

      There is no doubt that NK_C1 represents tumor-infiltrating NK cells with a CD56bright gene signature with a strong tissue resident score. However, the transcriptional expression of KLRC2 on these is not surprising! It is well established that KLRC2 transcripts (but not protein) are highly expressed on conventional CD56bright NK cells. There are several published sources where the authors can find such data for confirmation. Thus, this is not to be confused with adaptive NK cells having an entirely different transcriptional signature and expressing high levels of NKG2C at the cell surface. I strongly recommend reinterpreting the results based on the fact that KLRC2 is expressed at high levels in conventional CD56bright NK cells. If not, it would be important to verify that these tissueresident NK cells express NKG2C and not NKG2A at the cell surface. 

      We agree with the reviewer and have modified the text accordingly in the revised manuscript (Lines 279-283), including references to tissue-resident adaptive-like cells as described previously in literature. 

      (5) NCAM1 transcript alone is not sufficient to deconvolute CD56bright NK cells in TCGA data (Figure 7A). As a single marker, it likely reflects NK cell infiltration without providing further evidence on the contribution of the bright/dim components. Therefore, the use of the bright Tr NK signature described in Table 1 is very important (Figure 7B). Table 1 is not provided. Nor Supplementary Table 1. There is only one supplementary figure in the ppt attached.

      We agree that a high NCAM1/CD56 single gene signature could also represent NK cell infiltration. We have rephrased this in the text accordingly (Lines 354-357). We apologize for the missing tables and Supplementary figures. We have added these now to the manuscript as Supplementary table 1.

      Reviewer #2 (Public Review)  

      Summary: 

      This work elaborates on a combined therapeutic approach comprising ionizing radiation and CCR5i/αPD1 immunotherapy as a promising strategy in pancreatic cancer. Previous research has established that NK cell-derived CCL5 and XCL1 play a crucial role in recruiting cDC1 cells to the tumor microenvironment, contributing to tumor control. In this study, by using a murine pancreatic cancer model, the authors propose that the addition of radiation therapy to CCR5i and αPD1 immunotherapy could upregulate CD8+ T cells and a subgroup of NK cells within the tumor and result in better tumor control. They further analyzed human single-cell sequencing data from pancreatic cancer patients and identified one subgroup of NK cells (NK C1) with tissue-resident features. Subsequent cell-cell contact analysis reveals the NK-cDC1-CD8 cell axis in pancreatic cancer. By analyzing TCGA data, they found that high NK C1 signature levels were associated with better survival in pancreatic cancer patients. Thus, radiotherapy could benefit the outcome of patients bearing low NK C1 signatures. Importantly, the positive correlation between NK C1 score with survival extends beyond pancreatic cancer, showing potential applicability across various solid cancers.  

      Strengths: 

      This study could add new insight into the clinical practice by introducing such novel combined therapy and shed light on the underlying immune cell dynamics. These findings hold potential for more effective and targeted treatment in the future. Mouse experiments nicely confirmed that such combined therapy could significantly reduce tumor volume. The elegant use of single-cell sequencing analysis and human database examination enriches the narrative and strengthens the study's foundation. Additionally, the notion that NK C1 signature correlates with patient survival in various solid cancers is of high interest and relevance.  

      Weaknesses: 

      The role of CCR5i requires further clarification. While the authors demonstrated its capacity to reduce Treg in murine tumors, its impact on other cell populations, including NK cells and CD8+ T cells, was not observed. Nevertheless, the effect of CCR5i on tumor growth in Figure 2B should be shown. If the combination of radiotherapy and αPD1 already can achieve good outcomes as shown in Figure 3A, the necessity to include CCR5i is questioned. Overall, a more comprehensive elucidation of the roles of CCL5 and CCR5i in this context would be good.  

      We would like to thank the reviewer for their comments and agree that standalone CCR5i also shows a trend of reduced infiltrating NK cells and CD8+ T-cells, although this does not reach significance. We have mentioned this trend in the manuscript (see Lines 162-165) and added n.s to Figure 2B as well. In regards to adding CCR5i; although we observe volumetric control by radiotherapy and anti-PD1, we observe an increase in necrosis induction only in the triple combination compared to radiotherapy combined with anti-PD1 – suggesting that there is an additive effect of CCR5i in our model only as a combination modality. We therefore believe that addition of CCR5i to radiotherapy and anti-PD1 has a beneficial effect. The growth curves for CCR5i alone were already presented in Figure 3A, and we have modified our manuscript to refer to this (see Lines 165-167).

      (1) In line with this, spatial plots in Figure 4 did not include the group with only radiotherapy and αPD1. This inclusion would facilitate a clearer comparison and better highlight the essential role of CCR5i. 

      We agree with the reviewer that inclusion of radiotherapy and αPD1 would facilitate a clear comparison of our data and our experiments did include single controls for radiotherapy and αPD1; however, unfortunately, the tissue slides were of bad quality and therefore not suitable for quantification. In line with this, we have added references to other studies that investigated the effect of immune checkpoint inhibitors in combination with radiotherapy (see Lines 169-172).

      (2) NK C1 cells should be also analyzed in the mouse model. The authors suggest that NKNKG2Dve could be the cell population. Staining of inhibitory markers should be considered, for example, TIGIT and TIM3 as presented in Figure 5B. 

      As per the reviewer suggestion, we have now included some additional data on the surface expression of inhibitory markers/activating receptor on tumor-infiltrating NK cells in our model under the triple combination. These additional data demonstrate increased infiltration of trNK under the triple combination that seem to be more ‘hypoactive’ than conventional NK cells.  This data has been added as Figure 4E in the revised Figure.

      (3) While the cell-cell contact analysis generated from single-cell sequencing data is insightful, extending this analysis to the mouse model under therapy would be highly informative. NK and CD8 cells in the tumor increased upon the combined therapy. However, cDC1 was not characterized. Analysis regarding cDC1 would provide more information on the NK/cDC1/CD8 axis. 

      We agree that looking into cDC1 would be highly interesting in our treatment model and its characterization is currently under investigation. The importance about the interaction between cDC1-NK cells has been described before by various groups, and we have provided additional references for that in our manuscript (see Lines 449-455)

      (4) Human database analysis showed a positive correlation between NK C1 score and CCL5 in pancreatic cancer. Furthermore, radiotherapy could benefit the outcome of patients bearing low NK C1 scores. It would be interesting to test if radiotherapy could also benefit patients with low CCL5 levels in this cohort. 

      We would like to thank the reviewer for their suggestion and please see the figure below for the comparison. Patients with CCL5high are enriched for NK_C1 (Figure 7D) and CCL5high patients with NK_C1high have significantly increased overall and disease-free survival compared to NK_C1low (Figure 7E); where those with NK_C1low significantly benefit from radiotherapy (Figure 7B). Accordingly, patients with CCL5high have significantly decreased overall survival compared to CCL5low patients, again confirming CCL5 as a prognostic marker (Figure 1A, Figure R1). When we look at CCL5low patients however, there is no additional significant benefit for radiotherapy (see insert below) in the CCL5low group (not significant; only significant p-values are shown). These data collectively support the strong correlation between CCL5 levels and NK_C1 enrichment, and imply that radiotherapy alone is insufficient to drive NK_C1 cells in the absence of high CCL5 gradients to improve overall survival. However, given the increased overall survival of CCL5low compared to CCL5high it is likely that other factors are at play. Future studies will be required to further elucidate the role of CCL5 gradients on NK_C1 cells and the beneficial effect of radiotherapy.

      Author response image 1.

      Overall survival of CCL5high versus CCL5low patients stratified into groups with and without radiotherapy using TCGA-PAAD. Log-rank p-value indicates the significance level across all groups while individual significant comparisons are shown as indicated.

      Reviewer #3 (Public Review):

      Summary

      In the submitted manuscript by Go et al, the authors evaluated the tumor microenvironment in pancreatic ductal adenocarcinoma (PDAC) and made a number of interesting observations, including the following: 1) CCL5 expression within the tumor microenvironment negatively correlated with clinical outcomes in human patients with PDAC; 2) there were both positive and negative correlations between CCL5 expression and the expression of specific genes (e.g. those encoding CD56 and CD16, respectively) included among gene signature lists for Treg, MDSC, TAM, and NK cells; 3) CCR5 inhibition with the inhibitor, maraviroc, reduced Treg infiltration but not that of other immune cell types in an orthotopic murine model of PDAC; 4) CCR5 inhibition augmented anti-PD1 immunotherapy when combined with ionizing radiation (IR) therapy in the murine model; 5) the above therapy resulted in increased infiltration of CD8+ cytotoxic T cells as well as of a subset of NKG2D-negative, tissueresidency (tr) marker expressing NK cells (deemed Cluster 1 NK in their data sets) that inversely correlated with the number of E-cadherin+ cells (i.e. tumor cells) and showed predicted interactions with cDC1 dendritic cells (including XCL1/XCL2 expressed by the NK and XCR1 expressed by the cDC1); 6) the authors identified a number of putative signals stemming from the trNK (e.g. IL-16, TNFSF14, FASLG, CSF, MIF) as well as incoming from cDC1s to NK (e.g. BAG6-NKp30); 7) these trNK cells positively correlated with good outcomes and with CD8+ T cell infiltrations in human PDAC as well as in many other solid tumor types; and 8) importantly, the benefit of IR therapy was specific to the subset of PDAC patients (represented in the TCGA dataset) that were predicted to have low amounts of trNK cells. The authors used murine experimental models, multiplexed imaging analyses, and a number of publicly available sequencing data sets from human tumor samples to perform their investigations. Based on their findings, the authors proposed that combining IR with CCR5 inhibition and anti-PD1 immunotherapy is a promising strategy to treat solid cancers.  

      Strengths

      Overall, the collective analyses and conclusions appear to be novel and could be of high and rapid impact on the field, particularly in terms of directing clinical trials to incorporate IR with CCR5 inhibition and immunotherapy. The manuscript is well written; the figures are for the most part clear; and the Discussion is very thoughtful.   

      Weaknesses

      There were a number of minor typographical errors, missing references, or minor issues with the figures. In general, while many of the observations provided strong suggestive evidence of relationships, phenotypes, and functions, the authors often used language to indicate that such things were confirmed, validated, or proven. In fact, there was a paucity of such functional/confirmatory experiments. This does not necessarily detract from the overall significance, excitement for, and potential impact of the study; but the language could likely be adjusted to be more in keeping with the true nature of the findings. The main title and running title are a bit different; consider making them more similar.

      We apologize for the typographical errors, missing references and issues with the figures. We have revised our manuscript, with a major focus on adjusting our language to more carefully reflect our data, and hope to have addressed all the concerns of the reviewer. The slight discrepancy between the main title and running title are to be able to convey the contents of this manuscript in a comprehensive way. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):  

      Please make sure all files are made available. Also please check available datasets describing KLRC2 transcripts in CD56brights. This is not to be confused with an adaptive-like signature. 

      We have added the missing table to the supplementary figures and revised the manuscript text in regards to KLRC2 transcript in our NK_C1 cluster and its implications for an adaptive-like signature in the context of tissue-residency (see Lines 279-283; 465-474).

      Reviewer #2 (Recommendations For The Authors): 

      Additional experiments as mentioned in the 'weakness' section could help to further strengthen this study. Besides these points, I would recommend the following: 

      (1) The description in the figure should be more precise and clear. Especially in Figure 3A, it seems the addition of IR into CCR5i or CCR5i/aPD1 leads to a bigger tumor volume.  

      We have adjusted the figure descriptions to more clearly describe the figures. We apologise for the confusion in Figure 3A, this was a figure legend error and has been correctly rectified in the revised Figures (i.e. closed symbols represent +IR conditions).

      (2) The definition of Tregs in figures should be described, e.g. it is not specified which population is shown in Figure S2c.  

      We have added a definition of Tregs (i.e. Live/CD45+CD3+CD4+FOXP3+) in our revised manuscript (see Lines 162-165). To avoid confusion, we have removed the subsequent gating of CCR5 and PD-1 of Tregs in our revised Supplementary Figures.

      (3) Please add a bar in all histology figures, for example, Figure 2A, S2A, S3E. It seems in Figure S3D, E, the green group is missing.  

      We have added the scale bar to all the indicated figures. Unfortunately, indeed as correctly pointed out by the reviewer, we are missing the green group (i.e. IR+CCR5i) as we felt that the excessive growth seen with CCR5i alone may have given a false impression of the extent of infiltration, therefore we did not include this in the original analysis and do not have the data in the Figure.

      (4) Please check through the manuscript, there are some grammar mistakes.  

      We apologise for the grammar mistakes in our original manuscript and have carefully revised the current manuscript to avoid grammar mistakes

      (5) Figure S7B, the left cell lacks a name.  

      We have annotated the left cell accordingly in our revised supplementary figure.

      Reviewer #3 (Recommendations For The Authors): 

      (1) Abbreviations (e.g. PDAC) should be spelled out the first time introduced in the manuscript.

      We have adjusted this in our revised manuscript.

      (2) Referring to the tissue-resident NK cells as "hypofunctional" may not be useful...they seem to be functional, just not in the conventional sense. The authors may want to consider another term, such as non-cytotoxic (given the low expression of cytolytic granules, etc) or immunoregulatory (as they actually refer to them on line 310).

      We agree with the reviewer and have revised the manuscript to refer to them as “immunoregulatory” or “hypoactive” when appropriate. The latter is supported by the additional experiments as shown in Figure 4E.

      (3) Barry et al 2018 Nat Med demonstrated that NK cells in melanoma could support cDC1s and promote positive clinical outcomes in the setting of immunotherapy. It would likely be beneficial to also cite this paper (e.g. on line 425). 

      Thank you for the suggestion, which would work in line with our hypothesis of crosstalk between NK_C1 and cDC1. We have looked for FLT3L in our NK_C1 cluster and did not find any enrichment for FLT3L transcript (see Figure 5E). Nevertheless, we have added the reference in the discussion of our manuscript to further support the importance of crosstalk between cDC1 and NK cells (see Lines 449455)

      (4) Figure 2B: by eye, it looks like the difference between CD8+ T cells in the two conditions would be significantly different; is this not the case? Same thing for the NK cells...what are the pvalues? 

      We have added n.s. to our revised Figure 2B. The p-values for CD8+ T-cells and NK cells were 0.14 and 0.19 {2-tailed students t-test), respectively.

      (5) The murine data strongly suggest that the combination therapy promotes trNK cell infiltration into the tumors, in turn resulting in cDC1-mediated CD8+ T cell infiltration and/or activation. It could be highly valuable/useful to functionally determine (e.g. by depleting NK cells in this model) if NK cells are required for the effects seen. 

      We agree that depletion of NK cells could really solidify the findings even more, and it is part of ongoing investigations for future projects. However, it would be imperative to first characterise these NK cells in more depth as conventional global ablation of NK cells is excepted to highly impact immunosurveillance as well. This is part of current ongoing work.

      (6) Figure 7B: how were "high" and "low" defined (for the NK signature)?

      An enrichment score of the NK_C1 gene signature (see Table supplement 1) was first calculated per patient sample in the TCGA RNA-seq dataset using the Gene Set Variation Analysis (GSVA) method. A cut-off value was then determined using the maximally selected rank statistics (max-stat R package) method to divide patients into “high” and “low”. 

      (7) Lines 164-165 of the Results: it would be good to include a reference supporting the statement.

      We have added rephrased the manuscript and added corresponding references (see Lines 170-173 in revised manuscript).

      (8) There are many conclusions and very speculative language based only on sequencing results, and these have not been validated (e.g. in the Discussion, lines 447-453). As another example, it was concluded that a decrease in NKG2D+ NK cells implied a reduction in overall NK cell cytolytic activity and that NKG2D- NK cells were hypofunctional and did not kill well. This was not tested. Generally, it would be useful for the authors to use language that conveys that the data are primarily suggestive (rather than "confirmatory", line 447) of relationships, phenotypes, and functions at this point. 

      We thank the reviewer for their concerns and have carefully adapted the manuscript text to more clearly clarify the findings in a careful manner.

      (9) On lines 246-247 the authors refer to cluster 3 NK cells, which express CD16, as "immature". The rationale for this designation is not provided, and most human NK cell development models hold that CD16+ NK cells represent the most mature subset(s). 

      We apologize for the typographic error – later on we refer to the NK_C3 cluster as cytotoxic NK cells and we have corrected this in our revised manuscript (see Lines 273-275).

      (10) On line 351, the authors reference supplemental Figure 7C...but I don't see this figure in the accompanying powerpoint file. 

      This should have been Supplementary Figure 7B, and we have corrected it in the revised manuscript (see Lines 374-377)

      (11) On line 417, the authors reference NKp40; this is likely a typographical error. 

      This has been corrected in the revised manuscript to NKp46 (see Lines 439-442).

    1. eLife assessment

      The authors investigated the requirement and function of Blimp1/Prdm1 in murine natural killer (NK) cells and the ILC1 lineage of innate lymphoid cells, using a conditional knockout model. The single-cell mRNA-seq data provided here represent a valuable resource for the community, but the lack of mechanistic investigations leaves the study partially incomplete. The work will be of interest to the fields of innate lymphoid cell biology and tissue immunology.

    2. Reviewer #1 (Public Review):

      He et al. investigate the requirement and function of Blimp1 (encoded by Prdm1) in murine NK cells and ILC1. Employing a conditional knockout mouse model (Prdm1flox x Ncr1cre), the authors describe impaired abundance and maturation of Prdm1-deficient NK cells and ILC1 in different tissues. Blimp1-deficient NK cells have reduced expression of cytotoxic molecules (Gzmb, Prf1) and, in some instances, Ifng production, and Prdm1flox x Ncr1cre mice show impaired tumor control in experimental metastasis models. Using single cell RNA sequencing analysis, the authors propose that Prdm1 regulates JunB expression and NK cell maturation. Based on in silico analyses, the authors suggest manifold intercellular communication between NK/ILC1 and macrophages. Without following up on any of these potentially interesting suggestions, the authors conclude their study reiterating that Prdm1 regulates IFNg-production of tumor-infiltrating NK cells and ILC1.

      Many of the reported functions of Blimp1 in NK cells have previously been identified using a mixed-chimera strategy comparing Prdm1 WT and KO NK cells (Kallies et al., Blood 2011). Here, the authors expand on these findings using a conditional model to delete Prdm1 in NK/ILC1 and single cell sequencing, and provide a more refined analysis of the functions of Blimp1 in these cells. Cell-chat analysis suggests close interactions fo Blimp-dependent NK/ILC1 subsets with hepatic macrophages, but these suggestions are not followed up by experiments. Potentially interesting differences in the macrophage compartment of Ncr1-Cre x Prdm1-fl/fl mice are suggested by the scc-RNA-Seq data, but are not validated e.g. by FACS. The study falls short in providing new mechanistic insights. Nevertheless, it is an interesting confirmation of "old" suggestions in a more refined setting, and the provided single-cell mRNA-Seq data represents a potentially valuable resource for the community.

    3. Reviewer #2 (Public Review):

      He and colleagues aimed to elucidate the role of the transcription factor Prdm1 in liver Type 1 ILCs (innate lymphoid cells), focusing on its regulatory mechanisms and potential implications for developing innovative immune therapy strategies against liver cancer​.

      Strengths:

      The study effectively integrates omics analyses and cytometry to explore Prdm1's impact on the cellular composition and immune regulation within the liver, providing a comprehensive view of its biological role​. Employing a conditional knockout mouse model adds specificity to their experiments, allowing for precise manipulation of the Prdm1 gene​​.

      Weaknesses:

      The study predominantly relies on limited mouse models, which may not fully represent the complexity of Type 1 ILC behavior across different cancer types. Some experimental designs, such as the limited in vitro killing assessments, and additional human data could be expanded to strengthen the findings and their interpretation​​.

      The authors have demonstrated that Prdm1 plays a critical role in the function of NK cells and ILC1s within the liver, particularly in the context of tumor resistance. However, due to the use of specific disease models and lack of direct human data, the application of these findings to clinical settings remains speculative​​. While the study advances our understanding of liver ILC biology, further research is necessary to validate these effects in human systems and across more diverse cancer models​.

      ​Discussion on impact and utility:

      This study contributes significantly to the field of immunology and cancer therapy by revealing potential new targets for immunotherapy of liver cancer. The methods and data provided could serve as a valuable resource for further research aimed at enhancing immune-based cancer treatments​.

      ​Additional Context for Interpretation:

      Understanding the role of Prdm1 in the broader context of immune cell regulation and its interaction with other cellular components in the tumor microenvironment could be crucial. Further studies should explore the dynamic between Prdm1 expression, NK cell functionality, and tumor resistance mechanisms to fully harness the therapeutic potential of targeting this pathway in liver cancer​.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1 (Public Review):

      He et al. investigate the requirement and function of Blimp1 (encoded by Prdm1) in murine NK cells and ILC1. Employing a conditional knockout mouse model (Prdm1flox x Ncr1cre), the authors describe impaired abundance and maturation of Prdm1-deficient NK cells and ILC1 in different tissues. Blimp1-deficient NK cells have reduced expression of cytotoxic molecules (Gzmb, Prf1) and, in some instances, Ifng production, and Prdm1flox x Ncr1cre mice show impaired tumor control in experimental metastasis models. Using single-cell RNA sequencing analysis, the authors propose that Prdm1 regulates JunB expression and NK cell maturation. Based on in silico analyses, the authors suggest manifold intercellular communication between NK/ILC1 and macrophages. Without following up on any of these potentially interesting suggestions, the authors conclude their study reiterating that Prdm1 regulates IFNg-production of tumor-infiltrating NK cells and ILC1. Many of the reported functions of Blimp1 in NK cells have previously been identified using a mixed-chimera strategy comparing Prdm1 WT and KO NK cells (Kallies et al., Blood 2011). Here, the authors expand on these findings using a conditional model to delete Prdm1 in NK/ILC1 and single-cell sequencing and provide a more refined analysis of the functions of Blimp1 in these cells. Cell-chat analysis suggests close interactions of Blimp-dependent NK/ILC1 subsets with hepatic macrophages, but these suggestions are not followed up by experiments. Potentially interesting differences in the macrophage compartment of Ncr1-Cre x Prdm1-fl/fl mice are suggested by the scRNA-Seq data but are not validated e.g. by FACS. The study falls short in providing new mechanistic insights. Nevertheless, it is an interesting confirmation of "old" suggestions in a more refined setting, and the provided single-cell mRNA-Seq data represents a potentially valuable resource for the community. There are some control analyses that are required to support the conclusions of the authors, and I have a few suggestions that would help to improve the manuscript.

      We sincerely appreciate your careful review and insightful feedback on our manuscript. We have carefully considered your comments and present the results of new experiments conducted in response to your suggestions. Please find the detailed responses below.

      Major comments

      Comment 1: The authors do not control for the potential effects of Cre expression. Expression of Cre from within the Ncr1 locus (using the mouse model established by Narni-Mancinelli et al.) has significant effects on NK cells and especially ILC1s (reducing their frequency and absolute numbers and altering their functionality. The authors should characterize the Ncr1cre mice used here (developed by Shanghai Model Organism Center) in this regard and should use proper controls (Ncr1Cre+ Prdm1wt/wt as control for Ncr1Cre+ Prdm1fl/fl, instead of WT littermates) for all of their key data, e.g. those depicted in Fig 1FG, 2ADFH, 7D, S2,3,4.

      Response 1: This is a very insightful question that has posed a challenge for many researchers, including us, engaged in conditional knockout studies. The expression of Cre and the insertion of loxP sequences both have the potential to influence gene expression. This is because the region where loxP is inserted may contain regulatory sequences for the gene of interest. Ncr1-Cre is a frequently used transgenic mouse model in our laboratory. In our prior research, we also had concerns about the possible impact of Cre on NKp46 expression, which could lead to a decline in NK cell function. Therefore, in our previous studies focused on Smad4 expression in NK cells, we conducted similar experiments. In Figure 6 of our published paper in the Journal of Clinical Investigation (Wang et al., J Clin Invest, 2018), we compared NKp46-iCreTgfbr2fl/flSmad4fl/WT with NKp46-iCreTgfbr2fl/flSmad4fl/fl. Although the primary purpose is to establish Smad4's independence from TGF-β, it also allows for a comparison between Smad4fl/fl and Smad4fl/WT in the presence of Cre. In the critical phenotype we assessed, NKp46-iCreTgfbr2fl/flSmad4fl/fl (compared with NKp46-iCreTgfbr2fl/flSmad4fl/WT) exhibited the same phenotype as NKp46-iCreSmad4fl/fl (compared with NKp46WTSmad4fl/fl). This suggests that Cre's influence on NK cells may be within a reasonable and controllable range. Furthermore, in contrast to the decrease in Ncr1 expression caused by Cre, the reduction in the expression levels of genes targeted by Loxp knockout, such as Prdm1 in this study (Figure 1 E), is more significant. Therefore, with the current techniques and research methods, we believe that the data provided in this study can support the role of Prdm1 in

      NK cells.

      Comment 2: Several of the phenotypic findings on NK cells have been described before by Kallies et al. in 2011 (Ref 29), although using a different genetic Prdm1-ablation model (Prdm1-GFP/GFP knockin/knockout model). This study reported impaired NK cell maturation, reduced Gzmb expression, impaired in vivo cytotoxicity against subcutaneous RMA-S cells, impaired in vitro proliferation, comparable in vitro killing, increase in BM NK cell numbers. The authors should discuss/mention this more prominently in their manuscript, and highlight where they confirm or refine these previous findings, and where they actually provide new information.

      Response 2: We appreciate your valuable suggestions. The article you referred to, published in Blood, is indeed an excellent work. While we had cited this article, our discussion regarding its specific content was limited. Based on your advice, we have made revisions and included the following content in our discussion section (page 24; line 489-493):

      “In a study involving systemic knockout combined with competitive transplantation, it was found that Prdm1 promotes NK cell maturation and the expression of Gzmb. On the contrary, the same study also found that NK cells with Prdm1 deficiency exhibit heightened proliferation, increased survival, enhanced migratory abilities towards tumors, and greater cytotoxicity against subcutaneously implanted RMAS tumors (31).”.

      Comment 3: What is the reason to refer to the enriched cluster in Blimp1-deficient NK cells as "Junbhi"? There is no follow-up for a function of Junb, and there are many other genes upregulated in these cells. Most critically, these cells seem to represent exactly the c-Kithi cells that Kallies et al. already showed and discussed in their paper. The authors should stain for Kit, and also refer to this. Also, MacKay et al. performed Blimp1-Chip-Seq (in T cells), maybe it would be interesting to check to which of the identified DEGs Blimp1 can bind.

      Response 3: We appreciate the suggestion from the reviewer. We think a gene that supports the development of lymphocytes doesn't necessarily positively regulate their function. For example, JunB is essential for T cell development but can also induce T cell exhaustion (Lynn et al., Nature. 2019). Therefore, while Prdm1 has been shown to promote NK cell development, it cannot be assumed that it always positively regulates NK cell function, especially for anti-cancer immune surveillance. In this respect, we try to find a driving-factor of the impaired anti-tumor ability of Prdm1_Δ_Ncr1 NK cells. Although there are many other genes upregulated in this cluster (e.g. Kit), JunB attracts more our interest of its potential for regulating NK cells functions in cancer, whereas c-Kit is more likely a marker of NK cells maturation, which has been well-demonstrated by Kallies et al. and other studies. Our previous studies also showed that the expression of c-kit was decreased in mature NK cells, compared immature NK cells (Wang et al., J Clin Invest, 2018). 

      The lack of following experiments of Junb is because we cannot find valuable surface markers to investigate the follow-up function of _Junb_hi cNK cluster. If we use intracellular markers, it is more likely an analysis of gene expression pattern, which has been well-described in our RNA-seq data. As we describe above, our study did not aim to further investigate the role of prdm1 in NK cells maturation, as the c-Kit expression was upregulated in Prdm1-kncok NK cells and correlated with NK cell maturation, which has been validated by Kallies et al.. 

      We also have discussed the potential DEGs that could be bound and regulated by Prdm1 in our revised manuscript (page 27-28; line 561-571):

      “Prdm1 and Hobit directly bound and repressed Tcf7 (18), which encoded TCF-1, a TF binding and limiting the activity of Gzmb regulatory element (69). Gzmb has been demonstrated directly bound and activated by Junb in NK cells, which suggested Gzmb expression regulated by multiple Prdm1/Hobit downstream signals (26). In human T cells, binding motif of JUNB was enriched in the binding sites of PRDM1 (70), indicating the essential role of PRDM1-JUNB axis during NK cell and T cell development. In NK cells deficient in Prdm1 expression, we noted a decrease in Gzmb levels alongside with an elevation in Junb expression. This indicates that Prdm1 not only facilitates the expression of Gzmb in NK cells but also suppresses Junb expression. Given that Junb is recognized as a positive regulator of Gzmb (71), this presents a complex interplay that seems contradictory. Therefore, it is imperative to develop a theoretical framework to comprehensively understand and interpret this paradoxical relationship.”.

      Comment 4: cNK cells are considered circulating cells, that transiently pass through the liver.

      Previous studies have suggested almost identical gene expression patterns in hepatic and splenic NK cells. In functional tests, they often "perform" identically. I am therefore a bit surprised that the authors find a differential dependency of Blimp1 for the IFNg production of splenic (no role of Blimp1) versus hepatic (Blimp1 regulating IFNg production) NK cells (Fig S3). Do the authors have any suggestions on that? The analyses are performed by 12+4h stimulations with IL12/18, which could involve the effects of altered bystander cells (as suggested by Figure 6). Therefore, these analyses should be provided upon standard 4h stimulations with IL12/18 and also with PMA/I under BFA. Note: liver and splenic cNK cells look quite different in the chosen histograms in Figures 7 A, B, C, yet there is massive variability in these analyses - is there any systematic/technical problem?

      Response 4: We appreciate the valuable suggestion from the reviewer. Studies have suggested that, at the gene expression or transcriptomic level, liver NK cells exhibit more similarity to splenic NK cells while displaying greater divergence from liver ILC1s. However, we do not think that splenic NK cells or peripheral blood NK cells (which are more abundant in circulation) are entirely indistinguishable from liver NK cells. Notably, there are substantial differences in their maturity levels, with liver NK cells being more mature. Since we are examining the protein levels, a 4-hour stimulation period may not fully capture these distinctions. Even when considering the potential impact of bystander cells, the experimental design specifically targets Prdm1 knockout within NK cells, ensuring that the study accurately elucidates the role of Prdm1 in NK cells. For each experiment, we have implemented control measures, and any variances observed in the figures may be attributed to individual variations among the animals. It is also possible that the MFI values measured by flow cytometry exhibit larger variations than a percentage.

      Comment 5: Figure 4 H/I - In contrast to NK cells in Fig 4E, F, the KO and WT ILC1s seem to co-cluster largely. Authors should validate differentially expressed genes. How strong is the effect of Blimp1 in ILC1s? Or is Blimp1 a critical TF driving effector differentiation in NK cells, while it has only subtle effects in ILC1 (these may be regulated by Hobit?)? This seems an interesting finding that should at least be discussed. For these types of small differences in ILC1, FACS confirmation analyses should be performed and findings be reevaluated using Cre-expressing controls (see above).

      Response 5: We appreciate the suggestion from the reviewer. As request, we analyze the DEGs in liver cNK cells and ILC1s from our scRNA-seq data (revised Supplemental Figure 8, A and B). There only a few valuable DEGs in ILC1s compared to cNK cells. It’s likely that Prdm1 have more essential effect of cNK cells transcriptional program, while it plays more important role in keep the homeostasis of ILC1s population. We have discussed these points to better inform the readers. (page 27; line 554-561): 

      “Previous studies have identified Hobit and Prdm1 as central regulators instructing tissue-dependent programs and retention of diverse tissue-resident lymphocytes (18, 51, 53). Liver ILC1s required Hobit, but not necessary for cNK cells (6). Expression of Prdm1 was remarkably higher in cNK cells versus ILC1s (18). While in our study, cNK cells and liver ILC1s reduced simultaneously in Prdm1ΔNcr1 mice, and even more significant in ILC1s. This indicates that while Prdm1 is expressed at lower levels in ILC1s, its role in preserving the quantity of ILC1s may be more crucial. Thus, Prdm1 and Hobit may have parallel program in instructing ILC1s functional development and maturation.”. 

      We cannot find valuable surface marker to evaluate the change in ILC1s, as most of changes are intracellular markers.

      Comment 6: The authors describe and discuss some of Figure 1 and 2 data as if Blimp1 would be involved in alternative NK versus ILC1 fates, but there is no evidence for this.

      Response 6: There is no evidence that Prdm1 could alter the fate decision of the progenitor towards liver cNK or ILC1s. Although some studies reported the conversion between cNK cells and ILC1s in special contexts, it was widely accepted that liver cNK cells and ILC1s originated from different progenitors. While we observed changes in the proportions of liver cNK cells and ILC1 in Prdm1 KO mice, we still lack sufficient evidence to support the relative independence of NK and ILC1 development, as well as evidence to indicate that Prdm1 is exclusively responsible for NK and ILC1.

      Regarding the changes in NK and ILC1 proportions after Prdm1 KO, we believe that both NK and ILC1 cells require Prdm1 to maintain their populations, with ILC1 possibly requiring it to a greater extent. This is the reason for the altered balance between NK and ILC1 cells following Prdm1 KO. We wish to clarify this point to prevent any misconceptions among readers. To address this, we have added the following content to the discussion section (page 25; line 509-516):

      “Furthermore, although both liver NK cells and liver ILC1s require Prdm1 to maintain their quantity, liver ILC1s demonstrate a more pronounced dependency on Prdm1. However, it is currently widely believed that liver NK cells and liver ILC1s originate from different progenitors. It is worth noting that while we observed changes in the NK and ILC1 proportions after Prdm1 knockout, our data does not support the hypothesis that Prdm1 affects progenitor differentiation decisions, thereby influencing the fate selection of NK and ILC1. Further research may be needed to elucidate how Prdm1 regulates the balance between NK cells and ILC1s.”.

      Comment 7: There are several recent studies suggesting a role for Hobit, homologue of Blimp1, in NK cells and in ILC1, and in the control of liver metastases. The authors should discuss similar and unique functions of Hobit and Blimp1, also in the regulation of gene expression patterns, and should refer to these studies.

      Response 7: We would like to express our gratitude to the reviewer for your insightful comments, which bring forth a critical perspective. In accordance with the reviewer's suggestion, we have updated our discussion to include the diverse functions guided by Hobit and Prdm1 in regulating the development and function of cNK cells and ILC1s (page 27; line 554-561):

      “Previous studies have identified Hobit and Prdm1 as central regulators instructing tissue-dependent programs and retention of diverse tissue-resident lymphocytes (18, 51, 53). Liver ILC1s required Hobit, but not necessary for cNK cells (6). Expression of Prdm1 was remarkably higher in cNK cells versus ILC1s (18). While in our study, cNK cells and liver ILC1s reduced simultaneously in Prdm1ΔNcr1 mice, and even more significant in ILC1s. This indicates that while Prdm1 is expressed at lower levels in ILC1s, its role in preserving the quantity of ILC1s may be more crucial. Thus, Prdm1 and Hobit may have parallel program in instructing ILC1s functional development and maturation.”.

      As shown in Supplemental Figure 8, we analyzed two published scRNA-seq data performed with Hobit_KO mice and integrated DEGs in cNK cells and ILC1s with our data. We observed overlaps of DEGs in _Prdm1_Δ_Ncr1 and Hobit_KO between cNK cells and ILC1s, such as _Junb, Tcf7, Gzmb, and Prf1 (Supplemental Figure 8), indicating the similar regulatory network of Prdm1 and Hobit. These data are now described on page 19; lines 386-395:   

      “We also compared the gene expression patterns between Prdm1 and Hobit (homologue of Blimp1) with two published scRNA-seq data (51, 53). Following the knockout of Hobit, the DEGs were primarily identified within ILC1s. Conversely, after the knockout of Prdm1, a greater number of DEGs were observed in cNK cells. This indicates that Prdm1 likely possesses a broader range of target genes within cNK cells, whereas Hobit appears to have a more pronounced impact on gene expression within ILC1s (Supplemental Figure 8, C-F). There are some overlaps between the downstream transcriptional profile of Prdm1 and Hobit in liver cNK cells and ILC1s (Supplemental Figure 8, G and H), such as Junb, Fosb, Tcf7, Kit, Gzmb, Prf1, and Cxcr6 was simultaneously upregulated or downregulated in both Prdm1ΔNcr1 and _Hobit_KO liver cNK cells or ILC1s, indicating the similar regulatory networks of Prdm1 and Hobit.”.

      Comment 8: Figure 4: The authors should discuss (and cross-validate) their liver gene expression analyses in the context of published datasets of NK and ILC1, such as the ones by Lopez et al, Friedrich et al, Ducimetiere et al and Yomogida et al.

      Response 8: We thank the reviewer for raising this important point. To address this question, we have now analyzed the gene expression of liver cNK cells and ILC1 in two published data mentioned above, also in the context of Hobit-knock. We compared gene expression of different clusters and described in our revised manuscript (page 19; lines 386-395). 

      “We also compared the gene expression patterns between Prdm1 and Hobit (homologue of Blimp1) with two published scRNA-seq data (51, 53). Following the knockout of Hobit, the DEGs were primarily identified within ILC1s. Conversely, after the knockout of Prdm1, a greater number of DEGs were observed in cNK cells. This indicates that Prdm1 likely possesses a broader range of target genes within cNK cells, whereas Hobit appears to have a more pronounced impact on gene expression within ILC1s (Supplemental Figure 8, C-F). There are some overlaps between the downstream transcriptional profile of Prdm1 and Hobit in liver cNK cells and ILC1s (Supplemental Figure 8, G and H), such as Junb, Fosb, Tcf7, Kit, Gzmb, Prf1, and Cxcr6 was simultaneously upregulated or downregulated in both Prdm1ΔNcr1 and _Hobit_KO liver cNK cells or ILC1s, indicating the similar regulatory networks of Prdm1 and Hobit.”.

      Recommendations For The Authors:

      Comment 9: The use of a paired t-test analysis when comparing cells/groups from different mice is not correct. Instead, the authors should consider using e.g. an unpaired t-test and re-test the indicated significance (e.g. Figure 1F, Figure 2H).

      Response 9: We thank the reviewer’s comments. As we used littermates for the experiments and they are compared side by side, so the paired t-test analysis is acceptable. We reanalysis the significance in the results of Figure 1F, and Figure 2H using unpaired t-test. The statistics significance of Figure 1F using unpaired t-test was same as using t-test. However, in Figure 2H, the reduced IFN-γ production not reach statistics significance when used un-paired t-test (Supplemental Figure 12B). It may attribute to the variation between different littermates, but the trend is still under the scope of our conclusion. We believe that employing a paired t-test between littermates could be also meaningful. As such, we kept both statistical methodologies to ensure a thorough evaluation.

      Comment 10: In several instances, it is unclear whether data are pooled or representative (and if so, of how many analyses). This information needs to be provided for all analyses. 

      Response 10: We apologize for the lack of details and have now provided the sufficient information in our figure legends. 

      For example, we delete the number in original histogram to avoid the misunderstanding of the unclear whether data are pooled or representative (e.g. original Figure7 A-C; revised Figure7 A-C). Furthermore, we added the “representative” in figure legends of all flow cytometric plots to better inform readers (e.g. original Figure2, D and F; revised Figure2, B and D).

      Comment 11: In the title and abstract authors use "type 1 ILCs" for both NK cells and ILC1, and it is difficult to understand which phenotypes correspond to cNK cells versus ILC1. Most of the analyses clearly separate these two different cell types. I would appreciate a lot being more accurate in the abstract, and describing cNK and ILC1 phenotypes in a clear way.

      Response 11: We are really sorry for our inaccurate descriptions. According to Spits et al., (Spits et al., Nature Reviews Immunology, 2013) and other related studies, we have now adopted a more appropriate nomenclature as “Conventional NK cells” correspond to “cNK cells”, “Type 1 innate lymphoid cells” to “ILC1s”, and “Group 1 ILC” as the collective name of cNK and ILC1s. 

      The definition of these cells was described in the introduction (page 4, line 52-53; line58-62): 

      “Group 1 ILCs consist of cNK cells and ILC1s (1, 2), with distinct developmental trajectories and effect molecules (3).”, “In a state of homeostasis, liver group 1 ILCs (CD45+CD3-NK1.1+NKp46+) can be discriminated into cNK cells and ILC1s by the differential expression of CD49a and CD49b (2): cNK cells are marked by the expression of CD49b, while liver ILC1s exhibit a distinctive positivity for CD49a. Tumor Necrosis Factor Related Apoptosis Inducing Ligand (TRAIL) is also expressed on liver ILC1s, but not on cNK cells (10, 11).”. 

      We also describe cNK and ILC1 phenotypes in our scRNA-seq data, as shown in page 13; line 259-261: 

      “cNK cells expressed high levels of Itga2 (CD49b) and Eomes, while ILC1s had high levels expression of Itga1 (CD49a) and Tnfsf10 (Supplemental Figure 5, F and G).”.

      Comment 12: In the abstract authors state "The present study unveiled a novel regulatory mechanism of Prdm1 in liver Type 1 ILCs, showing promising potential for developing innovative immune therapy strategies against liver cancer." - maybe authors should discuss how their findings could be used for therapeutic approaches?

      Response 12: We appreciate comments from the reviewer. As there hasn't been a clear consensus on the role of Prdm1 in NK cells prior to this, some studies have suggested that Prdm1 can inhibit cytokine secretion by NK cells. Particularly, Kallies et al. in their 2011 article in Blood found that Prdm1 might suppress NK cell anti-tumor activity. Hence, there hasn't been any immunotherapy targeting Prdm1 in NK cells for cancer treatment. Our research demonstrates the enhancing role of Prdm1 in NK cell anti-tumor activity, providing theoretical support for NK cell therapy targeting Prdm1. 

      We added the following content to the discussion section (page 29; line 605-609): 

      “Further research may provide deeper insight into the role of PRDM1 in the anti-tumor function of human NK cells, enabling a more direct investigation of its application in cancer therapies. Given its important role in preserving liver cNK cells and ILC1s functional heterogeneity, enhancing Prdm1 function in human NK cells could potentially be a strategy to promote NK cell-based immunotherapy for cancer.”.

      Comment 13: The authors should explain or interpret their data a bit more (e.g. what is the consequence of GSEA enriched in negative regulation of Il6 production? (Fig. 3D)  do NK cells produce Il6 (Figure 3)? What's the impact of Il17 signaling in NK/ILC1 (Figure 5). Do the authors suggest JunB-driven metabolic reprogramming (Suppl. Fig 6D-F?).

      Response 13: We appreciate comments from the reviewer. The question of IL-6 production in NK cell also raised by another reviewer. We have checked the GSEA results, and found no valuable genes in IL-6 production in NK cells. According to the suggestions of another reviewer (Response to Reviewer 2 Comment, Comment 14), it may be prudent to omit this figure.

      IL-17 signaling indicated the plasticity of ILC1s, that may be originated from the differentiation of ILC3, we added more discussion of this part (page 17; line 341-344). 

      “Several ILC3 signature genes, such as Rora, Tmem176a, and Tmem176b (45), highly expressed in this cluster (Supplemental Figure 7D). Considering the close relationship between IL-17 mediated immunity response and ILC3 (1, 46), it is plausible that _Il7r_hi ILC1 cluster may be attributed, at least in part, to potential plasticity between ILC1 and ILC3 subsets.”.

      The decreased mitochondrial function may have more relevance to NK cell exhaustion in tumors. Our data suggest that the elevated expression of JunB in NK cells may predispose them to exhaustion. Currently, our hypothesis regarding the promotion of NK cell exhaustion by high JunB expression is based on the observed correlation between JunB expression levels and exhaustion phenotypes (at the gene expression and IFN-γ secretion levels) and the findings in reference 67 (Lynn et al., Nature, 2019), where JunB was found to promote T cell exhaustion. However, we have not demonstrated causation between high JunB expression and exhaustion in NK cells. We propose that in NK cells, especially mature NK cells, excessive JunB expression may make them more sensitive to exhaustion inducers. Nevertheless, further research is needed to confirm this. To clarify this, we added the following content in the discussion section (page 26; line 537-543): 

      “While our current data is not sufficient to definitively classify these cells as exhausted NK cells, it supports that a subpopulation, referred to Junbhi cluster, demonstrates an exhaustion-like phenotype.

      The significant increase in this cell population following Prdm1 knockout in NK cells may potentially be one of the reasons why Prdm1ΔNcr1 mice lose their tumor-killing capacity. Whether the excessive expression of JunB in NK cells is also a contributing factor to their exhaustion, similar to T cells(65), requires further investigation.”.

      Comment 14: Ref 25 and Ref 57 are the same publication?

      Response 14: We are really sorry for our careless mistakes. We have checked all the reference and corrected the wrong format.

      Comment 15: Figure 1, E - The method description of RT-PCR is missing. I apologize if I have overlooked this information.

      Response 15: We have now added the description of RT-PCR in our revised method section (page 31; line 638-644):

      “RNA was extracted from FACS-sorted NK cells or splenocytes using RNASimple Total RNA Kit (TIANGEN Biotech, 4992858) and subsequently reverse transcribed to cDNA with SuperScript VILO Master Mix (Thermo Fisher Scientific, 11755050) according to manufacturer’s instructions. qPCR was performed with SYBR Green Mix (Thermo Fisher Scientific, A25742) and CFX Opus 96 Real-Time PCR System (Bio-Rad). The relative mRNA expression level was calculated using 2-ddCt method. Primer sequences:           Prdm1: 5’-CAGAAACACTACTTGGTACA-3’; 5’-GATTGCTTGTGCTGCTAA-3’.”

      Comment 16: Figure 1, F - The NKp46+CD3- gate for the liver seems to cut the population, not all cells are included.

      Response 16: We appreciate the review’s comment and apologize for our carelessness. We expend our data with more samples and reanalyzed them with a more convincing gating strategy. We now update our figures (revised Figure 1G; revised Supplemental Figure 2A). Several changes have occurred in the data and conclusions, and we have accordingly revised these contents in our manuscript.

      The original text is:

      “Proportion and absolute number of cNK cells in blood, bone marrow, lung, liver, spleen, and lymph nodes were analyzed by flow cytometry. Compared with Prdm1+/+ mice, the percentage of cNK cells (CD3-NK1.1+NKp46+) among lymphocytes was decreased in all of these tissues except bone marrow and lymph nodes (Figure 1F; Supplemental Figure 2A). However, no significant difference was observed in the percentage of cNK cells among bone marrow-derived lymphocytes between Prdm1ΔNcr1 and Prdm1+/+ mice. The absolute number of cNK cells in blood, lung, liver, and spleen also decreased in Prdm1ΔNcr1 mice (Figure 1F; Supplemental Figure 2A). Only a slight decrease in the number of cNK cells was observed in the lymph nodes of Prdm1ΔNcr1 mice, which did not reach statistical significance either (Supplemental Figure 2A). In contrast, the absolute number of cNK cells in Prdm1fl/fl mice bone marrow is moderately higher than Prdm1ΔNcr1 mice (Figure 1F).”

      The revised text is (page 8; line 142-146):

      “Proportion and absolute number of cNK cells in blood, bone marrow, lung, liver, spleen, and lymph nodes were analyzed by flow cytometry. Compared with Prdm1+/+ mice, the percentage and absolute number of NK cells (CD45+CD3-NK1.1+NKp46+) among lymphocytes was decreased in all of these tissues, whereas increased number of NK cells were observed in bone marrow (Figure 1G; Supplemental Figure 2A).”

      Comment 17: Figure 1, The y-axis labeling of lung CD3-NKp46+ cells (x10^3) is not correct.

      Response 17: We are really sorry for our carelessness. We now check the labels and make sure they are correct.

      Comment 18: Figure 1, The statistical significance of absolute numbers of NKp46+ cells in the bone marrow should be reviewed.

      Response 18: We expend our data with more samples and reanalyzed them with a more convincing gating strategy. We observed significant increase of bone marrow NK cells quantity in our updated data. These changes are now described in our revised manuscript.

      The original text is: 

      “However, no significant difference was observed in the percentage of cNK cells among bone marrow-derived lymphocytes between Prdm1ΔNcr1 and Prdm1+/+ mice”, “In contrast, the absolute number of cNK cells in Prdm1fl/fl mice bone marrow is moderately higher than Prdm1ΔNcr1 mice (Figure 1F).”

      The revised text is (page 8; line 142-146):

      “Proportion and absolute number of cNK cells in blood, bone marrow, lung, liver, spleen, and lymph nodes were analyzed by flow cytometry. Compared with Prdm1+/+ mice, the percentage and absolute number of NK cells (CD45+CD3-NK1.1+NKp46+) among lymphocytes was decreased in all of these tissues, whereas increased number of NK cells were observed in bone marrow (Figure 1G; Supplemental Figure 2A).”

      Comment 19: Figure 1, G - CD27 and CD11b are used to define maturation stages within NK cells. Here the authors are analyzing group 1 ILC instead (containing both NK cells and ILC1, especially in the liver). It would be better to pre-gate on Eomes+ or CD49b+ NK cells for this analysis.

      Response 19: We apologize for the lack of details in this analysis. We have pre-gate CD49b+ NK cells for the maturation stages analysis. We have now added this statement in our revised manuscript and figure legend (page 8; line 149-151)

      “The maturation of cNK cells (gated by CD45+CD3-NK1.1+NKp46+CD49b+) from blood, bone marrow, lung, liver, spleen, and lymph nodes were assessed, based on the expression of CD11b and CD27.”.

      Comment 20: Supplementary Figure 1, A - The NKp46+CD3- gate seems to cut the population, not all cells are included. y-axis labeling of spleen CD3-NKp46+ cells (%) is not correct.

      Response 20: Thanks, we have corrected these errors and shown in our revised supplementary Figure 2A.

      Comment 21: Figure 2, D-G - Did the authors analyse the ILC1/NK compartment of the tumor? What is the abundance and phenotype of these cells dependent on Prdm1 expression? Proper Crecontrols should be used (see above).

      Response 21: We appreciate the suggestions from the reviewer. As request, we have now added the analysis of cNK/ILC1s population in the context of tumor. The proportion changes of cNK cells and ILC1s in Prdm1_Δ_Ncr1 mice was similar with the no tumor-burden condition, while the number of both cNK cells and ILC1s decreased in tumor bearing liver (revised Figure 7D). These contents have been updated in our revised manuscript (page 23; line 479-481):

      “The proportion changes of cNK cells and ILC1s in Prdm1ΔNcr1 mice was similar with the no tumorburden condition, while the number of both cNK cells and ILC1s have significant decreased in tumor-bearing liver (Figure 7D).”.

      The reason why we did not use Cre-controls was described in comment 1.

      Comment 22: Figure 2, H - Prdm1-deficient NK and ILC1 produce less Ifng in response to in vitro stimulations with Il-12 and /or Il-18, and bulk Seq analysis (Fig 3F) shows reduced Il12rb2 expression. Does the expression of cytokine receptors correlate with the maturation of NK cells? This could be analyzed from the single-cell RNA-seq dataset. The statistical significance of %Ifng after Il12/Il18 stimulation should be revisited (see above).

      Response 22: We thank the reviewer for the suggestions. To address this question, we explored the expression of IL-12 and IL-18 receptors in cNK and ILC1 clusters. Within cNK clusters, Il12rb2, Il18r1 and Il18rap was highly expressed in Prf1hi and Cxcr3hi cNK clusters (revised Supplemental Figure 6H), indicating the IL-18 receptor expression correlated with the NK cell maturation. While in ILC1, these receptors mostly expressed on Il7r_hi and _Gzmb_hi ILC1 clusters (revised Supplemental Figure 7C). Significant decreased of _Il18r1 expression in Prdm1_Δ_Ncr1 cNK cells and ILC1s may associated with the impaired ability to produce IFN-γ. We now added this analysis (page 18; line 364-368):

      “Within cNK cells, Il12rb2, Il18r1 and Il18rap was highly expressed in Prf1hi and Cxcr3hi cNK clusters (Supplemental Figure 6I), indicating the IL-18 receptor expression correlated with the NK cell maturation. While in ILC1, these receptors mostly expressed on Il7r_hi and _Gzmb_hi ILC1 clusters (Supplemental Figure 7D). Significant decreased of _Il18r1 expression in Prdm1ΔNcr1 cNK cells and ILC1s may associated with the impaired ability to produce IFN-γ.”.

      The un-paired t test of IFN-γ production was displayed in revised supplemental Figure 12 B. Difference in IFN-γ production was found to be not significant when analyzed using an unpaired ttest in original Figure 2 H. However, significance was observed in tumor-bearing liver cNK cells and ILC1s, specifically under the context of IL-12/IL-18 stimulation, as depicted in the original Figure 7E using an unpaired t-test. These variations may be attributed to differences among different littermates. Despite these variations, the trend remains consistent with our overall conclusions. We believe that employing a paired t-test between littermates could be also meaningful. As such, we kept both statistical methodologies to ensure a thorough evaluation.

      Comment 23: Figure 3, A-E - For bulk sequencing analysis, splenic CD3-NK1.1+NKp46+ were isolated. This population also contains ILC1 in the spleen (e.g. Flommersfeld et al.), although much less abundant compared to NK cells, and compared to the liver compartment. However, have the authors tested the abundance of splenic ILC1 in Prdm1-deficient mice, which may impact the gene expression data? In line with this the detection of altered Cxcr6 expression in Figure F, which is usually expressed by ILC1 rather than NK cells, may indicate an alteration in ILC1 numbers. The authors should validate the altered expression of CXCR6, Itga1, and Cx3cr1 on NK cells by flow cytometry.

      Response 23: We cited the work of Flommersfeld et al. into our manuscript and have expanded our Results section to include the following information (page 19; line 377-385):

      “Previous research found that spleen NK cells could be divided into three distinct groups based on their expression levels of CD27, CD62L, CD49a, and CD49b (52). CD27+CD62L- NK cells have remarkable high expression of Batf3, while it was only barely expressed in CD27+CD62L+ and CD27-CD62L+ NK cells (52). Based the sequencing data published by Flommersfeld et al., (GSE180978), a notable negative correlation was observed between the expression levels of Prdm1 and Batf3 (Supplemental Figure 8I). On top of that, our findings unveiled the negative regulatory influence of Prdm1 on Batf3 within both spleen and liver NK cells. This discovery highlights a potential upstream mechanism that may influence the hemostasis of the spleen NK cell subpopulations through Batf3.”.

      We validated the expression of CD49a (Itga1) and CX3CR1 in liver cNK cells and ILC1s in our revised manuscript, which is described in our revised manuscript (page 9; line 170-174, page 14; line 231-233):

      “Increased CD49a expression was also observed in Prdm1ΔNcr1 liver ILC1s, while it showed decreased expression in NKp46+ cells in the liver, bone marrow, and lymph nodes (Supplemental Figure 2, F and G).”, “The percentage of CX3CR1+ cNK cells was significantly decreased in multiple tissues of Prdm1_Δ_Ncr1 mice, while the proportion of CX3CR1+ ILC1 was increased in the liver (Figure 3F).”

      Comment 24: Figure 3, F - Tnfsf26: which gene is this? is this a typo? Is a function of this gene in NK cells reported? Altered Batf3 expression suggests an impact on ILC1-like NK cells (Flommersfeld et al).

      Response 24: We are very sorry for our mistakes. We have removed Tnfrsf26 from the heatmap.

      Comment 25: Figure 3, G-J refer to Kallies data?! 

      Response 25: Kallies‘s data has mentioned the reduced GzmB expression in Blimp1gfp/gfp mice. However, compared with Kallies’s study, we further analyzed the GzmB and Perforin expression in different mature stages of NK cells. Reduced GzmB expression not only due to the less mature phenotype in Prdm1-deficient NK cells, highlighting the role of Prdm1 in regulating NK cell function. So, we added these contents in the revised manuscript (page 12; line 233-242):

      “Lower GZMB and PRF1 production was observed in Prdm1-deficient splenic cNK cells, liver cNK cells and ILC1s (Figure 3, H-K; Supplemental Figure 4, A-I). Notably, the proportion of GZMB+ and PRF1+ cNK cells was decreased among almost all of the maturation stages of cNK cells (Figure 3, J and K). The relative mean fluorescent intensities (MFIs) of GZMB and PRF1 consistently show a reduction across all developmental stages in PrdmΔNcr1 NK cells (Supplemental Figure 4, H and I). Yet, no statistical difference of PRF1 was found within the CD11b-CD27+ and CD11b+CD27+ subsets, likely due to the relatively lower perforin levels in these populations (Supplemental Figure 4I). These findings suggest that Prdm1 may directly influence cytotoxic molecule in NK cells, rather than impacting their anti-tumor abilities solely by affecting the maturation phenotype of Prdm1-deficient NK cells.”

      In Discussion section (Kallies’s work is cited here in revised manuscript) (page 24; line 500-502):

      “Our results not only confirmed a decrease in cytotoxic molecules in Prdm1-deficient NK cells (31) but also showed that the reduction in Gzmb and perforin is not solely attributable to the diminished maturation of these cells.”

      Comment 26: Figure 3, G, I - How do the authors explain the high variability of GzmB and Prf1 in Prdm1+/+ cells? 2 samples have comparable values to Prdm1-deficient cells.

      Response 26: This may be due to the inherent differences in MFI among different samples. In the revised version, we have added data on percentages, which exhibit much less variability (Figure 3, H and I). The MFIs of GZMB and PRF1 are moved to supplemental Figure 4 E and F.

      Comment 27: Did the authors test the mice for potential germline recombination of the floxed allele, which has been suggested as a potential problem of Ncr1cre?

      Response 27: We appreciate the insightful comments provided by the reviewer, and this is a really good question. In Prdm1fl/fl mice, germline recombination typically results in a systemic knockout of Prdm1, which can lead to embryonic lethality. Given that mice were successfully born in the current study, it is almost unlikely that germline recombination of Prdm1 occurred due to leaky expression of Cre.

      To confirm this issue, we isolated splenocytes and assessed Prdm1 expression using qPCR. We observed no significant difference in Prdm1 expression between splenocytes from Prdm1+/+ and Prdm1ΔNcr1 mice (revised Figure 1F). This also indicated that germline recombination issues are unlikely to be present in the Prdm1ΔNcr1 mice.

      Comment 28: Histograms do not show MFI

      Response 28: We appreciate the comments provided by the reviewer. The MFI value was omitted.

      Comment 29: Supplementary Figure 4, B - FACS plot labelling: Typo, Histograms do not show MFI.

      Response 29: We sincerely thank the reviewer for careful reading. The typo in this figure was corrected. The MFI is omitted.

      Comment 30: Figure 4, A - What are the cells in the red cluster in the middle of the UMAP, do they belong to B cells? Why do they cluster so separately? It is interesting, but also surprising that NK and ILC1 cluster map so far apart from each other (rather with CD8 or B cells? or NKT cells) - do the authors have any comments?

      Response 30: We sincerely apologize for the mistakes in labeling a group of cells in our previous analysis. Upon a thorough re-evaluation, we have corrected the labels of several cell clusters that were previously misidentified. The revised heatmap (revised Supplemental Figure 5C) represents the marker genes for each cluster. Additionally, in our updated analysis (revised Figure 4A), we have included clusters for Epithelial cells, CD4+ T cells, NKT cells, and Kupffer cells. Please note, the red cluster identified in the center of the original heatmap corresponds to the CD4+ T cells.

      We checked the markers of cNK cell and ILC1 clusters and confirmed they are labeled correctly, as Ncr1 and Klrb1c (NK1.1) was highly expressed in these clusters compared to others (revised Supplemental Figures 5E).

      Comment 31: Does Junb expression correlate with the maturation stages of NK cells?

      Response 31: Our previous research indicated that during the maturation process of NK cells, there was a decrease in the expression levels of Junb (negative correlation), whereas there was an increase in the expression levels of Prdm1 (Wang et al., J Clin Invest, 2018; Supplemental Figure 5c and Supplemental Figure 11).

      Comment 32: The authors may consider validating their scRNA-seq data (e.g. by FACS analysis for highlighted markers, eg. cKit, Tcf7, Gzma, Cxcr3).

      Response 32: We appreciate the suggestion from the reviewer. We validated several marker genes, including Gzmb, Prf1, and Cx3cr1 by FACS, as shown in the revised Figure 3 F-K. Currently, FACS cannot distinguish liver NK cells into as many distinct clusters as can be achieved through scRNAseq analysis. However, we expect that as technology progresses, we will be able to enhance our validation of the scRNA-seq data.

      Comment 33: It is a bit unclear to me why authors refer to Cxcr3hi NK cells as tissue-resident. This is based on Cxcr3 and Ccr2 expression. To make this statement, a much more detailed analysis would be required. How are CD69, CD49a, or CXCR6 expression of these cells?

      Response 34: We appreciate the suggestion from the reviewer. The primary reason for classifying this specific cluster of NK cells as tissue-resident is derived from the differential expression genes (DEGs) and Gene Ontology (GO) analysis, which demonstrate significant chemokine receptor activity within this cluster.

      To make this statement more clearly, we check the expression of the above markers, but only Cd69 had expression in cNK clusters, which was highly expressed in _Junb_hi and _Cxcr3_hi cNK cells (revised Supplemental Figure 6D). We also used top30 DEGs in ILC1s versus cNK to calculate the module score in all cNK clusters, as _Cxcr3_hi cNK had highest score among these clusters (revised Supplemental Figure 6D). This part has been updated in our manuscript (page 15; line 298-308):

      “Expression of tissue-resident markers Cd69 was also highly expressed in this clusters (Supplemental Figure 6D). The enrichment of chemokine receptors in the genes upregulated in the Cxcr3_hi cluster implying a greater likelihood of this cluster being tissue-resident compared with other cNK cell clusters (Figure 4H). To further confirmed tissue-resident properties of this clusters, we calculated the module score based on top30 DEGs in ILC1 versus cNK clusters, including _Cxcr6, Itga1, Cd160, Cd226, etc. _Cxcr3_hi cNK clusters have the highest score among all cNK clusters (Supplemental Figure 6H), indicating the similarity with liver ILC1s. In the tumor microenvironment, reports indicated that NK cells could transform into ILC1s (25). If this conversion of cNK cells into ILC1s also occurred under normal physiological conditions, then _Cxcr3_hi cNK cell cluster might be the most susceptible to such transformation.”

      Comment 35: The authors suggest that Prdm1 regulates chemokine receptor expression. An alternative explanation could be that this is an indirect effect of altering the abundance of NK cell subsets.

      Response 35: We are sorry for lacking the details in these figures. The input cell number of each genotype has now been added in following figure legends. 

      Figure 4F, “Proportions of cNK cells among total cNK cells (left; 211 cells in Prdm1+/+, and 141 cells in Prdm1ΔNcr1) and within clusters (right).”; Figure 5C, “Proportions of ILC1s among total ILC1s in different genotypes (left; 114 cells in Prdm1+/+, and 63 cells in Prdm1ΔNcr1) and within each cluster (right).”; Figure 6C, “Proportions of MDMs and KCs among total macrophages in different genotypes (510 cells in Prdm1+/+, and 624 cells in Prdm1ΔNcr1).”

      To minimize the effects of discrepancies in input numbers between samples with different genotypes, we represented the relative proportions of each cluster within its specific genotype (e.g. Supplemental Figure 6B; Supplemental Figure 7B; Supplemental Figure 9B).

      Comment 36: Supplementary Figures 6 and 7, A - The formatting of gene annotations does not fit the heat maps (the gene names on the last rows are missing).

      Response 36: We apologize for our careless mistakes. We have now addressed these mistakes.

      Comment 37: Supplementary Figures 6 and 7, What is the consequence of compromised mitochondrial function? Increase apoptosis?

      Response 37: In our experiments, we did not find that Prdm1 has an effect on the apoptosis of NK cells. Conversely, previous studies have found that Prdm1 might inhibit the proliferation of NK cells (C. Kucuk, et. al., PNAS, 2011). We acknowledge that there is ongoing debate regarding the precise definition of NK cell exhaustion. In our experiments, no changes were detected in the expression levels of surface markers (TIGIT) associated with exhaustion on NK cells following the knockout of Prdm1. However, we did note a significant reduction in the cytokine secretion capacity and tumor control efficacy of NK cells after Prdm1 knockout. We prefer to say that the consequence of compromised mitochondrial function might be increased exhaustion. As we mentioned in discussion part (line 482-483), mitochondrial fragmentation has been confirmed to be closely associated with NK cell exhaustion in tumor (Zheng et al. Nature immunology, 2019). Although the evidence to define the exhausted NK cells in Prdm1_Δ_Ncr1 was not sufficient, our data may support the compromised mitochondrial functions, at least in part, associated with the exhausted phenotype of Prdm1_Δ_Ncr1 NK cells in cancer. 

      We have discussed these points in our revised manuscript (page 26; line 529-543): 

      “Mitochondria are pivotal organelles crucial for cellular metabolism. Disruptions in mitochondrial function have been linked to T Cell exhaustion, attributed to glycolytic reprogramming (66). Similarly, mitochondrial fragmentation has been closely associated with NK cell exhaustion (67).

      However, the concept of NK cell exhaustion isn't as firmly established as it is for T cells. Exhausted NK cells should primarily exhibit diminished functions. This is characterized by a diminished ability to destroy tumor cells, a reduced capability to activate other components of the immune system, and compromised proliferation and survival rates. Additionally, this reduced functionality is associated with a decline in the expression of molecules responsible for cytotoxic activity, lower production of IFN-γ, and metabolic disturbances that may arise from mitochondrial dysfunction. While our current data is not sufficient to definitively classify these cells as exhausted NK cells, it supports that a subpopulation, referred to Junb_hi cluster, demonstrates an exhaustion-like phenotype. The significant increase in this cell population following _Prdm1 knockout in NK cells may potentially be one of the reasons why Prdm1ΔNcr1 mice lose their tumor-killing capacity. Whether the excessive expression of JunB in NK cells is also a contributing factor to their exhaustion, similar to T cells(65), requires further investigation.”.

      Comment 38: Figure 5, Describing the scRNA Seq data, the authors are switching a lot between Figure 4 and Figure 5. Maybe a reorganization of the Figures (Figure 4: NK cell; Figure 5: ILC1) could help.

      Response 38: We appreciate the reviewer’s suggestion. We have now reorganized the Figure 4 and Figure 5.

      Comment 39: Figure 5, We suggest naming one of the ILC1 clusters "Gzmbhi" to keep it consistent with the FACS data.

      Response 39: We agree with this excellent suggestion and have now renaming the “Gzmahi” ILC1 cluster as “Gzmbhi” ILC1 cluster.

      Comment 40: Figure 5, C - How was the JunB score derived (which genes were used)?

      Response 40: The JunB score was calculated based on the expression of marker genes in _Junb_hi cNK clusters (DEGs in _Junb_hi cNK cluster compared to other clusters, as shown in revised Supplemental figure 6A). The score was calculated using “AddModuleScore” R package.

      Comment 41: Figure 5, G, I - The authors highlight Il17 signaling pathway, what is the impact of Il17 on NK/ILC1? Did the authors check for ILC3 (Rorc expression) within the ILC1 cluster?

      Response 41: The enrichment of IL-17 signaling pathway in Il7r_hi ILC1 indicated that this cluster encompass ILC1s originate from the conversion of Rorγt+ ILC3s. Although the Rorc expression was undetectable in all ILC1 clusters, we found several ILC3 marker genes highly expressed in this clusters (e.g. Rora, Tmem176a, Tmem176b) according to the ILC3 transcriptomes (Robinette et al., _Nature Immunology, 2015). 

      We have added these contents in our revised manuscript (page 17; line 341-344): 

      “Several ILC3 signature genes, such as Rora, Tmem176a, and Tmem176b (45), highly expressed in this cluster (Supplemental Figure 7D). Considering the close relationship between IL-17 mediated immunity response and ILC3 (1, 46), it is plausible that _Il7r_hi ILC1 cluster may be attributed, at least in part, to potential plasticity between ILC1 and ILC3 subsets.”.

      Comment 42: Figure 5, The authors detect more Ly49E+ cytotoxic ILC1 in Prdm1fl Ncr1cre mice.

      How does this observation fit to the reduced cytotoxicity of NK cells?

      Response 42: The proportion of _Klra_hi ILC1 was increased, while the _Gzmb_hi ILC1 was decreased in _Prdm1_ΔNcr1 mice. Moreover, total number of three ILC1 cluster was reduced in _Prdm1_ΔNcr1 mice.

      Comment 43: Line 350/351: Citation required.

      Response 43: We added the respective reference. (reference 55 and 56).

      Comment 44: Figure 6, The Cell-chat analysis provides interesting suggestions, but none are experimentally addressed. It is also difficult to evaluate these analyses: are any of the Mac subsets altered in frequency or phenotype in either genotype? This could be analyzed from the single-cell data in Fig 4. At the very least, flow cytometric validation of predicted shifts in the Mac compartment should be confirmed.

      Response 44: We gratefully thanks for these valuable suggestions. As requested, we analyzed macrophages and validated some of the scRNA-seq data by flow cytometry. We have re-written this part with the analysis of altered proportion of two macrophage clusters (Kupffer cells and Monocyte-derived macrophages) (page 20-21; line 399-436):

      “The scRNA sequencing analysis identified two well-established subpopulations of liver macrophages: the resident Kupffer Cells (KCs) and the Monocyte-Derived Macrophages (MDMs) (Figure 6, A-C; Supplemental Figure 9A). When comparing the total proportion of macrophages within the immune cell population of the liver between WT and Prdm1ΔNcr1 mice, there is an increase in Prdm1ΔNcr1 mice (Figure 6C). To confirm these findings, we utilized flow cytometry to define macrophages, including both KCs and MDMs, gating by CD45+Ly6G-F4/80+CD11b+ (Figure 6D).

      Our analysis showed that, following the deletion of Prdm1 in Group 1 ILCs, there is a significant increase in both the proportion and number of macrophages in the liver (Figure 6D).

      According to the transcriptional profile, liver macrophages further clustered and were labeled as “Ly6c2_hi”; “_Cxcl2_hi”; “_Ear2_hi” MDMs, and “_Mrc1_hi”; “_C1q_hi” KCs (Figure 6A, Supplemental Figure 9, A-E). Increased proportion of MDMs and KCs was observed in _Prdm1ΔNcr1 cells (Supplemental Figure 9B). Within MDMs clusters, Ly6c2_hi MDMs mainly compose of _Prdm1+/+ cells, while Prdm1ΔNcr1 cells concentrated in Cxcl2_hi cluster (Figure 6C). The scRNA-seq data reveal that following Prdm1 knockout in NKp46+ cells, there is a decrease in the proportion of KCs within the macrophage population, while the proportion of MDMs increases (Figure 6D). CX3CR1, a chemokine receptor, is extensively utilized to distinguish KCs and MDMs within macrophages. Cells expressing CX3CR1 are identified as MDMs, whereas those without CX3CR1 expression are categorized as KCs (56). Employing flow cytometry and leveraging CX3CR1 expression, we assessed the ratios of KCs and MDMs. However, diverging from the scRNA-seq findings, flow cytometry indicates that post-Prdm1 knockout in group 1 ILCs, there is a minor increase in the proportion of KCs within the total liver macrophages, and a decrease in the proportion of MDMs (Figure 6D; Supplemental Figure 9B). This discrepancy could stem from the different bases of classification: scRNA-seq defines KCs based on gene expression profiles, whereas flow cytometry differentiates between KCs and MDMs using the single surface marker, CX3CR1. Analysis of the macrophage subsets identified by scRNA-seq reveals that, while MDM clusters generally show high CX3CR1 expression, there exists a subset within MDMs, labeled _Mrc1hi, that also exhibits high levels of CX3CR1 (Supplemental Figure 9C). Consequently, if flow cytometry solely employs CX3CR1 for differentiating between KCs and MDMs, it could result in disparities when compared to scRNA-seq outcomes. Both KCs and MDMs has significantly increased in Prdm1ΔNcr1 mice, which was consist with the scRNA-seq data (Supplemental Figure 9, B and F). Despite the decrease in the proportion of Ly6c2hi MDMs in Prdm1ΔNcr1 mice, the expression levels of Ly6c2 exhibited minimal variation between WT and Prdm1ΔNcr1 mice (Supplemental Figure 9D). Intriguingly, within certain cellular subsets, notably the Ear2hi cluster, the Ly6c2 expression levels in KO mice were found to be higher than those in WT mice. Additionally, we employed flow cytometry to examine Ly6C expression within the macrophages. Similar with the scRNA-seq findings, there were no notable differences in Ly6C expression levels between WT and KO mice (Figure 6E; Supplemental Figure 9G).”.

      The changes of the macrophage compartment indicated the potential influence of functional NK cells to macrophages. We have revised these parts in our results and discussion (line 590-601). However, to address more analysis on macrophage is worthy but would go beyond the scope of this manuscript, which will be a direction of our further work.

      Comment 45: Figure 6, C1qhi Mac only are few cells/events, and interactions (or cells?) seem to be gone in the Prdm1-floxed mice. Is that true? Does it make sense to perform cell-chat analysis on so few cells?

      Response 45: We have now added KCs to the cell-chat analysis, and this cluster was belonged to C1qhi KCs. We have revised the analysis of corresponding parts in our manuscript (page 20-21; line 408-428):

      “According to the transcriptional profile, liver macrophages further clustered and were labeled as “Ly6c2_hi”; “_Cxcl2_hi”; “_Ear2_hi” MDMs, and “_Mrc1_hi”; “_C1q_hi” KCs (Figure 6A, Supplemental Figure 9, A-E). Increased proportion of MDMs and KCs was observed in _Prdm1ΔNcr1 cells (Supplemental Figure 9B). Within MDMs clusters, Ly6c2_hi MDMs mainly compose of _Prdm1+/+ cells, while Prdm1ΔNcr1 cells concentrated in Cxcl2_hi cluster (Figure 6C). The scRNA-seq data reveal that following Prdm1 knockout in NKp46+ cells, there is a decrease in the proportion of KCs within the macrophage population, while the proportion of MDMs increases (Figure 6D). CX3CR1, a chemokine receptor, is extensively utilized to distinguish KCs and MDMs within macrophages. Cells expressing CX3CR1 are identified as MDMs, whereas those without CX3CR1 expression are categorized as KCs (56). Employing flow cytometry and leveraging CX3CR1 expression, we assessed the ratios of KCs and MDMs. However, diverging from the scRNA-seq findings, flow cytometry indicates that post-Prdm1 knockout in group 1 ILCs, there is a minor increase in the proportion of KCs within the total liver macrophages, and a decrease in the proportion of MDMs (Figure 6D; Supplemental Figure 9B). This discrepancy could stem from the different bases of classification: scRNA-seq defines KCs based on gene expression profiles, whereas flow cytometry differentiates between KCs and MDMs using the single surface marker, CX3CR1. Analysis of the macrophage subsets identified by scRNA-seq reveals that, while MDM clusters generally show high CX3CR1 expression, there exists a subset within MDMs, labeled _Mrc1hi, that also exhibits high levels of CX3CR1 (Supplemental Figure 9C). Consequently, if flow cytometry solely employs CX3CR1 for differentiating between KCs and MDMs, it could result in disparities when compared to scRNA-seq outcomes.”.

      Comment 46: Figure 6, C - Here the interactions of both Mac+ILC1 and Mac+NK are shown together. It would be interesting to separate this analysis (also Suppl. Fig 9A-B) into comparisons of Mac+ILC1 vs Mac1+NK from WT or Prdm1fl Ncr1 mice.

      Response 46: As request, we re-analyzed this part in each genotype, which was showed in the Supplemental Figure 10. These data have now been described in (page 22; line 445-447).

      “The reduction of interaction mostly occurred in the cross-talk of ILC1-MDM and ILC1-KC, whereas no difference was observed in cNK-MDM and cNK-KC interaction (Supplemental Figure 10, A-H)”

      Comment 47: Supplementary Figure 9, A, B - Is this analysis using WT and Prdm1fl Ncr1cre dataset together? 

      Response 47: Yes, we used WT and Prdm1_Δ_Ncr1 data together. As the request above, we separate this analysis from WT or Prdm1_Δ_Ncr1 Ncr1 mice. These data have now been described in (page 22; line 445-460):

      “The reduction of interaction mostly occurred in the cross-talk of ILC1-MDM and ILC1-KC, whereas no difference was observed in cNK-MDM and cNK-KC interaction (Supplemental Figure 10, A-H). A reduction in the interaction of ligand-receptor, such as Mif-CD74, Cxcl16-Cxcr6, and Cxcl10-Cxcr3 was observed in Prdm1ΔNcr1 mice compared to Prdm1+/+ mice (Supplemental Figure 11). Compared to Prdm1+/+ mice, the information flow of CXCL and MIF pathways significantly decreased in Prdm1ΔNcr1 mice (Figure 6, H and I; Supplemental Figure 10, B, D, F, and H). These pathways play a crucial role in facilitating macrophage migration. The CXCL signaling was sent from Ly6c2_hi _Cxcl2_hi MDMs and _C1q_hi KC, targeting all ILC1 clusters and _Cxcr3_hi cNK cell clusters (Figure 6J). Of note, although the population of _Cxcl2_hi macrophage primarily comprised cells from _Prdm1ΔNcr1 mice, the interaction within the CXCL pathway between macrophages and group 1 ILCs was obviously less than Prdm1+/+ sample (Figure 6J). These changes could be linked to a decreased population of ILC1s and Cxcr3_hi cNK cell cluster in _Prdm1ΔNcr1 mice, implying that the homeostasis of _Cxcl2_hi macrophages required sufficient signals from cNK cells and ILC1s. The impaired CXCLCXCR interactions might subsequently lead to reduced recruitment and activation of group 1 ILCs and macrophages within the tumor microenvironment.”.

      Comment 48: Figure 7, A-C -What is the consequence/interpretation of reduced Mitotracker staining? Any metabolic assays performed? The definition of NK cell "exhaustion" is unclear, is reduced IFNg enough for that? Is the concept of NK cell exhaustion clearly established? Only shortly touched upon in the discussion, the rationale for suggesting an exhausted phenotype, should be explained.

      Response 48: MitoTracker was used to assess the mitochondrial mass. The reduced staining indicated compromised mitochondria function, which associated with mitochondrial fragmentation.

      We believe that the exhaustion of NK cells is not as well-established a concept as it is for T cells. The purpose of detecting mitochondria in this study is to provide evidence for the relationship between Prdm1 and the exhaustion of NK cells. In the discussion section, we have added the following content (page 26; line 529-543):

      “Mitochondria are pivotal organelles crucial for cellular metabolism. Disruptions in mitochondrial function have been linked to T Cell exhaustion, attributed to glycolytic reprogramming (66). Similarly, mitochondrial fragmentation has been closely associated with NK cell exhaustion (67).

      However, the concept of NK cell exhaustion isn't as firmly established as it is for T cells. Exhausted NK cells should primarily exhibit diminished functions. This is characterized by a diminished ability to destroy tumor cells, a reduced capability to activate other components of the immune system, and compromised proliferation and survival rates. Additionally, this reduced functionality is associated with a decline in the expression of molecules responsible for cytotoxic activity, lower production of IFN-γ, and metabolic disturbances that may arise from mitochondrial dysfunction. While our current data is not sufficient to definitively classify these cells as exhausted NK cells, it supports that a subpopulation, referred to Junb_hi cluster, demonstrates an exhaustion-like phenotype. The significant increase in this cell population following _Prdm1 knockout in NK cells may potentially be one of the reasons why Prdm1ΔNcr1 mice lose their tumor-killing capacity. Whether the excessive expression of JunB in NK cells is also a contributing factor to their exhaustion, similar to T cells(65), requires further investigation.”.

      Comment 49: Figure 7, x-axis labelling (MFI) of histograms is not correct. Do bar graphs and FACS plots show the same data? Does the number in the FACS plots indicate the MFI? If so, the FACS plots do not show representative samples?

      Response 48: We appreciate the valuable comments provided by the reviewer. In the revised Figure 7, the MFI values have been removed. Bar graphs now display summary data from FACS histograms.

      A representative sample close to the group's mean value was chosen for display in the histograms.

      Comment 50: Figure 7, D - How are these data different from Figure 2H? Why is it now called "exhaustion", but not in 2H? Is the detected IFNg only driven by ex vivo stimulation with Il12/Il18? As above, a "standard" 4h assay should also be provided to allow better interpretation of potential differences. In the introduction, the authors cite the Ducimetiere study (Ref 5) highlighting "the primary function of ILC1 in suppressing the seeding of metastatic tumor cells in liver tissue". Thus, it would be interesting to test Ifng production by liver ILC1 and NK cells ex vivo at early time points of tumor inoculation.

      Response 50: Tumors grow and proliferate within tissues, constituting one of the major causes of lymphocyte exhaustion. This part of the current study aims to investigate whether Prdm1 aids NK cells or ILC1 in resisting the exhaustion induced by malignant tumors. Specifically, we seek to ascertain whether the absence of Prdm1 renders NK cells or ILC1 more susceptible to exhaustion within the tumor microenvironment. Therefore, we will consider the capacity to secrete IFN-γ upon IL-12/IL-18 stimulation as one indicative aspect of exhaustion. It's crucial to emphasize that this assessment serves as only one piece of evidence, not the sole determinant. Overnight stimulation is a conventional method for studying NK cells and has been widely used across different laboratories, including our lab (e.g. Bream et al., Blood, 2003; Yu et al., Immunity, 2006; Wang et al., J Clin Invest, 2018). It's essential to clarify that our approach does not involve stimulating with tumor cells to evaluate the secretion capacity of IFN-γ by NK cells or ILC1.

      Reviewer 2 (Public Review):

      Summary:

      This study offers a significant advancement in understanding liver innate lymphoid cell (ILC) biology by elucidating the role of the transcription factor Prdm1. It shows that Prdm1 is crucial in maintaining the balance between conventional natural killer (cNK) cells and ILC1s in the liver, with knockout models revealing a vital role in cancer defense mechanisms. Despite not affecting direct cytotoxicity, Prdm1 deficiency leads to increased cancer metastasis and reduced secretion of key molecules like IFN-γ, pointing to its importance in immune regulation. The use of single-cell RNA sequencing further underscores Prdm1's role in cellular communication within the liver's immune milieu. This study is a robust contribution to the field, providing insights that could inform new immunotherapy approaches for liver cancer.

      Strengths:

      The study's strength lies in its comprehensive approach, combining the specificity of Prdm1 conditional deletion in Ncr1-cre mice with integrative omics analyses and cutting-edge cytometry to delineate Prdm1's role in liver Type 1 ILC biology and its functional implications in tumor immunity. This multifaceted strategy not only clarifies Prdm1's influence on ILC composition and maturation but also conveys potential therapeutic insights for liver cancer immunotherapy.

      We sincerely appreciate your interest and critical assessment of our manuscript. We have carefully read your comments and suggestions, and I am truly grateful for your expert guidance. We have worked on addressing each of your concerns and comments, and below we provide a point-to-point response. Please find the detailed responses below:

      Weakness

      Comment 1: A notable weakness of the study is the limited scope of in vivo disease models, primarily relying on the B16F10 melanoma model, which may not fully capture the complex behavior of Type 1 ILCs across diverse cancer types. Furthermore, the absence of direct human data, such as the effects of PRDM1 deletion in human NK cells or stem cells during their differentiation into NK and ILC1, leaves a gap in translating these findings to clinical settings.

      Response 1: We appreciate the reviewer for raising these important points, which we see as a unique opportunity for future work to transform our understanding of Prdm1 and its targets as opposed to a weakness of the present study. 

      In our revised manuscript, we have discussed these limitations of our study (page 29; line 602-609):

      “While our findings underscore the importance of Prdm1 in liver cNK cells and ILC1s tumor immune surveillance, it does not be validated in human NK cells, whereas previous studies have found that PRDM1 might inhibit the proliferation and function of human NK cells (33, 73). Furthermore, we not provided an in-depth evaluation in multiple tumor models. Further research may provide deeper insight into the role of PRDM1 in the anti-tumor function of human NK cells, enabling a more direct investigation of its application in cancer therapies. Given its important role in preserving liver cNK cells and ILC1s functional heterogeneity, enhancing Prdm1 function in human NK cells could potentially be a strategy to promote NK cell-based immunotherapy for cancer.”.

      Recommendations For The Authors:

      (Introduction) 

      Comment 2: Reference 1 appears slightly misplaced. You might find the nomenclature discussion in Spits et al., Nature Reviews Immunology, 2013, more appropriate.

      Response 2: We are really sorry for our inaccurate descriptions. According to Spits et al., (Spits et al., Nature Reviews Immunology, 2013) and other related studies, we have now adopted a more appropriate nomenclature as “Conventional NK cells” correspond to “cNK cells”, “Type 1 innate lymphoid cells” to “ILC1s”, and “Group 1 ILC” as the collective name of cNK and ILC1s. 

      The definition of these cells was described in the introduction (page 4, line 52-53; line58-62): 

      “Group 1 ILCs consist of cNK cells and ILC1s (1, 2), with distinct developmental trajectories and effect molecules (3).”, “In a state of homeostasis, liver group 1 ILCs (CD45+CD3-NK1.1+NKp46+) can be discriminated into cNK cells and ILC1s by the differential expression of CD49a and CD49b (2): cNK cells are marked by the expression of CD49b, while liver ILC1s exhibit a distinctive positivity for CD49a. Tumor Necrosis Factor Related Apoptosis Inducing Ligand (TRAIL) is also expressed on liver ILC1s, but not on cNK cells (10, 11).”. 

      We also describe cNK and ILC1 phenotypes in our scRNA-seq data, as shown in page 13; line 259-261: 

      “cNK cells expressed high levels of Itga2 (CD49b) and Eomes, while ILC1s had high levels expression of Itga1 (CD49a) and Tnfsf10 (Supplemental Figure 5, F and G).”.

      Comment 3: It has come to my attention that Reference 9 has been retracted. I recommend removing this citation to maintain the integrity of your references (https://doi.org/10.1182/blood.2023022801).

      Response 3: We thank the reviewer’s comment and we now have removed this citation.

      Comment 4: For a more comprehensive context around reference 15, consider citing Thierry Walzer's work ([https://rupress.org/jem/article/211/3/563/41636/T-bet-and-Eomes-instruct-thedevelopment-of-two)]) which aligns closely with your discussion.

      Response 4: We agree with the reviewer’s suggestion and have added this citation in our introduction (page 4; line 64-66):

      “Liver environment facilitated T-bet expression in the early stage of NK cells development, which results in Eomes repression. The repression of T-bet is required for Eomes+ NK cells (17).”.

      (Results) 

      Comment 5: The NK cell signature referenced in 32 has been questioned for its reliability as discussed by Cursons et al., CRI 2019 (https://pubmed.ncbi.nlm.nih.gov/31088844/). Reanalysis of data in Figure 1 B/C and Supplementary Figure 1 with the refined NK cell signature from Curson's work would be advantageous.

      Response 5: We thank the reviewer’s comment. As requested, we reanalyzed our data using the refined NK cell signature from Cursons et al. (revised Figure 1 A-C; revised Supplemental Figure 1). Of note, the overall survival of liver cancer (LIHC) patients only reached statistics significance when compared high and low expression of refined PRDM1-NK signature with a median cutoff (Figure 1, A-C). The overall survival performed with quartile high and low expression of refined PRDM1-NK signature was moved to supplemental figure 1, G-I. 

      The original text is: “Examination of 363 liver hepatocellular carcinoma (LIHC) patient samples from The Cancer Genome Atlas (TCGA) revealed a positive correlation between the expression of NK cell-associated genes (NCR1, NCR3, KLRB1, CD160, and PRF1) (32) and PRDM1 expression (Figure 1A). Patients with top and bottom quartiles of NK-PRDM1 signature expression were chosen for survival analysis (Figure 1B). Notably, patients with the NK-PRDM1_hi signature had better overall survival compared to the these with NK-_PRDM1_lo signature (Figure 1C). Similar results were also found in skin cutaneous melanoma (SKCM, n=454) and lung adenocarcinoma (LUAD, n=497) patients (Supplemental Figure 1, A-F). These data suggested that _PRDM1 in NK cells might be essential for immune surveillance in some solid tumors, including liver cancer. These findings prompted us to investigate the impact and mechanism of PRDM1 in NK cells and ILC1 within the context of liver cancer.”

      We have rewritten this part in our revised manuscript (page 7; line 119-132): 

      “Examination of 363 liver hepatocellular carcinoma (LIHC) patient samples from The Cancer Genome Atlas (TCGA) revealed a positive correlation between the expression of NK cell-associated genes (34) (NCR1, KLRB1, CD160, PRF1, etc.) and PRDM1 expression (Figure 1A). The patients are ordered from highest to lowest based on the expression of NK-Prdm1 for survival analysis (Figure 1B). Notably, patients exhibiting higher levels of NK-PRDM1 expression (above the median) experienced better survival outcomes compared to those with lower levels of NK-PRDM1 expression (below the median) (Figure 1C). Similar results were also found in skin cutaneous melanoma (SKCM, n=454) and lung adenocarcinoma (LUAD, n=497) patients (Supplemental Figure 1, A-F). Patients within the highest quartile of NK-PRDM1 signature expression demonstrated enhanced overall survival, a result that achieved statistical significance in LUAD and SKCM patients (Supplemental Figure 1, G-I). These data suggested that PRDM1 in NK cells might be essential for immune surveillance in solid tumors, including liver cancer, and prompted us to investigate the function and mechanism of PRDM1 in NK cells and ILC1 within the context of liver cancer.”.

      Comment 6: The origin of the Ncr1-cre mice utilised should be clarified; is this the line developed by Eric Vivier? (https://www.pnas.org/doi/10.1073/pnas.1112064108).

      Response 6: We did not use the line developed by Eric Vivier, our Ncr1-cre mice was purchase from Shanghai Model Organism Center, Inc.. We described this in our method parts (page 29-30; line 612-614): 

      Prdm1fl/fl mice were purchased from The Jackson Laboratory. Ncr1-iCre and B2m-/- mice were purchased from Shanghai Model Organisms Center, Inc.. Six- to twelve-week-old littermates were used for the experiment.”

      Comment 7: Considering the known reduction of Ncr1 expression in Ncr1-cre mice and its implications, it is recommended to repeat the B16F10 experiments with the correct control, Ncr1cre/+ Prdm1+/+.

      Response 7: This is an excellent question, and it has been raised by another reviewer and comprehensively answered (Reviewer 1, Comment 1). The answer is below: 

      The expression of Cre and the insertion of loxP sequences both have the potential to influence gene expression. This is because the region where loxP is inserted may contain regulatory sequences for the gene of interest. Ncr1-Cre is a frequently used transgenic mouse model in our laboratory. In our prior research, we also had concerns about the possible impact of Cre on NKp46 expression, which could lead to a decline in NK cell function. Therefore, in our previous studies focused on Smad4 expression in NK cells, we conducted similar experiments. In Figure 6 of our published paper in the Journal of Clinical Investigation (Wang et al., J Clin Invest, 2018), we compared NKp46iCreTgfbr2fl/flSmad4fl/WT with NKp46-iCreTgfbr2fl/flSmad4fl/fl. Although the primary purpose is to establish Smad4's independence from TGF-β, it also allows for a comparison between Smad4fl/fl and Smad4fl/WT in the presence of Cre. In the critical phenotype we assessed, NKp46iCreTgfbr2fl/flSmad4fl/fl (compared with NKp46-iCreTgfbr2fl/flSmad4fl/WT) exhibited the same phenotype as NKp46-iCreSmad4fl/fl (compared with NKp46WTSmad4fl/fl). This suggests that Cre's influence on NK cells may be within a reasonable and controllable range. Furthermore, in contrast to the decrease in Ncr1 expression caused by Cre, the reduction in the expression levels of genes targeted by Loxp knockout, such as Prdm1 in this study (Figure 1 E), is more significant. Therefore, with the current techniques and research methods, we believe that the data provided in this study can support the role of Prdm1 in NK cells.

      Comment 8: The proportion of ILC1 in wild-type mouse livers is notably higher than standard references. Could you confirm whether liver perfusion was performed before analysis? This procedure was not clearly detailed in the methods section.

      Response 8: We apologize that we did not provide enough detail regarding this point in our original method. We had performed the liver perfusion before analysis. This has now been clarified in the method section of the revised text (page 30-31; line 630-636): 

      “Mice were perfused with 1◊ PBS by portal vein puncture before harvesting tissues. Liver and lung was digested with 0.05% collagenase II for 30 minutes and filtered through 70 µm cell strainers, and mononuclear cells were isolated after subjected to density gradient using 30% and 70% percoll. Spleen were also removed and pressed through 70 µm filterers to obtain splenocytes. Peripheral blood mononuclear cells were obtained from peripheral blood after lysis of red blood cells (Biolegend, 420301). Flushing femurs and mechanical disruption of inguinal lymph nodes were performed to obtain cells from bone marrow and lymph nodes.”.

      The lymphocyte proportions in mice from different laboratories may exhibit slight variations, possibly due to genetic background disparities. To minimize the influence of genetic backgrounds, paired littermates were used in the current study, wherein one is Prdm1 WT and the other has the Prdm1 gene knocked out in NK cells.

      Comment 9: There appears to be inconsistency in reference formatting; for instance, Ref 39 does not match the formatting of other references. A thorough review of your citation format is suggested.

      Response 9: We apologize for the inadvertent errors and we reviewed the citation format.

      Comment 10: The information in Figures 2B and C may be better suited to the supplementary section as it does not significantly contribute to the main text.

      Response 10: We agree with the reviewer’s suggestion and these are now moved to supplementary figures (Supplemental Figure 2).

      Comment 11: The citation of reference 40 could be strengthened by including Sathe et al., 2014, which directly pertains to your findings (https://www.nature.com/articles/ncomms5539).

      Response 11: We added the suggested reference.

      Comment 12: Can the findings presented in Figure 2D/F be replicated using alternative models?

      This would substantiate the versatility of your results.

      Response 12: The current predominant in vivo tumor model for NK cells is primarily based on the use of B16F10 melanoma cells. These melanoma cells, with their low expression of MHC-I molecules, evade T cell-mediated immune surveillance, rendering them ideal targets for NK cells. Typically, this experimental melanoma metastasis assay involves tail vein injection, followed by nodules' detection in the lungs. To align with our investigation of liver-resident cNK and ILC1, we've introduced splenic injection (via the portal vein) and evaluated melanoma metastasis in the liver to reflect the anti-tumor capabilities of liver group 1 ILCs. We also explored subcutaneous tumor models, but we believe they may not effectively support Prdm1's role in cNK cells, particularly liver-resident NK cells and ILC1. While we've experimented with models using mouse liver tumor cells like Hepa 1-6, we found them less stable than B16F10 and less conducive to quantification. Should more suitable models or cells line emerge, we remain open to exploring them in future research.

      Comment 13: The absence of in vitro killing assessments against B16F10 and YAC-1 leaves a gap in the NK cell characterisation which would be valuable to address.

      Response 13: Isolating NK cells for ex vivo cytotoxicity assays typically requires stimulation with high concentrations of IL-2. Under such high IL-2 stimulation, many intracellular differences that contribute to difference in cytotoxicity, such as changes in transcription factors, are often masked. Another issue is that current ex vivo NK cell cytotoxicity assays often only isolate NK cells from the spleen. Liver-resident NK cells, on the other hand, are often limited in quantity and isolation methods, making it challenging to conduct ex vivo cytotoxicity assays effectively. If more sensitive detection methods become available, we will also incorporate ex vivo data into our future research endeavors.

      Comment 14: The suggestion that NK cells produce IL-6 is indeed a bold one, and without additional validation through intracellular cytokine detection or ELISA, it may be prudent to omit these claims.

      Response 14: We have checked the GSEA results, and found no valuable genes in IL-6 production.

      Therefore, we have removed this figure.

      Comment 15: The lack of fluorescence minus one (FMO) controls in Figure 3 and Supplementary

      Figure 4 is noted; including these would enhance the validity of your gating strategies.

      Response 15: As requested, we add the FMO controls in aforementioned figures.

      Comment 16: There seems to be a minor mix-up in referring to Figure 4A in the scRNAseq results section, perhaps it was intended to refer to Figure 3A?

      Response 16: We have corrected this part (line 247). We also double checked corrected the inaccuracies in the references to the figures. we apologize for the inadvertent errors.

      Comment 17: The rich datasets generated from bulk and scRNAseq are commendable. However, I urge you to make these datasets publicly accessible with a GEO accession number.

      Response 17: We appreciate the suggestion from the reviewer. We plan to upload our datasets when in the last version of our manuscript, which is also the request of the eLife policy.

      Comment 18: Figure 4K is insightful, yet a similar analysis of the ILC1 cluster could provide a more rounded understanding.

      Response 18: We thank the reviewer for the comments. We provide the similar analysis of ILC1s, as showing in revised Figure 5H. 

      Comment 19: The metabolic RNA signatures featured in Supplementary Figure 6 are intriguing and warrant further validation, perhaps through Seahorse analysis. Such validation could merit their inclusion in the main figures.

      Response 19: This is a very good suggestion. Currently, our data offer only limited indications in this context. We have chosen to validate some aspects of Prmd1's influence on cytotoxicity molecules. As for Prdm1's impact on other aspects of NK cells, such as metabolic functions, we may explore further in future research. Additionally, we hope that by publishing our research findings, laboratories worldwide can draw insights for their own studies and conduct relevant research based on this data.

      Comment 20: It is difficult to discern whether the cells depicted in Figure 7D are truly tumorinfiltrating ILC1 or NK cells that have adopted ILC1-like characteristics. Intravenous injection of CD45-PE could clarify this distinction, and if they are the latter, it may be more appropriate to refer to them as ILC1-like cells.

      Response 20: We completely agree with the reviewer's suggestion that "tumor-infiltrating lymphocytes" may not be accurate for the current experiment. Therefore, in the revised manuscript, we have changed it to "liver cNK or ILC1 from tumor-bearing livers.

    1. eLife assessment

      This important study demonstrates a link between an acute high fat diet, microglial metabolism and improved higher cognitive function. The evidence supporting the proposed mechanism in vivo is incomplete at this stage due to non-trivial technical limitations but the authors provide convincing in vitro metabolic characterization of primary microglia cultures to support the model. This work will be of interest to a broad audience in the field of neuroscience, metabolism, and immunology.

    2. Reviewer #1 (Public Review):

      In this study, Drougard et al. examined the consequences of an acute high fat diet (HFD) on microglia in mice. 3-day HFD influenced the regulation of systemic glucose homeostasis in a microglia-dependent and independent manner, as determined using microglial depletion with PLX5622. 3-day HFD increased microglial membrane potential and the levels of palmitate and stearate in cerebrospinal fluid in vivo. Using confocal imaging, respirometry and stable isotope-assisted tracing in primary microglial cultures, the authors suggest an increase in mitochondrial fission and metabolic remodelling occurs when exposed to palmitate, which increases the release of glutamate, succinate and itaconate that may alter neuronal metabolism. This acute microglial metabolic response following acute HFD is subsequently linked to improved higher cognitive function (learning and memory) in a microglia and DRP1-dependent manner.

      Strengths:

      Overall, this study is interesting and novel in linking acute high fat diet to changes in microglia and improved learning and memory in mice. The role for microglia and DRP1 in regulating glucose homeostasis and memory in vivo appears to be supported by the data. Palmitate (which is elevated in the CSF following acute HFD) is clearly used as a fuel by primary microglia ex vivo as determined using U-13C-plamitate tracing and metabolomics.

      Weaknesses:

      The authors suggest that utilisation of palmitate by microglia following HFD is the driver of the acute metabolic changes and that the release of microglial-derived lactate, succinate, glutamate and itaconate are causally linked to improvements in learning and memory. A weakness is that the authors provide no mechanistic link between beta-oxidation of palmitate (or other fatty acids) in microglia in vivo and the observed systemic metabolic and memory phenotypes. However, this reviewer acknowledges the technical difficulties of providing this evidence and approaches, such as microglia-specific deletion of CPT1a, will be an exciting avenue of research to explore for a subsequent study.

    3. Reviewer #2 (Public Review):

      The study by Drougard et al. aimed to answer a critical question on how high-fat diets trigger metabolic issues like obesity and diabetes. Their study revealed that an acute response by microglial cells in the brain to high-fat intake surprisingly benefits metabolism and cognitive function by rapidly metabolizing harmful fatty acids into alternative energy substrates like lactate and itaconate. Thus, short-term HFD intake seems to prompt a distinct beneficial response, suggesting a need for further exploration into the transition from acute to chronic effects.

    4. Reviewer #3 (Public Review):

      Drougard et al. explore microglial detection of a switch to high-fat diet and a subsequent metabolic response that benefits memory. The findings are both surprising and novel in the context of acute high-fat intake, with convincing evidence of increased CSF palmitate after 3 days of HFD. While the authors demonstrate compelling signs of microglial activation in multiple brain regions and unique metabolite release in tracing studies, they should address the following areas.

      Major Points:

      (1) It appears that the authors perform key metabolic assays in vitro/ex vivo using primary microglia from either neonatal or adult mice, which should be more clearly delineated especially for the 13C-palmitate tracing. In the case of experiments using primary microglia derived from mixed glial cultures stimulated with M-CSF, this system relies on neonatal mice. This is understandable given the greater potential yield from neonatal mice, but the metabolic state and energetic demands of neonatal and adult microglia differ as their functional roles change across the lifespan. The authors should either show that the metabolic pathways they implicate in neonatal microglia are also representative of adult microglia or perform additional experiments using microglia pooled from adult mice, especially because they link metabolites derived from neonatal microglia (presumably not under the effects of acute HFD) to improved performance in behavioral assays that utilize adult mice.

      (2) The authors demonstrate that 3 days of HFD increases circulating palmitate by CSF metabolomics and that microglia can readily metabolize palmitate, but the causal link between palmitate metabolism specifically by microglia and improved performance in behavioral paradigms remains unclear. A previous body of research, alluded to by the authors, suggests that astrocyte shuttling of lactate to neurons improves long-term and spatial memory. The authors should account for palmitate that also could be derived from astrocyte secretion into CSF, and the relative contribution compared to microglia-derived palmitate. Specifically, although microglia can metabolize the palmitate in circulation, there is no direct evidence that the palmitate from the HFD is directly shuttled to microglia and not, for example, to astrocytes (which also express CX3CR1). Thus, the Barnes Maze results could be attributed to multiple cell types. Furthermore, the evidence provided in Figure 5J is insufficient to claim a microglia-dependent mechanism without showing data from mice on HFD with and without microglia depletion (analogous to the third and fourth bars in panel K).

      (3) Given the emphasis on improved cognitive function, there is minimal discussion of the actual behavioral outcomes in both the results and discussion sections. The data that HFD-treated animals outperform controls should be presented in more detail both in the figure and in the text. For example, data from all days/trials of the Barnes Maze should be shown, including the day(s) HFD mice outperform controls. Furthermore, the authors should either cite additional literature or provide experimental evidence supporting the notion that microglia release of TCA-associated substrates into the extracellular milieu after HFD specifically benefits neuronal function cellularly or regionally in the brain, which could translate to improved performance in classical behavioral paradigms. The single reference included is a bit obscure, given the study found that increased lactate enhances fear memory which is a neural circuit not studied in the current manuscript. Are there no additional studies on more relevant metabolites (e.g., itaconate, succinate)?

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, Drougard et al. examined the consequences of an acute high fat diet (HFD) on microglia in mice. 3-day HFD influenced the regulation of systemic glucose homeostasis in a microglia-dependent and independent manner, as determined using microglial depletion with PLX5622. 3-day HFD increased microglial membrane potential and the levels of palmitate and stearate in cerebrospinal fluid in vivo. Using confocal imaging, respirometry and stable isotope-assisted tracing in primary microglial cultures, the authors suggest an increase in mitochondrial fission and metabolic remodeling occurs when exposed to palmitate, which increases the release of glutamate, succinate and itaconate that may alter neuronal metabolism. This acute microglial metabolic response following acute HFD is subsequently linked to improved higher cognitive function (learning and memory) in a microglia and DRP1-dependent manner.

      Strengths:

      Overall, this study is interesting and novel in linking acute high fat diet to changes in microglia and improved learning and memory in mice. The role for microglia and DRP1 in regulating glucose homeostasis and memory in vivo appears to be supported by the data.

      Weaknesses:

      The authors suggest that utilization of palmitate by microglia following HFD is the driver of the acute metabolic changes and that the release of microglial-derived lactate, succinate, glutamate and itaconate are causally linked to improvements in learning and memory. A major weakness is that the authors provide no mechanistic link between beta-oxidation of palmitate (or other fatty acids) in microglia and the observed systemic metabolic and memory phenotypes in vivo. Pharmacological inhibition of CPT1a could be considered or CPT1a-deficient microglia.

      We thank Reviewer #1 for their time, effort and the critique. Indeed, we suggest that palmitate drives the aMMR response and associated improvements in learning and memory. In response to acute HFD we observe 1) increased in palmitate in CSF; 2) impaired mitochondrial ETC activity in primary microglia (within 12 hours of HFD); and 3) improved learning and memory. The greatest barrier to proving how acute palmitate uptake in microglia improves learning and memory in vivo is the protracted methodology required for microglial isolation and purification. The timeframes and relatively harsh digestion protocols required are currently incompatible with metabolomic tracing and well beyond those required for most cell-types used for metabolomic investigation.  We have tested and failed to obtain reproducible data across numerous in vivo protocols and finally settled on in vitro 13C palmitate treated neonatal microglia as the best current option. Primary neonatal microglia are accepted as one of the current best culture models by the microglial community (Valdercaos cell report 2014, Kim Cell Metab 2019). Using neonatal microglia we demonstrate that 13Cpalmitate label is processed to palmitoylcarnitine (Fig 4C) and acetylcarnitine (Fig 4D) indicating that microglial fatty acid metabolism acts via the canonical CPT1/CPT2 pathway. These experiments highlight that microglia process palmitate via beta oxidation generating acetyl coA and engaging the TCA cycle (Fig 4G-I).

      We now acknowledge these technical limitations more clearly and highlight their impact on any conclusions regarding adult microglia in vivo:

      Results “Microglia take up and metabolize free fatty acids”; 

      “Due in part to the long isolation times required to generate pure primary adult microglia, metabolite tracing experiments on primary adult microglia are not currently feasible. We therefore chose primary murine neonatal microglia as our model of choice for more mechanistic experiments (Valdercaos, Cell Report 2014)”

      And,

      Discussion:

      “We propose that aMMR could result from direct uptake, processing, and release of fatty acid derived carbons, and demonstrate that microglia are capable of metabolizing fatty acids towards diverse intracellular and extracellular pools.”

      While acute ICV injection a CPT1a blocker would be of potential interest, the caveats associated with CPT1a inhibition in other cell-types (neurons, astrocytes, etc) and with targeting the appropriate brain region (currently unknown) currently preclude the effective use of this approach for to generate clear additional mechanistic insights. Similarly, given the time and resources required to generate, validate, optimize and experiment on a clean model of in vivo adult microglia-specific CPT1a knockout, this approach was deemed beyond the scope of this study. That said, the critique is important, and it should comprise a follow-up project.

      Comment: Another major weakness is that the authors also suggest that 3-day HFD microglial response (increase membrane potential) is likely driven by palmitate-induced increases in itaconate feedforward inhibition of complex II/SDH. Whilst this is an interesting hypothesis, the in vitro metabolic characterization is not entirely convincing.

      The reviewer is correct, we suggest that our data is consistent with a model where a palmitate-induced increase in itaconate inhibits complex II/SDH. While our findings do not comprise mechanistic proof, the hypothesis is supported by our Seahorse studies (Fig 2E) highlighting that a combined Palmitate + Succinate stimulation does not increase OCR beyond that of Palmitate alone; by primary microglial cell experiments highlighting that 3d-HFD treated adult primary microglia are refractory to succinate-induced mitochondrial membrane depolarization (Fig 2F); and by the identification of increased palmitate induced itaconate production/release in cultured primary neonatal microglia (Fig 4H). The data are consistent with an inhibition of complex II/ SDH and with increased itaconate secretion. They are also consistent with literature on more easily accessible myeloid lineages (Lampropoulou V, Cell Metab 2016).  

      Comment: The authors suggest that acute palmitate appears to rapidly compromise or saturate complex II activity. Succinate is a membrane impermeable dicarboxylate. It can enter cells via MCT transporters at acidic pH. It is not clear that I) Succinate is taken up into microglia, II) If the succinate used was pH neutral sodium succinate or succinic acid, and III) If the observed changes are due to succinate oxidation, changes in pH or activation of the succinate receptor SUCNR1 on microglia. In the absence of these succinate treatments, there are no alterations in mitochondrial respiration or membrane potential following palmitate treatment, which does not support this hypothesis.

      We thank Reviewer #1 for highlighting a lack of information in the material and methods. We have updated them accordingly as follows:

      “For the electron transport chain experiments (ETC), the experiment was based on the Salabei et al. The cell suspension was incubated with the mitochondrial probe Tetramethylrhodamine TMRM (10mM; Abcam, Cat# ab228569) and fluorescent glucose analog 2-NBDG (Abcam, Cat# 235976) for 30min at 37degrees before FACS acquisition. For challenging the ETC, the cell pellet was resuspended in 500ul of warm MAS buffer solution + 1nM Plasma Membrane Permeabilizer (Agilent Seahorse XF PMP) in order to permeabilize the cells. Microglial cells were gated from CD45low-CD11b+ cells followed by singlet after forward and side scatter pattern. They were incubated each 90 seconds by the following drugs: 0,5ul of 100uM Rotenone (Sigma), 2ul of 2.5M Succinate adjusted to ph 7.4 with NaOH (succinic acid, Sigma) and 0.5ul of 1mM Antimycin (Sigma). Cytometry was performed on Fortessa (BD Bioscience) and analyzed with FlowJo v10 (Treestar).”

      Following the updated protocol, we hope we highlighted that the succinate (solution of succinic acid ph 7.4) is reaching directly the ETC since the microglial cells have been permeabilized by the Plasma Membrane Permeabilizer (Agilent Seahorse XF PMP).

      Comment: Intracellular itaconate measurements and quantification are lacking and IRG1 expression is not assessed. There also appears to be more labelled itaconate in neuronal cultures from control (BSA) microglia conditioned media, which is not discussed. What is the total level of itaconate in neurons from these conditioned media experiments? No evidence is provided that the in vivo response is dependent on IRG1, the mitochondrial enzyme responsible for itaconate synthesis, or itaconate. To causally link IRG1/itaconate, IRG1-deficient mice could be used in future work. 

      We appreciate the interest, the exciting question, and the suggested future experiment. Indeed, our results suggest a difference in metabolite release between the BSA treated-microglia and palmitate treated-microglia and their impact on neurons comprises a prime question for future work. We have highlighted this in the discussion as well as adding a comment regarding relative levels of labelled itaconate as follows:

      Results; Acute HFD induces widespread MMR and rapid modulation (…) memory  

      “As a control for the direct uptake of 13C-glucose, we treated parallel neuronal cultures with the same fresh 13C-glucose tracing media originally added to the microglia. Intriguingly, and consistent with literature documenting poor direct glucose utilization by neurons [29], we found substantial m+3 lactate (as well as other metabolites) in neurons treated with microglial conditioned media, and at levels that far exceeded labelling triggered by glucose tracer alone (Fig 5A, middle column vs left column)(Suppl Fig S5B). The data indicate higher uptake of citrate and itaconate from the control microglia-conditioned media, further supporting the hypothesis that neuronal metabolism is reproducibly impacted by palmitate-triggered changes in microglial products. These data demonstrate that palmitate metabolism by microglia modulates neuronal carbon substrate use in vitro, and, they highlight the relative importance of this process compared to uptake of pure glucose. The data identify a candidate mechanism by which aMMR may alter neuronal function in vivo.”

      Comment: While microglial DRP1 is causally implicated the role of palmitate is not convincing. Mitochondrial morphology changes are subtle including TOMM20 and DRP1 staining and co-localization - additional supporting data should be provided. Electron microscopy of mitochondrial structure would provide more detailed insight to morphology changes. Western blot of fission-associated proteins Drp1, phospho-Drp1 (S616), MFF and MiD49/51. Higher magnification and quality confocal imaging of DRP1/TOMM20. Drp1 recruitment to mitochondrial membranes can be assessed using subcellular fractionation.

      We appreciate the reviewer’s comment. Previous work by others, already cited elsewhere in our manuscript

      (PMCID: PMC7251564), has clearly demonstrated increased mitochondrial fragmentation and

      phosphorylated DRP1 in 3d HFD animals. This very specific result can therefore be considered confirmatory / validating of existing literature, and important for inclusion of DRP1 in our overall model. We have made sure to better highlight this important literature accordingly:

      Results; A rapid Microglial Mitochondria response to high fat diet

      “Consistent with the in vivo observations above, in vitro palmitate exposure decreased microglial mitochondrial length within 24 hours, indicating that fatty acid exposure itself is sufficient to trigger mitochondrial fission in a cell autonomous manner (Fig 2G upper panels). This result also confirms observations by Kim et al. who observed mitochondrial fission and DRP1 phosphorylation upon 3d-HFD treated mice [Kim JD et al, Microglial UCP2 mediates Inflammation and Obesity induced by High Fat feeding, Cell Metab 2019].”

      Comment: No characterization of primary microglia from DRP1-knockout mice is performed with palmitate treatment. Authors demonstrate an increase in both stearate and palmitate in CSF following 3day HFD. Only palmitate was tested in the regulation of microglial responses, but it may be more informative to test stearate and palmitate combined.

      Testing stearate and palmitate combined is an interesting experiment for mimicking the global effect of HFD which is highly enriched with these two satured fatty acids, and then, more informative. In vitro stimulation of microglia model cells has been previously published by Valdearcos and al. (Cell Reports 2014) who studied the effect of a mix of stearate and palmitate on the Mediobasal Hypothalamus inflammation. Here, we build on their important findings by demonstrating that these 2 compounds are actually found in the CSF of 3d-HFD mice. Studies from other labs have also shown the presence of stearate and palmitate in the CSF of chronically obese and diabetic patients which highlights the importance of these findings (Melo HM et al. cell report 2020). While a systematic dissection of the roles of HFD-regulated CSF metabolites (including direct (diet containing) and indirect (secondary) is beyond the scope of this study, this point is important, not least because it highlights less well-studied metabolites and the potential of possible combinatorial interactions. We have highlighted this idea in the results as follows:

      Results; A rapid Microglial Mitochondria response to high fat diet

      “To test whether these observed fatty acid changes in the CSF might directly trigger aMMR, we switched to an in vitro primary neonatal microglia model and examined the effects of the more abundant of these, palmitate (Fig S2A-B).”

      and, in the discussion as follows:

      “Studies have identified stearate and palmitate in the CSF of patients with chronic obesity and with diabetes, reports that highlight the importance of these findings (Melo HM et al. cell report 2020). While a systematic dissection of the roles of HFD-regulated CSF metabolites (including direct (diet containing) and indirect (secondary)) is beyond the scope of this study, they represent priority areas for future investigation, particularly given the wide-range of fatty-acid specific biological effects in the literature, and the potential for combinatorial interactions.” 

      Reviewer #1 (Recommendations For The Authors):

      Congratulations on this interesting and novel work. Please see public review for details on potential experiments. While I would not expect all the experiments to be performed for this current study, it’s important to not overstate what the data is showing. For example, there is no causal link between palmitate oxidation in microglia or released metabolites (itaconate etc) from microglia in the effect on systemic glucose metabolism or memory. To make such claims more supporting data would be required.

      We thank Reviewer #1 for their highly constructive critique_._

      Reviewer #2 (Public Review):

      The study "A rapid microglial metabolic response controls metabolism and improves memory" by Drougard et al. provides evidence that short-term HFD has a beneficial effect on spatial and learning memory through microglial metabolic reprogramming. The manuscript is well-written and the statistics were properly performed with all the data. However, there are concerns regarding the interpretation of the data, particularly the gap between the in vivo observations and the in vitro mechanistic studies.

      In the PLX-5622 microglial depletion study, it is unclear what happened to the body weight, food intake, and day-night behavior of these mice compared to the vehicle control mice. It is important to address the innate immunity-dependent physiology affected by a long period of microglial depletion in the brain (also macrophages in the periphery). Furthermore, it would be beneficial to validate the images presented in Fig.1F by providing iba1 staining in chow diet-fed mice with or without PLX-5622 for 7-10 days. Additionally, high-quality images, with equal DAPI staining and comparable anatomical level, should be provided in both chow diet-fed mice and HFD-fed mice with or without PLX-5622 in the same region of hypothalamus or hippocampus. These are critical evidences for this project, and it is suggested that the authors provide more data on the general physiology of these mice, at least regarding body weight and food intake.

      We are grateful to Reviewer #2 for their constructive comments and for their time and effort; and for highlighting the lack of experimental details regarding the PLX-5622 microglial depletion study. We followed the protocol established in Feng et al JCI 2017. No adverse effects on body weight, food intake and day-night behavior have been described in this study as well as in other studies for longer treatment (Sonia George et al Molecular Neurodegeneration 2019). We didn’t observe any differences in body weight and the food intake within or between groups, upon PLX administration. These data have been included as new Supplementary Fig 6 A-B.

      The material and method was updated as follows:

      “Animals were administered PLX5622-containing diet for 7-9 days without observable impact on the body weight or food intake (Fig S6A-B), using protocols adopted from [Feng et al JCi 2017, Sonia George et al Molecular Neurodegeneration 2019].”

      Comment: It is also unclear whether the microglia shown in Fig.3A were isolated from mice 4 weeks after Tamoxifen injection. It is suggested that the authors provide more evidence, such as additional images or primary microglia culture, to demonstrate that the mitochondria had more fusion upon drp1 KO. It is recommended to use mito-tracker green/red to stain live microglia and provide good resolution images.

      We thank Reviewer #2 for pointing out the lack of detailed information about Fig 3A. Microglial cells were indeed isolated from mice after the tamoxifen injection for highlighting the deletion. We updated the Material and methods with the text below;

      “For the colocalization experiment, microglia were isolated from 10 to 12-week old drp1ko mice and their littermate controls, immediately fixed in PFA and stained with DRP1 (diluted 1:50 Cell signaling; Cat#8570) and tomm20 antibodies (diluted 1:1000, SantaCruz; Cat#sc177615).”

      This experiment was performed as an additional control of the drp1 deletion from our knockout-mice. For this experiment we used Tomm20 since the microglia cells weren’t live after the addition of PFA. 

      Comment: Regarding the data presented in Fig.5A, it is suggested that the authors profile the metabolomics of the microglial conditioned media (and provide the methods on how this conditioned media was collected) to determine whether there was already abundant lactate in the media. Any glucose-derived metabolites, e.g. lactate, are probably more preferred by neurons as energy substrates than glucose, especially in embryonic neurons (which are ready to use lactate in newborn brain).

      With regards to Fig 5A, metabolomics of microglia conditioned media are provided as Fig 5A, Supp Figure 5Band we provided a supplementary table 2.

      We thank Reviewer #2 for noting the lapse of technical detail. We updated the Material and methods with the following:

      “For conditioned media experiments, microglial cells were incubated with DMEM (Gibco) without lactate completed with BSA-conjugated palmitate or Control BSA. Conditioned media was collected after the incubation, centrifuged 15min at 300g (4oC) and the supernatant transferred and frozen in a fresh tube avoiding the cells and debris pellet. Sample were immediately snap frozen or use for the neurons incubation.”

      Any glucose-derived metabolites, e.g. lactate, are more preferred by neurons as energy substrates than glucose as described first in the literature by Prof. Pellerin and Prof. Magistretti via the astrocyte-neuron cooperation (PNAS 1994). Since their discovery, lactate has been explored and is well known as a key signaling molecule (Magistretti PJ Nat Rev Neurosciences 2018). We explored the role of lactate released from the microglia, and we demonstrated that it is taken up by neurons independently of any microglial pretreatment. This experiment highlights microglia as another lactate provider for the neurons (Fig 4N and Fig 5A). 

      Comment: Finally, it is important to address whether PLX-5622 affects learning and spatial memory in chow diet-fed animals. Following the findings shown in Fig 5J and 5K, the authors should confirm these by any morphological studies on synapse, e.g. by synaptophysin staining or ultrastructure EM study in the area shown in Fig 5I.

      We appreciate the comment and question. We performed the controls and included them now as Fig 5J and Fig S5 E-F-G. We do not observe any adverse effects of PLX5622 on learning and spatial memory in normal chow-fed animals. 

      While we were unable to study the synapses as requested, it is important to note that no changes are expected given publications from other labs using the same protocol (Feng x JCI 2017 ,Spangenberg E Nat Com 2019), or longer PLX5622 treatment (Niiyama T eNeuro 2023, Witcher KG J neurosciences 2021), all four of which did not find morphological differences at synapses. 

      Reviewer #2 (Recommendations For The Authors):

      The authors should provide more evidence that palmitate is derived from HFD to prove that it mediates the HFD effects on the microglial mitochondria response. This could be done by adding 13C-palmitate into the HFD and performing metabolomics in isolated microglia from control mice (and Drp1-MG-KO mice, if possible).

      We thank the Reviewer #2 for the enthusiastic revision. Unfortunately, we were unable to attempt this final suggested experiment. We have adjusted our wording accordingly and appreciate the reviewer’s understanding.

      Reviewer #3 (Public Review):

      Drougard et al. explore microglial detection of a switch to high-fat diet and a subsequent metabolic response that benefits memory. The findings are both surprising and novel in the context of acute highfat intake, with convincing evidence of increased CSF palmitate after 3 days of HFD. While the authors demonstrate compelling signs of microglial activation in multiple brain regions and unique metabolite release in tracing studies, they should address the following areas prior to acceptance of this manuscript.

      Major Points:

      (1) It appears that the authors perform key metabolic assays in vitro/ex vivo using primary microglia from either neonatal or adult mice, which should be more clearly delineated especially for the 13C-palmitate tracing. In the case of experiments using primary microglia derived from mixed glial cultures stimulated with M-CSF, this system relies on neonatal mice. This is understandable given the greater potential yield from neonatal mice, but the metabolic state and energetic demands of neonatal and adult microglia differ as their functional roles change across the lifespan. The authors should either show that the metabolic pathways they implicate in neonatal microglia are also representative of adult microglia or perform additional experiments using microglia pooled from adult mice, especially because they link metabolites derived from neonatal microglia (presumably not under the effects of acute HFD) to improved performance in behavioral assays that utilize adult mice.

      We thank Reviewer #3 for their constructive critique and encouraging words. As indicated, the 13C-palmitate experiments were performed with primary microglia derived from mixed glial cultures stimulated with M-CSF and we demonstrated our primary cultures were almost pure by the supplementary experiments (supp Fig2A and B). Additional minor details in these contexts have been added to the Material and Methods.

      The experiments focusing on the mitochondrial ETC were performed on sorted microglia from adult mice and parallels demonstrated with the neonatal cultures (the primary model for metabolic tracing). Compromised complex II activity under conditions of acute HFD/palmitate stimulation for instance were shown in both systems. Unfortunately, despite best-efforts, attempts to run 13C-palmitate tracing experiments on primary adult microglia failed, attributable in large part to the long (~4 hour) and harsh microglial extraction and sorting process. These experiments will require substantial follow-up efforts including the establishment and validation ideally of an adult microglia-neuron co-culture model that faithfully recapitulates most aspects of in vivo metabolic cross-talk. This noble aim is beyond the scope of this study. We have made sure to temper the  conclusions made in the manuscript and to not overstate the impact and interpretation of the in vitro work including updating the following sentences.

      Results “Microglia take up and metabolize free fatty acids”; 

      “Due in part to the long isolation times required to generate pure primary adult microglia, metabolite tracing experiments on primary adult microglia are not currently feasible. We therefore chose primary murine neonatal microglia as our model of choice for more mechanistic experiments (Valdercaos cell Report 2014)”

      and Discussion:

      “We propose that aMMR could result from direct uptake, processing, and release of fatty acid derived carbons, and demonstrate that microglia are capable of metabolizing fatty acids towards diverse intracellular and extracellular pools.”

      Comment: The authors demonstrate that 3 days of HFD increases circulating palmitate by CSF metabolomics and that microglia can readily metabolize palmitate, but the causal link between palmitate metabolism specifically by microglia and improved performance in behavioral paradigms remains unclear. A previous body of research, alluded to by the authors, suggests that astrocyte shuttling of lactate to neurons improves long-term and spatial memory. The authors should account for palmitate that also could be derived from astrocyte secretion into CSF, and the relative contribution compared to microglia-derived palmitate. Specifically, although microglia can metabolize the palmitate in circulation, there is no direct evidence that the palmitate from the HFD is directly shuttled to microglia and not, for example, to astrocytes (which also express CX3CR1). 

      We appreciate the comment. Indeed, this issue highlights one of the greatest challenges for efforts aimed at tracing (beyond doubt) that a single minor cell population contributes towards metabolic cross-talk in vivo. Our experiments show: increased CSF palmitate levels within one feeding cycle of HFD; rapidly induced microglial metabolic activation (characterized by increased mitochondrial membrane potential and impaired complex II activity); and that microglia mount a comparable mitochondrial activation profile in vitro when exposed to palmitate. They show in vitro using neonatal microglia that microglia take up and metabolize palmitate; that they release metabolites with neuro-modulatory potential; that neurons take these metabolites up and modulate their function differentially when exposed to control vs palmitate-treated microglia-conditioned media (in the absence of astrocytes). The experiments show through acute PLX-induced elimination of microglia, however crude, that this compartment impacts the acute HFD response, and using conditional deletion, that full DRP1 expression is required CX3CR1-CreERT2 targeted cells (primarily microglia deleting; Zhao et al 2019).  While these experiments cannot rule out a contribution of astrocytes to the observations in vivo, comparable experiments rarely can and we cannot rationalize why microglia should not have equal access to CSF palmitate for uptake or to neurons for substrate provisioning. We now better highlight this important issue, and temper our conclusions accordingly:

      “Tanycytes and astrocytes have both been documented to release select metabolites into the extracellular environment [33, 34]. While suggestive, the experiments highlighted here do not rule out a contribution of these or cell types in coupling acute HFD intake to memory and learning.”

      Comment: Thus, the Barnes Maze results could be attributed to multiple cell types. Furthermore, the evidence provided in Figure 5J is insufficient to claim a microglia-dependent mechanism without showing data from mice on HFD with and without microglia depletion (analogous to the third and fourth bars in panel K).

      Agreed. We appreciate the comment. We have now added the requested HFD condition to Figure 5J. The data support our previous interpretation of the data. 

      Comment: Given the emphasis on improved cognitive function, there is minimal discussion of the actual behavioral outcomes in both the results and discussion sections. The data that HFD-treated animals outperform controls should be presented in more detail both in the figure and in the text. For example, data from all days/trials of the Barnes Maze should be shown, including the day(s) HFD mice outperform controls. Furthermore, the authors should either cite additional literature or provide experimental evidence supporting the notion that microglia release of TCA-associated substrates into the extracellular milieu after HFD specifically benefits neuronal function cellularly or regionally in the brain, which could translate to improved performance in classical behavioral paradigms. The single reference included is a bit obscure, given the study found that increased lactate enhances fear memory which is a neural circuit not studied in the current manuscript. Are there no additional studies on more relevant metabolites (e.g., itaconate, succinate)?

      We agree. We have now re-plotted the behavioral test to better highlight that the HFD-treated animals outperform controls, as requested (Fig S7 and S8). We also added the requested literature. While we cannot be sure our search captured all relevant studies, we find a relative paucity of studies that characterize CSF metabolite changes in the context of acute high fat feeding or that demonstrate the ability of CSF substrates to convincingly improve memory and learning in vivo at physiological levels. Indeed, while simple, we feel the findings are of substantial novelty and highlight an area for significant future research. We have tempered our conclusions throughout and added to the discussion as follows:

      “Such substrate release could mediate the learning and memory effects that accompany aMMR; they are consistent with the data of other studies that have examined metabolite associations with learning and memory (itaconate [Morgunov IG, microorganisms 2020; Xiong J, Neuromolecular med 2023], succinate [Serra FT neurosciences letter 2022; Cline BH, BMC neurosciences 2012].”

      Minor Points:

      (1) In Figure 5J the latency to find the hole was noticeably higher (mean around 150s) than the latency in panel K (mean around 100s for controls, and 60s for Drp1MGWT on HFD). This suggests high variability between experiments using this modified version of the Barnes Maze, despite the authors assertion that a standard Barnes Maze was employed and the results were reproducible at multiple institutions. Why do Drp1MGWT mice on control diet find the escape hole significantly faster than WT mice on control diet in panel J? Given the emphasis on cognitive improvement following acute HFD as a novel finding, the authors should explain this discrepancy.

      We appreciate this question and comment. Indeed, as the reviewer knows, behavioral tests including the Barnes test show variation with genetic background, and with environment and context (eg. age, caging density, litter size, behavioral state and more (Inglis A, Physiol Behavior 2019; Loos M Mamm Genome 2015; and unpublished observations). We do not know the exact origin of the difference mentioned above but our best guess would be that it stems from either environmental differences  that are ever present in vivaria (seasonal, mouse house room, cage-changing cycles, etc) and/or, differences between the background genetics (eg. presence of Cre transgene and linked genome, genetic drift) or precise experimental differences between the cohorts (eg. repeated tamoxifen-injection paradigm for the deletion group). All of our experiments were performed in parallel, with all relevant animal groups equally represented in every run, and,and used age- and sex-matched individuals from congenic strains. Wherever possible, controls and test animals were littermates to minimize within strain variance attributable to litter effects (litter size, maternal and paternal effects). Given our lab’s interest and focus on the mechanistic and developmental origins of variance heterogeneity, these differences are of high interest for future study. 

      Comment: The authors highlight in the graphical abstract and again in Figure 4A the formation of lipid droplets following palmitate exposure as evidence of that microglia can process fatty acids. They later suggest that a lack of substantial induction of lipid droplet accumulation suggests that microglia are metabolically wired to release carbon substrates to neighboring cells. Clarification as to the role of lipid droplet formation/accumulation in explaining the results would eliminate any possible confusion.

      We modified the wording in the manuscript accordingly:

      Results “Microglia take up and metabolize free fatty acids”;

      “Based on BODIPY fluorescence, we found that primary microglia increase lipid droplet numbers within 24h of in vitro exposure to palmitate (200uM; Fig 4A), demonstrating a capacity to take up fatty acids.”

      Comment: In many bar graphs showing relatively modest effects, it would be helpful to use symbols to also show the distribution of sample and animal replicates (especially behavioral paradigms).

      Agreed. Indeed, the results are both modest and impressive given the nature of the intervention (simple change in dietary macronutrient composition). We have now re-plotted the results from the behavioral experiments, accordingly (Fig S7 and Fig S8).

      Reviewer #3 (Recommendations For The Authors):

      This is a good manuscript deserving of publication assuming some of the concerns posed above are addressed.

      We thank Reviewer #3 again for their time, effort, and dedication, and for their objective review of the manuscript.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      All of the reviewers indicate that their major concerns have been adequately addressed, but they each have a few comments that the authors should consider before submitting a final version (without further review) for publication. For example, a statement about the sex of the mice used in the studies and whether any differences were noted if both sexes were used. The idea that the loss of glutamate transport might affect NA loading into vesicles is also worth considering. Finally, the authors might want to mention that the role of neuropeptide release from NA neurons needs further examination. 

      As noted in the prior submitted revision, all experiments contained both males and females and this was addressed in our re-submission. In our analysis of breathing and metabolism, sex was included in the analysis and no significant phenotypic difference was observed (The statement of no sex difference is in line 451-456). For the fate map and in situ experiments, although the group size is small, we did not see obvious differences in the expression patterns in the three glutamate transporters between females and males (line 347-350). All the anatomical and phenotypic data in this manuscript are presented as combined graphs (figure 1, figure 1 supplement 1, figure 2, figure 2 supplement 2, figure 4,5,6,7) and we had differentially labeled our data points by sex (female data is pink and male data is blue).

      The possibility that loss of Vglut2 might affect NA release has been added in the discussion (line 485-491) of the current revision. Dopamine Beta Hydroxylase (DBH) converts dopamine to noradrenaline in the vesicles, thus, glutamate may not directly affect noradrenaline loading into vesicles. However, since loss of Vglut2 reduced dopamine release in subsets of dopaminergic neurons, it remains possible that glutamate affects dopamine loading in NA neurons and in turn perturbs DA to NA conversion in the vesicle by DBH and subsequent noradrenaline release. Future work could examine this hypothesis using fast-scan cyclic voltammetry (FSCV) or microdialysis.

      The further examination of the role of neuropeptide release from NA neurons is mentioned in the discussion (line 491-494 and line 497-499 of the pre).

      eLife assessment

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments provide compelling evidence that conditional deletion of vesicular glutamate transporters from noradrenergic neurons does not impact steady-state breathing or metabolic activity in room air, hypercapnia, or hypoxia. This study provides an important contribution to our understanding of how noradrenergic neurons regulate respiratory homeostasis in conscious adult mice. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments show that conditional deletion of Vglut2 in NA neurons does not impact steady-state breathing or metabolic activity in room air, hypercapnia, or hypoxia. Their observations challenge the importance of glutamatergic signaling from Vglut2 expressing NA neurons in normal respiratory homeostasis in conscious adult mice. 

      Strengths:

      The comprehensive Vglut1, Vglut2, and Vglut3 co-expression profiles in the central noradrenergic system and the combined measurements of breathing and oxygen consumption are two major strengths of this study. Observations from these experiments provide previously undescribed insights into (1) expression patterns for subtypes of the vesicular glutamate transporter protein in the noradrenergic system and (2) the dispensable nature of Vglut2dependent glutamate signaling from noradrenergic neurons to breathing responses to physiologically relevant gas challenges in adult conscious mice. 

      Weaknesses:

      Although the cellular expression profiles for the vesicular glutamate transporters are provided, the study does not document that glutamatergic-based signaling originating from noradrenergic neurons is evident at the cellular level under normal, hypoxic, and/or hypercapnic conditions. The authors effectively recognize this issue and appropriately discuss their findings in this context. 

      We thank the reviewer for the positive evaluation of our work.

      Reviewer #2 (Public Review):

      The authors characterized the recombinase-based cumulative fate maps for vesicular glutamate transporters (Vglut1, Vglut2 and Vglut3) expression and compared those maps to their realtime expression profiles in central NA neurons by RNA in situ hybridization in adult mice. Authors have revealed a new and intriguing expression pattern for Vglut2, along with an entirely uncharted co-expression domain for Vglut3 within central noradrenergic neurons. Interestingly, and in contrast to previous studies, the authors demonstrated that glutamatergic signaling in central noradrenergic neurons does not exert any influence on breathing and metabolic control either under normoxic/normocapnic conditions or after chemoreflex stimulation. Also, they showed for the first-time the Vglut3-expressing NA population in C2/A2 nuclei. In addition, they were also able to demonstrate Vglut2 expression in anterior NA populations, such as LC neurons, by using more refined techniques, unlike previous studies. 

      A major strength of the study is the use of a set of techniques to investigate the participation of NA-based glutamatergic signaling in breathing and metabolic control. The authors provided a full characterization of the recombinase-based cumulative fate maps for Vglut transporters. They performed real-time mRNA expression of Vglut transporters in central NA neurons of adult mice. Further, they evaluated the effect of knocking down Vglut2 expression in NA neurons using a DBH-Cre; Vglut2cKO mice on breathing and control in unanesthetized mice. Finally, they injected the AAV virus containing Cre-dependent Td tomato into LC of v-Glut2 Cre mice to verify the VGlut2 expression in LC-NA neurons. A very positive aspect of the article is that the authors combined ventilation with metabolic measurements. This integration holds

      particular significance, especially when delving into the exploration of respiratory chemosensitivity. Furthermore, the sample size of the experiments is excellent.  Despite the clear strengths of the paper, some weaknesses exist. It is not clear in the manuscript if the experiments were performed in males and females and if the data were combined. I believe that the study would have benefited from a more comprehensive analysis exploring the sex specific differences. The reason I think this is particularly relevant is the developmental disorders mentioned by the authors, such as SIDS and Rett syndrome, which could potentially arise from disruptions in central noradrenergic (NA) function, exhibit varying degrees of sex predominance. Moreover, some of the noradrenergic cell groups are sexually dimorphic. For instance, female Wistar rats exhibit a larger LC size and more LC-NA neurons than male subjects (Pinos et al., 2001; Garcia-Falgueras et al., 2005). More recently, a detailed transcriptional profiling investigation has unveiled the identities of over 3,000 genes in the LC. This revelation has highlighted significant sexual dimorphisms, with more than 100 genes exhibiting differential expression within LC-NA neurons at the transcript level. Furthermore, this investigation has convincingly showcased that these distinct gene expression patterns have the capacity to elicit disparate behavioral responses between sexes (Mulvey et al., 2018).

      Therefore, the authors should compare the fate maps, Vglut transporters in males and females, at least considering LC-NA neurons. Even in the absence of identified sex differences, this information retains significant importance. 

      An important point well raised by the authors is that although suggestive, these experiments do not definitively rule out that NA-Vglut2 based glutamatergic signaling has a role in breathing control. Subsequent experiments will be necessary to validate this hypothesis. 

      An improvement could be made in terms of measuring body temperature. Opting for implanted sensors over rectal probes would circumvent the need to open the chamber, thereby preventing alterations in gas composition during respiratory measurements. Further, what happens to body temperature phenotype in these animals under different gas exposures? These data should be included in the Tables. 

      Is it plausible that another neurotransmitter within NA neurons might be released in higher amounts in DBH-Cre; Vglut2 cKO mice to compensate for the deficiency in glutamate and prevent changes in ventilation? 

      Continuing along the same line of inquiry is there a possibility that Vglut2 cKO from NA neurons not only eliminates glutamate release but also reduces NA release? A similar mechanism was previously found in VGLUT2 cKO from DA neurons in previous studies (Alsio et al., 2011; Fortin et al., 2012; Hnasko et al., 2010). Additionally, does glutamate play a role in the vesicular loading of NA? Therefore, could the lack of effect on breathing be explained by the lack of noradrenaline and not glutamate? 

      We thank the reviewer for the positive evaluation and further suggestions. Please see our response in “Author Response” to the previous version of Reviewer #2 (Public review).

      Reviewer #4 (Public Review): 

      Summary:

      Although previous research suggested that noradrenergic glutamatergic signaling could influence respiratory control, the work performed by Chang and colleagues reveals that excitatory (specifically Vglut2) neurons is dynamically and widely expressed throughout the central noradrenergic system, but it is not significantly crucial to change baseline breathing as well the hypercapnia and hypoxia ventilatory responses. The central point that will make a significant change in the field is how NA-glutamate transmission may influence breathing control and the dysfunction of NA neurons in respiratory disorders. 

      Strengths:

      There are several strengths such as the comprehensive analysis of Vglut1, Vglut2, and Vglut3 expression in the central noradrenergic system and the combined measurements of breathing parameters in conscious unrestrained mice. 

      Other considerations :

      These results strongly suggest that glutamate may not be necessary for modulating breathing under normal conditions or even when faced with high levels of carbon dioxide (hypercapnia) or low oxygen levels (hypoxia). This finding is unexpected, considering many studies have underscored glutamate's vital role in respiratory regulation, more so than catecholamines. This leads us to question the significance of catecholamines in controlling respiration. Moreover, if glutamate is not essential for this function, we need to explore its role in other physiological processes such as sympathetic nerve activity (SNA), thermoregulation, and sensory physiology. 

      We thank the reviewer for the positive evaluation and further suggestions. The potential role of noradrenergic-derived glutamate in other processes, which is beyond the scope of this study, should be addressed in the future.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      All of my concerns were effectively resolved, leading me to accept the paper. However, I suggest that the authors consider investing in a more reliable system for measuring body temperature, as accurate measurements of this parameter are crucial for whole body plethysmography. 

      Thank you for the suggestion. The real-time measurement of body temperature is a goal in future studies.

      Reviewer #4 (Recommendations For The Authors):

      Because I am revising a revised version, I believe the authors have addressed most, if not all, the concerns raised by already 3 reviewers. In my understanding the authors achieved their aims and the results are totally supported by the conclusions. The impact of this work on the respiratory field is significant and is likely to advance the field. The methods and data utilized, which combine standard techniques with genetic tools, will be highly beneficial to the research community. 

      In my understanding I still have one concern that if glutamate is not critical, then what is? Could we potentially disable the noradrenergic (NA) system while preserving glutamate functionality to determine if the NA system is indeed crucial for respiratory physiology? This approach might provide clearer insights into the mechanisms underlying respiratory control. 

      We agree that there remain several exciting questions about the respective roles of noradrenaline, glutamate, and other neuropeptides such as Neuropeptide Y (NPY) and galanin. We are currently devising strategies to address the respective and combinatorial roles for all these candidates in breathing control. Most simply, we can conditionally, mutagenized each of them in the central noradrenergic system in an acute manner using DBH-CreER mice to determine if any of them are critical to respiratory control with the advantage of minimizing developmental compensatory events.

    2. eLife assessment

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments provide compelling evidence that conditional deletion of vesicular glutamate transporters from noradrenergic neurons does not impact steady-state breathing or metabolic activity in room air, hypercapnia, or hypoxia. This study provides an important contribution to our understanding of how noradrenergic neurons regulate respiratory homeostasis in conscious adult mice.

    3. Reviewer #1 (Public Review):

      Summary:

      Chang et al. provide glutamate co-expression profiles in the central noradrenergic system and test the requirement of Vglut2-based glutamatergic release in respiratory and metabolic activity under physiologically relevant gas challenges. Their experiments show that conditional deletion of Vglut2 in NA neurons does not impact steady-state breathing or metabolic activity in room air, hypercapnia, or hypoxia. Their observations challenge the importance of glutamatergic signaling from Vglut2 expressing NA neurons in normal respiratory homeostasis in conscious adult mice.

      Strengths:

      The comprehensive Vglut1, Vglut2, and Vglut3 co-expression profiles in the central noradrenergic system and the combined measurements of breathing and oxygen consumption are two major strengths of this study. Observations from these experiments provide previously undescribed insights into (1) expression patterns for subtypes of the vesicular glutamate transporter protein in the noradrenergic system and (2) the dispensable nature of Vglut2-dependent glutamate signaling from noradrenergic neurons to breathing responses to physiologically relevant gas challenges in adult conscious mice.

      Weaknesses:

      Although the cellular expression profiles for the vesicular glutamate transporters are provided, the study does not document that glutamatergic-based signaling originating from noradrenergic neurons is evident at the cellular level under normal, hypoxic, and/or hypercapnic conditions. The authors effectively recognize this issue and appropriately discuss their findings in this context.

    4. Reviewer #2 (Public Review):

      The authors characterized the recombinase-based cumulative fate maps for vesicular glutamate transporters (Vglut1, Vglut2 and Vglut3) expression and compared those maps to their real-time expression profiles in central NA neurons by RNA in situ hybridization in adult mice. Authors have revealed a new and intriguing expression pattern for Vglut2, along with an entirely uncharted co-expression domain for Vglut3 within central noradrenergic neurons. Interestingly, and in contrast to previous studies, the authors demonstrated that glutamatergic signaling in central noradrenergic neurons does not exert any influence on breathing and metabolic control either under normoxic/normocapnic conditions or after chemoreflex stimulation. Also, they showed for the first-time the Vglut3-expressing NA population in C2/A2 nuclei. In addition, they were also able to demonstrate Vglut2 expression in anterior NA populations, such as LC neurons, by using more refined techniques, unlike previous studies.

      A major strength of the study is the use of a set of techniques to investigate the participation of NA-based glutamatergic signaling in breathing and metabolic control. The authors provided a full characterization of the recombinase-based cumulative fate maps for Vglut transporters. They performed real-time mRNA expression of Vglut transporters in central NA neurons of adult mice. Further, they evaluated the effect of knocking down Vglut2 expression in NA neurons using a DBH-Cre; Vglut2cKO mice on breathing and control in unanesthetized mice. Finally, they injected the AAV virus containing Cre-dependent Td tomato into LC of v-Glut2 Cre mice to verify the VGlut2 expression in LC-NA neurons. A very positive aspect of the article is that the authors combined ventilation with metabolic measurements. This integration holds particular significance, especially when delving into the exploration of respiratory chemosensitivity. Furthermore, the sample size of the experiments is excellent.<br /> Despite the clear strengths of the paper, some weaknesses exist. It is not clear in the manuscript if the experiments were performed in males and females and if the data were combined. I believe that the study would have benefited from a more comprehensive analysis exploring the sex specific differences. The reason I think this is particularly relevant is the developmental disorders mentioned by the authors, such as SIDS and Rett syndrome, which could potentially arise from disruptions in central noradrenergic (NA) function, exhibit varying degrees of sex predominance. Moreover, some of the noradrenergic cell groups are sexually dimorphic. For instance, female Wistar rats exhibit a larger LC size and more LC-NA neurons than male subjects (Pinos et al., 2001; Garcia-Falgueras et al., 2005). More recently, a detailed transcriptional profiling investigation has unveiled the identities of over 3,000 genes in the LC. This revelation has highlighted significant sexual dimorphisms, with more than 100 genes exhibiting differential expression within LC-NA neurons at the transcript level. Furthermore, this investigation has convincingly showcased that these distinct gene expression patterns have the capacity to elicit disparate behavioral responses between sexes (Mulvey et al., 2018). Therefore, the authors should compare the fate maps, Vglut transporters in males and females, at least considering LC-NA neurons. Even in the absence of identified sex differences, this information retains significant importance.<br /> An important point well raised by the authors is that although suggestive, these experiments do not definitively rule out that NA-Vglut2 based glutamatergic signaling has a role in breathing control. Subsequent experiments will be necessary to validate this hypothesis.

      An improvement could be made in terms of measuring body temperature. Opting for implanted sensors over rectal probes would circumvent the need to open the chamber, thereby preventing alterations in gas composition during respiratory measurements. Further, what happens to body temperature phenotype in these animals under different gas exposures? These data should be included in the Tables.

      Is it plausible that another neurotransmitter within NA neurons might be released in higher amounts in DBH-Cre; Vglut2 cKO mice to compensate for the deficiency in glutamate and prevent changes in ventilation?

      Continuing along the same line of inquiry is there a possibility that Vglut2 cKO from NA neurons not only eliminates glutamate release but also reduces NA release? A similar mechanism was previously found in VGLUT2 cKO from DA neurons in previous studies (Alsio et al., 2011; Fortin et al., 2012; Hnasko et al., 2010). Additionally, does glutamate play a role in the vesicular loading of NA? Therefore, could the lack of effect on breathing be explained by the lack of noradrenaline and not glutamate?

    5. Reviewer #4 (Public Review):

      Summary:

      Although previous research suggested that noradrenergic glutamatergic signaling could influence respiratory control, the work performed by Chang and colleagues reveals that excitatory (specifically Vglut2) neurons is dynamically and widely expressed throughout the central noradrenergic system, but it is not significantly crucial to change baseline breathing as well the hypercapnia and hypoxia ventilatory responses. The central point that will make a significant change in the field is how NA-glutamate transmission may influence breathing control and the dysfunction of NA neurons in respiratory disorders.

      Strengths:

      There are several strengths such as the comprehensive analysis of Vglut1, Vglut2, and Vglut3 expression in the central noradrenergic system and the combined measurements of breathing parameters in conscious unrestrained mice.

      Other considerations :

      These results strongly suggest that glutamate may not be necessary for modulating breathing under normal conditions or even when faced with high levels of carbon dioxide (hypercapnia) or low oxygen levels (hypoxia). This finding is unexpected, considering many studies have underscored glutamate's vital role in respiratory regulation, more so than catecholamines. This leads us to question the significance of catecholamines in controlling respiration. Moreover, if glutamate is not essential for this function, we need to explore its role in other physiological processes such as sympathetic nerve activity (SNA), thermoregulation, and sensory physiology.

    1. eLife assessment

      This study presents solid evidence to support the effectiveness of the novel eIF2B activator DNL343 in mitigating the integrated stress response (ISR) and reducing neurodegeneration associated with ISR activation in two mouse models. These important findings offer promise for the potential use of DNL343 in treating vanishing white matter disease (VWMD), a rare condition resulting from eIF2B loss of function, and in addressing other neurodegenerative disorders characterized by ISR involvement. The study also identified potential VWMD biomarkers, which hold significance for assessing disease progression and evaluating treatment responses.

    2. Reviewer #3 (Public Review):

      Summary:

      ISR contributes to the pathogenesis of multiple neurodegenerative diseases, such as ALS, FTD, VWMD, etc. Targeting ISR is a promising avenue for therapeutic intervention. However, all previously identified ways to target ISR have problems. PERK inhibitors suppress ISR by inhibiting eIF2alpha phosphorylation and cause pancreatic toxicity in mice. In order to bypass eIF2alpha, previous studies have identified ISR suppressors that target eIF2B, such as ISRIB and 2BAct. These molecules suppress neurodegeneration but do not cause detrimental effects in mouse models. However, ISRIB is water-insoluble, and 2BAct causes cardiovascular complications in dogs, preventing their use in clinics. Here, the authors showed that DNL343, a new ISR inhibitor targeting eIF2B, suppresses features that can be related to neurodegeneration in mouse models. Combined with their previous results of a clinical phase I trial showing the safety of DNL343, these findings suggest the promise of DNL343 as a potential drug for neurodegenerative diseases in which ISR contributes to pathogenesis.

      Strengths:

      The finding is important and has disease implications.

      Weakness:

      The authors did not provide evidence that DNL343 suppresses the demise of nervous systems in their VWMD model.

    3. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors evaluated a novel eIF2B activator, DNL343, in two mouse models representing different integrated stress response (ISR) forms. They first assessed the pharmacokinetics of DNL343, demonstrating its ability to cross the blood-brain barrier and exhibit good bioavailability. In an acute ISR model induced by optic nerve crush (ONC) injury, DNL343 treatment reduced ISR-induced transcriptional changes and neuronal loss, demonstrating neuroprotective effects. Next, the authors generated an eIF2B loss-of-function mice model by knocking in disease-causing Eif2b5 variants. The model presents a chronic ISR and mimics vanishing white matter disease (VWMD). DNL343 treatment from the pre-symptomatic stage improved body weight and motor functions, corrected transcriptional changes, and reversed proteomic and metabolomic alterations in the brain and cerebrospinal fluid. DNL343 treatment initiated at an advanced disease stage also showed positive effects, restoring body weight gain, suppressing ISR, reducing neurodegeneration biomarkers, and extending lifespan. These findings highlight DNL343 as an effective ISR inhibitor with potential applications in treating VWMD and other neurodegenerative disorders involving ISR.

      Strengths:

      The study's findings regarding the novel compound DNL343 offer significant promise in addressing VWMD, a condition currently lacking disease-modifying treatment. DNL343 directly targets eIF2B, the disease-causing complex in VWMD, and demonstrates notable efficacy in reversing the integrated stress response (ISR) and mitigating neurodegeneration in a VWMD mouse model. These results raise hope for the potential application of DNL343 in VWMD treatment, a development eagerly anticipated by patients and the VWMD research community. Moreover, the study hints at the broader potential of DNL343 in treating other ISR-related neurodegenerative disorders, such as ALS, a prospect that holds broader interest. Additionally, the study's identification of potential biomarkers for VWMD represents a notable strength, potentially leading to improved disease progression assessment pending further confirmation in future research.

      Weaknesses:

      Direct biochemical evidence confirming DNL343's activity in eIF2B activation and its toxicity profile have been previously documented in a separate study. It would be beneficial to provide a more detailed introduction to this information, establishing a robust knowledge foundation for the in vivo study described in this work.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors developed DNL343, a CNS-penetrant small molecule integrated stress response (ISR) inhibitor, to treat neurodegenerative diseases caused by ISR.

      Strengths:

      DNL343 is an investigational CNS-penetrant small molecule integrated stress response (ISR) inhibitor designed to activate the eukaryotic initiation factor 2B (eIF2B) and suppress aberrant ISR activation. The therapeutic efficacy of DNL343 has been extensively characterized in two animal models. Importantly, plasma biomarkers of neuroinflammation and neurodegeneration can be reversed with DNL343 treatment. Remarkably, several of these biomarkers show differential levels in CSF and plasma from patients with vanishing white matter disease (VWMD) upon DNL343 treatment. Overall, this study is very exciting that targets ISR for therapeutic interventions.

      Weaknesses:

      My main questions center around the characterization of DNL343.

      (1) Is there any biochemical evidence showing DNL343 activates eIF2B, such as binding and in vitro biochemical activity assays? A conference presentation was cited. "Osipov, M. (2022). Discovery of DNL343: a Potent Selective and Brain-penetrant eIF2B Activator Designed for the Treatment of Neurodegenerative Diseases. Medicinal Chemistry Gordon Research Conference. New London, NH." However, there is no public information about this presentation.<br /> (2) How was the selectivity of DNL343 demonstrated? What are the off-targets of DNL343, particularly when DNL343 is administered at a high dose? Thermal-proteasome profiling or photoaffinity labeling experiments could be considered.<br /> (3) What are the total drug concentrations in the brain and plasma? What are the unbound ratios?<br /> (4) If DNL343 is given intravenously, what are the concentrations in the brain and plasma after 5 minutes and 1 h or longer time points? In other words, does DNL343 cross BBB through passive diffusion or an active process?<br /> (5) What is the full PK profile of DNL343 for intravenous and oral dosing?<br /> (6) Are there any major drug metabolites that could be concerns?

      Review for Revision:

      The companion JMC paper, doi.org/10.1021/acs.jmedchem.3c02422, addressed most of my questions. However, I was unable to find the total concentrations of DNL343 in the brain and plasma or the raw data for the full PK in the JMC paper. Otherwise, the JMC publication addressed all my questions.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this study, the authors evaluated a novel eIF2B activator, DNL343, in two mouse models representing different forms of the integrated stress response (ISR). They first assessed the pharmacokinetics of DNL343, demonstrating its ability to cross the blood-brain barrier and exhibit good bioavailability. In an acute ISR model induced by optic nerve crush (ONC) injury, DNL343 treatment reduced ISR-induced transcriptional changes and neuronal loss, demonstrating neuroprotective effects. Next, the authors generated an eIF2B loss-of-function mice model by knocking in disease-causing Eif2b5 variants. The model presents a chronic ISR and mimics vanishing white matter disease (VWMD). DNL343 treatment from the pre-symptomatic stage improved body weight and motor functions corrected transcriptional changes, and reversed proteomic and metabolomic alterations in the brain and cerebrospinal fluid. DNL343 treatment initiated at an advanced disease stage also showed positive effects, restoring body weight gain, suppressing ISR, reducing neurodegeneration biomarkers, and extending lifespan. These findings highlight DNL343 as an effective ISR inhibitor with potential applications in treating VWMD and other neurodegenerative disorders involving ISR.

      Strengths:

      The study's findings regarding the novel compound DNL343 offer significant promise in addressing VWMD, a condition currently lacking disease-modifying treatment. DNL343 directly targets eIF2B, the disease-causing complex in VWMD, and demonstrates notable efficacy in reversing the integrated stress response (ISR) and mitigating neurodegeneration in a VWMD mouse model. These results raise hope for the potential application of DNL343 in VWMD treatment, a development eagerly anticipated by patients and the VWMD research community. Moreover, the study hints at the broader potential of DNL343 in treating other ISR-related neurodegenerative disorders, such as amyotrophic lateral sclerosis, a prospect that holds broader interest. Additionally, the study's identification of potential biomarkers for VWMD represents a notable strength, potentially leading to improved disease progression assessment pending further confirmation in future research.

      Weaknesses:

      There are a couple of notable concerns in this study. Firstly, while the in vivo evidence strongly supports the efficacy of DNL343 in mitigating ISR and neurodegeneration, there is a lack of direct biochemical evidence to confirm its activity in eIF2B activation. Secondly, the potential for cardiovascular toxicity, which has been reported for a related eIF2B activator in a canine model (as mentioned in the manuscript), has not been evaluated for DNL343 in this study. This data gap regarding toxicity could be crucial for informing the future development of DNL343 for potential human use. Further investigation into these areas would be valuable for a comprehensive understanding of the compound's mechanisms and safety profile.

      We thank the reviewer for the thoughtful feedback and an opportunity to provide further clarification. To address the first question regarding biochemical evidence of the mechanism of action of DNL343, we agree that additional data is helpful to interpreting the results presented in this manuscript. We now include a citation to Craig et al (Craig, R.A., 2nd, J. De Vicente, A.A. Estrada, J.A. Feng, K.W. Lexa, M.J. Canet, W.E. Dowdle, R.I. Erickson, B.N. Flores, P.C.G. Haddick, L.A. Kane, J.W. Lewcock, N.J. Moerke, S.B. Poda, Z. Sweeney, R.H. Takahashi, V. Tong, J. Wang, E. Yulyaningsih, H. Solanoy, K. Scearce-Levie, P.E. Sanchez, L. Tang, M. Xu, R. Zhang and M. Osipov (2024). "Discovery of DNL343: A Potent, Selective, and Brain-Penetrant eIF2B Activator Designed for the Treatment of Neurodegenerative Diseases." J Med Chem.) which includes the full details on the discovery and characterization of DNL343.

      On the question of cardiovascular toxicity observed with previous eIF2B activating compounds, Craig et al also provides evidence in a non-human primate (cynomolgus monkey) model that DNL343 dosing did not result in QT prolongation or any functional cardiac changes. We have also completed a Phase 1 (NCT04268784) and Phase 1B double-blind (NCT05006352) trials in healthy and ALS participants, respectively and these trials are referenced on page 4, lines 102-103. The safety profile observed in these clinical studies supported further development of DNL343 for ALS in the Healey Platform trial (NCT04297683, Regimen G).

      Reviewer #2 (Public Review):

      Summary:

      The authors developed DNL343, a CNS-penetrant small molecule integrated stress response (ISR) inhibitor, to treat neurodegenerative diseases caused by ISR.

      Strengths:

      DNL343 is an investigational CNS-penetrant small molecule integrated stress response (ISR) inhibitor designed to activate the eukaryotic initiation factor 2B (eIF2B) and suppress aberrant ISR activation. The therapeutic efficacy of DNL343 has been extensively characterized in two animal models. Importantly, plasma biomarkers of neuroinflammation and neurodegeneration can be reversed with DNL343 treatment. Remarkably, several of these biomarkers show differential levels in CSF and plasma from patients with vanishing white matter disease (VWMD) upon DNL343 treatment. Overall, this is a very exciting study to target ISR for therapeutic interventions.

      Weaknesses:

      My main questions center around the characterization of DNL343.

      (1) Is there any biochemical evidence showing DNL343 activates eIF2B, such as binding assays or in vitro biochemical activity assays? A conference presentation was cited - "Osipov, M. (2022). Discovery of DNL343: a Potent Selective and Brain-penetrant eIF2B Activator Designed for the Treatment of Neurodegenerative Diseases. Medicinal Chemistry Gordon Research Conference. New London, NH." However, there needs to be public information about this presentation.

      Information from this presentation and more details on the discovery and characterization of DNL343 can be found in Craig et al J Med Chem (2024) and this citation has been replaced.

      (2) How was the selectivity of DNL343 demonstrated? What are the off-targets of DNL343, in particular when DNL343 is administered at a high dose? Thermal-proteasome profiling or photoaffinity labeling experiments could be considered.

      Please see Craig et al J Med Chem (2024) for full details. In brief, there were no significant off target effects observed for DNL343 in a Cerep panel.

      (3) What are the total drug concentrations in the brain and plasma? What are the unbound ratios?

      Following a single oral dose of DNL343 in mice, unbound brain-to-unbound plasma exposures ratios (Kp,uu) of 0.8 to 1.1 were observed, indicating high CNS penetrance. This was further supported by CSF-to-unbound plasma exposures ratios at 0.9 in the same mouse study. The CNS penetrance was also confirmed in rats and NHP by CSF-to-unbound plasma ratios near unity as reported in Craig et al J Med Chem (2024).

      (4) If DNL343 is given intravenously, what are the concentrations in the brain and plasma after 5 minutes and 1 hour or longer time points? In other words, does DNL343 cross BBB through passive diffusion or an active process?

      Unbound brain-to-unbound plasma exposure ratios following a single oral dose in the mouse were 0.8 to 1.1 and showed no time dependence. These measurements were made prior to, near, and following plasma tmax of DNL343, indicating unbound DNL343 crosses the BBB through passive diffusion and rapidly reached equilibrium between the brain and systemic circulation. Details can be found in Craig et al J Med Chem (2024).

      (5) What is the complete PK profile of DNL343 for intravenous and oral dosing?

      DNL343 administered orally to mice as a suspension formulation showed plasma PK consistent with prolonged absorption with tmax ranging from 3 to 4 h, and a terminal elimination half-life (t1/2) of ~10 h. Details can be found in Craig et al J Med Chem (2024).

      (6) Are there any major drug metabolites that could be of concern?

      DNL343 metabolism is through Phase 1 biotransformation pathways. None of the in vivo circulating metabolites show potency towards eIF2B activation. Given that none of these metabolites are of concern, we believe this information is beyond the scope of the current manuscript.

      Reviewer #3 (Public Review):

      Summary:

      ISR contributes to the pathogenesis of multiple neurodegenerative diseases, such as ALS, FTD, VWMD, etc. Targeting ISR is a promising avenue for potential therapeutics. However, previously identified ways to target ISR present some challenges. PERK inhibitors suppress ISR by inhibiting eIF2alpha phosphorylation and cause pancreatic toxicity in mice. In order to bypass eIF2alpha, previous studies have identified ISR suppressors that target eIF2B, such as ISRIB and 2BAct. These molecules suppress neurodegeneration but do not cause detrimental effects in mouse models. However, ISRIB is water-insoluble, and 2BAct causes cardiovascular complications in dogs, preventing their use in clinics. Here, the authors showed that DNL343, a new ISR inhibitor targeting eIF2B, suppresses neurodegeneration in mouse models. Combined with their previous results of a clinical phase I trial showing the safety of DNL343, these findings suggest the promise of DNL343 as a potential drug for neurodegenerative diseases in which ISR contributes to pathogenesis.

      Strengths:

      The finding is important and has disease implications, and the conclusion is not surprising.

      Weaknesses:

      The experimental design and data are hard to comprehend for an audience with a basic research background. This reviewer suggests that the authors use the same way that previous studies on ISRIB and 2BAct (e.g., Wong et al; eLife, 2019) designed experiments and interpret data.

      We thank this reviewer for their feedback and recognition that DNL343 has a promising potential as treatment for neurodegenerative diseases. While our studies share some similarities to Wong et al., eLife (2019) and Abbink et al., ACTN (2019), our study design is intentionally distinct (e.g. inclusion of both prevention and treatment dosing paradigms, determining dose-response impact of drug treatment across biomarkers) which necessitates tailored data visualization to effectively communicate our findings. However, we understand the importance of clarity for a broader audience and to this end, we have made a number of changes to the data figures, in particular data from omics experiments in Figures 3 and 5. We also provided additional supplemental tables to aid data interpretation. This would hopefully cater to both audiences familiar with previous work and those with a less specialized background.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Demyelination is a significant pathological feature in the VWMD mouse model. The authors should clarify whether they observed similar demyelination in their study and if DNL343 had any impact on reversing this demyelination. These findings are crucial for assessing the compound's effectiveness in mitigating neurodegeneration.

      Demyelination is indeed an important feature in the eIF2B LOF (VWMD) mouse model. Given that this phenotype and the ability to rescue the histological phenotype with this MOA (Wong et al; eLife, 2019, cited in introduction) is very well characterized, along with our limitation from the size and number of mouse tissues, we prioritized non-histological targeted and unbiased analyses that were aimed at identifying translatable biomarkers. Nonetheless, the totality of our data, in different mouse models and cell types, strongly supports DNL343 as a potent ISR inhibitor that is effective in attenuating neurodegeneration:

      · In the optic nerve crush model, DNL343 dose-dependently reduced retinal cell degeneration

      · In the VWMD mouse model, DNL343 attenuated the increase in a plasma biomarker of neurodegeneration, neurofilament-light, which corresponded to normalization in motor function.

      · Metabolomic and lipidomic analyses in the VWMD mouse model brain showed increases in oxysterols, such as 7-ketocholesterol, and cholesterol esters and these lipids are associated with demyelination (Nugent et al, 2020). DNL343 treatment attenuated the levels of these oxysterols, indicating decreased demyelination.

      · When initiated at an advance disease stage, reversal of plasma biomarkers of neurodegeneration (Nf-L) and neuroinflammation (GFAP) by DNL343 in this model was accompanied by extension in the lifespan that is otherwise shortened as the mutant animals succumb to disease.

      These data highlight the potential therapeutic benefits of DNL343 in the broader context of ISR-mediated neurodegeneration which can include but may not be limited to VWMD.

      (2) Figure 6 presents several biomarkers with significantly increased levels in VWMD mice and patient biofluids. However, these biomarkers are not reflected in the brain proteomics data presented in Figure 3. The discrepancy between these findings should be addressed and discussed in the manuscript to provide a more comprehensive understanding.

      Proteins detected in Figure 6 were not detected by TMT proteomics in the CSF. In the brain, only GFAP was detected and the overall abundance in tissue were similar in both genetic groups. Cytokines such as TIMP1, MCP1 are usually present in low abundances and therefore are challenging to detect in broad discovery proteomics method applied in this study. Antibody-based immunoassays are better suited to specifically measure low abundant proteins than mass-spectrometry-based proteomics, while mass-spectrometry based methods offer wider dynamic range to detect more highly abundant proteins. Differences in detection sensitivity between immunoassay vs mass spectrometry assays has been previously noted (Petrera et al, J Proteome Res, 2021). We have added new text to address this point in the revised manuscript (page 7, line 274-277).

      (3) Figure 7 discusses the effects of DNL343 treatment initiated at an advanced disease stage. Since the 4-week treatment did not rescue performance in the balance beam test (as shown in Figure 6A), it is important to clarify if a 20-week treatment had any impact on this parameter.

      This reviewer raised an important question that we were unfortunately unable test. When the balance beam training was administered after 8 (out of 20) weeks of dosing, most animals of both wildtype and mutant genotypes struggled to remain on or maintain balance on the beam and were unable to progress traversing the beam, making the assay unsuccessful in this cohort. This impairment appeared to be driven by distinct factors in the two genotypes: age-associated obesity in wild-type animals and severe motor impairment in the eIF2B HOM mice, irrespective of treatment. While it is possible that other less demanding and more sensitive assays could reveal more nuanced differences, this, and our earlier data (Figure 4G-I), suggest that DNL343 could prevent but not reverse functional deterioration. This is in line with our understanding of DNL343 mechanism of action that does not include neuronal regeneration, a therapeutic effect that is likely required for functional recuperation. We have added this point to the manuscript (page 8, line 319-326).

      Additionally, considering the significant increase in Gdf15 levels in the disease model, it would be valuable to know if DNL343 treatment affected Gdf15 levels. If these assays were conducted, reporting the data would greatly assist in evaluating the compound's efficacy when administered at an advanced disease stage.

      We were not able to measure GDF15 levels in the 20-week study due to limitation in the in-life collected plasma samples which was dedicated to assessing biomarkers of neurodegeneration (Figure 7E-F). However, data from our 4-week treatment study, which was initiated at a similar age range to the 20-week treatment study (19-26 and 24-33 weeks of age, respectively), showed that DNL343 was able to reduce GDF15 levels in the brain (mRNA and protein) and CSF (protein) (Supplemental Figure 5A-C), suggesting that DNL343 reduces ISR activation at an advanced disease stage in the model. We expect that this reduction observed at 4 weeks of treatment would persist for the duration of the extended treatment in the 20-week cohort.

      (4) A minor point. In Figures 5A, 5C, and 5E, it appears that the red-colored group should likely be labeled as "HOM 0 mg/kg" instead of "HOM 3 mg/kg".

      This has been amended, thank you.

      Reviewer #3 (Recommendations For The Authors):

      Major concerns:

      (1) The cellular function of DNL343 needs to be clarified. The authors claim that it activates eIF2B, but no cellular or molecular evidence is provided. Does it bind to eIF2B? Does it not affect eIF2alpha phosphorylation? Does it restore translation upon stress that causes eIF2alpha phosphorylation? Does it suppress stress granule assembly? The authors cited Sun, Tsai et al. 2023 and Osipov et al., 2022. However, these citations are conference abstracts with no published figures available for review.

      We agree that additional data outlining the biochemical evidence of the mechanism of action of DNL343 was needed. We now include a citation to Craig et al J Med Chem (2024) that includes the full details on the discovery and molecular characterization of DNL343.

      (2) It needs to be clarified how the authors selected the ISR marker genes. ISR genes are more than those selected. How about others? How did the authors measure the mRNA levels, bulk RNA-seq or RT-PCR? If the former, have the authors verified their results using RT-PCR? Have the authors measured the protein levels for nerve crush experiments (by both proteomic and individual protein analyses)? Also, no statistical analyses were found for the heat maps.

      The ISR marker genes were selected by a combination of experimental and literature data. Transcriptomics analysis of the eIF2B HOM brains was conducted using untargeted RNAseq (Supplemental Figure 1B). Here, we found an enrichment of transcripts previously reported to be ISR dependent, namely Atf4, Chac1, Ddit3, Eif4ebp1, Ppp1r15a (Larhammar et al., 2017), Atf3, Asns, Mthfd2, Psat1, Sesn2, Slc1a5, Slc7a5, Slc7a11, Trib3 (Wong et al., 2019, Abbink et al., 2019).  These transcripts were assayed using targeted qPCR in the eIF2B HOM brains, spleen and PBMC (Supplemental Figure 1A, C, D) and in the retinas from the ONC experiments (Figure 2C). We have further clarified the analysis method for the gene expression data in the figure legends.

      We did not interrogate the proteome of the retina in the ONC model. Our study in this model was intended as a proof-of-concept evaluation of DNL343 effects in this acute ISR-dependent model of neurodegeneration. To this end, we performed gene expression (Figure 2C) and immunofluorescence analyses (Figure 2D-F). Each of these analyses were conducted using dedicated whole retinas; conducting additional protein analyses would necessitate a separate cohort of animals.

      We believe that heatmaps provide the best visualization of the data, particularly the dose dependent effects of DNL343 on multiple genes, but we understand the value for also providing statistical analyses. To address this, we provide additional Supplemental tables to show the outcome of statistical analyses undertaken. Statistical data relating to Figure 2C can be found on new Supplemental Tables 1 & 2; those relating to Supplemental Figures 1A, C, and D on new Supplemental Tables 3, 5, 6, respectively; that from Figure 4D on new Supplemental Table 8, and that from Figure 7D on new Supplemental Table 11.

      (3) Both the authors and Wong et al. (eLife, 2019) performed transcriptomic analyses on HOM mice. How do the authors compare the two data sets? Are they the same?

      In this work, transcriptomic approach was applied to confirm induction of ISR response in our in vivo model. While data are not identical, all of the top annotated genes shown in supplementary figure 1B were also deemed to be significant by Wong and coworkers (Bayes factor > 10). More importantly, as explained in our responses to question #2 from reviewer 3,  ISR genes highlighted in supplementary Figure 1B were also confirmed in two other studies (Larhammar et al., 2017, Abbink et al., 2019). These data support our interpretation that eIF2B HOM have elevated ISR relative to WT mice. We have added new text to line 164 on page 5 to clarify this point.

      (4) Can the authors interpret their omic data using volcano plots for HOM rescue experiments, as Wong et al. did in eLife 2019? Heat maps with statistical analyses are more straightforward to comprehend. Can the authors verify some of these data using RT-PCR, Western blot, etc.?

      We added additional pathway interpretation in our Figure 3 and 5 to highlight key biological processes altered in the brain and cellular compartment origin of CSF proteins changed in eIF2B HOM at baseline and following treatment with DNL343. Our treatment designed employed multiple dosing levels and as such, summarization by volcano plot would have resulted in creation of many figures that can be more easily captured by a single heat map plot. However, to provide additional quantitative information, we now added supplementary tables showing full statistical analysis for all heat maps for added clarity and transparency.

      We demonstrated 100% correlation between the select genes we examined by qPCR in supplemental Figure 1A and those identified from brain by RNA-seq. In addition, question of reliability of RNA-seq data has been previously been examined in great detail (Everaet et al, Sci Rep 2017) and found ~85% concordance between RNA-seq and qPCR data and those that were discordant tended to have < 2 log2FC and were present in low abundance. Given that top core ISR genes identified in our study have >2 log2FC and have been verified by other independent labs (Larhammar et al., 2017, Abbink et al., 2019, Wong et al., 2019). Based on these, we do not think that there is a rationale need for technical confirmation of RNAseq data.

      Risks for mis-annotation of proteins in TMT data were further mitigated by removing protein with coverage < 20% and having less than 8 unique peptides detected and setting protein annotation FDR to <1%.

      Additionally, TMT-labelling based proteomics offers wider dynamic range and sensitivity than western blotting. Validation of TMT logFC data with western blot technique, which is less quantitative and has lower dynamic ranges of detection may not be very informative. Furthermore, similar trends of changes in key ISR genes and proteins shown in figures 4D and 5A (e.g PSAT, SLC7A11, SLC7A5) provides additional support for the authenticity of proteins identified in this work.

      Also, for Figures 4E and F, it is assumed that each line represents an individual animal, but why their body weight gains are so different for the wild type? Can the authors plot the mean and s.e.m.? Also, there are no data about neurodegeneration. The authors need to show microscopy images, count the numbers, and assess the morphology of nerve cells.

      The large data spread in the body weight gain in our wild-type mice reflect the normal variability of this endpoint which can be influenced by sex and age. Indeed, both factors are present in our cohorts as animals of both sexes were included and there was a 7-week age-range (10-17 weeks of age at dosing start). Each line in Figures 4E-F indeed represents data sampled from individual animal over time. We chose to represent the data this way for transparency and have provided additional visualization (new Supplemental Figure 3) showing both body weight gain and plasma Nf-L levels as mean ± SEM as requested by this reviewer.

      In this study we chose to use a clinically-relevant biomarker of neurodegeneration, plasma neurofilament light chain (NfL) (Figure 4F). This allowed us to prioritize the tissue samples from these studies to execute comprehensive unbiased analyses for more complete characterization of the phenotype of these eIF2B LoF mice. NfL is a biomarker that has been recognized as a sensitive measurement of neuronal/axonal damage regardless of cause (Gaetani et al., 2018, Khalil et al., 2018). Elevated levels of plasma (and CSF) NfL levels has been demonstrated across neurodegenerative conditions such as Alzheimer’s disease (Giacomucci et al., 2022), multiple sclerosis (Ferreira-Atuesta et al., 2021), and in ALS (Huang et al., 2018).

      (5) How ISR is connected to metabolomic changes? Can the authors explain it?

      ISR caused significant increases in amino acid transporter and serine/glycine/1-carbon metabolism enzymes transcript and protein abundances that were highlighted in Figure 3A and C and lines 237-255 in the main text. Similar patterns were also observed in prior published studies (Larhammar et al., 2017, Abbink et al., 2019, Wong et al., 2019). Consistent with these changes we observed increased levels of Alanine (transported by SLC3A2, SLC7A11, SLC7A3) and decreased cystathionine levels (associated with increased expression of CTH).  ATF4 is one of the main orchestrator of ISR response to stress (e.g., amino acid deprivation) and it is required for expression of amino acid transporters and enzymes required for synthesis non-essential amino acids (PMID: 28494858). ATF4 increases cellular amino acid uptake and deliver AA needed for synthesis of proteins and glutathione needed for survival.

      We also observed prominent changes in CE in eIF2B HOM and its normalization with DNL343 treatment shown in Figure 5C. We checked for changes in expression levels of CEL, CES1, LCAT, LIPA, SOAT1, and NCEH1 proteins involved in CE metabolism and failed to detect any changes in protein or RNA abundances.  This  suggests that a rapid demyelination is a more likely trigger for CE accumulation as reported in FTD-GRN (Marian OC et al., 2023 acta neuropathol commun 11, 52), and in experimental demyelination models (Nugent AA et al., 2020 Neuron). We have added new text to the discussion section of the manuscript page 9, lines 408-411 to discuss how these results relate to each other.

      (6) It is hard to understand the biomarker part. The authors said "potential translational biomarkers are elevated..." Do the authors mean they are elevated so they can be potential biomarkers? If their levels are unchanged (e.g., TIMP-1), how can they be biomarkers? Also, this part needs a conclusion/summary. Also, what does "reversed biomarkers..." mean?

      We have modified the text to clarify and included a concluding sentence for this section of the results (page 7, lines 297-299). In assessing whether a given protein could be a potential translational biomarker for human disease we evaluated if the following two conditions were met: (1) Increased or decreased gene expression or protein levels of the biomarker in the brain or biofluids (CSF or plasma) of Eif2b5 R191H homozygote mice relative to wild-type controls that is modulated or normalized by administration of DNL343 and (2) protein levels in biofluids from VWMD patients that show differential levels than healthy controls in the same directionality as what is seen in the mouse model. GDF-15, GFAP, and NfL meet these criteria, but TIMP-1 and MCP-1 do not.

      Minor concerns:

      (1) Please explain which multiple comparison tests the authors used.

      This information has been further clarified in the figure legends.

      (2) Administrating the drug at an advanced stage led to a trend of NfL reduction but did not rescue function. Can the authors discuss what this means?

      Further elaboration and discussion about this finding have been added to the results section on page 8, line 319-325.

      (3) For statistical analyses on the bar graphs, it would be better if the authors labeled the comparison pairs on the graphs.

      We agree that labelling comparisons in bar graphs could aid the readership and have added this modification. Additionally, comparisons are indicated in the figure legend.

      (4) The authors need to state clearly that 2BAct's cardiovascular toxicity was observed in dogs, not mice. The current study does not exclude similar DNL343 toxicity. However, previous clinical trials suggest that DNL343 may be safe for humans.

      The suggestion to specify cardiovascular toxicity in dogs has been added (page 3, line 101), thank you. We now include a citation to Craig et al J Med Chem (2024) that provides evidence in a non-human primate (cynomolgus monkey) model that DNL343 dosing did not result in QT prolongation or any functional cardiac changes. We have also completed a Phase 1 (NCT04268784) and Phase 1B double-blind (NCT05006352) trials in healthy and ALS participants, respectively and now include reference to these trials on page 4, lines 102-104. The safety profile observed in these clinical studies supported further development of DNL343 for ALS in the Healey Platform trial (NCT04297683, Regimen G).

    1. eLife assessment

      This fundamental study addresses the question of how certain zooplankton achieve barotaxis, directed locomotion in response to changes in hydraulic pressure. The authors provide compelling evidence that the response involves ciliary photoreceptors interacting with motoneurons. This work should be of broad interest to scientists working on mechanosensation, cilia, locomotion, and photoreceptors.

    2. Joint Public Review:

      In this work, the authors address a fundamental question in the biological physics of many marine organisms, across a range of sizes: what is the mechanism by which they measure and respond to pressure. Such responses are classed under the term "barotaxis", with a specific response termed "barokinesis", in which swimming speed increases with depth (hence with pressure). While macroscopic structures such as gas-filled bladders are known to be relevant in fish, the mechanism for smaller organisms has remained unclear. In this work, the authors use ciliated larvae of the marine annelid Platynereis dumerilii to investigate this question. This organism has previously been of great importance in unravelling the mechanism of multicellular phototaxis associated with a ciliated band of tissue directed by light falling on photoreceptors.

      In the present work, the authors use a bespoke system to apply controlled pressure changes to organisms in water and to monitor their transient response in terms of swimming speed and characteristics of swimming trajectories. They establish that those changes are based on relative pressure, and are reflected in changes in the ciliary beating. Significantly, by imaging neuronal activity during pressure stimulation, it was shown that ciliary photoreceptor cells are activated during the pressure response. That these photoreceptors are implicated in the response was verified by the reduced response of certain mutants, which appear to have defective cilia. Finally, serotinin was implicated in the synaptic response of those neurons.

      This work is an impressive and synergistic combination of a number of different biological and physical probes into this complex problem. The ultimate result, that ciliary photoreceptors are implicated, is fascinating and suggests and interesting interplay between photoreception and pressure detection.

      Future studies ought to address the following three questions opened by this work:

      (1) How the off response to decrease of pressure is mediated

      (2) Which receptor/channel mediates in photoreceptors the response to increased pressure,

      (3) How the integration of light and pressure information is integrated by photoreceptors in order to guide the behavior of the larvae.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      We thank Reviewer #1 for the assessment of our study.

      Reviewer #2:

      The authors should use DF/F to quantify over time the calcium response in photoreceptors. Furthermore, they should show that there is no concern of motion artifact when the pressure changes - as it could be a concern”.

      We used the ΔR/R measure (as defined in Böhm et al. 2016) to correct for motion artifacts due to the larvae moving out of the focal plane at the onset of pressure stimulation. This measure calculates the ratio of the GCaMP signal and a reference fluorescent signal (tdTomato in our case). This ratiometric quantification can better correct for changes in fluorescence that are not related to changes in calcium concentration than the ΔF/F metric, which does not use an independent reference channel.

      The authors have not shown

      (1) how the off response to decrease of pressure is mediated

      (2) which receptor/channel mediates in photoreceptors the response to increased pressure,

      (3) nor how the integration of light and pressure information is integrated by photoreceptors in order to guide the behavior of the larvae.

      These points are beyond the scope of the study. However, if possible within a short time frame, it would be really interesting to find out whether conflicting stimuli or converging stimuli (light & pressure) can cancel each other out or synergize. In particular since the authors cite unpublished results in the discussion: "Our unpublished results indeed suggest that green light determines the direction of swimming and can override upward swimming induced by pressure, which only influences the speed of swimming (LABC and GJ, unpublished)." Showing in one panel this very cool phenomenon would be exciting & open tons of questions for the field.”

      We agree that investigating the interaction of light and pressure is a very exciting direction. However, doing it properly with the rigour we characterised pressure sensation here (across stages, pressure levels and genotypes) and phototaxis and UV avoidance in previous work (across stages, wavelengths, genotypes and stimulus direction; see Randel et al. 2014, Gühmann et al. 2015, Verasztó et al. 2018, Jokura et al. 2023) would require a separate in-depth study.

      We agree with points 1-3 regarding the limitations and mentioned these in the discussion.

      (1) Although we carried out pressure-release experiments to characterise in more detail the response to pressure OFF, our setup did not allow us to control pressure release as accurately as we could for pressure increase. Therefore, we decided not to address this aspect of the response in more detail in this study.

      “Upon a decrease in pressure, three-day-old (but not two-day-old) larvae also show an off-response characterised by downward swimming. We have not analysed in detail the neuronal mechanisms of this response but it may depend on an inverted activation of the cPRC circuit, as happens during UV avoidance (Jokura et al., 2023)”

      (2) We decided not to explore this important question in this study, due to the significant effort it would take to test the expression and function of potential candidate channels in pressure transduction mechanism. “The cellular and molecular mechanisms by which cPRCs sense and transduce changes in hydrostatic pressure deserve further enquiry. “ and “The molecular mechanisms of pressure detection remain unclear. Components of the phototransduction cascade may be involved in pressure sensation. Our results indicate that the ciliary opsin required for detecting UV light is not essential for pressure sensation.“ We hypothesise in the discussion that TRP channels may play a role in pressure transduction, due to their diversity, multiple modalities and participation in phototransduction cascades.

      (3) We considered that the complexity of this question merits a separate study, where both cues can be accurately titrated and temporally combined to dissect the mechanisms of sensory integration. We have therefore removed the sentence referring to the interaction of phototaxis and the pressure response from the discussion.

      “How UV and pressure signals are integrated by the cPRC and how other light responses such as phototaxis interact with pressure responses remain exciting avenues for future research.”

    1. eLife assessment

      This valuable study elucidates the essential role of the chromatin regulator KDM6B in the establishment and maintenance of neural stem cells (NSCs) in the mouse hippocampus. While the evidence supporting the authors' claims is largely solid, a more comprehensive investigation into the cellular and molecular events underlying the loss of hippocampal NSCs would have further strengthened the study. Nonetheless, the findings will be of interest to biologists studying neural development and NSCs.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors have previously studied the function of the lysine demethylase Kdm6b as a positive regulator of neurogenesis from subventricular zone neural precursors. Here they knockout Kdm6b in progenitors of the dentate gyrus and show convincingly that deletion causes precocious differentiation of these stem cells. These data are valuable and show that Kdm6b can have very different functions in distinct populations of neuronal progenitors.

      Strengths:

      Kdm6b has repeatedly been implicated as a positive regulator of differentiation in the cellular transitions where it has been studied before. By contrast, here the authors show convincingly that it is required for maintenance of the stem cell state in the hippocampus, and that Kdm6b deletion is associated with premature stem cell differentiation and a small dentate gyrus in the adult hippocampus. Inducible deletion of Kdm6b in adult hippocampal stem cells confirms the precocious differentiation and loss of this population in the absence of Kdm6b even when induced at this later age.

      Weaknesses:

      This is a surprising finding in light of many other papers that are well-cited by the authors, including their own studies of SVZ progenitors where Kdm6b promotes neuronal differentiation. However, the weakness of the study is that the authors shed very little light on why the effects of Kdm6b would be so different (in fact, largely opposite) in the two stem cell populations they have studied.

    3. Reviewer #2 (Public Review):

      Summary:

      Gil & Lim et al. applied mouse genetic models to study the roles of chromatin regulator KDM6B in regulating the development of the hippocampal dentate gyrus (DG), as well as the establishment and maintenance of DG NSCs. KDM6B is expressed in postnatal DGs. Importantly, conditional knockout of Kdm2b in embryonic DG progenitors leads to a significantly smaller DG with loss of DG NSCs. Hippocampal-dependent behaviors are defective in Kdm6b-cKO mice. Deletion of Kdm6b results in precocious neuronal differentiation and loss of the NSC population in both postnatal and adult DGs. Single-cell RNA-seq reveals disrupted stem cell maintenance gene signature in Kdm6b-deleted NSCs. Moreover, CUT&RUN studies showed that Kdm6b deletion increases H3K27me3 levels at a few NSC maintenance genes.

      Strengths:

      The conclusions of this paper are mostly well supported by data. The discussion is thorough.

      Weaknesses:

      I concur with the two reviewing editors who noted that the paper lacks insights into how KDM6B regulates the expression of NSC genes in DG precursors. Additionally, the authors did not provide evidence regarding whether the function of KDM6B is enzymatically dependent.

      The Kdm6b-cKO brain exhibited apparently smaller DGs, indicating compromised neurogenesis. While the authors observed an increased number of IPCs in the E17.5 DGs (Figure 4B-4C) and an increased number of BrdU+TBR2+PROX1+ cells in the P0.5 DGs (Figure 5B-5C), it is perplexing why this does not lead to an increased number of PROX1+ DG neurons? Further investigation into the cellular mechanisms underlying these events would enhance the understanding of Kdm6b's role in neurogenesis.

      Many data were not of sufficient quality and should be improved.

    4. Reviewer #3 (Public Review):

      Gil et al provide novel evidence that the chromatin regulator KDM6B is important for establishing and maintaining the neural stem cell (NSC) pool within the dentate gyrus in development and adulthood. They show compelling evidence that loss of KDM6B promotes precocious neuronal differentiation, resulting in a failure to establish and maintain the dentate gyrus NSC pool. The strongest evidence they provide is their immunohistochemistry analysis, in which they observed precocious expression of later differentiation markers from cells marked by BrdU. However, given that KDM6B is ubiquitously expressed, it is difficult to ascertain if their dysregulation is due to a direct loss of KDM6B within NSCs or caused by dysregulation of other glial cells impacted by KDM6B loss through the hGFAP-Cre. Characterization of mature glia would strengthen the work.

      They additionally provide evidence of precocious differentiation through scRNA-seq by highlighting key genes that are dysregulated with KDM6B loss. It appears the clustering analysis into cell types was done with WT and KDM6b-depleted cells together. The evidence for precocious differentiation would be greatly strengthened if they instead determined cell-type specific clusters using their WT samples and then observed if fewer cells are characterized as NSCs and more cells align to later developmental stage clusters with KDM6B depletion.

      Gil et al propose that KDM6B loss leads to hippocampus-specific impairments in learning and memory. While KDM6B-depleted mice do show a significant decrease in freezing time in contextual fear conditioning, Figure 2 Supplement 1 shows KDM6B-depleted mice are hyperactive compared to WT in the open field test. Thus, the reduction in freezing could be due to hyperactivity. Plotting freezing time in short bins throughout the duration of the test can help clarify this. It would be additionally helpful to plot the training baseline and the test on the same graph and compare their freezing from baseline to clarify if they completely fail to freeze or show a reduction in freezing compared to the wild-type.

    5. Author response:

      We thank the reviewers for their positive evaluation and constructive comments.  In our revision, we will aim to improve the analysis of our existing data and perform new experiments to address questions raised by the reviewers. 

      Reviewer 1 found it interesting that Kdm6b-deletion in hippocampal dentate gyrus (DG) neural stem cells causes precocious neuronal differentiation, whereas in contrast, Kdm6b is required for the maturation of neural progenitors in the ventricular-subventricular zone (V-SVZ). In the submitted manuscript, we did not provide much insight into the differences in Kdm6b function in these two neural stem cell populations. We plan on performing new experiments and expanding on our prior V-SVZ studies in a way that allows a direct comparison to the analyses of the DG. We hope that the addition of this data will shed light on why Kdm6b-deletion produces such different phenotypes in postnatal neural stem cells of the mouse brain. 

      Reviewer 2 noted that our submitted manuscript lacked insight into how KDM6B regulates gene expression. In particular, this reviewer asked whether the function of KDM6B is mediated by its enzymatic activity. The CUT&RUN experiment in our manuscript revealed an increase in H3K27me3 levels at select neural maintenance genes in the DG of Kdm6b-deleted mice. However, we agree that this data is insufficient to assess the significance of KDM6B-mediated H3K27me3 demethylation in regulating the NSC transcriptome. To address this point, we are performing experiments that can directly test this mechanistic model of KDM6B function and answer the question of whether the H3K27me3 demethylase activity of KDM6B is required for its ability to activate transcription.  Reviewer 2 also had a specific question about the cell types observed in the developing hippocampus after Kdm6b-deletion, and we believe that additional analyses will provide clarity to the overall phenotype.  More generally, we will aim to improve data quality and visualization. 

      Reviewer 3 raised the concern that because Kdm6b is not exclusively expressed in neural stem cells, the phenotype of precocious neuronal differentiation in mice with Kdm6b-deletion driven by the hGFAP-Cre transgene may be indirect, such as through changes in mature glial populations.  We will study the mature glia, as suggested by the reviewer.  We will also more thoroughly describe how our experiments targeting Kdm6b-deletion to adult neural stem cells with the tamoxifen-inducible Nestin-CreER provide evidence for the precocious neuronal differentiation phenotype being cell autonomous, at least in adult mice.  Reviewer 3 also had helpful suggestions for analyzing our single-cell RNA-seq data and behavioral studies, and we will address these comments in the revision. 

      Again, we thank the editors and reviewers for their time and consideration.  We believe that our manuscript will be greatly improved through this review process and hope to construct a stronger understanding of the role of KDM6B in DG neurogenesis.

    1. Reviewer #3 (Public Review):

      Summary:

      The goal of this study was to carry out an in-depth granular and unbiased phenotyping of peripheral blood circulating Tfh specific to two malaria vaccine candidates, PfSEA-1A and PfGARP, and correlate these with age (children vs adults) and protection from malaria (antibody titers against Plasmodium antigens.). The authors further attempted to identify any specific differences in the Tfh responses to these two distinct malaria antigens.

      Strengths:

      The authors had access to peripheral blood samples from children and adults living in a malaria-endemic region of Kenya. The authors studied these samples using in vitro restimulation in the presence of specific malaria antigens. The authors generated a very rich data set from these valuable samples using cutting-edge spectral flow cytometry and a 21-plex panel that included a variety of surface markers, cytokines, and transcription factors.

      Weaknesses:

      - Quantifying antigen-specific T cells by flow cytometry requires the use of either 1- tetramers or 2- in vitro restimulation with specific antigens followed by identification of TCR-activated cells based on de-novo expression of activation markers (e.g. intracellular cytokine staining and/or surface marker staining). Although authors use an in vitro restimulation strategy, they do not focus their study on cells de-novo expressing activation markers as a result of restimulation; therefore, their study is not really on antigen-specific cTfh. Moreover, the authors report no changes in the expression of activation markers commonly used to identify antigen-specific T cells upon in vitro restimulation (including IFNg and CD40L); therefore, it is not clear if their in vitro restimulation with malaria antigens actually worked.

      - CXCR5+CD4+ memory T cells have been shown to present multi-potency and plasticity, capable of differentiating to non-Tfh subsets upon re-challenge. Although authors included in their flow panel a good number of markers commonly used in combination to identify Tfh (CXCR5, PD-1, ICOS, Bcl-6, IL-21), they only used one single marker (CXCR5) as their basis to define Tfh, thus providing a weak definition for Tfh cells and follow up downstream analysis.

      - Previous works have used FACS-sorting and in vitro assays for cytokine production and B cell help to study the functional capacity of different cTfh subsets in blood from Plasmodium-infected individuals. In this study, authors do not carry out any such assays to isolate and evaluate the functional capacity of the different Tfh subsets identified. Thus, all the suggestions for the role that these different cTfh subsets may have in vivo in the context of malaria remain highly hypothetical.

      - The authors have not included malaria unexposed control groups in their study, and experimental groups are relatively small (n=13).

    2. eLife assessment

      Using multiparameter spectral flow cytometry and unbiased clustering analysis, this study provides useful insights into the heterogeneity of antigens-specific circulating T follicular helper cells from children and adults living in malaria-endemic areas of Kenya. Although the study is well-designed, the analysis and interpretation of the potential functional roles for PfSEA-1A- and PfGARP-specific subsets of circulating T follicular helper cells are incomplete.

    3. Reviewer #1 (Public Review):

      Summary:

      This study aims to understand the malaria antigen-specific cTfh profile of children and adults living in a malaria holoendemic area. PBMC samples from children and adults were unstimulated or stimulated with PfSEA-1A or PfGARP in vitro for 6h and analysed by a cTfh-focused panel. Unsupervised clustering and analysis on cTfh were performed.

      The main conclusions are:<br /> (1) the cohort of children has more diverse (cTfh1/2/17) recall responses compared to the cohort of adults (mainly cTfh17) and<br /> (2) Pf-GARP stimulates better cTfh17 responses in adults, thus a promising vaccine candidate.

      Strengths:

      This study is in general well-designed and with excellent data analysis. The use of unsupervised clustering is a nice attempt to understand the heterogeneity of cTfh cells. Figure 9 is a beautiful summary of the findings.

      Weaknesses:

      (1) Most of my concerns are related to using PfSEA-1A and PfGARP to analyse cTfh in vitro stimulation response. In vitro, stimulation on cTfh cells has been frequently used (e.g. Dan et al, PMID: 27342848), usually by antigen stimulation for 9h and analysed CD69/CD40L expression, or 18h and CD25/OX40. However, the authors use a different strategy that has not been validated to analyse in vitro stimulated cTfh. Also, they excluded CD25+ cells which might be activated cTfh. I am concerned about whether the conclusions based on these results are reliable.

      It has been shown that cTfh cells can hardly produce cytokines by Dan et al. However, in this paper, the authors report the significant secretion of IL-4 and IFNg on some cTfh clusters after 6h stimulation. If the stimulation is antigen-specific through TCR, why cTfh1 cells upregulate IL-4 but not IFNg in Figure 6? I believe including the representative FACS plots of IL-4, IFNg, IL21 staining, and using %positive rather than MFI can make the conclusion more convincing. Similarly, the author should validate whether TCR stimulation under their system for 6h can induce robust BCL6/cMAF expression in cTfh cells. Moreover, there is no CD40L expression. Does this mean TCR stimulation mediated BCl6/cMAF upregulation and cytokine secretion precede CD40L expression?

      In summary, I am particularly concerned about the method used to analyse PfSEA-1A and PfGARP-specific cTfh responses because it lacks proper validation. I am unsure if the conclusions related to PfSEA-1A/PfGARP-specific responses are reliable.

      (2) The section between lines 246-269 is confusing. Line 249, comparing the abundance after antigen stimulation is improper because 6h stimulation (under Golgi stop) should not induce cell division. I think the major conclusions are contained in Figure 5e, that (A) antigen stimulation will not alter cell number in each cluster and (B) children have more MC03, 06 and fewer MC02, etc.). The authors should consider removing statements between lines 255-259 because the trends are the same regardless of stimulations.

    4. Reviewer #2 (Public Review):

      Summary:

      Forconi et al explore the heterogeneity of circulating Tfh cell responses in children and adults from malaria-endemic Kenya, and further compare such differences following stimulation with two malaria antigens. In particular, the authors also raised an important consideration for the study of Tfh cells in general, which is the hidden diversity that may exist within the current 'standard' gating strategies for these cells. The utility of multiparametric flow cytometry as well as unbiased clustering analysis provides a potentially potent methodology for exploring this hidden depth. However, the current state of analysis presented does not aid the understanding of this heterogeneity. This main goal of the study could hopefully be achieved by putting all the parameters used in one context, before dissecting such differences into their specific clinical contexts.

      Strengths:

      Understanding the full heterogeneity of Tfh cells in the context of infection is an important topic of interest to the community. The study included clinical groupings such as age group differences and differences in response to different malaria antigens to further highlight context-dependent heterogeneity, which offers new knowledge to the field. However, improvements in data analyses and presentation strategies should be made in order to fully utilize the potential of this study.

      Weaknesses:

      In general, most studies using multiparameter analysis coupled with an unbiased grouping/clustering approach aim to describe differences between all the parameters used for defining groupings, prior to exploring differences between these groupings in specific contexts. However, the authors have opted to separate these into sections using "subset chemokine markers", "surface activation markers" and then "cytokine responses", yet nuances within all three of these major groups were taken into account when defining the various Tfh identities. Thus, it would make sense to show how all of these parameters are associated with one another within one specific context to first logically establish to the readers how can we better define Tfh heterogeneity. When presented this way, some of the identities such as those that are less clear such as "MC03/MC04/ MC05/ MC08" may even be better revealed. once established, all of these clusters can then be subsequently explored in further detail to understand cluster-specific differences in children vs adults, and in the various stimulation conditions. Since the authors also showed that many of the activation markers were not significantly altered post-stimulation thus there is no real obstacle for merging the entire dataset for the first part of this study which is to define Tfh heterogeneity in an unbiased manner regardless of age groups or stimulation conditions. Other studies using similar approaches such as Mathew et al 2020 (doi: 10.1126/science.abc8) or Orecchioni et al 2017 (doi: 10.1038/s41467-017-01015-3) can be referred to for more effective data presentation strategies.

      Accordingly, the expression of cytokines and transcription factors can only be reliably detected following stimulation. However, the underlying background responses need to be taken into account for understanding "true" positive signals. The only raw data for this was shown in the form of of heatmap where no proper ordering was given to ensure that readers can easily interpret the expression of these markers following stimulation relative to no stimulation. Thus, it is difficult to reliably interpret any real differences reported without this. Finally, the authors report differences in either cluster abundance or cluster-specific cytokine/ transcription factor expression in Tfh cell subsets when comparing children vs adults, and between the two malaria antigens. The comparisons of cytokine/transcription factor between groups will be more clearly highlighted by appropriately combining groupings rather than keeping them separate as in Figures 6 and 7.

    1. eLife assessment

      This manuscript makes valuable contributions to our understanding of cell polarisation dynamics and its underlying mechanisms. Through the development of a computational pipeline, the authors provide solid evidence that compensatory actions, whether regulatory or spatial, are essential for the robustness of the polarisation pattern. However, a more comprehensive validation against experimental data and a proper estimation of model parameters are required for further characterization and predictions in natural systems, such as the C. elegans embryo.

    2. Joint Public Review:

      The polarisation phenomenon describes how proteins within a signalling network segregate into different spatial domains. This phenomenon holds fundamental importance in biology, contributing to various cellular processes such as cell migration, cell division, and symmetry breaking in embryonic morphogenesis. In this manuscript, the authors assess the robustness of stable asymmetric patterns using both a previously proposed minimal model of a 2-node network and a more realistic 5-node network based on the C. elegans cell polarisation network, which exhibits anterior-posterior asymmetry. They introduce a computational pipeline for numerically exploring the dynamics of a given reaction-diffusion network and evaluate the stability of a polarisation pattern. Typically, the establishment of polarisation requires the mutual inhibition of two groups of proteins, forming a 2-node antagonistic network. Through a reaction-diffusion formulation, the authors initially demonstrate that the widely-used 2-node antagonistic network for creating polarised patterns fails to maintain the polarised pattern in the face of simple modifications. However, the collapsed polarisation can be restored by combining two or more opposing regulations. The position of the interface can be adjusted with spatially varied kinetic parameters. Furthermore, the authors show that the 5-node network utilised by C. elegans is the most stable for maintaining polarisation against parameter changes, identifying key parameters that impact the position of the interface. While the results offer novel and insightful perspectives on the network's robustness for cell polarisation, the manuscript lacks comprehensive validation against experimental data, justified node-node network interactions, and proper estimation of model parameters (based on quantitative measurements or molecular intensity distributions). These limitations significantly restrict the utility of the model in making meaningful predictions or advancing our understanding of cell polarisation and pattern formation in natural systems, such as the C. elegans embryo.<br /> In more detail, the authors demonstrate that the simplified 2-node model requires precise parameter fine-tuning to maintain stable polarisation. Any single modification to this 2-node network disrupts the polarisation pattern, highlighting the model's lack of robustness. However, stability is achieved when two opposite modifications are applied, which also increases the number of parameter sets that sustain the pattern. This robustness is contingent on monotonic correlations between all system parameters.

      The study extends its significance by examining how cells maintain pattern stability amid spatial parameter variations, which are common in natural systems due to extracellular and intracellular fluctuations. The authors found that in the 2-node network, varying individual parameters spatially disrupt the pattern, but stability is restored with compensatory variations. Additionally, the polarisation interface stabilises around the step transition between parameter values, making its localisation tunable. This suggests a potential biological mechanism where localisation might be regulated through signalling perception.

      Focusing on the C. elegans cell polarisation network, the authors propose a 5-node network based on an exhaustive literature review, summarised in a supplementary table. Using their computational pipeline, they identify several parameter sets capable of achieving stable polarisation and claim that their model replicates experimental behaviour, even when simulating mutants. They also found that among 34 possible network structures, the wild-type network with mutual inhibition is the only one that proves viable in the computational pipeline. Compared with previous studies, which typically considered only 2- or 3-node networks, this analysis provides a more complete and realistic picture of the signalling network behind polarisation in the C. elegans embryo. In particular, the model for C. elegans cell polarisation paves the way for further in silico experiments to investigate the role of the network structure over the polarisation dynamics. The authors suggest that the natural 5-node network of C. elegans is optimised for maintaining cell polarisation, demonstrating the elegance of evolution in finding the optimal network structure to achieve certain functions.

      Noteworthy limitations are also found in this work. To simplify the model for numerical exploration, the authors assume several reactions have equivalent dynamics, reducing the parameter space to three independent dimensions. While the authors briefly acknowledge this limitation in the "Discussion and Conclusion" section, further analysis might be required to understand the implications. For instance, it is not clear how the results depend on the particular choice of parameters. The authors showed that adding additional regulation might disrupt the polarised pattern, with the conclusion apparently depending on the strength of the regulation. Even for the 5-node wild-type network, which is the most robust, adding a strong enough self-activation of [A], as done in the 2-node network, will probably cause the polarised pattern to collapse as well.

      Additionally, the authors utilise parameter values that are unrealistic, fail to provide units for some of them, and assume unknown parameter values without justification. The model appears to have non-dimensionalised length but not time, resulting in a mix of dimensional and non-dimensional variables that can be confusing. Furthermore, they assume equal values for Hill coefficients and many parameters associated with activation and inhibition pathways, while setting inhibition intensity parameters to 1. These arbitrary choices raise concerns about the fidelity of the proposed model in representing the real system, as their selected values could potentially differ by many orders of magnitude from the actual parameters.

      The definition of stability and its evaluation in the proposed pipeline might also be too narrow. Throughout the paper, the authors discuss the stability of the polarised pattern, checked by an exhaustive search of the parameter space where the system reaches a steady state with a polarised pattern instead of a homogeneous pattern. It is not clear if the stability is related to the linear stability analysis of the reaction terms, as conducted in Goehring et al. (Science, 2011), which could indicate if a homogeneous state exists and whether it is stable or unstable. The stability test is performed through a pipeline procedure where they always start from a polarised pattern described by their model and observe how it evolves over time. It is unclear if the conclusions depend on the chosen initial conditions. Particularly, it is unclear what would happen if the initial distribution of posterior molecules is not exactly symmetric with respect to the anterior molecules, or if the initial polarisation is not strong.

      Regarding the biological interpretation and relevance of the model, it overlooks some important aspects of the C. elegans polarisation system. The authors focus solely on a reaction-diffusion formulation to reproduce the polarisation pattern. However, the polarisation of the C. elegans zygote consists of two distinct phases: establishment and maintenance, with actomyosin dynamics playing a crucial role in both phases (see Munro et al., Dev Cell 2004; Shivas & Skop, MBoC 2012; Liu et al., Dev Biol 2010; Wang et al., Nat Cell Biol 2017). Both myosin and actin are crucial to maintaining the localisation of PAR proteins during cell polarisation, yet the authors neglect cortical flows during the establishment phase and any effects driven by myosin and actin in their model, failing to capture the system's complexity. How this affects the proposed model and conclusions about the establishment of the polarisation pattern needs careful discussion. Additionally, they assume that diffusion in the cytoplasm is infinitely fast and that cytoplasmic flows do not play any role in cell polarity. Finite cytoplasmic diffusion combined with cytoplasmic flows could compromise the stability of the anterior-posterior molecular distributions. The authors claim that cytoplasmic diffusion coefficients are two orders of magnitude higher than membrane diffusion coefficients, but they seem to differ by only one order of magnitude (Petrášek et al., Biophys. J. 2008). The strength of cytoplasmic flows has been quantified by a few studies, including Cheeks et al., and Curr Biol 2004.

      Although the authors compare their model predictions to experimental observations, particularly in reproducing mutant behaviours, they do not explicitly show or discuss these comparisons in detail. Diffusion coefficients and off-rates for some PAR proteins have been measured (Goehring et al., JCB 2011), but the authors seem to use parameter values that differ by many orders of magnitude, perhaps due to applied scaling. To ensure meaningful predictions, whether their proposed model captures the extensive published data should be evaluated. Various cellular/genetic perturbations have been studied to understand their effects on anterior-posterior boundary positioning. Testing these perturbations' responses in the model would be important. For example, comparing the intensity distribution of PAR-6 and PAR-2 with measurements during the maintenance phase by Goehring et al., JCB 2011, or comparing the normalised intensity of PAR-3 and PKC-3 from the model with those measured by Wang et al., Nat Cell Biol 2017, during establishment and maintenance phases (in both wild-type and cdc-42 (RNAi) zygotes) could provide insightful validation. Additionally, in the presence of active CDC-42, it has been observed that PAR-6 extends further into the posterior side (Aceto et al., Dev Biol 2006). Conducting such validation tests is essential to convince readers that the model accurately represents the actual system and provides insights into pattern formation during cell polarisation.

      A clear justification, with references, for each network interaction between nodes in the five-node model is needed. Some of the activatory/inhibitory signals proposed by the authors have not been demonstrated (e.g. CDC-42 directly inhibiting CHIN-1). Table S2 provided by the authors is insufficient to justify each node-node interaction, requiring additional explanations. (See the review by Gubieda et al., Phil. Trans. R. Soc. B 2020, for a similar node network that differs from the authors' model.) Additionally, the intensity distributions of cortical PAR-3 and PKC-3 seem to vary significantly during both establishment and maintenance phases (Wang et al., Nat Cell Biol 2017), yet the authors consider the PAR-3/PAR-6/PKC-3 as a single complex. The choices in the model should be justified, as the presence or absence of clustering of these PAR proteins can be crucial during cell polarisation (Wang et al., Nat Cell Biol 2017; Dawes & Munro, Biophys J 2011).

      In summary, the authors successfully demonstrate the importance of compensatory actions in maintaining polarisation robustness. Their computational pipeline offers valuable insights into the dynamics of reaction-diffusion networks. However, the lack of detailed experimental validation and realistic parameter estimation limits the model's applicability to real biological systems. While the study provides a solid foundation, further work is needed to fully characterise and validate the model in natural contexts. This work has the potential to significantly impact the field by providing a new perspective on the robustness of cell polarisation networks.

      The computational pipeline developed could be a valuable tool for further in silico experiments, allowing researchers to explore the dynamics of more complex networks. To maximise its utility, the model needs comprehensive validation and refinement to ensure it accurately represents biological systems. Addressing these limitations, particularly the need for more detailed experimental validation and realistic parameter choices, will enhance the model's predictive power and its applicability to understanding cell polarisation in natural systems.

    1. eLife assessment

      This valuable work advances our understanding of the foraging behaviour of aerial insectivorous birds. Its major strength is the large volume of tracking data and the accuracy of those data. However, the evidence supporting the main claim of optimal foraging is incomplete.

    2. Reviewer #1 (Public Review):

      This study tests whether Little Swifts exhibit optimal foraging, which the data seem to indicate is the case. This is unsurprising as most animals would be expected to optimize the energy income:expenditure ratio; however, it hasn't been explicitly quantified before the way it was in this manuscript.

      The major strength of this work is the sheer volume of tracking data and the accuracy of those data. The ATLAS tracking system really enhanced this study and allowed for pinpoint monitoring of the tracked birds. These data could be used to ask and answer many questions beyond just the one tested here.

      The major weakness of this work lies in the sampling of insect prey abundance at a single point on the landscape, 6.5 km from the colony. This sampling then requires the authors to work under the assumption that prey abundance is simultaneously even across the study region - an assumption that is certainly untrue. The authors recognize this problem and say that sampling in a spatially explicit way was beyond their scope, which I understand, but then at other times try to present this assumption as not being a problem, which it very much is. Further, it is uncertain whether other aspects of the prey data are problematic. For example, the radar only samples insects at 50 m or higher from the ground - how often do Little Swifts forage under 50 m high? Another example might be that the phrases "high abundance" and "low abundance" are often used in the manuscript, but never defined.

      It may be fair to say that prey populations might be correlated over space but are not equal. It is this unknown degree of spatial correlation that lends confidence to the findings in the Results. As such, the finding that Little Swifts forage optimally is indeed supported by the data, notwithstanding some of the shortcomings in the prey abundance data. The authors achieved their aims and the results support their conclusions.

      At its centre, this work adds to our understanding of Little Swift foraging and extends to a greater understanding of aerial insectivores in general. While unsurprising that Little Swifts act as optimal foragers, it is good to have quantified this and show that the population declines observed in so many aerial insectivores are not necessarily a function of inflexible foraging habits. Further, the methods used in this research have great potential for other work. For example, the ATLAS system poses some real advantages and an exciting challenge to existing systems, like MOTUS. The radar that was used to quantify prey abundance also presents exciting possibilities if multiple units could be deployed to get a more spatially-explicit view.

      To improve the context of this work, it is worth noting that the authors suggest that this work is important because it has never been done before for an aerial insectivore; however, that justification is untrue as it has been assessed in several flycatcher and swallow species. A further justification is that this research is needed due to dramatic insect population declines, but the magnitude and extent of such declines are fiercely debated in the literature. Perhaps these justifications are unnecessary, and the work can more simply be couched as just a test of optimality theory.

    3. Reviewer #2 (Public Review):

      Summary:

      Bloch et al. investigate the relationships between aerial foragers (little swifts) tracked with an automated radio-telemetry system (Atlas) and their prey (flying insects) monitored with a small-scale vertical-looking radar device (BirdScan MR1). The aim of the study was to test whether little swifts optimise their foraging with the abundance of their prey. However, the results provided little evidence of optimal foraging behaviour.

      Strengths:

      This study addresses fundamental knowledge gaps on the prey-predator dynamics in the airspace. It describes the coincidence between the abundance of flying insects and features derived from tracking individual swifts.

      Weaknesses:

      The article uses hypotheses broadly derived from optimal foraging theory, but mixes the form of natural selection: parental energetics, parental survival (predation risks), nestling foraging, and breeding success. Results are partly incoherent (e.g., "Thus, even when the birds foraged close to the colony under optimal conditions, the shorter traveling distance is not thought to not confer lower flight-related energetic expenditure because more return trips were made.", L285-287), and confounding factors (e.g., brooding vs. nestling phase) are ignored. Some limits are clearly recognised by the authors (L329 and ff). To illustrate potential confounding effects, the daily flight duration (Prediction 4) should decrease with prey abundance, but how far does the daily flight duration coincide with departure and arrival at sunrise and sunset (note that day length increases between March and May), respectively, and how much do parents vary in the duration of nest attendance during the day across chick ages? To conclude, insufficient analyses are performed to rigorously assess whether little swifts optimize their foraging.

      Filters applied on tracking data are necessary but may strongly influence derived features based on maximum or mean values. Providing sensitivity tests or using features less dependent on extreme values may provide more robust results.

      Radar insect monitoring is incomplete and strongly size-dependent. What is the favourite prey size of swifts? How does it match with BirdScan MR1 monitoring capability?

    1. eLife assessment

      This study investigates the role of Caspar (Casp), an orthologue of human Fas-associated factor-1, in regulating the number of primordial germ cells that form during Drosophila embryogenesis. The findings are important in that they reveal an additional pathway involved in germ cell specification and maintenance. The evidence supporting the conclusions is solid, as the authors identify Casp and its binding partner Transitional endoplasmic reticulum 94 (TER94) as factors that influence germ cell numbers. Minor changes to the title, text, and experimental design are recommended.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors were seeking to define the roles of the Drosophila caspar gene in embryonic development and primordial germ cell (PGC) formation. They demonstrate that PGC number, and the distribution of the germ cell determinant Oskar, change as a result of changes in caspar expression; reduction of caspar reduces PGC number and the domain of Oskar protein expression, while overexpression of caspar does the reverse. They also observe defects in syncytial nuclear divisions in embryos produced from caspar mutant mothers. Previous work from the same group demonstrated that Caspar protein interacts with two partners, TER94 and Vap33. In this paper, they show that maternal knockdown of TER94 results in embryonic lethality and some overlap of phenotypes with reduction of caspar, supporting the idea may work together in their developmental roles. The authors propose models for how Caspar might carry out its developmental functions. The most specific of these is that Caspar and its partners might regulate oskar mRNA stability by recruiting ubiquitin to the translational regulator Smaug.

      Strengths:

      The work identifies a new factor that is involved in PGC specification and points toward an additional pathway that may be involved in establishing and maintaining an appropriate distribution of Oskar at the posterior pole of the embryo. It also ties together earlier observations about the presence of TER94 in the pole plasm that have not heretofore been linked to a function.

      Weaknesses:

      (1) A PiggyBac insertion allele casp[c04227] is used throughout the paper and referred to as a loss-of-function allele (casp[lof]). However, this allele does not appear to act strictly as a loss-of-function. Figure 1E shows that some residual Casp protein is present in early embryos produced by casp[lof]/Df females, and this protein is presumably functional as the PiggyBac insertion does not affect the coding region. Also, Figures 1B and 1C show that the phenotypes of casp[lof] homozygotes and casp[lof]/Df are not the same; surprisingly, the homozygous phenotypes are more severe. These observations are unexplained and inconsistent with the insertion being simply a loss-of-function allele. Might there be a second-site mutation in casp[c04227]?

      (2) TER94 knockdown phenotypes have been previously published (Zhang et al 2018 PMID 30012668), and their effects on embryonic viability and syncytial mitotic divisions were described there. This paper is inappropriately not cited, and the data in Figure 4 should be presented in the context of what has been published before.

      (3) The peptide counts in the mass spectrometry experiment aimed at finding protein partners for Casp are extremely low, except for Casp itself and TER94. Peptide counts of 1-2 seem to me to be of questionable significance.

      (4) The pole bud phenotypes from TER94 knockdown and casp mutant shown in Fig 5 appear to be quite different. These differences are unexplained and seem inconsistent with the model proposed that the two proteins work in a common pathway. Whole embryos should also be shown, as the TER94 KD phenotype could result from a more general dysmorphism.

      (5) Figure 6 is not quantitative, lacking even a second control staining to check for intensity variation artifacts. Therefore it shows that the distribution of Oskar protein changes in the various genotypes, but not convincingly that the level of Oskar changes as the paper claims.

      (6) The error bars are huge in the graphs in Figure 7H, I, and J, leading me to question whether these changes are statistically significant. Calculations of statistical significance are missing from these graphs and need to be added.

      (7) There are many instances of fuzzy and confusing language when describing casp phenotypes. For example, on lines 211-212 it is stated that 'casp[lof] adults are only partially homozygous viable as ~70% embryos laid by the homozygous mutant females failed to hatch into larvae'. Isn't this more accurately described as 'casp[c04227] is a maternal-effect lethal allele with incomplete penetrance'? Another example is on line 1165, what exactly is a 'semi-vital function'?

    3. Reviewer #2 (Public Review):

      Summary:

      This study investigated the role of the Caspar (Casp) gene, a Drosophila homolog of human Fas-associated factor-1. It revealed that maternal loss of Casp led to centrosomal and cytoskeletal abnormalities during nuclear cycles in Drosophila early embryogenesis, resulting in defective gastrulation. Moreover, Casp regulates PGC numbers, likely by regulating the levels of Smaug and then Oskar. They demonstrate that Casp protein levels are linearly correlated to the PGC number. The partner protein TER94, an ER protein, shows similar but slightly distinct phenotypes. Based on the deletion mutant analysis, TER94 seems functionally relevant for the observed Casp phenotype. Additionally, it is likely involved in regulating protein degradation during PGC specification.

      Strengths:

      The paper reveals an unexpected function of the maternally produced Casp gene, previously implicated in immune response regulation and NF-kB signaling inhibition, in nuclear division and PGC formation in early fly embryos. Experiments are properly conducted and strongly support the conclusion. The rescue experiment using deletion mutant form is particularly informative as it suggests the requirement of each domain function.

      Weaknesses:

      Functional relationships among molecules shown here (and other genes known to regulate these processes) are still unclear.

    4. Reviewer #3 (Public Review):

      Summary:

      Das et al. discovered a maternal role for Caspar (Casp), the Drosophila orthologue of human Fas-associated factor-1 (FAF1), in embryonic development and germ cell formation. They find that Casp interacts with Transitional endoplasmic reticulum 94 (TER94). Loss of Casp or TER94 leads to partial embryonic lethality, correlated with aberrant centrosome behavior and cytoskeletal abnormalities. This suggests that Casp, along with TER94, promotes embryonic development through a still unidentified mechanism. They also find that Casp regulates germ cell number by controlling a key determinant of germ cell formation, Oskar, through its negative regulator, Smaug.

      Strengths:

      Overall, the experiments are well-conducted, and the conclusions of this paper are mostly well-supported by data.

      Weaknesses:

      Some additional controls could be included, and the language could be clarified for accuracy.

    1. eLife assessment

      This important study investigates the influence of the cingulate cortex on the development of the social vocalizations of marmoset monkeys by making bilateral lesions of this brain area in neonatal animals. The evidence supporting the authors' claims is solid, although including longer-term effects and different social contexts would strengthen the manuscript. The work will be of broad interest to cognitive neuroscientists, speech and language researchers, and primate neuroscientists.

    2. Reviewer #1 (Public Review):

      Summary:

      This study seeks to quantify changes in vocal behavior during development in marmosets with bilateral anterior cingulate cortex (ACC) lesions. The ACC and its role in social vocal behaviors are of great interest given previous literature on its involvement in the initiation of vocalizations, processing emotional content, and its connectivity to two other critical nodes in the vocal network, the amygdala and the PAG. The authors seek to test the hypothesis that the ACC contributes to the development of mature vocal behaviors during the first few weeks of life by disrupting this process with neonatal ACC lesions. Imaging and histological analyses confirm the extent of the lesion and suggest downstream effects in connected regions. Analysis of call rates and call type proportions show no or slight differences between lesioned and controlled animals. Additional analyses on the proportion of grouped 'social' calls and certain acoustic features of a particular call, the phee, reveal more distinct differences between the groups.

      Strengths:

      The authors have identified that ACC lesions in early life have no or little influence on certain aspects of vocal behavior (e.g. call rate, call intervals) but larger impacts on other aspects (e.g. acoustic features of phee calls). This data is a valuable addition to the literature on the effects of the ACC on vocal production.

      The histological methods and resulting quantification of neural changes in the lesioned area and in downstream areas of interest are intriguing given the large time gap between the lesion and these analyses.

      Weaknesses:

      The article emphasizes vocal social behavior but none of the experiments involve a social element. Marmosets are recorded in isolation which could be sufficient for examining the development of vocal behavior in that particular context. However, the early-life maturation of vocal behavior is strongly influenced by social interactions with conspecifics. For example, the transition of cries and subharmonic phees which are high-entropy calls to more low-entropy mature phees is affected by social reinforcement from the parents. And this effect extends cross-context where differences in these interaction patterns extend to vocal behavior when the marmosets are alone. From the chord diagrams, cries still consist of a significant proportion of call types in lesioned animals. Additionally, though it is an intriguing finding that the infants' phee calls have acoustic differences being 'blunted of variation, less diverse and more regular,' the suggestion that the social message conveyed by these infants was 'deficient, limited, and/or indiscriminate' is not but can be tested with, for example, playback experiments.

      The manuscript would benefit from the addition of more details to be able to better determine if the conclusions are well supported by the data. Understanding that this is very difficult data to get, the number of marmosets and some variability in the collection of the data would allow for the plotting of each individual across figures. For example, in the behavioral figures, which is the marmoset that is in the behavioral data that has a sparing of the ACC lesion in one hemisphere? Certain figures, described below in the recommendations for the authors, could also do with additional description.

    3. Reviewer #2 (Public Review):

      Summary:

      Nagarajan et al. investigate the role of the anterior cingulate cortex (ACC) in vocal development of infant marmoset monkeys using lesions in this brain area. Many previous studies show that ACC plays an important role in volitional and emotion-driven vocal behavior in mammals. The experiments Nagarajan et al. performed strengthen the long-standing hypothesis that ACC influences the development of social-vocal behavior in non-human primates. Furthermore, their anatomical studies support the idea of cortical structures exerting cognitive control over subcortical networks for innate vocalization, and thus, enabling mammals to perform flexible social-vocal communication.

      Strengths:

      Many invasive behavioral studies in monkeys often times use 2-3 animals. The authors used a sufficiently high number of animals for their experiments. This increases the power of their conclusions.<br /> The study also investigates the impact of ACC lesions on downstream areas important for innate vocal production. This adds further evidence to the role of ACC in influencing these subcortical regions during vocal development and vocal behavior in general.

      Weaknesses:

      The authors state that the integrity of white matter tracts at the injection site was impacted but do not show data.

      The study only provides data up to the 6th week after birth. Given the plasticity of the cortex, it would be interesting to see if these impairments in vocal behavior persist throughout adulthood or if the lesioned marmosets will recover their social-vocal behavior compared to the control animals.

      Even though this study focuses entirely on the development of social vocalizations, providing data about altered social non-vocal behaviors that accompany ACC lesions is missing. This data can provide further insights and generate new hypotheses about the exact role of ACC in social-vocal development. For example, do these marmosets behave differently towards their conspecifics or family members and vice versa, and is this an alternate cause for the observed changes in social-vocal development?

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Nagarajan et al. study the impact of early damage to the anterior cingulate cortex (ACC) on the vocal development of marmoset monkeys. AAC lesions were performed on neonatal marmosets and their vocal patterns and the spectrotemporal features of their calls were analyzed compared to control groups during the first six weeks of life. While the vocal repertoire was not significantly affected by ACC lesions, the authors described notable differences in the social contact call, the phee call. Marmosets with ACC damage made fewer social contact calls, and when they did, these calls were shorter, louder, and monotonic. Additionally, the study revealed that ACC damage in infancy led to permanent alterations in downstream brain areas involved in social vocalizations, such as the amygdala and periaqueductal gray.

      Strengths:

      This study suggests that the ACC plays a crucial role in the normal development of social vocal behavior in infant marmosets. Studying vocal behavior in marmosets can provide insights into the neural mechanisms underlying human speech and communication disorders due to their similarity in brain structure and social behavior.

      The methods are robust and reliable with precise localization of the lesions with neuroimaging and histological examination.

      Weaknesses:

      It is striking to find that the vocal repertoire of infant marmosets was not significantly affected by ACC lesions. During development, the neural circuits are still maturing and the role of different brain regions may evolve over time. While the ACC likely contributes to vocalizations across the lifespan, its relative importance may vary depending on the developmental stage. In neonates, vocalizations may be more reflexive or driven by physiological needs. At this stage, the ACC may play a role in basic socioemotional regulation but may not be as critical for vocal production. Since the animals lived for two years, further analysis might be helpful to elucidate the precise role of ACC in the vocal behavior of marmosets.

      - Figure 3D. According to the Introduction "...infant ACC lesions abolish the characteristic cries that infants normally issue when separated from its mother". Are the present results in marmosets showing the opposite effect? Please discuss.

      - Figure 3E and Discussion. Phees are mature contact calls and cries immature contact calls (Zhang et al, 2019, Nat Commun). Therefore, I would rather say that the proportion of immature (cries) contact calls increases vs the mature (phee, trill, twitters) contact calls in the ACC group. Cries are also "isolated-induced contact calls" to attract the attention of the caregivers.

      - Figure 4D. Animal location and head direction within the recording incubator can have significant effects on the perceived amplitude of a call. Were these factors taken into account?

      - Figure 4E. When a phee call has a higher amplitude, as is the case for the ACC group (Figure 4D), the energy of the signal will be concentrated more strongly at the phee call frequency ~8KHz. This concentration of the energy reduces the variability in the frequency distribution, leading to lower entropy. The interpretation of the results should be reconsidered. A faint call (control group) can exhibit more variability in the frequency content since the energy is distributed across a wider range of frequencies contributing to higher entropy. It can still be "fixed, regular, and stereotyped" if the behavior is consistent or predictable with little variation. Also, to define ACC calls as "monotonic" I would rather search for the lack of frequency modulation, amplitude variation, or narrower bandwidth.

      - Apart from the changes in the vocal behavior, did the AAC lesions manifest in any other observable cognitive, emotional, or social behavior? ACC plays a role in processing pain and modulating pain perception. Could that be the reason for the observed increase in the proportion of cries in the ACC group and the increase in the phee call amplitude? Did the cries in the ACC group also display a higher amplitude than the cries in the control group?

      - Discussion. Louder calls have the potential to travel longer distances compared to fainter calls, possess higher energy levels, and can propagate through the environment more effectively. If the ACC group produced louder phee syllables, how could be the message conveyed over long distances "deficient, limited, and/or indiscriminate"?

    1. eLife assessment

      This study provides an important re-evaluation of modality-specific information processing in the thalamus of trained mice. Using an elegant task design that probes competing tactile and visual stimuli, the authors present convincing evidence that behavioral training reshapes the sensitivity of higher-order thalamic nuclei. Despite the innovative methods and significant findings, the conclusions would be strengthened by deeper analyses of the sensory and non-sensory aspects of the modulation of the higher-order thalamic nuclei.

    2. Reviewer #1 (Public Review):

      Petty and Bruno investigate how response characteristics in the higher-order thalamic nuclei POm (typically somatosensory) and LP (typically visual) change when a stimulus (whisker air puff or visual drifting grating) of one or the other modality is conditioned to a reward. Using a two-step training procedure, they developed an elegant paradigm, where the distractor stimulus is completely uninformative about the reward, which is reflected in the licking behavior of trained mice. While the animals seem to take on to the tactile stimulus more readily, they can also associate the reward with the visual stimulus, ignoring tactile stimuli. In trained mice, the authors recorded single-unit responses in both POm and LP while presenting the same stimuli. The authors first focused on POm recordings, finding that in animals with tactile conditioning POm units specifically responded to the air puff stimulus but not the visual grating. Unexpectedly, in visually conditioned animals, POm units also responded to the visual grating, suggesting that the responses are not modality-specific but more related to behavioral relevance. These effects seem not not be homogeneously distributed across POm, whereas lateral units maintain tactile specificity and medial units respond more flexibly. The authors further ask if the unexpected cross-modal responses might result from behavioral activity signatures. By regressing behavior-coupled activity out of the responses, they show that late activity indeed can be related to whisking, licking, and pupil size measures. However, cross-modal short latency responses are not clearly related to animal behavior. Finally, LP neurons also seem to change their modality-specificity dependent on conditioning, whereas tactile responses are attenuated in LP if the animal is conditioned to visual stimuli.

      The authors make a compelling case that POm neurons are less modality-specific than typically assumed. The training paradigm, employed methods, and analyses are mostly to the point, well supporting the conclusions. The findings importantly widen our understanding of higher-order thalamus processing features with the flexibility to encode multiple modalities and behavioral relevance. The results raise many important questions on the brain-wide representation of conditioned stimuli. E.g. how specific are the responses to the conditioned stimuli? Are thalamic cross-modal neurons recruited for the specific conditioned stimulus or do their responses reflect a more global shift of attention from one modality to another?

      To elaborate on higher-order thalamic activity in relationship to conditioned behavior, a trial-by-trial analysis would be very useful. Is neuronal activity predictive of licking and at which relative timing? Furthermore, I wonder why the (in my mind) major and from the data obvious take-away, "POm neurons respond more strongly to visual stimuli if visually conditioned", is not directly tested in the summary statistics in Figure 3h.

      The remaining early visual responses in POm in visually conditioned mice after removing behavior-linked activity are very convincing (Figure 5d). It would help, however, to see a representation of this on a single-neuron basis side-by-side. Are individual neurons just coupled to behavior while others are independent, or is behaviorally coupled activity a homogeneous effect on all neurons on top of sensory activity?

      The conclusions on flexible response characteristics in LP in general are less strongly supported than those in POm. First, the differentiation between POm and LP relies heavily on the histological alignment of labeled probe depth and recording channel, possibly allowing for wrong assignment. furthermore, it seems surprising, but is not discussed, that putative LP neurons have such strong responses to the air puff stimuli, in both conditioning cases. In tactile conditioning, LP air puff responses seem to be even faster and stronger than POm. In visual conditioning, drifting grating responses paradoxically seem to be later than in tactile conditioning (Fig S2e). These differences in response changes between POm and LP should be discussed in more detail and statements of "similar phenomena" in POm and LP (abstract) should be qualified.

    3. Reviewer #2 (Public Review):

      Summary

      This manuscript by Petty and Bruno delves into the still poorly understood role of higher-order thalamic nuclei in the encoding of sensory information by examining the activity in the Pom and LP cells in mice performing an associative learning task. They developed an elegant paradigm in which they conditioned head-fixed mice to attend to a stimulus of one sensory modality (visual or tactile) and ignore a second stimulus of the other modality. They recorded simultaneously from POm and LP, using 64-channel electrode arrays, to reveal the context-dependency of the firing activity of cells in higher-order thalamic nuclei. They concluded that behavioral training reshapes activity in these secondary thalamic nuclei. I have no major concerns with the manuscript's conclusions, but some important methodological details are lacking and I feel the manuscript could be improved with the following revisions.

      Strengths

      The authors developed an original and elegant paradigm in which they conditioned head-fixed mice to attend to a stimulus of one sensory modality, either visual or tactile, and ignore a second stimulus of the other modality. As a tactile stimulus, they applied gentle air puffs on the distal part of the vibrissae, ensuring that the stimulus was innocuous and therefore none aversive which is crucial in their study.

      It is commonly viewed that the first-order thalamus performs filtering and re-encoding of the sensory flow; in contrast, the computations taking place in high-order nuclei are poorly understood. They may contribute to cognitive functions. By integrating top-down control, high-order nuclei may participate in generating updated models of the environment based on sensory activity; how this can take place is a key question that Petty and Bruno addressed in the present study.

      Weaknesses

      (1) Overall, methods, results, and discussion, involving sensory responses, especially for the Pom, are confusing. I have the feeling that throughout the manuscript, the authors are dealing with the sensory and non-sensory aspects of the modulation of the firing activity in the Pom and LP, without a clear definition of what they examined. Making subsections in the results, or a better naming of what is analyzed could convey the authors' message in a clearer way, e.g., baseline, stim-on, reward.

      In line #502 in Methods, the authors defined "Sensory Responses. We examined each cell's putative sensory response by comparing its firing rate during a "stimulus period" to its baseline firing rate. We first excluded overlapping stimuli, defined as any stimulus occurring within 6 seconds of a stimulus of a different type. We then counted the number of spikes that occurred within 1 second prior to the onset of each stimulus (baseline period) and within one second of the stimulus onset (stimulus period). The period within +/-50ms of the stimulus was considered ambiguous and excluded from analysis."

      Considering that the responses to whisker deflection, while weak and delayed, were shown to occur, when present, before 50 ms in the Pom (Diamond et al., 1992), it is not clear what the authors mean and consider as "Sensory Responses"?

      Precise wording may help to clarify the message. For instance, line #134: "Of cells from tactilely conditioned mice, 175 (50.4%) significantly responded to the air puff, as defined by having a firing rate significantly different from baseline within one second from air puff onset (Figure 3d, bottom)", could be written "significantly responded to the air puff" should be written "significantly increased (or modified if some decreased) their firing rate within one second after the air puff onset (baseline: ...)". This will avoid any confusion with the sensory responses per se.

      (2) To extend the previous concern, the latency of the modulation of the firing rate of the Pom cells for each modality and each conditioning may be an issue. This latency, given in Figure S2, is rather long, i.e. particularly late latencies for the whisker system, which is completely in favor of non-sensory "responses" per se and the authors' hypothesis that sensory-, arousal-, and movement-evoked activity in Pom are shaped by associative learning. Latency is a key point in this study.

      Therefore,<br /> - latencies should be given in the main text, and Figure S2 could be considered for a main figure, at least panels c, d, and e, could be part of Figure 3.

      - the Figure S2b points out rather short latency responses to the air puff, at least in some cells, in addition to late ones. The manuscript would highly benefit from an analysis of both early and late latency components of the "responses" to air puffs and drafting grating in both conditions. This analysis may definitely help to clarify the authors' message. Since the authors performed unit recordings, these data are accessible.

      - it would be highly instructive to examine the latency of the modulation of Pom cells firing rate in parallel with the onset of each behavior, i.e. modification of pupil radius, whisking amplitude, lick rate (Figures 1e, g and 3a, b). The Figure 1 does not provide the latency of the licks in conditioned mice.

      - the authors mention in the discussion low-latency responses, e.g., line #299: "In both tactilely and visually conditioned mice, movement could not explain the increased firing rate at air puff onset. These low-latency responses across conditioning groups is likely due in part to "true" sensory responses driven by S1 and SpVi."; line #306: "Like POm, LP displayed varied stimulus-evoked activity that was heavily dependent on conditioning. LP responded to the air puff robustly and with low latency, despite lacking direct somatosensory inputs."<br /> But which low-latency responses do the authors refer to? Again, this points out that a robust analysis of these latencies is missing in the manuscript but would be helpful to conclude.

      (3) Anatomical locations of recordings in the dorsal part of the thalamus. Line #122 "Our recordings covered most of the volume of POm but were clustered primarily in the anterior and medial portions of LP (Figure 2d-f). Cells that were within 50 µm of a region border were excluded from analysis."<br /> How did the authors distinguish the anterior boundary of the LP with the LD nucleus just more anterior to the LP, another higher-order nucleus, where whisker-responsive cells have been isolated (Bezdudnaya and Keller, 2008)?

      (4) The mention in the Methods about the approval by an ethics committee is missing.<br /> All the surgery (line #381), i.e., for the implant, the craniotomy, as well as the perfusion, are performed under isoflurane. But isoflurane induces narcosis only and not proper anesthesia. The mention of the use of analgesia is missing.

    4. Reviewer #3 (Public Review):

      Petty and Bruno ask whether activity in secondary thalamic nuclei depends on the behavioral relevance of stimulus modality. They recorded from POm and LP, but the weight of the paper is skewed toward POm. They use two cohorts of mice (N=11 and 12), recorded in both nuclei using multi-electrode arrays, while being trained to lick to either a tactile stimulus (air puff against whiskers, first cohort) or a visual stimulus (drifting grating, second cohort), and ignore the respective other. They find that both nuclei, while primarily responsive to their 'home' modality, are more responsive to the relevant modality (i.e. the modality predicting reward).

      Strengths:

      The paper asks an important question, it is timely and is very well executed. The behavioral method using a delayed lick index (excluding impulsive responses) is well worked out. Electrophysiology methods are state-of-the-art with information about spike quality in Figure S1. The main result is novel and important, convincingly conveying the point that encoding of secondary thalamic nuclei is flexible and clearly includes aspects of the behavioral relevance of a stimulus. The paper explores the mapping of responses within POm, pointing to a complex functional structure, something that has been reported/suggested in earlier studies.

      Weaknesses:

      Coding: It does not become clear to which aspect of the task POm/LP is responding. There is a motor-related response (whisking, licking, pupil), which, however, after regressing it out leaves a remaining response that the authors speculate could be sensory.

      Learning: The paper talks a lot about 'learning', although it is only indirectly addressed. The authors use two differently (over-)trained mice cohorts rather than studying e.g. a rule switch in one and the same mouse, which would allow us to directly assess whether it is the same neurons that undergo rule-dependent encoding.

      Mapping: The authors treat and interpret the two nuclei very much in the same vein, although there are clear differences. I would think these differences are mentioned in passing but could be discussed in more depth. Mapping using responses on electrode tracks is done in POm but not LP.

    1. eLife assessment

      This study presents convincing evidence of the role of an intestine-released neuropeptide, FLP-2, in the oxidative stress response of C. elegans, as well as for the neural circuit pathway that regulates its release in response to sensing reactive oxygen species (i.e., H2O2). These valuable results advance the understanding of gut-brain signaling and the neural circuit basis of behavioral responses to stress.

    2. Reviewer #1 (Public Review):

      Summary:

      The main goal of the paper was to identify signals that activate FLP-1 release from AIY neurons in response to H2O2, previously shown by the authors to be an important oxidative stress response in the worm.

      Strengths:

      This study builds upon the authors' previous work (Jia and Sieburth 2021) by further elucidating the gut-derived signaling mechanisms that coordinate the organism-wide antioxidant stress response in C. elegans.

      By detailing how environmental cues like oxidative stress are transduced into gut-derived peptidergic signals, this study represents a valuable advancement in understanding the integrated physiological responses governed by the gut-brain axis.

      This work provides valuable mechanistic insights into the gut-specific regulation of the FLP-2 peptide signal.

      Weaknesses:

      Although the authors identify intestinal FLP-2 as the endocrine signal important for regulating the secretion of the neuronal antioxidant neuropeptide, FLP-1, there is no effort made to identify how FLP-2 levels regulate FLP-1 secretion or identify whether this regulation is occurring directly through the AIY neuron or indirectly. This is brought up in the discussion, but identifying a target for FLP-2 in this pathway seems like a crucial missing piece of information in characterizing this pathway.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The core findings demonstrate that the neuropeptide-like protein FLP-2, released from the intestine of C. elegans, is essential for activating the intestinal oxidative stress response. This process is mediated by endogenous hydrogen peroxide (H2O2), which is produced in the mitochondrial matrix by superoxide dismutases SOD-1 and SOD-3. H2O2 facilitates FLP-2 secretion through the activation of protein kinase C family member pkc-2 and the SNAP25 family member aex-4. The study further elucidates that FLP-2 signaling potentiates the release of the antioxidant FLP-1 neuropeptide from neurons, highlighting a bidirectional signaling mechanism between the intestine and the nervous system.

      Strengths:

      This study presents a significant contribution to the understanding of the gut-brain axis and its role in oxidative stress response and significantly advances our understanding of the intricate mechanisms underlying the gut-brain axis's role in oxidative stress response. By elucidating the role of FLP-2 and its regulation by H2O2, the study provides insights into the molecular basis of inter-tissue communication and antioxidant defense in C. elegans. These findings could have broader implications for understanding similar pathways in more complex organisms, potentially offering new targets for therapeutic intervention in diseases related to oxidative stress and aging.

      Weaknesses:

      (1)The experimental techniques employed in the study were somewhat simple and could benefit from the incorporation of more advanced methodologies.

      (2)The weak identification of the key receptors mediating the interaction between FLP-2 and AIY neurons, as well as the receptors in the gut that respond to FLP-1.

      (3)The study could be improved by incorporating a sensor for the direct measurement of hydrogen peroxide levels.

    1. eLife assessment

      This important study by Franziska Auer and colleagues examines cerebellar Purkinje cells' role in controlling posture in larval zebrafish using the innovative chemogenetic tool TRPV1/capsaicin. This work will interest neuroscientists studying motor control and cerebellar function. Overall, solid evidence is presented showing that disrupting Purkinje cell function impairs balance in the pitch axis and that this cell population encodes tilt direction. At the same time, some conclusions require more data or better statistical analysis.

    2. Reviewer #1 (Public Review):

      This study uses a variety of approaches to explore the role of the cerebellum, and in particular Purkinje cells (PCs), in the development of postural control in larval zebrafish. A chemogenetic approach is used to either ablate PCs or disrupt their normal activity and a powerful, high-throughput behavioural tracking system then enables quantitative assessment of swim kinematics. Using this strategy, convincing evidence is presented that PCs are required for normal postural control in the pitch axis. Calcium imaging further shows that PCs encode tilt direction. Evidence is also presented that suggests the role of the cerebellum changes over the course of early development, although this claim is rather less robust in the current version of the paper. Finally, the authors build on their prior work showing that both axial muscles and pectoral fins contribute to "climbs" and show evidence that suggests PCs are required for correct engagement of the fins during this behaviour. Overall, establishing a role for the cerebellum in postural control is not very surprising. However, a clear motivation of this study was to establish a robust experimental platform to investigate the changing role of cerebellar circuits in the development of postural control in the highly experimentally accessible zebrafish larvae, and in this regard, the authors have certainly succeeded.

      Overall, I consider this an excellent paper, with some room for improvement in aspects of presentation, discussion, and some aspects of the data analysis..

    3. Reviewer #2 (Public Review):

      Summary:

      Franziska Auer et al. investigate the role of cerebellar Purkinje cells in controlling posture in larval zebrafish using the chemogenetic tool TRPV1/capsaicin to bidirectionally manipulate (i.e., activate or ablate) these cells. This tool has been developed for zebrafish previously but has not been applied to Purkinje cells.

      High-throughput behavioral experiments are presented to monitor how body posture is affected by these perturbations. The analysis of postural control focuses on a specific subaspect of posture: the body tilt-angle relative to horizontal just before a swim bout is executed, quantified separately for pre-ascent and pre-dive bouts. They report a broad bimodal distribution of pre-ascent bout posture ranging from -20 to +40 degrees, while the pre-dive bout posture was more Gaussian, ranging between -40 and 0 degrees. The treatment effect is quantified as the change in the median of these distributions.

      Purkinje cell activation and ablation in 7 days post-fertilization (dpf) fish shifted the median of the ascending bout posture distributions to positive values. The authors hypothesize that the stochastic nature of the activation process might desynchronize Purkinje cell activity, thus abolishing Purkinje cells' role in postural control, similar to ablation. However, this does not explain why dive bout posture decreased upon activation but was unaffected by ablation.

      To test whether the role of Purkinje cells in postural control matures over development, the authors repeated the ablation experiments at 14 dpf. They state that "at 14 dpf, the effects of Purkinje cell lesions on posture were more widespread than at 7 dpf." However, this effect size is comparable to that observed at 7 dpf, suggesting no further maturation of the role of Purkinje cells in pre-ascending bout postural control. The median pre-dive bout posture decreased at 14 dpf, contrasting with no effect at 7 dpf, yet this change was comparable in effect size to the activation effect on Purkinje cells at 7 dpf. The current data breadth may not be sufficient to conclude that signatures of emerging cerebellar control of posture across early development were uncovered.

      The study's exploration of activating Purkinje cells in freely swimming fish using TRPV1/capsaicin is of special interest, but the practicability of this method is unclear from the current presentation. It would be beneficial to present the distribution of the percentage of activatable Purkinje cells across animals and time points to provide insight into the method's efficiency. Discussing this limitation and potential improvements would aid in evaluating the method, especially since the authors report that the activation experiments were labor-intensive, limiting repeat experiments. This may explain why the activation experiment at 7 dpf is the only data presented with cell activation, with other analyses performed using the cell ablation capabilities of the TRPV1/capsaicin method. Another data point at 14dpf would significantly strengthen the conclusions.

      The authors analyze Purkinje cell-controlled fin-trunk coordination by examining ascending bout posture across different swim bout speeds. They make the important finding that pectoral fin movements contribute significant lift for median and fast swim bouts but not for slow ones, and that Purkinje cell ablation disrupts lift generation at all speeds.

      Finally, the authors examined whether Purkinje cell activity encodes postural tilt-angle by performing calcium imaging on 31 cells from 8 fish using their Tilt In Place Microscope (TIPM). They report that they could decode the tilt-angle from individual neurons with a highly tuned response, and also from neurons that were not obviously tuned when pooling them and analyzing the population response. However, due to the non-simultaneous recordings across animals, definitive conclusions about population-level encoding should be made cautiously, it might be better to suggest potential population encoding that needs confirmation with more targeted experiments involving simultaneous recordings.

      Strengths:

      - The study introduces a novel application of the chemogenetic tool TRPV1/capsaicin to study cerebellar function in zebrafish.

      - High-throughput behavioral experiments provide detailed analysis of postural control.

      - The further investigation of Purkinje cell-controlled fin-trunk coordination offers new insights into motor control mechanisms.

      - The use of calcium imaging to decode postural tilt-angle from Purkinje cell activity presents interesting preliminary results on neuronal population encoding.

      Weaknesses:

      - The term "disruption" for postural control effects may lead to misleading expectations.

      - The supporting data show only subtle median shifts in postural angle, raising questions about the significance of observed effects. Statistical methods that account for the hierarchical structure of the data might be required to support the conclusions.

      - The study's data breadth may not be sufficient to conclude emerging cerebellar postural control across early development.

      - The current presentation does not adequately detail the practicability and efficiency of the TRPV1/capsaicin method for activating Purkinje cells, and the labor-intensive nature of these experiments constrains the ability to replicate and validate the findings.

      - Non-simultaneous recordings in calcium imaging necessitate cautious interpretation of population-level encoding results.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper uses a new chemogenetic tool to investigate the role of cerebellar Purkinje cells in postural control. Using a high-throughput behavioral assay, they show that activation or ablation of Purkinje cells affects various aspects of postural control in zebrafish larvae during spontaneous swimming and that the effects are more pronounced at later developmental time points, where the Purkinje cell number is much greater. Using a sophisticated imaging assay, they record Purkinje cell activity in response to the tilt of the fish and show that some Purkinje cells are tuned to tilt direction and that the direction can even be decoded from untuned neurons.

      Strengths:

      Overall the study is nice, using a range of tools to address a fundamental question about the role of the cerebellum in postural control in fish.

      Weaknesses:

      (1) The data in Figure 1 that establishes the method seems to be based on a very small number of experiments and lacks some statistical analysis.

      (2) The choice and presentation of the statistical and analysis methods used in Figures 2-5 could be improved.

    1. eLife assessment

      In this valuable study, Li et al., set out to understand the mechanisms of audiovisual temporal recalibration - the brain's ability to adjust to the latency differences that emerge due to different (distance-dependent) transduction latencies of auditory and visual signals - through psychophysical measurements and modelling. The analysis supports a role for causal inference in recalibration, though the evidence is incomplete.

    2. Reviewer #1 (Public Review):

      This study asks whether the phenomenon of crossmodal temporal recalibration, i.e. the adjustment of time perception by consistent temporal mismatches across the senses, can be explained by the concept of multisensory causal inference. In particular, they ask whether the explanation offered by causal inference better explains temporal recalibration better than a model assuming that crossmodal stimuli are always integrated, regardless of how discrepant they are.

      The study is motivated by previous work in the spatial domain, where it has been shown consistently across studies that the use of crossmodal spatial information is explained by the concept of multisensory causal inference. It is also motivated by the observation that the behavioral data showcasing temporal recalibration feature nonlinearities that, by their nature, cannot be explained by a fixed integration model (sometimes also called mandatory fusion).

      To probe this the authors implemented a sophisticated experiment that probed temporal recalibration in several sessions. They then fit the data using the two classes of candidate models and rely on model criteria to provide evidence for their conclusion. The study is sophisticated, conceptually and technically state-of-the-art, and theoretically grounded. The data clearly support the authors' conclusions.

      I find the conceptual advance somewhat limited. First, by design, the fixed integration model cannot explain data with a nonlinear dependency on multisensory discrepancy, as already explained in many studies on spatial multisensory perception. Hence, it is not surprising that the causal inference model better fits the data. Second, and again similar to studies on spatial paradigms, the causal inference model fails to predict the behavioral data for large discrepancies. The model predictions in Figure 5 show the (expected) vanishing recalibration for large delta, while the behavioral data don't' decay to zero. Either the range of tested SOAs is too small to show that both the model and data converge to the same vanishing effect at large SOAs, or the model's formula is not the best for explaining the data. Again, the studies using spatial paradigms have the same problem, but in my view, this poses the most interesting question here.

      In my view there is nothing generally wrong with the study, it does extend the 'known' to another type of paradigm. However, it covers little new ground on the conceptual side.

      On that note, the small sample size of n=10 is likely not an issue, but still, it is on the very low end for this type of study.

    3. Reviewer #2 (Public Review):

      Summary:

      Li et al.'s goal is to understand the mechanisms of audiovisual temporal recalibration. This is an interesting challenge that the brain readily solves in order to compensate for real-world latency differences in the time of arrival of audio/visual signals. To do this they perform a 3-phase recalibration experiment on 9 observers that involves a temporal order judgment (TOJ) pretest and posttest (in which observers are required to judge whether an auditory and visual stimulus were coincident, auditory leading or visual leading) and a conditioning phase in which participants are exposed to a sequence of AV stimuli with a particular temporal disparity. Participants are required to monitor both streams of information for infrequent oddballs, before being tested again in the TOJ, although this time there are 3 conditioning trials for every 1 TOJ trial. Like many previous studies, they demonstrate that conditioning stimuli shift the point of subjective simultaneity (pss) in the direction of the exposure sequence.

      These shifts are modest - maxing out at around -50 ms for auditory leading sequences and slightly less than that for visual leading sequences. Similar effects are observed even for the longest offsets where it seems unlikely listeners would perceive the stimuli as synchronous (and therefore under a causal inference model you might intuitively expect no recalibration, and indeed simulations in Figure 5 seem to predict exactly that which isn't what most of their human observers did). Overall I think their data contribute evidence that a causal inference step is likely included within the process of recalibration.

      Strengths:

      The manuscript performs comprehensive testing over 9 days and 100s of trials and accompanies this with mathematical models to explain the data. The paper is reasonably clearly written and the data appear to support the conclusions.

      Weaknesses:

      While I believe the data contribute evidence that a causal inference step is likely included within the process of recalibration, this to my mind is not a mechanism but might be seen more as a logical checkpoint to determine whether whatever underlying neuronal mechanism actually instantiates the recalibration should be triggered.

      The authors' causal inference model strongly predicts that there should be no recalibration for stimuli at 0.7 ms offset, yet only 3/9 participants appear to show this effect. They note that a significant difference in their design and that of others is the inclusion of longer lags, which are unlikely to originate from the same source, but don't offer any explanation for this key difference between their data and the predictions of a causal inference model.

      I'm also not completely convinced that the causal inference model isn't 'best' simply because it has sufficient free parameters to capture the noise in the data. The tested models do not (I think) have equivalent complexity - the causal inference model fits best, but has more parameters with which to fit the data. Moreover, while it fits 'best', is it a good model? Figure S6 is useful in this regard but is not completely clear - are the red dots the actual data or the causal inference prediction? This suggests that it does fit the data very well, but is this based on predicting held-out data, or is it just that by having more parameters it can better capture the noise? Similarly, S7 is a potentially useful figure but it's not clear what is data and what are model predictions (what are the differences between each row for each participant; are they two different models or pre-test post-test or data and model prediction?!).

      I'm not an expert on the implementation of such models but my reading of the supplemental methods is that the model is fit using all the data rather than fit and tested on held-out data. This seems problematic.

      I would have liked to have seen more individual participant data (which is currently in the supplemental materials, albeit in a not very clear manner as discussed above).

      The way that S3 is described in the text (line 141) makes it sound like everyone was in the same direction, however, it is clear that 2 /9 listeners show the opposite pattern, and 2 have confidence intervals close to zero (albeit on the -ve side).

    4. Reviewer #3 (Public Review):

      Summary:

      Li et al. describe an audiovisual temporal recalibration experiment in which participants perform baseline sessions of ternary order judgments about audiovisual stimulus pairs with various stimulus-onset asynchronies (SOAs). These are followed by adaptation at several adapting SOAs (each on a different day), followed by post-adaptation sessions to assess changes in psychometric functions. The key novelty is the formal specification and application/fit of a causal-inference model for the perception of relative timing, providing simulated predictions for the complete set of psychometric functions both pre and post-adaptation.

      Strengths:

      (1) Formal models are preferable to vague theoretical statements about a process, and prior to this work, certain accounts of temporal recalibration (specifically those that do not rely on a population code) had only qualitative theoretical statements to explain how/why the magnitude of recalibration changes non-linearly with the stimulus-onset asynchrony of the adaptor.

      (2) The experiment is appropriate, the methods are well described, and the average model prediction is a fairly good match to the average data (Figure 4). Conclusions may be overstated slightly, but seem to be essentially supported by the data and modelling.

      (3) The work should be impactful. There seems a good chance that this will become the go-to modelling framework for those exploring non-population-code accounts of temporal recalibration (or comparing them with population-code accounts).

      (4) A key issue for the generality of the model, specifically in terms of recalibration asymmetries reported by other authors that are inconsistent with those reported here, is properly acknowledged in the discussion.

      Weaknesses:

      (1) The evidence for the model comes in two forms. First, two trends in the data (non-linearity and asymmetry) are illustrated, and the model is shown to be capable of delivering patterns like these. Second, the model is compared, via AIC, to three other models. However, the main comparison models are clearly not going to fit the data very well, so the fact that the new model fits better does not seem all that compelling. I would suggest that the authors consider a comparison with the atheoretical model they use to first illustrate the data (in Figure 2). This model fits all sessions but with complete freedom to move the bias around (whereas the new model constrains the way bias changes via a principled account). The atheoretical model will obviously fit better, but will have many more free parameters, so a comparison via AIC/BIC or similar should be informative.

      (2) It does not appear that some key comparisons have been subjected to appropriate inferential statistical tests. Specifically, lines 196-207 - presumably this is the mean (and SD or SE) change in AIC between models across the group of 9 observers. So are these differences actually significant, for example via t-test?

      (3) The manuscript tends to gloss over the population-code account of temporal recalibration, which can already provide a quantitative account of how the magnitude of recalibration varies with adaptor SOA. This could be better acknowledged, and the features a population code may struggle with (asymmetry?) are considered.

      (4) The engagement with relevant past literature seems a little thin. Firstly, papers that have applied causal inference modelling to judgments of relative timing are overlooked (see references below). There should be greater clarity regarding how the modelling here builds on or differs from these previous papers (most obviously in terms of additionally modelling the recalibration process, but other details may vary too). Secondly, there is no discussion of previous findings like that in Fujisaki et al.'s seminal work on recalibration, where the spatial overlap of the audio and visual events didn't seem to matter (although admittedly this was an N = 2 control experiment). This kind of finding would seem relevant to a causal inference account.

      References:<br /> Magnotti JF, Ma WJ and Beauchamp MS (2013) Causal inference of asynchronous audiovisual speech. Front. Psychol. 4:798. doi: 10.3389/fpsyg.2013.00798<br /> Sato, Y. (2021). Comparing Bayesian models for simultaneity judgement with different causal assumptions. J. Math. Psychol., 102, 102521.

      (5) As a minor point, the model relies on simulation, which may limit its take-up/application by others in the field.

      (6) There is little in the way of reassurance regarding the model's identifiability and recoverability. The authors might for example consider some parameter recovery simulations or similar.

      (7) I don't recall any statements about open science and the availability of code and data.

    1. eLife assessment

      This valuable study provides evidence that during learning of a simple detection task, the change in the rate of spike bursts is a signal that is distinct from the change in firing rate, and suggests that the change in bursting is more correlated with learning than other measures of change in activity. However, the evidence for the claim that bursting contributes to learning and attention is currently incomplete, because the authors did not take into account the potentially differential effects of learning-related changes in movement on bursting compared to non-burst spike events, and there is no meaningful way to measure attention in their task. Also, the study used an artificial microstimulation as the stimulus, which limits the generalization of these results to normal sensory-motor learning.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper, the authors' study aimed to test existing theories on the role of bursting in learning and attention. They find evidence for both. It is not clear how these two can be reconciled, but this is one of the first studies to explicitly test recent theories of spike multiplexing in the brain. This will pave the way for future investigations, both experimental and theoretical.

      Strengths:

      (1) A key strength of this study is the fact that it aims to test existing theories of spike multiplexing, finding support for both attention-like and learning-like signals.

      (2) The task setup is of particular interest to brain-machine interfaces, and how such setups trigger learning and attention mechanisms.

      Weaknesses:

      (1) The fact that the teaching signal is an (artificial) stimulation of the primary sensory cortex, makes it unclear how applicable are these results to a more general understanding of learning and attention in the brain.

      (2) It would have been useful to more directly compare the results obtained here with existing burst-dependent computational models of learning and attention. This is particularly important since there appears to be an interaction between learning and sharpening signals.

      (3) There are inherent limitations in our current ability to read out bursting and non-bursting signals, this is a brave first attempt, but at this point, it is unclear how can one robustly read out this information from noisy data.

    3. Reviewer #2 (Public Review):

      Naud et al investigate whether single spikes and bursts encode different information in behavior. To do this, they reanalyze juxtasomal recordings of deep-layer cortical neurons from behaving rats collected in two previous studies by Doron et al. Rats were trained (in a Go-NoGo design) to lick a spout for a water reward in response to electrical microstimulation of the primary somatosensory cortex, which rats quickly learn to do in a single day. Juxtasomal recordings near the site of micro stimuli are then divided up into single spikes ("events") versus high-frequency bursts ("bursts"). Training results in the appearance of bursts, which do not seem to correlate with the rate of events, suggesting that bursts and events carry different information. While the fraction of bursts is elevated during Hit trials, errors appear to uniquely trigger additional bursts. The distribution of burst times appears to shift from long after the stimulus (early in training) to shortly after the stimulus (later in training). Bursts of layer 5 pyramidal neurons in particular are associated with apical tuft activity that could enhance plasticity. The observed increased bursting is therefore suggestive of a potential mechanism by which errors engage plasticity.

      This paper has substantial strengths: the experiments appear to be well performed, the dataset is substantial, and the questions and phenomena are interesting.

      The exclusion of fast-spike (inhibitory) data, which the experiments seem to have generated, is a weakness as these data could have provided an important control. If the bursts here reflect apical dendrite activity, the same phenomena might be absent in inhibitory cells as they lack apical tufts.

      Another weakness is the need to better control movement, which could be an alternative explanation to the top-down modulation of apicals that the authors suspect. For example, the bursts on error trials could be due to the animals moving more when an error occurs. Layer 5 of the somatosensory cortex has increased activity during whisking or body movements. If the mouse fidgets out of frustration that the reward has not occurred or whisks more, bursts are highly likely due to less exotic purely bottom-up inputs.

    4. Reviewer #3 (Public Review):

      Summary:

      The burst fraction neural code has conceptual interest but has been little examined in vivo. This study examines and compares the burst fraction, the standard firing rate (firing rate) code, and the related event fraction (event rate) code using published data from an experiment where rats learned to lick after detecting electrical microstimulation in the somatosensory (barrel) cortex. Analyzing single-neuron spiking responses, the study reports that the burst fraction identifies more and different neurons showing the effects of training than the firing rate. The study further claims that the burst fraction (1) most promptly responded to false-negative detection errors, (2) during further training of trained animals (from 80% to 90% accuracy, over five days), correlates with behavioral accuracy, and (3) by shifting earlier to align with the (relatively constant) event rate modulation, leads to the observed sharpened firing rate response during this further training. The study concludes that 'a fine-grained separation of spike timing patterns [into burst fraction, firing rate, and event rate] reveals two signals,' an error signal and a sharpening signal.

      Strengths:

      The burst fraction is shown to discern more (and somewhat different) cells showing significant responses in trained animals and also to reveal a larger absolute difference in the fraction of responsive cells between naïve and trained animals. The Poisson model analysis particularly convincingly shows that the firing rate alone cannot explain either the spiking pattern or the prevalence of burst fraction-ON cells, thereby furnishing strong evidence that the burst fraction conveys independent information from the firing rate. The demonstration of error signals on miss trials in all three neural codes (burst fraction, firing rate, event rate) is interesting. It is also interesting to see that neural responses broadly shift earlier for animals even during further training in an already 'expert' stage and that the burst fraction correlates with further accuracy increases.

      Weaknesses:

      The evidence is inadequate for the burst fraction as responding more promptly to missed trials.

      This key claim seems to rest solely on the timing of the first bins in Figure 3B showing statistically significant differences. This reasoning implicitly draws inferences from the lack of statistical differences, which cannot support a positive claim in general. Specifically, here, the burst fraction is calculated with a division, which can magnify small differences and impact the power of statistical tests. If I trace back from the first bin showing significant differences to the first bin the signal starts rising, the timing seems to be comparable for all three neural codes (~1.6 s).

      Pertinently, what is the statistical test used in Figure 3B? A parametric test may be inappropriate for the burst fraction, a ratio that like does not fulfill the normality assumption. An inappropriate test would compound the problem of concluding from the lack of (early) significant differences.

      The evidence that burst fraction is responsible for sharpening is opaque due to insufficient statistical reporting. Specifically, it seems there is a correlation between firing rate and accuracy that is reported as non-significant.

      Changes in the reaction times (or other movement parameters) over-training may confound the correlation of the burst fraction to the accuracy and firing rate sharpening during further training. Lack of control for changes in movement over training weakens the results.

      The claim of independence of burst fraction and event rate/firing rate information is too strong. The authors show a significant negative correlation between burst fraction and firing rate (2D).

      The claim that there is no 'functional reorganization' beyond day two is too strong. Although this claim is not a core one to the study, it derives from an absence of statistical significance, especially problematic here as the effect sizes are large. For example, the Spearman correlation is 0.67/0.87 for the analyses with burst fraction. With only five data points, even strong effects may not achieve statistical significance, making negative conclusions problematic. Further, how are the p-values calculated (if using a parametric test, are the assumptions met), and why should these analyses use Spearman's correlation when analogous analyses in Figure 4E, F use Pearson's r?

      Does the burst fraction correlate with accuracy in cross-training?

      If the burst fraction correlates with accuracy, it should be expected to do so also when the animals progress from the naïve to the trained stage. Moreover, the correlation in Figure 4E can benefit from strengthening as it is now supported by only five points, is driven by only three 'clusters,' and only represents a narrow range of accuracies. If the data is available for this analysis, it should be done to test and potentially strengthen the main claim of the study.

      The text and figures contain numerous ambiguities that need to be clarified. These do not include obvious typos, only elements that affect conceptual understanding.

      - Some key terms in the main claims are never defined. For example, in the title, it is unclear what 'fast' and 'transients' mean. The abstract uses, but the main text never defines, 'demultiplexing,' 'a *conjunctive* burst code,' 'sparse and succinct [sic],' and 'correlated more *globally*.'

      - Some paper components are un(der)explained and, sometimes, apparently internally inconsistent. For example, in Figure 1I, the fraction of firing rate-ON cells does not look like the 6% shown in Figure 1J, left. In Figure 2E-G, what is the total cell number, 279, in Figure 2G legend, why is it different from the 153 total cells in Figure 2E legend, and what is the 'n = 5' within Figure 2G? All n numbers should be explained in general; more examples include the 245 in Figure 3C and the 49 in Figure 3B. In Figure 3C, what is the top horizontal bar (I assume significant differences)? About catch trials, the Figure 3D legend says rewards are given on licks, but the text says licking was not rewarded; which is the case? Figure 4B legend says 'firing rate (left), burst fraction (middle) and event rate (right),' but the plot colors imply a different order.

      - The abstract states, 'The alignment of bursting and event rate modulation [...] was strongly associated [sic] behavioral accuracy.' It seems to me it is not the alignment of burst fraction and event rate but rather burst fraction per se that correlates with behavioral accuracy (Figure 4E right). At least, the latter correlation is the only one tested.

    1. eLife assessment

      This important study advances our understanding of how FGF13 variants confer seizure susceptibility. By acting in a set of inhibitory interneurons, FGF13 regulates synaptic transmission and excitability. The data presented here are convincing and combine cell type-specific knockouts and electrophysiology, complemented by histology/RNA studies. Collectively, this research will be of interest to a wide audience, particularly those involved in the study of epilepsy, inhibitory neurons, and ion channels.

    2. Reviewer #1 (Public Review):

      Summary:

      A subset of fibroblast growth factor (FGF) proteins (FGF11-FGF14; often referred to as fibroblast growth factor homologous factors because they are not thought to be secreted and do not seem to act as growth factors) have been implicated in modulating neuronal excitability, however, the exact mechanisms are unclear. In part, this is because it is unclear how different FGF isoforms alter ion channel activity in different neuronal populations. In this study, the authors explore the role of FGF 13 in epilepsy using a variety of FGF13 knock-out mouse models, including several targeted cell-type specific conditional knockout mouse lines. The study is intriguing as it indicates that FGF13 plays an especially important role in inhibitory neurons. Furthermore, although FGF13 has been studied as a regulator of neuronal voltage-gated sodium channels, the authors present data indicating that FGF13 knockout in inhibitory neurons induces seizures not by altering sodium current properties but by reducing voltage-gated potassium currents in inhibitory neurons. While intriguing, the data are incomplete in several aspects and thus the mechanisms by which various FGF13 variants induce Developmental and Epileptic Encephalopathies are not resolved by the data presented.

      Strengths:

      A major strength is the array of techniques used to assess the mice and the electrical activity of the neurons.

      The multiple mouse knock-out models utilized are a strength, clearly demonstrating that FGF13 expression in inhibitory neurons, and possibly specific sub-populations of inhibitory neurons, is critically important.

      The data on the increased sensitivity to febrile seizures in KO mice are very nice, provide clear evidence for regulation of excitability in inhibitory neurons by FGF13.

      The Gad2Fgf13-KO mice indicated that several Fgf13 splice variants may be expressed in inhibitory neurons and suggest that the Fgf13-VY splice variants may have previously unrecognized specific roles in regulating neuronal excitability.

      The data on males and females from the various KO mice lines indicates a clear gene dosage effect for this X-linked gene.

      The unbiased metabolomic analysis supports the assertion that Fgf13 expression in inhibitory neurons is important in regulating seizure susceptibility.

      Weaknesses:

      The knockout approach can be powerful but also has distinct limitations. Multiple missense mutations in FGF13-S have been identified. The knockout models employed here are not appropriate for understanding how these missense variants lead to altered neuronal excitability. While the data show that complete loss of Fgf13 from excitatory forebrain neurons is not sufficient to induce seizure susceptibility, it does not rule out that specific variants (e.g., R11C) might alter the excitability of forebrain neurons. The missense variants may alter excitatory and/or inhibitory neuron excitability in distinct ways from a full FGF13 knockout.

      The electrophysiological experiments are intriguing but not comprehensive enough to support all of the conclusions regarding how FGF13 modulates neuronal excitability.

      Another concern is the use of different ages of neurons for different experiments. For example, sodium currents in Figures 2 and 5 (and Supplemental Figures 2 and 7) are recorded from cultured neurons, which may have very different properties (including changes in sodium channel complexes) from neurons in vivo that drive the development of seizure activity.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors address three primary questions:

      (1) how FGF13 variants confer seizure susceptibility,<br /> (2) the specific cell types involved, and<br /> (3) the underlying mechanisms, particularly regarding Nav dysfunction.

      They use different Cre drivers to generate cell type-specific knockouts (KOs). First, using Nestin-Cre to create a whole-brain Fgf13 KO, they observed spontaneous seizures and premature death. While KO of Fgf13 in excitatory neurons does not lead to spontaneous seizures, KO in inhibitory neurons recapitulates the seizures and premature death observed in the Nestin-Cre KO. They further narrow down the critical cell type to MGE-derived interneurons (INs), demonstrating that MGE-neuron-specific KO partially reproduces the observed phenotypes. "All interneuron" KOs exhibit deficits in synaptic transmission and interneuron excitability, not seen in excitatory neuron-specific KOs. Finally, they rescue the defects in the interneuron-specific KO by expressing specific Fgf13 isoforms. This is an elegant and important study adding to our knowledge of mechanisms that contribute to seizures.

      Strengths

      • The study provides much-needed cell type-specific KO models.<br /> • The authors use appropriate Cre lines and characterize the phenotypes of the different KOs.<br /> • The metabolomic analysis complements the rest of the data effectively.<br /> • The study confirms and extends previous research using improved approaches (KO lines vs. in vitro KD or antibody infusion).<br /> • The methods and analyses are robust and well-executed.

      Weaknesses

      • One weakness lies in the use of the Nkx2.1 line (instead of Nkx2.1CreER) in the paper. As a result, some answers to key questions are incomplete. For instance, it remains unclear whether the observed effects are due to Chandelier cells or NGFCs, potentially both MGE and CGE derived, explaining why Nkx2.1 alone does not fully replicate the overall inhibitory KO. Using Nkx2.1CreER could have helped address the cell specificity. With the Nkx2.1 line used in the paper, the answer is partial.

      • While the mechanism behind the reduced inhibitory drive in the IN-specific KO is suggested to be presynaptic, the chosen method does not allow them to exactly identify the mechanisms (spontaneous vs mEPSC/mIPSC), and whether it is a loss of inhibitory synapses (potentially axo-axonic) or release probability.

      • Some supporting data (e.g. Supplemental Figure 7 and 8) appear to come from only one (or two) WT and one (or two) KO mice. Supplementary data, like main data, should come from at least three mice in total to be considered complete/solid (even if the statistical analysis is done with cells).

      General Assessment

      The general conclusions of this paper are supported by data. As it is, the claim that "these results enhance our understanding of the molecular mechanisms that drive the pathogenesis of Fgf13-related seizures" is partially supported. A more cautious term may be more appropriate, as the study shows the mechanism is not Nav-mediated and suggests alternative mechanisms without unambiguously identifying them. The conclusion that the findings "expand our understanding of FGF13 functions in different neuron subsets" is supported, although somewhat overstated, as the work is not conclusive about the exact neuron subtypes. However, it does indeed show differential functions for specific neuronal classes, which is a significant result.

      Impact and Utility

      This paper is undoubtedly valuable. Understanding that excitatory neurons are not the primary contributors to the observed phenotypes is crucial. The finding that the effects are not MGE-unique is also important. This work provides a solid foundation for further research and will be a useful resource for future studies.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors aimed to determine the mechanism by which seizures emerge in Developmental and Epileptic Encephalopathies caused by variants in the gene FGF13. Loss of FGF13 in excitatory neurons had no effect on seizure phenotype as compared to the loss of FGF13 in GABAergic interneurons, which in contrast caused a dramatic proseizure phenotype and early death in these animals. They were able to show that Fgf13 ablation and consequent loss of FGF13-S and FGF13-VY reduced overall inhibitory input from Fgf13-expressing interneurons onto hippocampal pyramidal neurons. This was shown to occur not via disruption to voltage-gated sodium channels but rather by reducing potassium currents and action potential repolarisation in these interneurons.

      Strengths:

      The authors employed multiple well-validated, novel mouse lines with FGF13 knocked out in specific cell types including all neurons, all excitatory cells, all GABAergic interneurons, or a subset of MGE-derived interneurons, including axo-axonic chandelier cells. The phenotypes of each of these four mouse lines were carefully characterised to reveal clear differences with the most fundamental being that Interneuron-targeted deletion of FGF13 led to perinatal mortality associated with extensive seizures and impaired the hippocampal inhibitory/excitatory balance while deletion of FGF13 in excitatory neurons caused no detectable seizures and no survival deficits.

      The authors made excellent use of western blotting and in situ hybridisation of the different FGF13 isoforms to determine which isoforms are expressed in which cell types, with FGF3-S predominantly in excitatory neurons and FGF13-VY and FGF13-V predominantly in GABAergic neurons.

      The authors performed a highly detailed electrophysiological analysis of excitatory neurons and GABAergic interneurons with FGF13 deficits using whole-cell patch clamp. This enabled them to show that FGF13 removal did not affect voltage-gated sodium channels in interneurons, but rather reduced the action of potassium channels, with the resultant effect of making it more likely that interneurons enter depolarisation block. These findings were strengthened by the demonstration that viral re-expression of different Fgf13 splice isoforms could partially rescue deficits in interneuron action potential output and restore K+ channel current size.

      Additionally, the discussion was nuanced and demonstrated how the current findings resolved previous apparent contradictions in the field involving the function of FGF13.

      These findings will have a significant impact on our understanding of how FGF13 causes seizures and death in DEEs, and the action of different FGF13 isoforms within different neuronal cell types, particularly GABAergic interneurons.

    1. eLife assessment

      This study presented a valuable inventory in scoring a neuropsychological test, ROCFT. The level of evidence is compelling. The authors constructed large samples from multi-center international researchers and tested the model using internet data with excellent performance. Their deep learning method could potentially apply to neuropsychological tests as well as other related fields.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors aimed to develop and validate an automated, deep learning-based system for scoring the Rey-Osterrieth Complex Figure Test (ROCF), a widely used tool in neuropsychology for assessing memory deficits. Their goal was to overcome the limitations of manual scoring, such as subjectivity and time consumption, by creating a model that provides automatic, accurate, objective, and efficient assessments of memory deterioration in individuals with various neurological and psychiatric conditions.

      Strengths:

      Comprehensive Data Collection:<br /> The authors collected over 20,000 hand-drawn ROCF images from a wide demographic and geographic range, ensuring a robust and diverse dataset. This extensive data collection is critical for training a generalizable and effective deep learning model.

      Advanced Deep Learning Approach:<br /> Utilizing a multi-head convolutional neural network to automate ROCF scoring represents a sophisticated application of current AI technologies. This approach allows for detailed analysis of individual figure elements, potentially increasing the accuracy and reliability of assessments.

      Validation and Performance Assessment:<br /> The model's performance was rigorously evaluated against crowdsourced human intelligence and professional clinician scores, demonstrating its ability to outperform both groups. The inclusion of an independent prospective validation study further strengthens the credibility of the results.

      Robustness Analysis Efficacy:<br /> The model underwent a thorough robustness analysis, testing its adaptability to variations in rotation, perspective, brightness, and contrast. Such meticulous examination ensures the model's consistent performance across different clinical imaging scenarios, significantly bolstering its utility for real-world applications.

      Weaknesses:

      Insufficient Network Analysis for Explainability:<br /> The paper does not sufficiently delve into network analysis to determine whether the model's predictions are based on accurately identifying and matching the 18 items of the ROCF or if they rely on global, item-irrelevant features. This gap in analysis limits our understanding of the model's decision-making process and its clinical relevance.

      Generative Model Consideration:<br /> The critique suggests exploring generative models to model the joint distribution of images and scores, which could offer deeper insights into the relationship between scores and specific visual-spatial disabilities. The absence of this consideration in the study is seen as a missed opportunity to enhance the model's explainability and clinical utility.

      Appraisal and discussion:<br /> By leveraging a comprehensive dataset and employing advanced deep learning techniques, they demonstrated the model's ability to outperform both crowdsourced raters and professional clinicians in scoring the ROCF. This achievement represents a significant step forward in automating neuropsychological assessments, potentially revolutionizing how memory deficits are evaluated in clinical settings. Furthermore, the application of deep learning to clinical neuropsychology opens avenues for future research, including the potential automation of other neuropsychological tests and the integration of AI tools into clinical practice. The success of this project may encourage further exploration into how AI can be leveraged to improve diagnostic accuracy and efficiency in healthcare.

      However, the critique regarding the lack of detailed analysis across different patient demographics, the inadequacy of network explainability, and concerns about the selection of median crowdsourced scores as ground truth raises questions about the completeness of their objectives. These aspects suggest that while the aims were achieved to a considerable extent, there are areas of improvement that could make the results more robust and the conclusions stronger.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to develop and validate a machine-learning-driven neural network capable of automatic scoring of the Rey-Osterrieth Complex Figure. They aimed to further assess the robustness of the model to various parameters such as tilt and perspective shift in real drawings. The authors leveraged the use of a huge sample of lay workers in scoring figures and also a large sample of trained clinicians to score a subsample of figures. Overall, the authors found their model to have exceptional accuracy and perform similarly to crowdsourced workers and clinicians with, in some cases, less degree of error/score dispersion than clinicians.

      Strengths:

      The authors used very large data; including a large number of Rey-Osterrieth Complex Figures, a huge crowdsourced human worker sample, and a large clinician sample.

      The authors deeply describe their model in relatively accessible terms.

      The writing style of the paper is accessible, scientific, and thorough.

      Pre-registration of the prospectively collected new data was acceptable.

      Weaknesses:

      There is no detail on how the final scoring app can be accessed and whether it is medical device-regulated.

      No discussion on the difference in sample sizes between the pre-registration of the prospective study and the results (e.g., aimed for 500 neurological patients but reported data from 288).

      Details in pre-registration and paper regarding samples obtained in the prospective study were lacking.

      Demographics for the assessment of the representation of healthy and non-healthy participants were not present.

      The authors achieved their aims and their results and conclusions are supported by strong methods and analyses. The resulting app produced in this work, if suitable for clinical practice, will have impact in automated scoring, which many clinicians will be exceptionally happy with.

    4. Reviewer #3 (Public Review):

      Summary:

      This study presented a valuable inventory of scoring a neuropsychological test, ROCFT, with constructing an artificial intelligence model.

      Strengths:

      They constructed huge samples collected among multi-center international researchers and tested the model precisely using internet data.<br /> The model scored the test with excellent ability, surpassing even experts. The product can run an application on a tablet, which helps clinicians and patients.<br /> Their method of building the model of deep learning and testing will apply to tests in all fields, not just the psychological field.

      Weaknesses:

      The considerable effort and cost to make the model only for an existing neuropsychological test.

    1. eLife assessment

      This work identifies the molecular function of an orphan human transporter, SLC35G1, providing convincing but somewhat incomplete evidence that this protein is involved in intestinal citrate absorption. This work provides important insight into transporter function and human physiology.

    1. eLife assessment

      This is a valuable study on the diffusion rates of drug molecules in human-derived cells, highlighting that their diffusion behavior depends on their charged state. It proposes that blocking drug protonation enhances diffusion and fractional recovery, suggesting improved intracellular availability of weakly basic drugs. The correlation between pKa and intracellular diffusion is solid and well-supported, but the study would benefit from a more rigorous statistical treatment and a balanced comparison across different types of compounds. Despite these limitations, the findings are significant for drug design and understanding the biophysical behavior of small molecules in cells.

    1. eLife assessment

      This useful study draws on published single-cell and spatial transcriptomic data of colon cancer liver metastasis to clarify the pro- and anti-tumorigenic properties of NK cells. The authors discover increased GZMK+ resting NK cells in the tumor tissue and reduced abundance of KIR2DL4+ activated NK cells. However, the evidence is currently incomplete, as the models used to validate the hypothesis and claims are inadequate and lack necessary controls.