26,869 Matching Annotations
  1. Mar 2024
    1. Reviewer #2 (Public Review):

      Summary:

      This paper examines the recruitment of the inflammasome seeding pattern recognition receptor NLRP3 to the Golgi. Previously, electrostatic interactions between the polybasic region of NLRP3 and negatively charged lipids were implicated in membrane association. The current study reports that reversible S-acylation of the conserved Cys-130 residue, in conjunction with upstream hydrophobic residues plus the polybasic region, act together to promote Golgi localization of NLRP3, although additional parts of the protein are needed for full Golgi localization. Treatment with the bacterial ionophore nigericin inhibits membrane traffic and prevents Golgi-associated thioesterases from removing the acyl chain, causing NLRP3 to become immobilized at the Golgi. This mechanism is put forth as an explanation for how NLRP3 is activated in response to nigericin.

      Strengths:

      The experiments are generally well presented. It seems likely that Cys-130 does indeed play a previously unappreciated role in the membrane association of NLRP3.

      Weaknesses:

      The interpretations about the effects of nigericin are less convincing. Specific comments follow.

      (1) The experiments of Figure 4 bring into question whether Cys-130 is S-acylated. For Cys-130, S-acylation was seen only upon expression of a severely truncated piece of the protein in conjunction with overexpression of ZDHHC3. How do the authors reconcile this result with the rest of the story?

      (2) Nigericin seems to cause fragmentation and vesiculation of the Golgi. That effect complicates the interpretations. For example, the FRAP experiment of Figure 5 is problematic because the authors neglected to show that the FRAP recovery kinetics of non-acylated resident Golgi proteins are unaffected by nigericin. Similarly, the colocalization analysis in Figure 6 is less than persuasive when considering that nigericin significantly alters Golgi structure and could indirectly affect colocalization.

    1. Reviewer #1 (Public Review):

      Summary<br /> In this work, Mouelhi et al investigated how the nucleus responds to long term confinement. They find that short-term confinement does not affect nuclear volume, whereas long-term confinement leads to a decrease in volume. The authors propose this decrease occurs after mitosis and relies on cPLA2 and myosin contractility.

      Strengths

      The ability to accurately control cell confinement allows authors to determine its effects on cellular function with high resolution. This provides a good addition to the existing collection of tools used for cellular micromanipulation. The results provided are relevant and timely and could help understand how cancer cells adapt to conditions of confinement.

      Weaknesses

      I have a few concerns which I believe should be addressed:

      (1) It is unclear whether the authors took into consideration the contribution of nuclear blebs for nuclear volume measurements. This would be particularly relevant in situations of very strong confinement. Blebs were previously shown to affect volume (Mistriotis et al., JCB 2019). One could argue that the decreased nuclear volume was due to the increased blebbing observed in very strong confinements.

      (2) From their experimental setup, it is unclear whether the reduced nuclear volume observed after confined cell division arises from a geometrical constraint or is due to an intrinsic nuclear feature. One could argue that cells exiting mitosis under confinement have clustered chromosomes and, therefore, will have decreased volume. This would imply that the nucleus is not "reset" but rather that a geometrical constraint is forcing nuclei to be smaller. One way to test this would be to follow individual cells under confinement, let them enter mitosis, and then release the confinement. If, under these conditions, the daughter nuclei are smaller, then it supports their model. If daughter nuclei recover to their initial value, then it´s simply due to a geometrical constraint that forces the clustering of chromosomes and the reassembly of the NE in a confined space.

      (3) The authors claim that the nucleus adapts to confinement based on evidence that the nucleus no longer shrinks in the second division following the first division. I would argue no further decrease is possible because the DNA is already compacted in the smallest possible volume. If indeed nuclei are in a new homeostatic state as the authors claim, then one would expect nuclei to remain smaller even after confinement is removed. This analysis is missing.

      (4) Also, if the authors want to claim that this is a mechanism used for cancer cells to adapt to confined situations as the title says, they need to show that normal, near-diploid cells do not behave in the same way. This analysis is missing.

      (5) Authors state that "Loss of nuclear blebs is clearly linked to mitosis, suggesting that nuclear volume and nuclear envelope tension are tightly coupled, and supports the hypothesis that mitosis is a key regulator of nuclear envelope tension". I have a few issues with the way this sentence is written. Firstly, one could say that all nuclear structures (and not only blebs) are lost during mitosis because the nucleus disassembles. Hence, the new homeostatic state could be determined by envelope reassembly after mitosis and not mitosis itself. Secondly, I don´t understand why the loss of nuclear blebs suggests that volume and tension are tightly coupled. Thirdly, how can mitosis be a key regulator of nuclear envelope tension when the nucleus is disassembled during the process? These require clarification.

      (6) The authors claim that, unlike previous studies (Lomakin et al), this work shows a "gradual nuclear adaptation". From their results, this is difficult to conclude simply because they do not analyse cPLA2 levels. This is solely based on indirect evidence obtained from cPLA2 inhibition. A gradual adaptation would mean that based on the level of confinement we would expect to have increasingly higher levels of cPLA2 (and therefore nuclear tension).

      (7) The authors should refrain from saying that the mechanism behind DNA repair is coupled to the nuclear adaptation they show. There are several points regarding this statement. Firstly, increased DNA damage could be due to nuclear ruptures imposed by confinement at 2h. In fact, the authors show leakage of NLS from the nucleus after confinement (Figure S3A). Secondly, the decrease in DNA damage at 24h could be because these nuclei did not rupture. How can they ensure that cells with low DNA damage at 24h had increased DNA damage at 2h? Finally, one needs to confirm if the nuclei they are analysing at 24h did undergo a round of cell division previously. From the evidence provided, the authors cannot conclude that DNA damage regulation is occurring in confined cells. Moreover, cell cycle arrest is a known effect of DNA damage. Cells with high damage at 2h most likely are arrested or will present with increased mitotic errors (which the authors exclude from their analyses).

    2. Reviewer #2 (Public Review):

      Summary:

      Extensive previous research has shown that cell confinement, e.g., vertical compression of cells to a height smaller than the height of the unconfined cells, results in the unfolding of nuclear membrane invaginations, calcium and membrane tension mediated recruitment of cPLA2 to the nuclear membrane (which triggers increased cortical myosin accumulation and activity, among other effects), nuclear blebbing, and DNA damage. However, the long-term effects of confinement, and how cells adapt to such confined conditions, have remained largely unexplored.

      In this work, the authors use custom-built cell confinement devices that enable precise control of confinement for prolonged periods of time (up to several days), along with live cell and fixed cell imaging to compare short-term (2 hours) and long-term (24+ hours) effects of confinement on nuclear structure. The authors report that while vertical confinement results in a short-term increase in nuclear cross-sectional area, associated with an increase in nuclear surface area due to unfolding of nuclear envelope invaginations while maintaining nuclear volume, long-term confinement results in a decrease in nuclear volume, reduced cross-sectional area, and re-appearance of nuclear envelope invaginations. Using time-lapse imaging, the authors demonstrate that these effects are associated with a reduction in nuclear volume upon completion of the first mitosis under confinement. Pharmacological inhibition experiments indicate a requirement of cPLA2, calcium signaling, and actomyosin contractility in this process. Although it is not surprising that nuclear blebs disappear following mitosis, as the nuclear envelope breaks down at the onset of mitosis and subsequently reforms as the chromatin decondenses, the observed change in nuclear volume upon prolonged confinement is intriguing. Notably, the nuclear adaptation following prolonged confinement was also associated with a reduction in DNA damage when comparing cells at 2h and 24h of confinements, measured by the presence of gamma-H2AX foci in the nucleus. By fitting their experimental data of nuclear surface area measurements, the authors arrive at the conclusion that cells have an intrinsic nuclear envelope tension set-point and that completing mitosis enables cells to reset nuclear envelope tension to this set-point.

      Strengths:

      The use of an agarose confinement system with precise control over vertical confinement enables the authors to apply long-term confinement without depriving cells of nutrients while performing live cell imaging or immunofluorescence analysis following fixation. The live cell imaging is a powerful tool to assess the effect of confinement not only on nuclear morphology, but also on cell cycle progression (using the FUCCI fluorescent reporter) and to compare nuclear volume between mother and daughter cells. The data presented by the authors to demonstrate changes in nuclear volume and surface area are convincing and supported by several independent measurements. The model comparing total and apparent nuclear surface area nicely complements the experimental measurements and helps to make the point that cells have a nuclear envelope tension set-point, even though the authors were unable to directly measure nuclear envelope tension. The inhibitor experiments targeting cPLA2 (using AACOCF3), intracellular calcium (using BAPTA-Amand 2APB), and myosin contractility (using blebbistatin) identify key players in the underlying cellular mechanism.

      Weaknesses:

      Although the findings by the authors will be of interest to a broad community, several weaknesses limit the mechanistic insights gained from this study. One major limitation is that all experiments are performed in a single cell line, H-29 human colorectal cancer cells, which has an unusual nuclear envelope composition as it has no lamin B2, low lamin B1 levels, and contains a p53 mutation. Because lamins B1 and B2 play important functions in protecting the nuclear envelope from blebs and confinement-induced rupture, and p53 is crucial in the cellular DNA damage response, it remains unclear whether other cell lines exhibit similar adaptation behavior.

      Furthermore, although the time-lapse experiments suggest that reduction in nuclear volume occurs primarily during mitosis, the authors do not address whether prolonged confinement, even in the absence of apoptosis, could also result in cells adjusting their nuclear volume, or alternatively normalizing nuclear envelope tension by recruiting additional membrane from the endoplasmic reticulum, which is continuous with the nuclear membranes.

      Additionally, the molecular mechanisms underlying the observed loss in nuclear volume and the regulation of this process remain to be identified. The pharmacological studies implicate cPLA2, intracellular calcium, and actomyosin contractility in this process, but do not include validation to confirm the efficiency of the drug treatment or to rule out off-target effects. Regarding the proposed role of cPLA2, previous studies have shown that cPLA2 recruitment to the nuclear membrane, which is essential to mediate its nuclear mechanotransduction function, requires both an increase in nuclear membrane tension and intracellular calcium. However, the current study does not include any data showing the recruitment of cPLA2 to the nuclear membrane upon confinement, or the disappearance of nuclear membrane-associated cPLA2 during prolonged confinement, leaving unclear the precise function and dynamics of cPLA2 in the process.

      Lastly, it remains unclear (1) whether the reduction in nuclear volume is caused by a reduction in nuclear water content, by chromatin compaction, e.g. associated with an increase in heterochromatin, or through other mechanisms, (2) whether the change in nuclear volume is reversible, and if so, how quickly, and (3) what functional consequences the substantial reduction in nuclear volume has on nuclear function, as one would expect that this reduction would be associated with a substantial increase in nuclear crowding, affecting numerous nuclear processes.

    1. eLife assessment

      The authors introduce a valuable machine-learning model for predicting binding sites of diverse ligands, including DNA, RNA, peptides, proteins, ATP, HEM, and metal ions, on proteins. The method is freely accessible and user-friendly. The authors have conducted thorough benchmarking and ablation studies, providing convincing evidence of the model's overall performance, despite some imperfections of the comparisons to other methods that arise from intrinsic differences between training methods and data.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors aim to address a critical challenge in the field of bioinformatics: the accurate and efficient identification of protein binding sites from sequences. Their work seeks to overcome the limitations of current methods, which largely depend on multiple sequence alignments or experimental protein structures, by introducing GPSite, a multi-task network designed to predict binding residues of various molecules on proteins using ESMFold.

      Strengths:

      (1) Benchmarking. The authors provide a comprehensive benchmark against multiple methods, showcasing the performances of a large number of methods in various scenarios.

      (2) Accessibility and Ease of Use. GPSite is highlighted as a freely accessible tool with user-friendly features on their website, enhancing its potential for widespread adoption in the research community.

      Weaknesses:

      (1) Lack of significant insights. The paper reproduces results and analyses already presented in previous literature, without providing significant novel analysis or interpretation. However, they show a novel method with an original approach.

      The work is useful for the field, especially in disease mechanism elucidation and novel drug design. The availability of genome-scale binding residue annotations GPSite offers is a significant advancement.

    3. Reviewer #2 (Public Review):

      Summary:

      This work provides a new framework, "GPsite" to predict DNA, RNA, peptide, protein, ATP, HEM, and metal ions binding sites on proteins. This framework comes with a webserver and a database of annotations. The core of the model is a Geometric featurizer neural network that predicts the binding sites of a protein. One major contribution of the authors is the fact that they feed this neural network with predicted structure from ESMFold for training and prediction (instead of native structure in similar works) and a high-quality protein Language Model representation. The other major contribution is that it provides the public with a new light framework to predict protein-ligand interactions for a broad range of ligands. It is a convincing outcome of previous efforts to Geometric Deep Learning approaches to model protein-ligand interactions. The authors have demonstrated the interest of their framework with comprehensive ablation studies and benchmarks.

      Strengths:

      - The performance of this framework as well as the provided dataset and web server make it useful to conduct studies.<br /> - The ablations of some core elements of the method, such as the protein Language Model part, the use of multiple ligands in the same model, the input structure, or the use of predicted structure to complement native structure are very insightful. They can help convince the reader that every part of the framework is necessary. This could also guide further developments in the field. As such, the presentation of this part of the work holds a critical place in this work.

      Weaknesses:

      - The authors made an important effort to compare their work to other similar frameworks. Yet, the lack of homogeneity of training methods and data from one work to the other makes the comparison slightly unconvincing, as the authors pointed out. Ablations performed by the authors were able to compensate for this general weakness, as well as the focus on several example structures.

    4. Reviewer #3 (Public Review):

      Summary

      The authors of this work aim to address the challenge of accurately and efficiently identifying protein binding sites from sequences. They recognize that the limitations of current methods, including reliance on multiple sequence alignments or experimental protein structure, and the under-explored geometry of the structure, which limit the performance and genome-scale applications. The authors have developed a multi-task network, GPSite, that predicts binding residues for a range of biologically relevant molecules, including DNA, RNA, peptides, proteins, ATP, HEM, and metal ions, using sequence embeddings from protein language models and ESMFold-predicted structures. The reported results showed to be superior to current sequence-based and structure-based methods in terms of accuracy and efficiency.

      Strengths<br /> (1) The GPSite model's ability to predict binding sites for a wide variety of molecules, including DNA, RNA, peptides, and various metal ions.<br /> (2) Based on the presented results, GPSite outperforms state-of-the-art methods in several benchmark datasets in terms of accuracy and efficiency.<br /> (3) GPSite adopts predicted structure instead of native structures as input, enabling the model to be applied to a wider range of scenarios where native structures are rare.<br /> (4) The low computational cost of GPSite is beneficial, which enables rapid genome-scale binding residue annotations, indicating the model's potential for large-scale downstream applications and discoveries.

      Weaknesses

      There are no major weaknesses after the revision.

    1. eLife assessment

      This study presents a rather valuable finding that vilazodone can restore the normal platelet level through regulating 5-HT1A receptor. The evidence supporting the claims of the authors is solid, although inclusion of more cell lines and more detailed analysis of the results would have strengthened the study. The work will be of interest to scientists working in the field of thrombocytopenia.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors tried to understand the mechanism on how a drug candidate, VLZ, works on a receptor, 5-HTR1A, by activating the SRC/MAPK pathway to promote the formation of platelets.

      Strengths:

      The authors used both computational and experimental methods. This definitely saves time and funds to find a useful drug candidate and its therapeutic marker in the subfield of platelets reduction in cancer patients. The authors achieved the aim to explain the mechanism of VLZ on improving thrombocytopenia by using two cell lines and two animal models.

      Weaknesses:

      Only two cell lines, HEL and Meg-01 cells, were evaluated in this study. However, using more cell lines is really depending on the work flow and the grant situations of the current research team.

    1. eLife assessment

      This important study unveils the significant impact of prenatal alcohol exposure on epigenetic patterns, offering new insights into its adverse health outcomes through solid evidence from both mouse models and human data. The findings, which reveal how a high-methyl diet can mitigate these epigenetic alterations, present a promising prenatal care strategy. Despite its solid data overall, the study's small sample size and unaccounted confounders suggest the need for further research to confirm these findings and explore their practical implications.

    2. Reviewer #1 (Public Review):

      Summary:

      This manuscript examined the impact of prenatal alcohol exposure on genome-wide DNA methylation in the brain and liver, comparing ethanol-exposed mice to unexposed controls. They also investigated whether a high-methyl diet (HMD) could prevent the DNA methylation alterations caused by alcohol. Using bisulfite sequencing (n=4 per group), they identified 78 alcohol-associated differentially methylated regions (DMRs) in the brain and 759 DMRs in the liver, of which 85% and 84% were mitigated by the HMD group, respectively. The authors further validated 7 DMRs in humans using previously published data from a Canadian cohort of children with FASD.

      Overall, the findings from this study provide new insight into the impact of prenatal alcohol exposure, while also showing evidence for methyl-rich diets as an intervention to prevent the effects of alcohol on the epigenome. Some methodological concerns and confounders limit the robustness of these results, and should be addressed in future studies to further strengthen the conclusions of this study and its applicability to broader settings.

      Strengths:

      - The use of whole genome bisulfite sequencing allowed for the interrogation of the entire DNA methylome and DMR analysis, rather than a subset of CpGs.<br /> - The combination of data from animal models and humans allowed the authors to make stronger inferences regarding their findings<br /> - The authors investigated a potential mechanism (high methyl diet) to buffer against the effects of prenatal alcohol exposure, which increases the relevance and applicability of this research.

      Weaknesses:

      - The sample size was small for the epigenetic analyses, which limits the robustness of the findings.<br /> - The authors could not account for potential confounders in their analyses, including birthweight, alcohol levels, and sex. This is a particular problem for the high-methyl diet analyses, in which the alcohol-exposed mice consumed less alcohol than their non-diet counterparts.

    1. eLife assessment

      The study presented in this manuscript makes important contributions to our understanding of cell fate decisions and the role of noise in gene regulatory networks. Through computational and theoretical analysis, the authors provide solid support for distinguishing distinct driving forces behind fate decisions based on noise profiles and reprogramming trajectories. While acknowledging the potential limitations of small gene regulatory networks in capturing the richness of whole-transcriptome sequencing datasets, this study offers a creative approach for formulating hypotheses about gene regulation during stem cell differentiation using single-cell sequencing data.

    2. Joint Public Review:

      In this manuscript, Xue and colleagues investigate the fundamental aspects of cellular fate decisions and differentiation, focusing on the dynamic behaviour of gene regulatory networks. It explores the debate between static (noise-driven) and dynamic (signal-driven) perspectives within Waddington's epigenetic landscape, highlighting the essential role of gene regulatory networks in this process. The authors propose an integrated analysis of fate-decision modes and gene regulatory networks, using the Cross-Inhibition with Self-activation (CIS) network as a model. Through mathematical modelling, they differentiate two logic modes and their effect on cell fate decisions: requires both the presence of an activator and absence of a repressor (AA configuration) with one where transcription occurs as long the repressor is not the only species on the promoter (OO configuration).

      The authors establish a relationship between noise profiles, logic-motifs, and fate-decision modes, showing that defining any two of these properties allows the inference of the third. They also identify, under the signal-driven mode, two fundamental patterns of cell fate decisions: either prioritising progression or accuracy in the differentiation process. The authors apply this analysis to available high-throughput datasets of cell fate decisions in hematopoiesis and embryogenesis, proposing the underlying driving force in each case and utilising the observed noise patterns to nominate key regulators.

      The paper significantly advances our understanding of gene regulatory networks through a well-described computational study, where the authors rigorously evaluate assumptions in modelling. Particularly commendable is their introduction of the concept of combinatorial logic, exemplified by the double 'and' and double 'or' (AA/OO) logic motifs, which they successfully map to previously described cell fate decision processes. This theoretical and computational exploration sheds light on the dynamic landscape of epigenetic cell fate decisions, emphasising the role of combinatorial logic in coordinating noise and signal-driven processes. The thorough comparison of two model configurations underscores the importance of integration logic, contributing to a clearer understanding of gene regulatory network dynamics. Importantly, the results of the simulations are presented clearly, enhancing accessibility and intuitive understanding. The paper's strength also lies in its predictive power, as the authors use simulations to make insightful predictions about the regulatory organisation of stem cell differentiation systems. While the exploration is restricted to specific scenarios, these limitations serve to highlight areas for future research rather than detract from the paper's strengths.

      While the paper presents an intriguing framework for understanding gene regulatory networks and cell fate decisions, there are some weaknesses that warrant attention. Firstly, the framework would benefit from validation with more experimental data and application to diverse systems beyond those explored in the study, such as de-differentiation in adult tissues and regeneration processes. Additionally, while the authors successfully make predictions about the regulatory organisation of stem cell differentiation systems, there is a lack of discussion regarding how perturbations in the regulatory network could affect cell fate decisions. Furthermore, the paper could be strengthened by addressing the effects of mutations and other perturbations that may significantly influence cell fate decision-making processes, thus enhancing the robustness of the findings. Finally, there are instances where the clarity of the writing could be improved to enhance understanding and accessibility for readers.

    1. eLife assessment

      This work presents some valuable information regarding the molecular mechanisms controlling the regeneration of pancreatic beta cells following induced cell ablation. However, the study lacks the critical lineage tracing result to support the conclusion about the origin of the regenerated beta cells. The results of the pharmacological manipulation of CaN signaling are also incomplete. In particular, these manipulation are not cell-specific, making it difficult to interpret and thus a genetic approach is recommended.

    2. Reviewer #1 (Public Review):

      Induction of beta cell regeneration is a promising approach for the treatment of diabetes. In this study, Massoz et.al., identified calcineurin (CaN) as a new potential modulator of beta cell regeneration by using zebrafish as model. They also showed that calcineurin (CaN) works together with Notch signaling to promote the beta cell regeneration. Overall, the paper is well organized, and technically sound. However, some evidences seem weak to get the conclusion.

    1. eLife assessment

      This potentially useful work characterizes the changes in microbial composition of the nasal and fecal microbiomes of COVID-19 patients according to the severity of disease. However, the description of methods and statistics used for several figures is incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      The research study under review investigated the relationship between the gut and identified potential biomarkers derived from the nasopharyngeal and gut microbiota-based that could aid in predicting COVID-19 severity. The study reported significant changes in the richness and Shannon diversity index in nasopharyngeal microbiome associated with severe symptoms. The study showed a high abundance of Bacillota and Pesudomonadota in patients exhibiting severe symptomatology. Positive correlations were also found between Corynebacterium, Acinetobacter, Staphylococcus, and Veillonella, with the severity of SARS-CoV-2 infection.

      Strengths:

      The study successfully identified differences in the microbiome diversity that could indicate or predict disease severity. Furthermore, the authors demonstrated a link between individual nasopharyngeal organisms and the severity of SARS-CoV-2 infection. The density of the nasopharyngeal organism was shown to be a potential predictor of the severity of COVID-19.

      Weaknesses:

      The authors claimed an association between nasopharyngeal organisms and severity of SARS-CoV-2 infection but omitted essential data on the statistical significance of these associations between groups. The authors frequently referred to a p-value < 0.05 without presenting the actual p-values and percentages to show the significance of their results. The discussion is hard to understand (lacked clarity), as it contained an extensive literature review without discussing the study findings. A more focused discussion and results section on the main findings could have improved the overall readability of the paper. The role of potential confounders, such as HIV infection, and ethnicity which impacts the nasopharyngeal microbiome composition, was not included in the paper. Addressing the potential confounders would contribute to a more comprehensive understanding of the study's implications, specifically the role of the nasopharyngeal microbiome as a predictor of COVID-19 severity.

    3. Reviewer #2 (Public Review):

      The study conducted by Benita et al studied the gut and nasopharyngeal microbiome in covid-19 severity. There are a lot of studies on this topic, and this study therefore cannot stand out from a pool of such similar studies. Beyond that, I have a number of major concerns:

      (1) The sample size is limited. There were 3 cohorts, but only ~100 subjects in total. This indicates that there were only a small number of subjects in each cohort (the authors did not list this information), and beyond that, there was a lack of healthy individuals as controls. A cohort-specific effect should usually exist, I believe with such a small number of patients (they were further divided into 3 groups), the authors cannot find reproducible data between cohorts.

      (2) The study did not meet the study goal. The authors say "Many factors have been described to be correlated with its severity but no specific determinants of infection outcome have been identified yet". However, numerous studies have shown the relationship between microbiome and covid. The present study only again showed a correlation between microbiome and covid severity and did not provide further insights, nor did they find specific determinants.

      (3) This study only studied 16s-seq for microbiome profiling, which made this study lack depth and resolution. Many peer papers have used metagenomics sequencing for in-depth interrogation.

      (4) Since there are fecal and nasopharyngeal microbiome data, the authors only listed their respective associations with covid severity yet did not provide further insights into whether and how these two microbiome types are linked to covid, or into whether there is a microbiome priority, resistance or transmission.

      (5) The abstract is amiss where each sentence lacks a key message - I don't understand each of the sentences or the underlying meanings. One example of an unclear expression is "this ratio" - what ratio?

      (6) The figures are all unclear and need significant improvement

    4. Reviewer #3 (Public Review):

      Summary:

      How the microbial composition of the human body is influenced by and influences disease progression is an important topic. For people with COVID-19, symptomatic progression and deterioration can be difficult to predict. This manuscript attempts to associate the nasal and fecal microbiomes of COVID-19 patients with the severity of disease symptoms, with the goal of identifying microbial markers that can predict disease outcomes. However, the value of this work is held back by unclear methods and data presentation.

      Strengths:

      Analysis of microbiomes from two distinct anatomical locations and across three distinct patient groups is a substantial undertaking. How these microbiomes influence and are influenced by COVID-19 disease progression is an important question. In particular, the putative biomarker identified here could be of clinical value with additional research.

      Weaknesses:

      The methods and statistics used for several figures and comparisons are unclear or used in non-standard ways. For instance: the description of the Bray-Curtis test for Figure 1 is inaccurate and conflicts between the text and figure legend; the method used to compare the relative abundance of genera in Figure 2 is not clear; and it is not stated how the "total amount" of detected bacteria is inferred from the data presented in Figures 2C and 2D.

      The description of results for Figure 1 is overstated or unclear for both the alpha diversity among disease groups and the overlap for nasal samples.

      The most abundant phyla from nasal samples cumulatively account for less than 1% of abundance and it is unclear why this would be expected or how it compares to other work. Relatedly, the potential biological relevance of the very small proportional changes among phyla in the nasal samples is also not clear.

      There is no real discussion of how the identified biomarkers might work in practice. While some microbes are detected in one condition but not others, it is unclear whether these organisms are expected to already exist below the detection threshold and then increase in abundance along with disease severity, or if they are picked up from the environment. For instance, would the presence of these 'severe' - associated microbes in patients with mild or moderate disease justify additional treatment to prevent disease progression?

      The authors use the term "nasopharyngeal-faecal axis", but there is no substantial discussion of how these two microbiomes interact to influence disease progression, or how they are jointly affected to yield useful biomarkers. With one exception, correlation values between nasal and fecal microbes range from negligible to modest. It is unclear, then, how much parallel influence disease has on these microbiomes.

    1. eLife assessment

      This comprehensive study provides valuable information on the cooperation of Ikaros with Foxp3 to establish and regulate a major portion of the epigenome and transcriptome of T-regulatory cells. While the data are compelling, the evidence that these features are solely intrinsic, independent of the micro-environment, could be strengthened.

    2. Joint Public Review:

      This study investigates the role of Ikaros, a zinc finger family transcription factor related to Helios and Eos, in T-regulatory (Treg) cell functionality in mice. Through genome-wide association studies and chromatin accessibility studies, the authors find that Ikaros shares similar binding sites to Foxp3. Ikaros cooperates with Foxp3 to establish a major portion of the Treg epigenome and transcriptome. Ikaros-deficient Treg exhibits Th1-like gene expression with abnormal expression of IL-2, IFNg, TNFa, and factors involved in Wnt and Notch signalling. Further, two models of inflammatory/ autoimmune diseases - Inflammatory Bowel Disease (IBD) and organ transplantation - are employed to examine the functional role of Ikaros in Treg-mediated immune suppression. The authors provide a detailed analysis of the epigenome and transcriptome of Ikaros-deficient Treg cells.

      These studies establish Ikaros as a factor required in Treg for tolerance and the control of inflammatory immune responses. The data are of high quality. Overall, the study is well organized, and reports new data consolidating mechanistic aspects of Foxp3 mediated gene expression program in Treg cells.

      Strengths:

      The authors have performed biochemical studies focusing on mechanistic aspects of molecular functions of the Foxp3-mediated gene expression program and complemented these with functional experiments using two models of autoimmune diseases, thereby strengthening the study. The studies are comprehensive at both the cellular and molecular levels. The manuscript is well organized and presents a plethora of data regarding the transcriptomic landscape of these cells.

      Weakness:

      The findings of markedly increased percentages of activated conventional T cells (CD44hi), major increases in TFH cells, and elevated serum Ig levels indicate disrupted immune homeostasis even in the absence of overt autoimmune manifestations seen in histopathology. Thus, some of the observed genetic changes observed by the authors are likely Treg cell extrinsic. Further, clear conclusions from the genome-wide studies are lacking.

    1. eLife assessment

      The development of this mouse model is an important step to establish the role of the FSH Receptor in tissues beyond the reproductive system, and the data provided in this paper are convincing for a role for the FSH receptor in cell systems well beyond the classic reproductive tissues. Such model(s) have long been needed in this field and will provide expanded opportunities to better define FSH biology in vivo in these important target tissues. Ultimately, this model could shed light on FSH biology in women after menopause, when endogenous FSH levels rise dramatically, or in men with hypogonadism when FSH levels are high.

    2. Reviewer #1 (Public Review):

      The manuscript describes the development of a mouse model that co-expresses a fluorescent protein ZsGreen) marker in gene fusion with the FSHR gene.

      The authors are correct in that there is a lack of reliable antibodies against many of the GPCR family members. The approach is novel and interesting, with the potential to help understand the expression pattern of gonadotropin receptors. There has been a very long debate about the expression of gonadotropin receptors in other tissues other than gonads. While their expression of the FSHR in some of those tissues has been detected by a variety of methods, their physiological, or pathophysiological, function(s) remain elusive.

      The authors in this manuscript assume that the expression of ZsGren and the FSHR are equal. While this is correct genetically (transcription->translation) it does not go hand in hand with other posttranslational processes.

      (1) One of the shocking observations in this manuscript is the expression of FSHR in Leydig cells. Other observations are in the osteoblasts and endothelial cells as well as epithelial cells in different organs. The expression of ZsGreen in these tissues seems high and one shall start questioning if there are other mechanisms at play here.

      First, the turnover of fluorescent proteins is long, longer than 48h, which means that they accumulate at a different speed than the endogenous FSHR This means that ZsGreen will accumulate in time while the FSHR receptor might be degraded almost immediately. This correlated with mRNA expression (by the authors) but does not with the results of other studies in single-cell sequencing (see below).

      The expression of ZsGreen in Leydig cells seems much higher than in Sertoli cells, this is "disturbing" to put it mildly. This is visible in both the ZsGreen expression and the FISH assay (Figure 2 B-D).

      (2) The expression in WAT and BAT is also questionable as the expression of ZsGreen is high everywhere. That makes it difficult to believe that the images are truly informative. For example, the stainings of aorta show the ZsGreen expression where elastin and collagen fibres are - these are not "cells" and therefore are not expressing ZsGreen.

      (3) FISH expression (for FSHR) in WT mice is missing.

      Also, the tissue sections were stained with the IgG only (neg control) but in practice both the KI and the WT tissues should be stained with the primary and secondary antibodies. The only control that I could think of to truly get a sense of this would be a tagged receptor (N-terminal) that could then be analysed by immunohistochemistry.

      (4) The authors also claim:<br /> To functionally prove the presence of FSHR in osteoblasts/osteocytes, we also deleted FSHR in osteocytes using an inducible model. The conditional knockout of FSHR triggered a much more profound increase in bone mass and decrease in fat mass than blockade by FSHR antibodies (unpublished data).

      This would be a good control for all their images. I think it is necessary to make the large claim of extragonadal expression, as well as intragonadal such as Leydig cells.

      (5) Claiming that the under-developed Leydig cells in FSHR KO animals are due to a direct effect of the FSHR, and not via a cross-talk between Sertoli and Leydig cells, is too much of a claim. It might be speculated to some degree but as written at the moment it suggests this is "proven".

      (6) We also do not know if this FSHR expressed is a spliced form that would also result in the expression of ZsGreen but in a non-functional FSHR, or whether the FSHR is immediately degraded after expression. The insertion of the ZsGreen might have disturbed the epigenetics, transcription, or biosynthesis of the mRNA regulation.

      (7) The authors should go through single-cell data of WT mice to show the existence of the FSHR transcript(s).<br /> For example here:<br /> https://www.nature.com/articles/sdata2018192

    3. Reviewer #2 (Public Review):

      The authors developed an original knock-in reporter mice line expressing ZSGreen under the control of endogenous FSHR promoter. The existence of FSHR in various extra-gonadal tissues and the physio-pathological consequences indeed remains a debated question and could potentially have an important impact on many high-incidence diseases occurring in menopausal women. Unfortunately, the provided data set lacks crucial controls and therefore does not provide a robust/convincing answer to the above-mentioned question.

      Summary:<br /> The authors investigated the expression pattern of the FSHR in the gonads, where its expression has been demonstrated for decades, but also in many extra-gonadal tissues. The question is important since the expression of FSHR outside of the gonads has been increasingly reported and associated with the dramatic increase of circulating FSH after menopause, and has been suggested to play an important role in the advent of multiple diseases occurring with high incidence in post-menopausal women. However, the reality of such extra-gonadal expression of FSHR remains debated, mainly because this receptor is expressed at a low level and because the specificity/affinity of the available anti-FSHR antibodies is questionable.

      Strengths:<br /> The development of reporter mice expressing ZsGreen fluorescent protein under the control of endogenous FSHR promoter is an original and potentially powerful approach to tackle the problem.

      Weaknesses:<br /> The data provided are provocative since the FSHR seems to be expressed in all tested tissues. In the testis, for instance, the authors report very high levels of FSHR in interstitial cells and germ cells. In the ovary, there seems to be no difference in FSHR expression between granulosa cells and the other cell types. These findings alone contradict all the knowledge on FSH expression patterns in the gonads that have been accumulated over decades by many independent labs. In view of such results, the validity of the reporter mice line should be questioned thoroughly:

      (1) Is the FSHR expression pattern affected by the knockin mice (no side-by-side comparison between wt and GSGreen mice, using in situ hybridization and ddRTPCR, at least in the gonads, is provided)?

      (2) Is the splicing pattern of the FSHR affected in the knockin compared to wt mice, at least in the gonads?

      (3) Are there any additional off-target insertions of GSGreen in these mice?

      (4) Are similar results observed in separate founder mice?

      (5) How long is GSGreen half-life? Could a very long half-life be a major reason for the extremely large expression pattern observed?

      In the absence of answers to these questions, the data produced in extra-gonadal tissues using the same reporter mice, are not convincing and do not support the authors' claims.

    1. eLife assessment

      This study on scRNA-seq of allergic contact dermatitis (ACD) is important in that it presents new data on fibroblasts in ACD and links to recent studies on other cell types and their signatures. The evidence presented is solid in that the data support claims of unique roles for subtypes of fibroblasts in ACD. Overall, this paper will be used as a resource by many in the skin inflammation field.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Liu et al. used scRNA-seq to characterize cell type-specific responses during allergic contact dermatitis (ACD) in a mouse model, specifically the hapten-induced DNFB model. Using the scRNA-seq data, they deconvolved the cell types responsible for the expression of major inflammatory cytokines such as IFNG (from CD4 and CD8 T cells), IL4/13 (from basophils), IL17A (from gd T cells), and IL1B from neutrophils and macrophages. They found the highest upregulation of a type 1 inflammatory response, centering around IFNG produced by CD4 and CD8 T cells. They further identified a subpopulation of dermal fibroblasts that upregulate CXCL9/10 during ACD and provided functional genetic evidence in their mouse model that disrupting IFNG signaling to fibroblasts decreases CD8 T cell infiltration and overall inflammation. They identify an increase in IFNG-expressing CD8 T cells in human patient samples of ACD vs. healthy control skin and co-localization of CD8 T cells with PDGFRA+ fibroblasts, which suggests this mechanism is relevant to human ACD. This mechanism is reminiscent of recent work (Xu et al., Nature 2022) showing that IFNG signaling in dermal fibroblasts upregulates CXCL9/10 to recruit CD8 T cells in a mouse model of vitiligo. Overall, this is a very well-presented, clear, and comprehensive manuscript. The conclusions of the study are mostly well supported by data, but some aspects of the work could be improved by additional clarification of the identity of the cell types shown to be involved, including the exact subpopulation discovered by scRNA-seq and the subtype of CD8 T cell involved. The study was limited by its use of one ACD model (DNFB), which prevents an assessment of how broadly relevant this axis is. The human sample validation is slightly circumstantial and limited by the multiplexing capacity of immunofluorescence markers.

      Strengths:

      Through deep characterization of the in vivo ACD model, the authors were able to determine which cell types were expressing the major cytokines involved in ACD inflammation, such as IFNG, IL4/13, IL17A, and IL1B. These analyses are well-presented and thoughtful, showing first that the response is IFNG-dominant, then focusing on deeper characterization of lymphocytes, myeloid cells, and fibroblasts, which are also validated and complemented by FACS experiments using canonical markers of these cell types as well as IF staining. Crosstalk analyses from the scRNA-seq data led the authors to focus on IFNG signaling fibroblasts, and in vitro experiments demonstrate that CXCL9 and CXCL10 are expressed by fibroblasts stimulated by IFNG. In vivo functional genetic evidence demonstrates an important role for IFNG signaling in fibroblasts, as KO of Ifngr1 using Pdgfra-Cre Ifngr1 fl/fl mice, showed a reduction in inflammation and CD8 T cell recruitment.

      Weaknesses:

      The use of one model limits an understanding of how broad this fibroblast-T cell axis is during ACD. However, the authors chose the most commonly employed model and cited additional work in a vitiligo model (another type 1 immune response). The identity of the involved fibroblasts and T cells in the mouse model is difficult to assess as scRNA-seq identified subpopulations of these cell types, but most work in the Pdgfra-Cre Ifngr1 fl/fl mice used broad markers for these cell types as opposed to matched subpopulation markers from their scRNA-seq data. Human patient samples of ACD were co-stained with two markers at a time, demonstrating the presence of CD8+IFNG+ T cells, PDGFRA+CXCL10+ fibroblasts, and co-localization of PDGFRA+ fibroblasts and CD8+ T cells. However, no IF staining demonstrates co-expression of all 4 markers at once; thus, the human validation of co-localization of CD8+IFNG+ T cells and PDGFRA+CXCL10+ fibroblasts is ultimately indirect, although not a huge leap of faith. Although n=3 samples of healthy control and ACD samples are used, there is no quantification of any results to demonstrate the robustness of differences.

    3. Reviewer #2 (Public Review):

      Summary:

      The investigators apply scRNA seq and bioinformatics to identify biomarkers associated with DNFB-induced contact dermatitis in mice. The bioinformatics component of the study appears reasonable and may provide new insights regarding TH1-driven immune reactions in ACD in mice. However, the IF data and images of tissue sections are not clear and should be improved to validate the model.

      Strengths:

      The bioinformatics analysis.

      Weaknesses:

      The IF data presented in 4H, 6H, 7E and 7F are not convincing and need to be correlated with routine staining on histology and different IF markers for PDGFR. Some of the IF staining data demonstrates a pattern inconsistent with its target.

    1. eLife assessment

      This valuable work provides a near-complete description of the mechanosensory bristles on the Drosophila melanogaster head and the anatomy and projection patterns of the bristle mechanosensory neurons that innervate them. The data presented are solid. The study has generated numerous resources for the community that will be of interest to neuroscientists in the field of circuits and behaviour, particularly those interested in mechanosensation and behavioural sequence generation.

    2. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable work provides a near-complete description of the mechanosensory bristles on the Drosophila melanogaster head and the anatomy and projection patterns of the bristle mechanosensory neurons that innervate them. The data presented are solid. The study has generated numerous invaluable resources for the community that will be of interest to neuroscientists in the field of circuits and behaviour, particularly those interested in mechanosensation and behavioural sequence generation.

      We express our gratitude to the Reviewers for their valuable suggestions, which significantly enhanced the manuscript. The revisions were undertaken, not with the expectation of acceptance, but rather driven by our sincere belief that these revisions would enhance the manuscript's impact for future readers.

      Public Reviews:

      Reviewer #1 (Public Review):

      Sensory neurons of the mechanosensory bristles on the head of the fly project to the sub esophageal ganglion (SEZ). In this manuscript, the authors have built on a large body of previous work to comprehensively classify and quantify the head bristles. They broadly identify the nerves that various bristles use to project to the SEZ and describe their region-specific innervation in the SEZ. They use dye-fills, clonal labelling, and electron microscopic reconstructions to describe in detail the phenomenon of somatotopy - conserved peripheral representations within the central brain - within the innervation of these neurons. In the process they develop novel tools to access subsets of these neurons. They use these to demostrate that groups of bristles in different parts of the head control different aspects of the grooming sequence.

      Reviewer #2 (Public Review):

      The authors combine genetic tools, dye fills and connectome analysis techniques to generate a "first-of-its-kind", near complete, synaptic resolution map of the head bristle neurons of Drosophila. While some of the BMN anatomy was already known based on previous work by the authors and other researchers, this is the first time a near complete map has been created for the head BMNs at electron microscopy resolution.

      Strengths:

      (1) The authors cleverly use techniques that allow moving back and forth between periphery (head bristle location) and brain, as well as moving between light microscopy and electron microscopy data. This allows them to first characterize the pathways taken by different head BMNs to project to the brain and also characterize anatomical differences among individual neurons at the level of morphology and connectivity.

      (2) The work is very comprehensive and results in a near complete map of all I’m head BMNs.

      (3) Authors also complement this anatomical characterization with a first-level functional analysis using optogenetic activation of BMNs that results in expected directed grooming behavior.

      Weaknesses:

      (1) The clustering analysis is compelling but cluster numbers seem to be arbitrarily chosen instead of by using some informed metrics.

      We made revisions to the manuscript that address this concern. Please see our response to “recommendations for authors” for a description of these revisions.

      (2) It could help provide context if authors revealed some of the important downstream pathways that could explain optogenetics behavioral phenotypes and previously shown hierarchical organization of grooming sequences.

      We made revisions to the manuscript that address this recommendation. Please see our response to “recommendations for authors” for a description of these revisions.

      (3) In contrast to the rigorous quantitative analysis of the anatomical data, the behavioral data is analyzed using much more subjective methods. While I do not think it is necessary to perform a rigorous analysis of behaviors in this anatomy focused manuscript, the conclusions based on behavioral analysis should be treated as speculative in the current form e.g. calling "nodding + backward walking" as an avoidance response is not justified as it currently stands. Strong optogenetic activation could lead to sudden postural changes that due to purely biomechanical constraints could lead to a couple of backward steps as seen in the example videos. Moreover since the quantification is manual, it is not clear what the analyst interprets as backward walking or nodding. Interpretation is also concerning because controls show backward walking (although in fewer instances based on subjective quantification).

      While unbiased machine vision-based methods would nicely complement the present work, this type of analysis is not yet working to distinguish between different head grooming movements. Therefore, we are currently limited to manual annotation for our behavioral analysis. That said, we do not believe that our manual annotation is subjective. The grooming movements that we examine in this work are distinguishable from each other through frame-by-frame manual annotation of video at 30 fps. Our annotation of the grooming and backward motions performed by flies are based on previous publications that established a controlled vocabulary defining each movement (Hampel et al., 2020a, 2017, 2015; Seeds et al., 2014). In this work, we added head nodding to this controlled vocabulary that is described in the Materials and methods. We have added additional text to the third paragraph of the Material and methods section entitled “Behavioral analysis procedures” that we hope better describes our behavioral analysis. This description now reads:

      Head nodding was annotated when the fly tilted its head downward by any amount until it returned its head back in its original position. This movement often occurred in repeated cycles. Therefore, the “start” was scored at the onset of the first forward movement and the “stop” when the head returned to its original position on the last nod.

      We do not make any firm conclusions about the head movements (nodding) and backwards motions. We refer to nodding as a descriptive term that would allow the reader to better understand what the behavior looks like. We make no firm conclusions about any behavioral functional role that either the nodding or the backward motions might have, with the exception of nodding in the context of grooming. We only suggest that the behaviors appear to be avoidance responses. Furthermore, backward walking was not mentioned. Instead we refer to backward motions. We are only reporting our annotations of these movements that do occur, and are significantly different from controls. We speculate that these could be avoidance responses based on support from the literature. Future studies will be required to understand whether these movements serve real behavioral roles.

      Summary:

      The authors end up generating a near-complete map of head BMNs that will serve as a long-standing resource to the Drosophila research community. This will directly shape future experiments aimed at modeling or functionally analyzing the head grooming circuit to understand how somatotopy guides behaviors.

      Reviewer #3 (Public Review):

      Eichler et al. set out to map the locations of the mechanosensory bristles on the fly head, examine the axonal morphology of the bristle mechanosensory neurons (BMNs) that innervate them, and match these to electron microscopy reconstructions of the same BMNs in a previously published EM volume of the female adult fly brain. They used BMN synaptic connectivity information to create clusters of BMNs that they show occupy different regions of the subesophageal zone brain region and use optogenetic activation of subsets of BMNs to support the claim that the morphological projections and connectivity of defined groups of BMNs are consistent with the parallel model for behavioral sequence generation.

      The authors have beautifully cataloged the mechanosensory bristles and the projection paths and patterns of the corresponding BMN axons in the brain using detailed and painstaking methods. The result is a neuroanatomy resource that will be an important community resource. To match BMNs reconstructed in an electron microscopy volume of the adult fly brain, the authors matched clustered reconstructed BMNs with light-level BMN classes using a variety of methods, but evidence for matching is only summarized and not demonstrated in a way that allows the reader to evaluate the strength of the evidence. The authors then switch from morphology-based categorization to non-BMN connectivity as a clustering method, which they claim demonstrates that BMNs form a somatotopic map in the brain. This map is not easily appreciated, and although contralateral projections in some populations are clear, the distinct projection zones that are mentioned by the authors are not readily apparent. Because of the extensive morphological overlap between connectivity-based clusters, it is not clear that small projection differences at the projection level are what determines the post-synaptic connectivity of a given BMN cluster or their functional role during behavior. The claim the somatotopic organization of BMN projections is preserved among their postsynaptic partners to form parallel sensory pathways is not supported by the result that different connectivity clusters still have high cosine similarity in a number of cases (i.e. Clusters 1 and 3, or Clusters 1 and 2). Finally, the authors use tools that were generated during the light-level characterization of BMN projections to show that specifically activating BMNs that innervate different areas of the head triggers different grooming behaviors. In one case, activation of a single population of sensory bristles (lnOm) triggers two different behaviors, both eye and dorsal head grooming. This result does not seem consistent with the parallel model, which suggests that these behaviors should be mutually exclusive and rely on parallel downstream circuitry.

      We made revisions to the manuscript that address this recommendation. Please see our response to “recommendations for authors” for a description of these revisions.

      This work will have a positive impact on the field by contributing a complete accounting of the mechanosensory bristles of the fruit fly head, describing the brain projection patterns of the BMNs that innervate them, and linking them to BMN sensory projections in an electron microscopy volume of the adult fly brain. It will also have a positive impact on the field by providing genetic tools to help functionally subdivide the contributions of different BMN populations to circuit computations and behavior. This contribution will pave the way for further mechanistic study of central circuits that subserve grooming circuits.

      Recommendations for the authors:

      All three reviewers appreciated the work presented in this manuscript. There were also a few overlapping concerns that were raised that are summarised below, should the authors wish to address them:

      Somatotopy: We recommend that the authors describe the extent of prior knowledge in more detail to highlight their contribution better.

      We made revisions that better highlight the extent of prior knowledge about somatotopy. We describe how previous studies showed bristle mechanosensory neurons in insects are somatotopically organized, but these studies were not comprehensive descriptions of complete somatotopic maps for the head or body. To our knowledge, our study provides the first comprehensive and synaptic resolution somatotopic map of a head for any animal. This sets the stage for the complete definition of the interface between somatotopically-organized mechanosensory neurons and postsynaptic circuits, which has broad implications for future studies on aimed grooming, and mechanosensation in general. Below we itemize revisions to the Introduction, Discussion, and Figures to provide a clearer statement of the significance of our study as it relates to somatotopy.

      (1) Newly added Figure 1 – figure supplement 1 more explicitly grounds the study in somatotopy, providing a working model of the organization of the circuit pathways that produce the grooming sequence. This model features somatotopy as shown in Figure 1 – figure supplement 1C.

      (2) Figure 1 – figure supplement 1 is incorporated into the Introduction in the second, third, and fourth paragraphs, the first paragraph of the Results section titled “Somatotopically-organized parallel BMN pathways”, and the second and third paragraphs of the last Discussion section titled “Parallel circuit architecture underlying the grooming sequence”.

      (3) We added text to the end of the fourth paragraph of the Introduction that now reads: “In this model, parallel-projecting mechanosensory neurons that respond to stimuli at specific locations on the head or body could connect with somatotopically-organized parallel circuits that elicit grooming of those locations (Figure 1 – figure supplement 1A-C). The previous discovery of a mechanosensory-connected circuit that elicits aimed grooming of the antennae provides evidence of this organization (Hampel 2015). However, the extent to which distinct circuits elicit grooming of other locations is unknown, in part, because the somatotopic projections of the mechanosensory neurons have not been comprehensively defined for the head or body.”

      (4) There is a Discussion section that further explains the extent of prior knowledge and our contributions on somatotopy that is titled “A synaptic resolution somatotopic map of the head BMNs”. Additionally, the previous version of this section had a paragraph on the broader implications of our work as it relates to somatotopy across species. In light of the reviewer comments, we decided to make this paragraph into its own Discussion section to better highlight the broader significance of our work. This section is titled “First synaptic resolution somatotopic map of the head”.

      The somatotopy isn't overtly obvious - perhaps they could try mapping presynaptic sites and provide landmarks to improve visualisation.

      We made the following revisions to better highlight the head BMN somatotopy. One point of confusion from the previous manuscript version stemmed from us not explicitly defining the somatotopic organization that we observed. There seemed to be confusion that we were defining the head somatotopy based only on the small projection differences among BMNs from neighboring head locations. While we believe that these small differences indeed correspond to somatotopy, we failed to highlight that there are overt differences in the brain projections of BMNs from distant locations on the head. For example, Figure 5B (right panel) shows the distinct projections between the LabNv (brown) and AntNv (blue) BMNs that innervate bristles on the ventral and dorsal head, respectively. Thus, BMN types innervating neighboring bristles show overlapping projections with small projection differences, whereas those innervating distant bristles show non overlapping projections into distinct zones.

      Our analysis of postsynaptic connectivity similarity also shows somatotopic organization among the BMN postsynaptic partners, as BMN types innervating the same or neighboring bristle populations show high connectivity similarity (Figure 8, old Figure 7). Below we highlight major revisions to the text and Figures that hopefully better reveal the head somatotopy.

      (1) In the last paragraph of the Introduction we added text that explicitly frames the experiments in terms of somatotopic organization: “This reveals somatotopic organization, where BMNs innervating neighboring bristles project to the same zones in the CNS while those innervating distant bristles project to distinct zones. Analysis of the BMN postsynaptic connectome reveals that neighboring BMNs show higher connectivity similarity than distant BMNs, providing evidence of somatotopically organized postsynaptic circuit pathways.”

      (2) We mention an example of overt somatotopy from Figure 5 in the Results section titled “EM-based reconstruction of the head BMN projections in a full adult brain”. The text reads “For example, BMNs from the Eye- and LabNv have distinct ventral and anterior projections, respectively. This shows how the BMNs are somatotopically organized, as their distinct projections correspond to different bristle locations on the head (Figure 5B,C).”

      (3) In new Figure 8 (part of old Figure 7), we modified panels that correspond to the cosine similarity analysis of postsynaptic connectivity. The major revision was to plot the cosine similarity clusters onto the head bristles so that the bristles are now colored based on their clusters (C). This shows how neighboring BMNs cluster together, and therefore show similar postsynaptic connectivity. We believe that this provides a nice visualization of somatotopic organization in BMN postsynaptic connectivity. We also added the clustering dendrogram as recommended by Reviewer #2 (Figure 8A).

      (4) In new Figure 8, we added new panels (D-F) that summarize our anatomical and connectomic analysis showing different somatotopic features of the head BMNs. Different BMN types innervate bristles at neighboring and distant proximities (D). BMNs that innervate neighboring bristles project into overlapping zones (E, example of reconstructed BM-Fr and -Ant neurons with non-overlapping BM-MaPa neurons) and show postsynaptic connectivity similarity (F, example connectivity map of three BM types on cosine similarity data).

      (5) To accompany the new Figure 8D-F panels, we added a paragraph to summarize the different somatotopic features of the head BMNs that were identified based on our anatomical and connectomic analysis. This is the last paragraph in the Results section titled “Somatotopically-organized parallel BMN pathways”:

      Our results reveal head bristle proximity-based organization among the BMN projections and their postsynaptic partners to form parallel mechanosensory pathways. BMNs innervating neighboring bristles project into overlapping zones in the SEZ, whereas those innervating distant bristles project to distinct zones (example of BM-Fr, -Ant, and -MaPa neurons shown in Figure 8D,E). Cosine similarity analysis of BMN postsynaptic connectivity revealed that BMNs innervating the same bristle populations (same types) have the highest connectivity similarity. Figure 8F shows example parallel connections for BM-Fr, -Ant, and -MaPa neurons (vertical arrows), where the edge width indicates the number of synapses from each BMN type to their major postsynaptic partners. Additionally, BMNs innervating neighboring bristle populations showed postsynaptic connectivity similarity, while BMNs innervating distant bristles show little or none. For example, BM-Fr and -Ant neurons have connections to common postsynaptic partners, whereas BM-MaPa neurons show only weak connections with the main postsynaptic partners of BM-Fr or -Ant neurons (Figure 8F, connections under 5% of total BMN output omitted). These results suggest that BMN somatotopy could have different possible levels of head spatial resolution, from specific bristle populations (e.g. Ant bristles), to general head areas (e.g. dorsal head bristles).

      We also refer to Figure 8D-F to illustrate the different somatotopic features in the Discussion. These references can be found in the following Discussion sections titled “A synaptic resolution somatotopic map of the head BMNs (fourth paragraph)”, and “Parallel circuit architecture underlying the grooming sequence (second paragraph)”.

      (6) In addition to improving the Figures, we provide additional tools that enable readers to explore the BMN somatotopy in a more interactive way. That is, we provide 5 different FlyWire.ai links in the manuscript Results section that enable 3D visualization of the different reconstructed BMNs (e.g. FlyWire.ai link 1).

      Note: In working on old Figure 7 to address this Reviewer suggestion, we also reordered panels A-E. We believe that this was a more logical ordering than in the previous draft. These panels are now the only data shown in Figure 7, as the cosine similarity analysis is now in Figure 8. We hope that splitting these panels into two Figures will improve manuscript readability.

      Light EM Mapping: A better description of methods by which this mapping was done would be helpful. Perhaps the authors could provide a few example parallel representations of the EM and light images in the main figure would help the reader better appreciate the strength of their approach.

      We have done as the Reviewers suggested and added panels to Figure 6 that show examples of the LM and EM image matching (Figure 6A,B). We added two examples that used different methods for labeling the LM imaged BMNs, including MCFO labeling of an individual BM-InOc neuron and driver line labeling of a major portion of BM-InOm neurons using InOmBMN-LexA. These panels are referred to in the first paragraph of the Results section titled “Matching the reconstructed head BMNs with their bristles”. Note that examples for all LM/EM matched BMN types are shown in Figure 6 – figure supplement 2.

      We had provided Figure 6 – figure supplement 2 in the reviewed manuscript that shows all the above requested “parallel representations of the EM and light images”. However, the Reviewer critiques made us realize that the purpose of this figure supplement was not clearly indicated. Therefore, we have revised Figure 6 – figure supplement 2 and its legend to make its purpose clearer. First, we changed the legend title to better highlight its purpose. The legend is now titled: “Matching EM reconstructed BMN projections with light microscopy (LM) imaged BMNs that innervate specific bristles”. Second, we added label designations to the figure panel rows that highlight the LM and EM comparisons. That is, the rows for light microscopy images of BMNs are indicated with LM and the rows for EM reconstructed BMN images are labeled with EM. Reviewer #3 had indicated that it was not clear what labeling methods were used to visualize the LM imaged BM-InOm neurons in Figure 6 – figure supplement 2N. Therefore, we added text to the figure and the legend to better highlight the different methods used. Panels A and B were also cropped to accommodate the above mentioned revisions.

      The manuscript also provides an extensive Materials and methods section that describes the different lines of evidence that were used to assign the reconstructed BMNs as specific types. We changed the title to better highlight the purpose of this methods section to “Matching EM reconstructed BMN projections with light microscopy imaged BMNs that innervate specific bristles”. The evidence used to support the assignment of the different BMN types is also summarized in Figure 6 – figure supplement 3.

      Parallel circuit model: The authors motivate their study with this. We're recommending that they define expectations of such circuitry, its alternatives (including implications for downstream pathways), and behavior before they present their results. We're also recommending that they interpret their behavioural results in the context of these circuits.

      Our primary motivation for doing the experiments described in this manuscript was to help define the neural circuit architecture underlying the parallel model that drives the Drosophila grooming sequence. This manuscript provides a comprehensive assessment of the first layer of this circuit architecture. A byproduct of this work is a contribution that offers immediate utility and significance to the Drosophila connectomics community. Namely, the description of the majority of mechanosensory neurons on the head, with their annotation in the recently released whole brain connectome dataset (FlyWire.ai). In writing this manuscript, we tried to balance both of these things, which was difficult to write. We very much appreciate the Reviewers' comments that have highlighted points of confusion in our original draft. We hope that the revised draft is now clearer and more logically presented. We have made revisions to the text and provided a new figure supplement (Figure 1 - figure supplement 1) and new panels in Figure 8. Below we highlight the major revisions.

      (1) The Introduction was revised to more explicitly ground the study in the parallel model, while also removing details that were not pertinent to the experiments presented in the manuscript.

      The first paragraph introduces different features of the parallel model. To better focus the reader on the parts of the model that were being assessed in the manuscript, we removed the following sentences: “Performance order is established by an activity gradient among parallel circuits where earlier actions have the highest activity and later actions have the lowest. A winner-take-all network selects the action with the highest activity and suppresses the others. The selected action is performed and then terminated to allow a new round of competition and selection of the next action.” Note that these sentences are included in the third and fourth paragraphs of the last Discussion section titled “Parallel circuit architecture underlying the grooming sequence”.

      The first paragraph of the Introduction now introduces a bigger picture view of the model that emphasizes the two main features: 1) a parallel circuit architecture that ensures all mutually exclusive actions to be performed in sequence are simultaneously readied and competing for output, and 2) hierarchical suppression among the parallel circuits, where earlier actions suppress later actions.

      (2) Newly added Figure 1 – figure supplement 1 provides a working model of grooming (Reviewer # 1 suggestion). We now more strongly emphasize that the study aimed to define the parallel neural circuit architecture underlying the grooming sequence, focusing on the mechanosensory layer of this architecture. In particular, we refer to the new Figure 1 – figure supplement 1 that has been added to better convey the hypothesized grooming neural circuit architecture. Figure 1 – figure supplement 1 is incorporated into the Introduction (paragraphs two, three, and four), Results section titled “Somatotopically-organized parallel BMN pathways (first paragraph)”, and last Discussion section titled “Parallel circuit architecture underlying the grooming sequence (second and third paragraphs)”.

      (3) New panels in Figure 8 update the model of parallel circuit organization as it relates to somatotopy (D-F). These panels show the parallel circuits hypothesized by the model, but also indicate convergence, with different possible levels of head resolution for these circuits. We describe above where these panels are referenced in the text.

      (4) We added a new paragraph in the last Discussion section titled “Parallel circuit architecture underlying the grooming sequence” that better incorporates the results from this manuscript into the working model of grooming. This paragraph is shown below.

      Here we define the parallel architecture of BMN types that elicit the head grooming sequence that starts with the eyes and proceeds to other locations, such as the antennae and ventral head. The different BMN types are hypothesized to connect with parallel circuits that elicit grooming of specific locations (described above and shown in Figure 1 – figure supplement 1A,C). Indeed, we identify distinct projections and connectivity among BMNs innervating distant bristles on the head, providing evidence supporting this parallel architecture (Figure 8D-F). However, we also find partially overlapping projections and connectivity among BMNs innervating neighboring bristles. Further, optogenetic activation of BMNs at specific head locations elicits grooming of both those locations and neighboring locations (Figure 9). These findings raise questions about the resolution of the parallel architecture underlying grooming. Are BMN types connected with distinct postsynaptic circuits that elicit aimed grooming of their corresponding bristle populations (e.g. Ant bristles)? Or are neighboring BMN types that innervate bristles in particular head areas connected with circuits that elicit grooming of those areas (e.g. dorsal or ventral head)? Future studies of the BMN postsynaptic circuits will be required to define the resolution of the parallel pathways that elicit aimed grooming.

      Aside from this summary of major concerns, the detailed recommendations are attached below.

      Reviewer #1 (Recommendations For The Authors):

      I appreciate the quality and exhaustive body of work presented in this manuscript. I have a few comments that the authors may want to consider:

      (1) The authors motivate this study by posing that it would allow them to uncover whether the complex grooming behaviour of flies followed a parallel model of circuit function. It would have been nice to have been introduced to what the alternative model might be and what each would mean for organisation of the circuit architecture. Some guiding schematics would go a long way in illustrating this point. Modifying the discussion along these lines would also be helpful.

      We made several revisions to the manuscript that address this recommendation. Among these revisions, we added Figure 1 – figure supplement 1 that includes a working model for grooming. Please see above for a description of these revisions.

      (2) The authors mention the body of work that has mapped head bristles and described somatotopy. It would be useful to discuss in more detail what these studies have shown and highlight where the gaps are that their study fills.

      We made several revisions to the manuscript that address this recommendation. Please see above for a description of these revisions.

      (3) The dye-fills and reconstructions that are single colour could use a boundary to demarcate the SEZ. This would help in orienting the reader.

      We agree with Reviewer #1 that Figure 4 and its supplements could use some indicator that would orient the reader with respect to the dye filled or stochastically labeled neurons. The images are of the entire SEZ in the ventral brain, and in the case of some panels, the background staining enables visualization of the brain (e.g. Figure 4H,M,N. To help orient the reader in this region, we added a dotted line to indicate the approximate SEZ midline. This also enables the reader to more clearly see which of the BMN types cross the midline.

      Midline visual guides were added for Figure 4, Figure 4 – figure supplement 2, Figure 4 – figure supplement 3, Figure 4 – figure supplement 4, Figure 4 – figure supplement 5, Figure 4 – figure supplement 6, Figure 4 – figure supplement 7, Figure 4 – figure supplement 8, Figure 6 – figure supplement 2.

      (4) The comparison between the EM and the fills/clones are not obvious. And particularly because they are not directly determined, it would be nice to have the EM reconstruction alongside the dye-fills. This would work very nicely in the supplementary figure with the multiple fills of the same bristles. I think this would really drive home the point.

      We made several revisions to the manuscript that address this recommendation. Please see above for a description of these revisions.

      (5) Are there unnoticed black error-bars floating around in many of the gray-scale images?

      The black bars were masking white scale bars in the images. We have removed the black bars and remade the images without scale bars. This was done for the following Figures: Figure 4, Figure 4 – figure supplement 2, Figure 4 – figure supplement 3, Figure 4 – figure supplement 4, Figure 4 – figure supplement 5, Figure 4 – figure supplement 6, Figure 4 – figure supplement 7, Figure 4 – figure supplement 8, Figure 6 – figure supplement 2.

      Reviewer #2 (Recommendations For The Authors):

      (1) The only point in the paper I found myself going back and forth between methods/supp and text was when authors discuss about the clustering. I think it would help the reader if a few sentences about cosine clustering used for connectivity based clustering were included in the main text. Also, for NBLAST hierarchical clustering, it would help if some informed metrics could be used for defining cluster numbers (e.g. Braun et al, 2010 PLOS ONE shows how Ward linkage cost could be used for hierarchical clustering).

      Depending on where the cut height is placed on the dendrogram for cosine similarity of BMNs, different features of the BMN type postsynaptic connectivity are captured. As the number of clusters is increased (lower cut height), clustering is mainly among BMNs of the same type, showing that these BMNs have the highest connectivity similarity. As the number of clusters is reduced (higher cut height), BMNs innervating neighboring bristles on the head are clustered, revealing three general clusters corresponding to the dorsal, ventral, and posterior head. This reveals somatotopy based clustering among same and neighboring BMN types. The cut height shown in Figure 8 and Figure 8 – figure supplement 2 was chosen because it highlighted both of these features.

      The NBLAST clustering shows similar results to the connectivity based clustering with respect to neighboring and distant BMN types. As the number of clusters increases BMNs of the same type are clustered, and these types can be further subdivided into morphologically distinct subtypes. As the number of clusters is reduced, the clustering captures neighboring BMNs. Thus, neighboring BMN types showed high morphology similarity (and proximity) with each other, and low similarity with distant BMN types.

      Please see our responses to a Reviewer #3 critique below for further description of the clustering results.

      On the same lines it would help if the clustering dendrograms were included in the main figure.

      We thank Reviewer #2 for this comment. We have added the dendrogram to Figure 8A, a change that we feel makes this Figure much easier to understand.

      (2) It could help provide intuition if the authors revealed some of the downstream targets and their implication in explaining the behavioral phenotypes.

      While this will be the subject of at least two forthcoming manuscripts, we have added text to the present manuscript that provides insight into BMN postsynaptic targets. Our previous work (Hampel et al. 2015) described a mechanosensory connected neural circuit that elicits grooming of the antennae. While this previous study demonstrated that the Johnston’s organ mechanosensory neurons are synaptically and functionally connected with this circuit, our preliminary analysis indicates that it is also connected with BM-Ant neurons. We hypothesize that there are additional such circuits that are responsible for eliciting grooming of other head locations.

      To better highlight potential downstream targets in the manuscript, we now mention the antennal circuit in the Introduction. This text reads: In this model, parallel-projecting mechanosensory neurons that respond to stimuli at specific locations on the head or body could connect with somatotopically-organized parallel circuits that elicit grooming of those locations (Figure 1 – figure supplement 1A-C). The previous discovery of a mechanosensory-connected circuit that elicits aimed grooming of the antennae provides evidence of this organization (Hampel 2015). However, the extent to which distinct circuits elicit grooming of other locations is unknown, in part, because the somatotopic projections of the mechanosensory neurons have not been comprehensively defined for the head or body.

      There is also text in the Discussion that addresses this Reviewer comment. It describes the antennal circuit and mentions the possibility that other similar circuits may exist. This can be found in the third paragraph of the section titled “Circuits that elicit aimed grooming of specific head locations”.

      (3) Authors find that opto activation of BMNs leads to grooming of targeted as well as neighboring areas. Is there any sequence observed here? i.e. first clean targeted area and then clean neighboring area? I wonder if the answer to this is something as simple as common post-synaptic targets which is essentially reducing the resolution of the BMN sensory map. Some more speculation on this interesting result could be helpful.

      We appreciate and agree with this point from Reviewer #2, and have tried to better emphasize the possible implications for grooming that the overlapping projections and connectivity among BMNs innervating neighboring bristles may have. This is now better addressed in the Results and Discussion sections. Below we highlight where this is addressed:

      (1) In the second paragraph of the Results section titled “Activation of subsets of head BMNs elicits aimed grooming of specific locations” we added text that suggests the possibility that grooming of the stimulated and neighboring locations could be due to the overlapping projections and connectivity. This text reads: This suggested that head BMNs elicit aimed grooming of their corresponding bristle locations, but also neighboring locations. This result is consistent with our anatomical and connectomic data indicating that BMNs innervating neighboring bristles show overlapping projections and postsynaptic connectivity similarity (see Discussion).

      (2) In the fourth paragraph of the Discussion section titled “A synaptic resolution somatotopic map of the head BMNs”, we added a sentence to the end of the fourth paragraph that alludes to further discussion of this topic. This sentence reads: This overlap may have implications for aimed grooming behavior. For example, neighboring BMNs could connect with common neural circuits to elicit grooming of overlapping locations (discussed more below).

      (3) In the fourth paragraph of the Discussion section titled “Circuits that elicit aimed grooming of specific head locations” there is a paragraph that mentions the possibility of mechanosensory convergence onto common postsynaptic circuits to promote grooming of the stimulated area, along with neighboring areas. This paragraph is below.

      We find that activation of specific BMN types elicits both aimed grooming of their corresponding bristle locations and neighboring locations. This suggests overlap in the locations that are groomed with the activation of different BMN types. Such overlap provides a means of cleaning the area surrounding the stimulus location. Interestingly, our NBLAST and cosine similarity analysis indicates that neighboring BMNs project into overlapping zones in the SEZ and show common postsynaptic connectivity. Thus, we hypothesize that neighboring BMNs connect with common neural circuits (e.g. antennal grooming circuit) to elicit overlapping aimed grooming of common head locations.

      (4) In the new second paragraph of the Discussion section titled “Parallel circuit architecture underlying the grooming sequence” we further discuss the issue of the BMN “sensory map. This paragraph is below.

      Here we define the parallel architecture of BMN types that elicit the head grooming sequence that starts with the eyes and proceeds to other locations, such as the antennae and ventral head. The different BMN types are hypothesized to connect with parallel circuits that elicit grooming of specific locations (described above and shown in Figure 1 – figure supplement 1A,C). Indeed, we identify distinct projections and connectivity among BMNs innervating distant bristles on the head, providing evidence supporting this parallel architecture (Figure 8D-F). However, we also find partially overlapping projections and connectivity among BMNs innervating neighboring bristles. Further, optogenetic activation of BMNs at specific head locations elicits grooming of both those locations and neighboring locations (Figure 9). These findings raise questions about the resolution of the parallel architecture underlying grooming. Are BMN types connected with distinct postsynaptic circuits that elicit aimed grooming of their corresponding bristle populations (e.g. Ant bristles)? Or are neighboring BMN types that innervate bristles in particular head areas connected with circuits that elicit grooming of those areas (e.g. dorsal or ventral head)? Future studies of the BMN postsynaptic circuits will be required to define the resolution of the parallel pathways that elicit aimed grooming.

      (4) If authors were to include a summary table that shows all known attributes about BMN type as columns that could be very useful as a resource to the community. Table columns could include attributes like "bristle name", "nerve tract", "FlyWire IDs of all segments corresponding to the bristle class". "split-Gal4 line or known enhancer" , etc.

      We provided a table that includes much of this information after the manuscript had already gone out for review. We regret that this was not available. This is now provided as Supplementary file 3. This table provides the following information for each reconstructed BMN: BMN name, bristle type, nerve, flywire ID, flywire coordinates, NBLAST cluster (cut height 1), NBLAST cluster (cut height 5), and cosine cluster (cut height 4.5). Note that the driver line enhancers for targeting specific BMN types are shown in Figure 3I.

      Specific Points:

      Figure 4C-V:

      • I find it a bit difficult to distinguish ipsi- from contra-lateral projections. Maybe indicate the midline as a thin, stippled line?

      We thank the Reviewer #2 for this suggestion. We have now added lines in the panels in Figure 4C-V to indicate the approximate location of the midline. We also added lines to the Figure 4 – figure supplements as described above.

      I think this Fig reference is wrong "the red-light stimulus also elicited backward motions with control flies (Figure 6B,C, control, black trace, Video 5)." should be Fig 8B,C

      We have fixed this error.

      Reviewer #3 (Recommendations For The Authors):

      Introduction:

      Motivating this study in terms of understanding the neural mechanisms that execute the parallel model seems to overstate what you will achieve with the current study. If you want to motivate it this way, I suggest focusing on the grooming sequence of the head along (eyes, antennae, proboscis).

      We made several revisions to the manuscript that address this recommendation. Please see above for a description of these revisions. Please note that many of the revisions focus on the head grooming sequence. We also made minor revisions to the Introduction that further emphasize the focus on head grooming.

      Results:

      Figure 1. Please indicate that this is a male fly in either the figure title or in the figure itself.

      We added a male symbol to Figure 1A.

      Figure 3. Panel J is referenced in the main body text and in the figure caption, but there is no Fig 3J.

      Panel J is shown in the upper right corner of Figure 3. We realize that the placement of this panel is not ideal, but this was the only place that we could fit it. Additionally, the panel works nicely at that location to better enable comparison with panel C. We have revised the text in the Figure 3 legend to better highlight the location of this Figure panel: “Shown in the upper right corner of the figure are the aligned expression patterns of InOmBMN-LexA (red), dBMN-spGAL4 (green), and TasteBMN-spGAL4 (brown).”

      We also added text to a sentence in the results section entitled “Head BMNs project into discrete zones in the ventral brain” that indicates the panel location. This text reads: To further visualize the spatial relationships between these projections, we computationally aligned the expression patterns of the different driver lines into the same brain space (Figure 3J, upper right corner).

      Matching the BMNs to EM reconstructions: why cut the dendrogram at H=5? Would be better to determine cluster number using an unbiased method.

      To match the morphologically distinct EM reconstructed BMNs to their specific bristles, we relied on different lines of evidence, including NBLAST results (discussed more below), dye fill/stochastic labeling/driver line labeling matches, published morphology, nerve projection, bristle number, proximity to other BMNs, and postsynaptic connectivity (summarized in Figure 6 – figure supplement 3). The following Materials and methods section provides a detailed description of the evidence used to assign each BMN type in “Matching EM reconstructed BMN projections with light microscopy imaged BMNs that innervate specific bristles”. In many cases, BMN type could be assigned with confidence solely based on morphological comparisons with our light level data (e.g. dye fills), in conjunction with bristle counts to indicate an expected number of BMNs showing similar morphology. Thus, the LM/EM matches and NBLAST clustering were largely complementary.

      The EM reconstructed BMNs were matched as particular BMN types, in part based on examination of the NBLAST data at different cut heights. NBLAST clustering of the BMNs revealed general trends at higher and lower cut heights (Figure 6 – figure supplement 1A, Supplementary file 3). The lowest cut heights included mostly BMNs of the same type innervating the same bristle populations, and smaller clusters that subdivided into morphologically distinct subtypes (see Supplementary file 3 for clusters produced at cut height 1). This revealed that BMNs of the same type tended to show the highest morphological similarity with each other, but they also showed intratype morphological diversity. Higher cut heights produced clusters of BMNs innervating neighboring bristles populations (e.g. ventral head BMNs), showing high morphological similarity among neighboring BMN types.

      We selected the cut height 5 shown in Figure 6 – figure supplement 1A,B because it captures examples of both same and neighboring type clustering. For example, it captures a cluster of mostly BM-Taste neurons (Cluster 16), and neighboring BMN types, including those from the dorsal head (Cluster 14) or ventral head (Cluster 15).

      Based on reviewer comments, we realized that the way we wrote the BMN matching section in the Results indicated more reliance on the NBLAST clustering than what was actually necessary, distorting the way we actually matched the BMNs. Therefore, we softend the first couple of sentences to place less emphasis on the importance of the NBLAST. We also indicated that the readers can find the resulting clusters at different cut heights, referring to Figure 6 – figure supplement 1A and Supplementary file 3. The first two sentences of the first paragraph in the Results section titled “Matching the reconstructed head BMNs with their bristles” now read:

      The reconstructed BMN projections were next matched with their specific bristle populations. The projections were clustered based on morphological similarity using the NBLAST algorithm (example clustering at cut height 5 shown in Figure 6 – figure supplement 1A,B, Supplementary file 3, FlyWire.ai link 2) (Costa et al., 2016). Clusters could be assigned as BMN types based on their similarity to light microscopy images of BMNs known to innervate specific bristles.

      The number of reconstructed BMNs is remarkably similar to what is expected based on bristle counts for each group except for lnOm. Why do you think there is such a large discrepancy there?

      We believe that there is a discrepancy between the number of reconstructed BM-InOm neurons and the number expected based on InOm bristle counts because these bristle counts were based on few flies and these numbers appear to be variable. We did not further investigate the numbers of InOm bristles in this manuscript because we only needed an estimate of their numbers, given that there is over an order of magnitude difference in the eye bristles versus any other head bristle population. Therefore, we could relatively easily conclude that the head BMNs were related to the InOm bristles, based on their sheer numbers and their morphology.

      Figure 6 - figure supplement 2N, please describe these panels better. Main text says the upper image is from lnOmBMN-LexA, but the figure legend doesn't agree.

      We have added text to the figure legend that now makes the contents of panel 2N clear to the reader. Further, we now indicate in the figure legend for each panel, the method used to obtain the labeled neurons (i.e. fill, MCFO, driver), to avoid similar confusion for the other panels.

      Figure 6 - figure supplement 4D. How frequently is there a mismatch between the number of BMNs for a given type across hemispheres?

      Although the full reconstruction of the BMNs on both sides of the brain was beyond the scope of this work, the BMNs on both sides have since been reconstructed and annotated (Schlegal et al. 2023). We plan to provide more analysis of BMNs on both sides of the brain in a forthcoming manuscript. However, the BMN numbers tend to show agreement on both sides of the brain. The table below shows a comparison between the two sides:

      Author response table 1.

      Figures 6 and 7. It would be helpful to include a reference brain in all panels that show cluster morphology. Without landmarks there is nothing to anchor the eye to allow the reader to see the described differences in BMN projection zones and patterns.

      While we apologize for not making this specific change, we have made revisions to other parts of the manuscript to better highlight the somatotopic organization among the BMNs (revisions described above). Please note that we now provide FlyWire.ai publicly available links that enable readers to view the BMN projections in 3D. They can also toggle a brain mesh on and off to provide spatial reference.

      "BMN somatotopic map": It would be helpful to show or describe in more detail what the unique branch morphology for each zone is. It is quite difficult to appreciate, as the groups also have a lot of overlap. Would the unique regions that the BMN groups innervate be easier to see if you plotted presynaptic sites by group? I am left unsure about whether there is a somatotopic map here.

      We made several revisions to the manuscript that address this recommendation. Please see above for a description of these revisions. Please note that we did not examine the fine branch morphological differences between BMN types having overlapping projections. Showing these differences would require more extensive anatomical analysis that is beyond the scope of this work. For showing definitive somatotopy, we focused on the overt differences between BMNs innervating bristles at distant locations on the head.

      Overall the strict adherence to the parallel model impacts the interpretation of the data. It would be helpful for the authors to discuss which aspects of the current study are consistent with the parallel model and which results are not consistent.

      We made several revisions to the manuscript that address this recommendation. Please see above for a description of these revisions.

      Discussion:

      "Circuits that elicit aimed grooming of specific head locations": In the previous paragraph you mention "BMN types innervating neighboring bristle populations have overlapping projections into zones that correspond roughly to the dorsal, ventral, and posterior head. The overlap is likely functionally significant, as cosine similarity analysis revealed that neighboring head BMN types have common postsynaptic partners. However, overlap between neighboring BMN types is only partial, as they show differing projections and postsynaptic connectivity." Then in this paragraph, you say, "How do the parallel-projecting head BMNs interface with postsynaptic neural circuits to elicit aimed grooming of specific head locations? Different evidence supports the hypothesis that the BMNs connect with parallel circuits that each elicit a different aimed grooming movement (Seeds et al., 2014)." The overlapping postsynaptic BMN connectivity seems in conflict with the claim that the circuits are parallel.

      We apologize for this confusion. We now better describe this apparent discrepancy between our results and the parallel model of grooming behavior. We made several revisions to the manuscript that address this recommendation. Please see above for a description of these revisions.

      We have made additional changes to the manuscript:

      (1) We added Supplementary file 2 that includes links for downloading the image stacks used to generate panels in Figure 1, Figure 2, Figure 3, Figure 4, and figure supplements for these figures. These image stacks are stored in the Brain Image Library (BIL). Rows in the spreadsheet correspond to each image stack. Columns provide information about each stack including: figure panels that each image stack contributed to, image stack title, DOI for each stack (link provides metadata for each stack and file download link), image stack file name, genotype of imaged fly, and information about image stack. References to this file have been made at different locations throughout the text and Figure legends. We also added a section on the BIL data in the Materials and methods entitled “Light microscopy image stack storage and availability”. Old Supplementary file 2 has been renamed Supplementary file 3.

      (2) We added a new reference for FlyWire.ai (Dorkenwald et al. 2023) that was posted as a preprint during the revision of this manuscript.

    3. Reviewer #2 (Public Review):

      The authors combine genetic tools, dye fills and connectome analysis techniques to generate a "first-of-its-kind", near complete, synaptic resolution map of the head bristle neurons of Drosophila. While some of the BMN anatomy was already known based on previous work by the authors and other researchers, this is the first time a near complete map has been created for the head BMNs at electron microscopy resolution.

      Strengths:

      (1) The authors cleverly use techniques that allow moving back and forth between periphery (head bristle location) and brain, as well as moving between light microscopy and electron microscopy data. This allows them to first characterize the pathways taken by different head BMNs to project to the brain and also characterize anatomical differences among individual neurons at the level of morphology and connectivity.<br /> (2) The work is very comprehensive and results in a near complete map of all head BMNs.<br /> (3) Authors also complement this anatomical characterization with a first-level functional analysis using optogenetic activation of BMNs that results in expected directed grooming behavior.

      Weaknesses:<br /> (1) While not strictly needed here, it could help provide context if authors revealed some of the important downstream pathways that could explain optogenetics behavioral phenotypes: This point was addressed by authors in the revisions and I agree a detailed description of downstream circuits is not needed at this point.<br /> (2) In contrast to the rigorous quantitative analysis of the anatomical data, the behavioral data is analyzed using much more subjective methods. While I do not think it is necessary to perform a rigorous analysis of behaviors in this anatomy focused manuscript, the conclusions based on behavioral analysis should be treated as speculative in the current form e.g. calling "nodding + backward motions" as an avoidance response is not justified as it currently stands. Strong optogenetic activation could lead to sudden postural changes that due to purely biomechanical constraints could lead to a couple of backward steps as seen in the example videos. Moreover since the quantification is manual, it is not clear what the analyst interprets as backward walking or nodding. Interpretation is also concerning because controls show backward walking (although in fewer instances based on subjective quantification): This point was addressed by the authors during revisions and I'm mostly satisfied with their response, where authors agree that the behavioral results are currently used to speculate about the role of BMNs in aversive behaviors. Still, the fact that controls show some "backward motions" is a bit concerning when talking about "significant differences" between control and test groups based on manual annotations and I would recommend future studies focusing on these behaviors to use more unbiased quantitative analysis wherever possible.

      Summary:

      The authors end up generating a near-complete map of head BMNs that will serve as a long-standing resource to the Drosophila research community. This will directly shape future experiments aimed at modeling or functionally analyzing the head grooming circuit to understand how somatotopy guides behaviors. I appreciate the authors taking the time to revise the manuscript and address reviewer concerns.

    4. Reviewer #3 (Public Review):

      Eichler et al. set out to catalog the mechanosensory bristles of the fly head in an effort to understand the extent to which their organization is consistent with the parallel model of hierarchical suppression in the context of grooming behavior. They map the locations of the mechanosensory bristles on the fly head, examine the axonal morphology of the bristle mechanosensory neurons (BMNs) that innervate them, and match these to electron microscopy reconstructions of the same BMNs in a previously published EM volume of the female adult fly brain. They use BMN synaptic connectivity information to create clusters of BMNs that they show occupy different regions of the subesophageal zone brain region and use optogenetic activation of subsets of BMNs to evaluate the behaviors evoked by specific activation of BMN subpopulations innervating the head.

      The authors have beautifully cataloged the mechanosensory bristles and the projection paths and patterns of the corresponding BMN axons in the brain using detailed and painstaking methods. The result is a neuroanatomy resource that will be an important community resource. To match BMNs reconstructed in an electron microscopy volume of the adult fly brain, the authors matched clustered reconstructed BMNs with light-level BMN classes observed using precise dye-fills and stochastic labeling techniques. The authors then employ a variety of clustering methods to demonstrate that BMN populations that innervate different regions of the head project into the subesophageal zone and terminate in distinctive yet, in some cases, partially overlapping zones. By clustering BMNs on the basis of their synaptic partners, the authors find that BMNs from distant areas of the head have non-overlapping synaptic partners while those from neighbor areas have overlapping synaptic partners. This result calls into question the scale at which the parallel model of hierarchical suppression may be operating. Finally, the authors use tools that were generated during the light-level characterization of BMN projections to show that activating BMNs that innervate specific areas of the head leads to grooming of the innervated regions and neighboring regions, consistent with the observed overlap in downstream circuits between BMNs innervating neighboring regions of the head. This result suggests that while the parallel model could be operating on a broad scale, additional circuit mechanisms may be operating on a finer scale to produce grooming of the area surrounding the source of mechanosensory input.

      This work will have a positive impact on the field by contributing a complete accounting of the mechanosensory bristles of the fruit fly head, describing the brain projection patterns of the BMNs that innervate them, and linking them to BMN sensory projections in an electron microscopy volume of the adult fly brain. It will also have a positive impact on the field by providing genetic tools to help functionally subdivide the contributions of different BMN populations to circuit computations and behavior. This contribution will pave the way for further mechanistic study of central circuits that subserve grooming circuits.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (recommendations for the authors):

      Additional suggestions for improvement are noted below:

      (1) Additional 1. Lns 261-262, as well as abstract: The term 'aerobic fermentation' is not accurate in the context of this manuscript. This terminology should be reserved for conditions where lactate production is observed under optimal aerobic conditions. This is not the case in this study. More lactate was observed in the agr mutant only when cells were grown under microaerobic conditions, where some level of fermentation would be expected to be active (esp. if nitrate is not provided in media).

      We modified the text by deleting reference to the “aerobic” fermentation as suggested by the reviewer:

      Line 93 (abstract): “Deletion of agr increased both respiration and aerobic fermentation but decreased ATP levels and growth, suggesting that Δagr cells assume a hyperactive metabolic state in response to reduced metabolic efficiency.”

      Line 184: “Collectively, these data suggest that Δagr increases respiration and aerobic fermentation to compensate for low metabolic efficiency.”

      (2) Additionally, the authors' statement, 'The tendency of Δagr cells to forgo the additional ATP yield from acetate production in favor of NAD+-generating lactate (23, 24) underscores the importance of redox balance in Δagr cells,' appears contradictory to the data presented in Fig 5, where the Δagr mutant demonstrates an approximately threefold increase in acetate production during exponential growth compared to the wild-type strain. A clarification or adjustment in the manuscript may be necessary to ensure consistency and accurate interpretation.

      In glucose-fermenting S. aureus, pyruvate can serve as an electron acceptor, generating lactate from lactate dehydrogenases. Acetyl-CoA production proceeds via the pyruvate formate-lyase reaction, which converts pyruvate to formate rather than CO2 and thus does not consume oxidized NAD+. Thus, at a general level, the tendency of fermenting cells to forgo the additional ATP yield from acetate production in favor of NAD+-generating ethanol synthesis underscores the importance of redox balance when respiration is suboptimal. This is especially true for fermenting Δagr strains, as evidenced by increased lactate production compared to their relatively ATP replete wild-type parental strains. However, in the interest of clarity, we removed the sentence in question, because it is not necessary and potentially confusing, and because the additional context it requires would detract from the manuscript by disrupting its sense of narrative and brevity.

      (3) Ln 277-285: There still are errors in how this paragraph is worded. What the authors stated in the 'response to the reviewers' (question 13) and the changes they made in the text are different. Here again, the response to question 13 suggested the following, "Collectively, these observations suggest that a surge in NADH production and reductive stress in the Δagr strain induces a burst in respiration, but levels of NADH are saturating, thereby driving fermentation in the presence of oxygen." That bit of it where the authors suggest that fermentation was activated because NADH was saturating is only true under microaerobic conditions and not under oxygen rich conditions.

      Reviewer #1 (comment under Review): Data presented in Figure 5 suggest the opposite - a surge in NADH accumulation leading to a decrease in the NAD/NADH ratio, rather than a surge in the 'consumption' of NADH. Clarifying this point in the manuscript would ensure accurate representation of the findings.

      Responses to Comments 3 and a comment in the Review have been combined.

      Line 280: We thank the Reviewer for their attention to detail in picking up our error in response to question 13 related to the difference in the revised text and “response to reviewers”. We modified the text accordingly.

      “Microaerobic conditions and “consumption”: We have modified the wording and fixed the error with respect to “consumption” as pointed out by the reviewer (strikethrough/underlined):

      Line 285: “Collectively, these observations suggest that a surge in NADH consumption accumulation and reductive stress in the Δagr strain induces a burst in respiration, but levels of NADH are saturating, thereby driving fermentation under microaerobic conditions in the presence of oxygen.”

      Reviewer #2 (recommendations for the authors):

      (1) The authors are requested to revise 'we expected a lower NAD+/NADH' in line 280 to 'we expected a higher NAD+/NADH.' Additionally, what was the glucose concentration in TSB media?

      NAD+/NADH: We thank the Reviewer for their attention to detail in picking up our error. Our responses to Reviewer 1, Comment 3 above addresses this issue.

      Glucose: We modified the Methods as suggested.

    2. eLife assessment

      This important study outlines how the agr quorum sensing system in Staphylococcus aureus confers long-lived protection against oxidative stress, thereby linking bacterial metabolism to virulence in this pathogen. While the findings, which are supported by solid data, seem at first glance to contradict earlier findings that show increased fitness of agr mutants under oxidative stress, the core conclusions of the study are well-substantiated. The topic of the paper holds broad relevance to microbiologists, especially those focusing on host-pathogen interactions and bacterial responses to ROS.

    3. Reviewer #1 (Public Review):

      As a pathogen, S. aureus has evolved strategies to evade the host's immune system. It effectively remains 'under the radar' in the host until it reaches high population densities, at which point it triggers virulence mechanisms, enabling it to spread within the host. The agr quorum sensing system is central to this process, as it coordinates the pathogen's virulence in response to its cell density.

      In this study, Podkowik and colleagues suggest that cells activating agr signaling also benefit from protection against H2O2 stress, whereas inactivation of agr increases cell death. The underlying cause of this lack of protection is tied to an ATP deficit in the agr mutant, leading to increased glucose consumption and NADH production, ultimately resulting in a redox imbalance. In response to this imbalance, the agr mutant increases respiration, resulting in the endogenous production of ROS which synergizes with H2O2 to mediate killing of the agr mutant. Suppressing respiration in the agr mutant restored protection against H2O2 stress.

      Additionally, the authors establish that agr-dependent protection against oxidative stress is also linked to RNAIII activation, and the subsequent block of Rot translation. However, the specific protective genes regulated by Rot remain unidentified. Thus, according to the evidence provided, agr triggers intrinsic mechanisms that not only decrease harmful ROS production within the cell but also alleviate its detrimental effects.

      Interestingly, these protective mechanisms are long-lived, and guard the cells against external oxidative stressors such as H2O2, even after the agr system has been 'turned off' in the population.

    4. Reviewer #2 (Public Review):

      In their study, Podkowik et al. elucidate the protective role of the accessory gene regulator (agr) system in Staphylococcus aureus against hydrogen peroxide (H2O2) stress. Their findings demonstrate that agr safeguards the bacterium by controlling the accumulation of reactive oxygen species (ROS), independent of agr activation kinetics. This protection is facilitated through a regulatory interaction between RNAIII and Rot, impacting virulence factor production and metabolism, thereby influencing ROS levels. Notably, the study highlights the remarkable adaptive capabilities of S. aureus conferred by agr. The protective effects of agr extend beyond the peak of agr transcription at high cell density, persisting even during the early log-phase. This indicates the significance of agr-mediated protection throughout the infection process. The absence of agr has profound consequences, as observed by the upregulation of respiration and fermentation genes, leading to increased ROS generation and subsequent cellular demise. Interestingly, the study also reveals divergent effects of agr deficiency on susceptibility to hydrogen peroxide compared to ciprofloxacin. While agr deficiency heightens vulnerability to H2O2, it also upregulates the expression of bsaA, countering the endogenous ROS induced by ciprofloxacin. These findings underscore the complex and context-dependent nature of agr-mediated protection. Furthermore, in vivo investigations using murine models provide valuable insights into the importance of agr in promoting S. aureus fitness, particularly in the context of neutrophil-mediated clearance, with notable emphasis on the pulmonary milieu. Overall, this study significantly advances our understanding of agr-mediated protection in S. aureus and sheds light on the sophisticated adaptive mechanisms employed by the bacterium to fortify itself against oxidative stress encountered during infection.

      The conclusions drawn in this paper are generally well-supported by the data.

    1. eLife assessment

      This fundamental study substantially advances our understanding of sibling chimerism in marmosets by demonstrating that chimerism is limited to hematopoietic cells. The evidence supporting these findings is compelling, demonstrated through comprehensive analyses, including single-cell RNA-seq data from multiple individuals and tissues. The work will be of broad interest to many fields of biology.

    2. Reviewer #1 (Public Review):

      Summary:

      Del Rosario et al characterized the extent and cell types of sibling chimerism in marmosets. To do so, they took advantage of the thousands of SNPs that are transcribed in single-nucleus RNA-seq (snRNA-seq) data to identify the sibling genotype of origin for all sequenced cells across 4 tissues (blood, liver, kidney, and brain) from many marmosets. They found that chimerism is prevalent and widespread across tissues in marmosets, which has previously been shown. However, their snRNA-seq approach allowed them to identify precisely which cells were of sibling origin, and which were not. In doing so they definitively show that sibling chimerism across tissues is limited to cells of myeloid and lymphoid lineages. The authors then focus on a large sample of microglia sequenced across many brain regions to quantify: (1) variation in chimerism across brain regions in the same individual, and (2) the relative importance of genetic vs. environmental context on microglia function/identity.

      (1) Much like across different tissues in the same individual, they found that the proportion of chimeric microglia varies across brain regions collected from the same individuals (as well as differing from the proportion of sibling cells found in the blood of the same animals), suggesting that cells from different genetic backgrounds may differ in their recruitment and/or proliferation across regions and local tissue contexts, or that this may be linked to stochastic bottleneck effects during brain development.

      (2) Their (admittedly smaller sample size) analyses of host-sibling gene expression showed that the local environment dominates genotype.

      All told, this thoughtful and thorough manuscript accomplishes two important goals. First, it all but closes a previously open question on the extent and cell origins of sibling chimerism. Second, it sets the stage for using this unique model system to examine, in a natural context, how genetic variation in microglia may impact brain development, function, and disease.

      The conclusions of this paper are well supported by the data, and the authors exert appropriate care when extrapolating their results that come from smaller samples. However, there are a few concerns that should be addressed.

      The "modest correlation" mentioned in lines 170-172 does not take into account the uncertainty in estimates of each chimeric cell proportion (although the plot shows those estimates nicely). This is particularly important for the macrophages, which are far less abundant. Perhaps a more appropriate way to model this would be in a binomial framework (with a random effect for individuals of origin). Here, you could model the sibling identity of each macrophage as a function of the proportion of sibling-origin microglia and then directly estimate the percent variance explained.

      A similar (albeit more complicated because of the number of regions being compared) approach could be applied to more rigorously quantify the variation in chimerism across brain regions (L198-215; Figure 4). This would also help to answer the question of whether specific brain regions are more "amenable" to microglia chimerism than others.

      While the sample size is small, it would be exciting to see if any microglia eQTL are driven by sibling chimerism across the marmosets.

      L290-292: The authors should propose ways in which they could test the two different explanations proposed in this paragraph. For instance, a simulation-based modeling approach could potentially differentiate more stochastic bottleneck effects from recruitment-like effects.

      While intriguing, the gene expression comparison (Figure 5) is extremely underpowered. It would be helpful to clarify this and note the statistical thresholds used for identifying DEGs (the black points in the figure).

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript reports a novel and quite important study of chimerism among common marmosets. As the authors discuss, it has been known for years that marmosets display chimerism across a number of tissues. However, as the authors also recognize, the scope and details of this chimerism have been controversial. Some prior publications have suggested that the chimerism only involves cells derived from hematopoietic stem cells, while other publications have suggested more cell types can also be chimeric, including a wide range of cell types present in multiple organs. The present authors address this question and several other important issues by using snRNA-seq to track the expression of host and sibling-derived mRNAs across multiple tissues and cell types. The results are clear and provide strong evidence that all chimeric cells are derived from hematopoietic cell lineages.

      This work will have an impact on studies using marmosets to investigate various biological questions but will have the biggest impact on neuroscience and studies of cellular function within the brain. The demonstration that microglia and macrophages from different siblings from a single pregnancy, with different genomes expressing different transcriptomes, are commonly present within specific brain structures of a single individual opens a number of new opportunities to study microglia and macrophage function as well as interactions between microglia, macrophages, and other cell types.

      Strengths:

      The paper has a number of important strengths. This analysis employs the first unambiguous approach providing a clear answer to the question of whether sibling-derived chimeric cells arise only from hematopoietic lineages or from a wider array of embryonic sources. That is a long-standing open question and these snRNA-seq data seem to provide a clear answer, at least for the brain, liver, and kidney. In addition, the present authors investigate quantitative variation in chimeric cell proportions across several dimensions, comparing the proportion of chimeric cells across individual marmosets, across organs within an individual, and across brain regions within an individual. All these are significant questions, and the answers have important implications for multiple research areas. Marmosets are increasingly being used for a range of neuroscience studies, and a better understanding of the process that leads to the chimerism of microglia and macrophages in the marmoset brain is a valuable and timely contribution. But this work also has implications for other lines of study. Third, the snRNA-seq data will be made available through the Brain Initiative NeMO portal and the software used to quantify host vs. sibling cell proportions in different biosamples will be available through GitHub.

      Weaknesses:

      I find no major weaknesses, but several minor ones. First, the main text of the manuscript provides no information about the specific animals used in this study, other than sex. Some basic information about the sources of animals and their ages at the time of study would be useful within the main paper, even though more information will be available in the supplementary material. Second, it is not clear why only 14 pairs of animals were used for estimating the correlation of chimerism levels in microglia and macrophages. Is this lower than the total number of pairwise comparisons possible in order to avoid using non-independent samples? Some explanation would be helpful. Finally, I think more analysis of the consistency and variability of gene expression in microglia across different regions of the brain would be valuable. Are there genetic pathways expressed similarly in host and sibling microglia, regardless of region of the brain? Are there pathways that are consistently expressed differently in host vs sibling microglia regardless of brain region?

    1. eLife assessment

      This important study uses citizen science-generated diversity records and quantitative methodologies to improve species distribution estimates. This combination of fields, technologies, and methodologies is solid and improves species distribution maps formerly based solely on limited data gathered by scientists in traditional ways/surveys. This paper will be of interest to researchers interested in citizen science and new sources of big data in biodiversity, and to biogeographers exploring the distributions of species on the planet.

    2. Reviewer #1 (Public Review):

      Summary:<br /> The study presented by Atsumi et al. is about using smartphone-driven, community-sourced data to enhance biodiversity monitoring. The idea is to leverage the widespread use of smartphones to gather data from the community quickly, contributing to a more comprehensive understanding of biodiversity. The authors discuss the importance of ecosystem services linked to biodiversity and the threats posed by human activities. It emphasizes the need for comprehensive biodiversity data to implement the Kunming-Montreal Global Biodiversity Framework. The 'Biome' mobile app, launched in Japan, uses species identification algorithms and gamification to gather over 6 million observations since 2019. While community-sourced data may have biases, incorporating it into Species Distribution Models (SDMs) improves accuracy, especially for endangered species. The app covers urban-natural gradients uniformly, enhancing traditional survey data biased towards natural areas. Combining these sources provides valuable insights into species distributions for conservation, protected area designation, and ecosystem service assessment.

      Strengths:

      The use of a smartphone app ('Biome') for community-driven species occurrence data collection represents an innovative and inclusive approach to biodiversity monitoring, leveraging the widespread use of smartphones. The app has successfully accumulated a large volume of species occurrence data since its launch in 2019, showcasing its effectiveness in rapidly gathering information from diverse locations. Despite challenges with certain taxa, the study highlights high species identification accuracy, especially for birds, reptiles, mammals, and amphibians, making the 'Biome' app a reliable tool for species observation. The integration of community-sourced data into Species Distribution Models (SDMs) improves the accuracy of predicting species distributions. This has implications for conservation planning, including the designation of protected areas and assessment of ecosystem services. The rapid accumulation of data and advancements in machine learning methods open up opportunities for conducting time-series analyses, contributing to the understanding of ecosystem stability and interaction strength over time. The study emphasizes the collaborative nature of the platform, fostering collaboration among diverse stakeholders, including local communities, private companies, and government agencies. This inclusive approach is essential for effective biodiversity assessment and decision-making. The platform's engagement with various stakeholders, including local communities, supports biodiversity assessment, management planning, and informed decision-making. Additionally, the app's role in fostering nature-positive awareness in society is highlighted as a significant contribution to creating a sustainable society.

      Weaknesses:

      While the studies make significant contributions to biodiversity monitoring, they also have some weaknesses. Firstly, relying on smartphone-driven, community-sourced data may introduce spatial and taxonomic biases. The 'Biome' app, for example, showed lower accuracy for certain taxa like seed plants, molluscs, and fishes, potentially impacting the reliability of the gathered data. Furthermore, the effectiveness of Species Distribution Models (SDMs) relies on the assumption that biases in community-sourced data can be adequately accounted for. The unique distribution patterns of the 'Biome' data, covering urban-natural gradients uniformly, might not fully represent the diversity of certain ecosystems, potentially leading to inaccuracies in the models. Moreover, the divergence in data distribution patterns along environmental gradients between 'Biome' data and traditional survey data raises concerns. The app data shows a more uniform distribution across natural-urban gradients, while traditional data is biased towards natural areas. This discrepancy may impact the representation of certain ecosystems and influence the accuracy of Species Distribution Models (SDMs). While the integration of 'Biome' data into SDMs improves accuracy, the study notes that controlling the sampling efforts is crucial. Spatially-biased sampling efforts in community-sourced data need careful consideration, and efforts to control biases are essential for reliable predictions.

    1. Reviewer #3 (Public Review):

      This study explores sensory prediction errors in the sensory cortex. It focuses on the question of how these signals are shaped by non-hierarchical interactions, specifically multimodal signals arising from same-level cortical areas. The authors used 2-photon imaging of mouse auditory cortex in head-fixed mice that were presented with sounds and/or visual stimuli while moving on a ball. First, responses to pure tones, visual stimuli, and movement onset were characterized. Then, the authors made the running speed of the mouse predictive of sound intensity and/or visual flow. Mismatches were created through the interruption of sound and/or visual flow for 1 second while the animal moved, disrupting the expected sensory signal given the speed of movement. As a control, the same sensory stimuli triggered by the animal's movement were presented to the animal decoupled from its movement. The authors suggest that auditory responses to the unpredicted silence reflect mismatch responses. That these mismatch responses were enhanced when the visual flow was congruently interrupted, indicates the cross-modal influence of prediction error signals.

      This study's strengths are the relevance of the question and the design of the experiment. The authors are experts in the techniques used. The analysis explores neither the full power of the experimental design nor the population activity recorded with 2-photon, leaving open the question of to what extent what the authors call mismatch responses are not sensory responses to sound interruption. The auditory system is sensitive to transitions and indeed responses to the interruption of the sound are similar in quality, if not quantity, in the predictive and the control situation.

    2. Reviewer #2 (Public Review):

      In this study, Solyga and Keller use multimodal closed-loop paradigms in conjunction with multiphoton imaging of cortical responses to assess whether and how sensorimotor prediction errors in one modality influence the computation of prediction errors in another modality. Their work addresses an important open question pertaining to the relevance of non-hierarchical (lateral cortico-cortical) interactions in predictive processing within the neocortex.

      Specifically, they monitor GCaMP6f responses of layer 2/3 neurons in the auditory cortex of head-fixed mice engaged in VR paradigms where running is coupled to auditory, visual, or audio-visual sensory feedback. The authors find strong auditory and motor responses in the auditory cortex, as well as weak responses to visual stimuli. Further, in agreement with previous work, they find that the auditory cortex responds to audiomotor mismatches in a manner similar to that observed in visual cortex for visuomotor mismatches. Most importantly, while visuomotor mismatches by themselves do not trigger significant responses in the auditory cortex, simultaneous coupling of audio-visual inputs to movement non-linearly enhances mismatch responses in the auditory cortex.

      Their results thus suggest that prediction errors within a given sensory modality are non-trivially influenced by prediction errors from another modality. These findings are novel, interesting, and important, especially in the context of understanding the role of lateral cortico-cortical interactions and in outlining predictive processing as a general theory of cortical function.

      In its current form, the manuscript lacks sufficient description of methodological details pertaining to the closed-loop training and the overall experimental design. In several scenarios, while the results per se are convincing and interesting, their exact interpretation is challenging given the uncertainty about the actual experimental protocols (more on this below). Second, the authors are laser-focused on sensorimotor errors (mismatch responses) and focus almost exclusively on what happens when stimuli deviate from the animal's expectations.

      While the authors consistently report strong running-onset responses (during open-loop) in the auditory cortex in both auditory and visual versions of the task, they do not discuss their interpretation in the different task settings (see below), nor do they analyze how these responses change during closed-loop i.e. when predictions align with sensory evidence.

      However, I believe all my concerns can be easily addressed by additional analyses and incorporation of methodological details in the text.

      Major concerns:

      (1) Insufficient analysis of audiomotor mismatches in the auditory cortex:

      Lack of analysis of the dependence of audiomotor mismatches on the running speed: it would be helpful if the authors could clarify whether the observed audiomotor mismatch responses are just binary or scale with the degree of mismatch (i.e. running speed). Along the same lines, how should one interpret the lack of dependence of the playback halt responses on the running speed? Shouldn't we expect that during playback, the responses of mismatch neurons scale with the running speed?

      Slow temporal dynamics of audiomotor mismatches: despite the transient nature of the mismatches (1s), auditory mismatch responses last for several seconds. They appear significantly slower than previous reports for analogous visuomotor mismatches in V1 (by the same group, using the same methods) and even in comparison to the multimodal mismatches within this study (Figure 4C). What might explain this sustained activity? Is it due to a sustained change in the animal's running in response to the auditory mismatch?

      (2) Insufficient analysis and discussion of running onset responses during audiomotor sessions: The authors report strong running-onset responses during open-loop in identified mismatch neurons. They also highlight that these responses are in agreement with their model of subtractive prediction error, which relies on subtracting the bottom-up sensory evidence from top-down motor-related predictions. I agree, and, thus, assume that running-onset responses during the open loop in identified 'mismatch' neurons reflect the motor-related predictions of sensory input that the animal has learned to expect. If this is true, one would expect that such running-onset responses should dampen during closed-loop, when sensory evidence matches expectations and therefore cancels out this prediction. It would be nice if the authors test this explicitly by analyzing the running-related activity of the same neurons during closed-loop sessions.

      (3) Ambiguity in the interpretation of responses in visuomotor sessions.

      Unlike for auditory stimuli, the authors show that there are no obvious responses to visuomotor mismatches or playback halts in the auditory cortex. However, the interpretation of these results is somewhat complicated by the uncertainty related to the training history of these mice. Were these mice exclusively trained on the visuomotor version of the task or also on the auditory version? I could not find this info in the Methods. From the legend for Figure 4D, it appears that the same mice were trained on all versions of the task. Is this the case? If yes, what was the training sequence? Were the mice first trained on the auditory and then the visual version?

      The training history of the animals is important to outline the nature of the predictions and mismatch responses that one should expect to observe in the auditory cortex during visuomotor sessions. Depending on whether the mice in Figure 3 were trained on visual only or both visual and auditory tasks, the open-loop running onset responses may have different interpretations.

      a) If the mice were trained only on the visual task, how should one interpret the strong running onset responses in the auditory cortex? Are these sensorimotor predictions (presumably of visual stimuli) that are conveyed to the auditory cortex? If so, what may be their role?

      b) If the mice were also trained on the auditory version, then a potential explanation of the running-onset responses is that they are audiomotor predictions lingering from the previously learned sensorimotor coupling. In this case, one should expect that in the visual version of the task, these audiomotor predictions (within the auditory cortex) would not get canceled out even during the closed-loop periods. In other words, mismatch neurons should constantly be in an error state (more active) in the closed-loop visuomotor task. Is this the case?

      If so, how should one then interpret the lack of a 'visuomotor mismatch' aligned to the visual halts, over and above this background of continuous errors?<br /> As such, the manuscript would benefit from clearly stating in the main text the experimental conditions such as training history, and from discussing the relevant possible interpretations of the responses.

      (4) Ambiguity in the interpretation of responses in multimodal versus unimodal sessions.

      The authors show that multimodal (auditory + visual) mismatches trigger stronger responses than unimodal mismatches presented in isolation (auditory only or visual only). Further, they find that even though visual mismatches by themselves do not evoke a significant response, co-presentation of visual and auditory stimuli non-linearly augments the mismatch responses suggesting the presence of non-hierarchical interactions between various predictive processing streams.

      In my opinion, this is an important result, but its interpretation is nuanced given insufficient details about the experimental design. It appears that responses to unimodal mismatches are obtained from sessions in which only one stimulus is presented (unimodal closed-loop sessions). Is this actually the case? An alternative and perhaps cleaner experimental design would be to create unimodal mismatches within a multimodal closed-loop session while keeping the other stimulus still coupled to the movement.

      Given the current experiment design (if my assumption is correct), it is unclear if the multimodal potentiation of mismatch responses is a consequence of nonlinear interactions between prediction/error signals exchanged across visual and auditory modalities. Alternatively, could this result from providing visual stimuli (coupled or uncoupled to movement) on top of the auditory stimuli? If it is the latter, would the observed results still be evidence of non-hierarchical interactions between various predictive processing streams?

      Along the same lines, it would be interesting to analyze how the coupling of visual as well as auditory stimuli to movement influences responses in the auditory cortex in close-loop in comparison to auditory-only sessions. Also, do running onset responses change in open-loop in multimodal vs. unimodal playback sessions?

      Minor concerns and comments:

      (1) Rapid learning of audiomotor mismatches: It is interesting that auditory mismatches are present even on day 1 and do not appear to get stronger with learning (same on day 2). The authors comment that this could be because the coupling is learned rapidly (line 110). How does this compare to the rate at which visuomotor coupling is learned? Is this rapid learning also observable in the animal's behavior i.e. is there a change in running speed in response to the mismatch?

      (2) The authors should clarify whether the sound and running onset responses of the auditory mismatch neurons in Figure 2E were acquired during open-loop. This is most likely the case, but explicitly stating it would be helpful.

      (3) In lines 87-88, the authors state 'Visual responses also appeared overall similar but with a small increase in strength during running ...'. This statement would benefit from clarification. From Figure S1 it appears that when the animal is sitting there are no visual responses in the auditory cortex. But when the animal is moving, small positive responses are present. Are these actually 'visual' responses - perhaps a visual prediction sent from the visual cortex to the auditory cortex that is gated by movement? If so, are they modulated by features of visual stimuli eg. contrast, intensity? Or, do these responses simply reflect motor-related activity (running)? Would they be present to the same extent in the same neurons even in the dark?

      (4) The authors comment in the text (lines 106-107) about cessation of sound amplitude during audiomotor mismatches as being analogous to halting of visual flow in visuomotor mismatches. However, sound amplitude versus visual flow are quite different in nature. In the visuomotor paradigm, the amount of visual stimulation (photons per unit time) does not necessarily change systematically with running speed. Whereas, in the audiomotor paradigm, the SNR of the stimulus itself changes with running speed which may impact the accuracy of predictions. On a broader note, under natural settings, while the visual flow is coupled to movement, sound amplitude may vary more idiosyncratically with movement.

      Perhaps such differences might explain why unlike in the case of visual cortex experiments, running speed does not affect the strength of playback responses in the auditory cortex.

    3. Reviewer #1 (Public Review):

      Summary:

      The manuscript presents a short report investigating mismatch responses in the auditory cortex, following previous studies focused on the visual cortex. By correlating the mouse locomotion speed with acoustic feedback levels, the authors demonstrate excitatory responses in a subset of neurons to halts in expected acoustic feedback. They show a lack of responses to mismatch in the visual modality. A subset of neurons show enhanced mismatch responses when both auditory and visual modalities are coupled to the animal's locomotion.

      While the study is well-designed and addresses a timely question, several concerns exist regarding the quantification of animal behavior, potential alternative explanations for recorded signals, correlation between excitatory responses and animal velocity, discrepancies in reported values, and clarity regarding the identity of certain neurons.

      Strengths:

      (1) Well-designed study addressing a timely question in the field.

      (2) Successful transition from previous work focused on the visual cortex to the auditory cortex, demonstrating generic principles in mismatch responses.

      (3) The correlation between mouse locomotion speed and acoustic feedback levels provides evidence for a prediction signal in the auditory cortex.

      (4) Coupling of visual and auditory feedback shows putative multimodal integration in the auditory cortex.

      Weaknesses:

      (1) Lack of quantification of animal behavior upon mismatches, potentially leading to alternative interpretations of recorded signals.

      (2) Unclear correlation between excitatory responses and animal velocity during halts, particularly in closed-loop versus playback conditions.

      (3) Discrepancies in reported values in a few figure panels raise questions about data consistency and interpretation.

      (4) Ambiguity regarding the identity of the [AM+VM] MM neurons.

    4. eLife assessment

      This study provides important findings on the modulation of cortical neuronal responses to sensory stimuli by motor-driven predictive signals. The study is methodologically sound and well-designed. The data, as analysed, provide incomplete support for the conclusion that audiomotor mismatch responses are observed in the auditory cortex and that these are strongly modulated by cross-modal signals.

    1. Author Response

      eLife assessment

      This study demonstrates mRNA-specific regulation of translation by subunits of the eukaryotic initiation factor complex 3 (eIF3) using convincing methods, data, and analyses. The investigations have generated important information that will be of interest to biologists studying translation regulation. However, the physiological significance of the gene expression changes that were observed is not clear.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Herrmannova et al explore changes in translation upon individual depletion of three subunits of the eIF3 complex (d, e, and f) in mammalian cells. The authors provide a detailed analysis of regulated transcripts, followed by validation by RT-qPCR and/or Western blot of targets of interest, as well as GO and KKEG pathway analysis. The authors confirm prior observations that eIF3, despite being a general translation initiation factor, functions in mRNA-specific regulation, and that eIF3 is important for translation re-initiation. They show that the global effects of eIF3e and eIF3d depletion on translation and cell growth are concordant. Their results support and extend previous reports suggesting that both factors control the translation of 5'TOP mRNAs. Interestingly, they identify MAPK pathway components as a group of targets coordinately regulated by eIF3 d/e. The authors also discuss discrepancies with other reports analyzing eIF3e function.

      We would like to note that the first sentence contains a typo; the correct expression is: “…of three subunits of the eIF3 complex (d, e, and h) in mammalian cells”.

      Strengths:

      Altogether, a solid analysis of eIF3 d/e/h-mediated translation regulation of specific transcripts. The data will be useful for scientists working in the Translation field.

      Weaknesses:

      The authors could have explored in more detail some of their novel observations, as well as their impact on cell behavior.

      Many experiments are on-going in this direction. The original plan was to map all the effects in general and in as much detail as possible to select a few of them for future long-term projects.

      Reviewer #2 (Public Review):

      Summary:

      mRNA translation regulation permits cells to rapidly adapt to diverse stimuli by fine-tuning gene expression. Specifically, the 13-subunit eukaryotic initiation factor 3 (eIF3) complex is critical for translation initiation as it aids in 48S PIC assembly to allow for ribosome scanning. In addition, eIF3 has been shown to drive transcript-specific translation by binding mRNA 5' cap structures through the eIF3d subunit. Dysregulation of eIF3 has been implicated in oncogenesis, however the precise eIF3 subunit contributions are unclear. Here, Herrmannová et al. aim to investigate how eIF3 subcomplexes, generated by knockdown (KD) of either eIF3e, eIF3d, or eIF3h, affect the global translatome. Using Ribo-seq and RNA-seq, the authors identified a large number of genes that exhibit altered translation efficiency upon eIF3d/e KD, while translation defects upon eIF3h KD were mild. eIF3d/e KD share multiple dysregulated transcripts, perhaps due to both subcomplexes lacking eIF3d. Both eIF3d/e KD increase the translation efficiency (TE) of transcripts encoding lysosomal, ER, and ribosomal proteins. This suggests a role of eIF3 in ribosome biogenesis and protein quality control. Many transcripts encoding ribosomal proteins harbor a TOP motif, and eIF3d KD and eIF3e KD cells exhibit a striking induction of these TOP-modified transcripts. On the other hand, eIF3d KD and eIF3e KD lead to a reduction of MAPK/ERK pathway proteins. Despite this downregulation, eIF3d KD and eIF3e KD activate MAPK/ERK signaling as ERK1/2 and c-Jun phosphorylation were induced. Finally, in all three knockdowns, MDM2 and ATF4 protein levels are reduced. This is notable because MDM2 and ATF4 both contain short uORFs upstream of the start codon, and further support a role of eIF3 in reinitiation. Altogether, Herrmannová et al. have gained key insights into precise eIF3-mediated translational control as it relates to key signaling pathways implicated in cancer.

      Strengths:

      The authors have provided a comprehensive set of data to analyze RNA and ribosome footprinting upon perturbation of eIF3d, eIF3e, and eIF3h. As described above in the summary, these data present many interesting starting points for understanding additional roles of the eIF3 complex and specific subunits in translational control.

      Weaknesses:

      • The differences between eIF3e and eIF3d knockdown are difficult to reconcile, especially since eIF3e knockdown leads to a reduction in eIF3d levels.

      We agree and discuss this problem thoroughly in the corresponding section of our study.

      • The paper would be strengthened by experiments directly testing what RNA determinants allow for transcript-specific translation regulation by the eIF3 complex. This would allow the paper to be less descriptive.

      We carried out bioinformatic analysis dealing with specific RNA determinants that is presented as the last chapter of our study. A detailed, transcript-specific analysis of these determinants is underway, however, we consider them beyond the scope for this article.

      • The paper would have more biological relevance if eIF3 subunits were perturbed to mimic naturally occurring situations where eIF3 is dysregulated. For example, eIF3e is aberrantly upregulated in certain cancers, and therefore an overexpression and profiling experiment would have been more relevant than a knockdown experiment.

      This is indeed true and so far we have generated several stable cell lines individually overexpressing selected eIF3 subunits implicated in the observed cancer phenotypes. However, this is a completely different project of one of our PhD students, which will be published as a comprehensive study when completed.

      Reviewer #3 (Public Review):

      Summary:

      In this article, Hermannova et al catalog the changes in ribosome association with mRNAs when the eukaryotic translation initiation factor 3 is disrupted by knocking down subunits of the multisubunit protein. They find that RNAs relying on TOP motifs for translation, such as ribosomal protein RNAs, and RNAs encoding proteins that modify other proteins in the ER or components of the lysosome are upregulated. In contrast, proteins encoding components of MAP kinase cascades are downregulated when subunits of eIF3 are knocked down.

      Strengths:

      The authors use ribosome profiling of well-characterized mutants lacking subunits of eIF3 and assess the changes in translation that take place. They supplement the ribosome association studies with western blotting to determine protein level changes of affected transcripts. They analyze what is being encoded by the transcripts undergoing translation changes, which is important for understanding more broadly how translation initiation factor levels affect cancer cell translatomes.

      Weaknesses:

      (1) The data are presented as a catalog of effects, and the paper would be strengthened if there were a clear model tying the various effects together or linking individual subunit knockdown to cancerous phenotypes. It is unclear what the hypothesis is for cells having more MAPK activity with less of the MAPK proteins being translated, so the main findings of the paper become observational without context.

      As the signaling pathways are very complex and there is a frequent crosstalk among them (c-Jun can be activated by the ERK pathway as well as the JNK pathway, activated ERKs can phosphorylate many different transcription factors, etc.), we opted not to investigate the reported results any further in this study. As mentioned above, we have several ongoing, long-term projects aiming to elucidate the consequences of the observed changes in protein levels as well as in the phosphorylation status of the MAPK pathway constituents. The take home message of the present study is that eIF3 subunits (d and e) have control over the expression of many proteins involved in the MAPK/ERK pathway and that there is an independent effect (already present in the downregulation of eIF3h, which does not affect the MAPK protein expression) that leads to activation of the ERK pathway, which may be a direct consequence of compromised eIF3 function in general.

      (2) The conclusions drawn are presented as very generalized other than in the last paragraph, but the experiments were only done in Hela cells. Since conclusions are being made about how translation changes affect MAP kinase signaling and there is mention in the abstract that dysregulation of these subunits is observed in cancer, at least one other cell line would need to be analyzed to provide evidence that the effects of subunit knockdown aren't cell-line specific.

      There are several notes emphasizing that the data presented in this study were obtained only in HeLa cells. We agree that further research in other cell lines will be needed to confirm that what we observed is a general phenomenon. Nonetheless, as noted in the discussion, other reports have already been published strongly indicating that this phenomenon is not unique to HeLa cells (Li et al., 2021, PMID:34520790, HTR-8/SVneo cells). We will review our conclusions and further clarify that our results so far only apply to Hela cells.

      (3) It is also unclear how replicates were performed and how many replicates were performed for several experiments. Biological replicates are mentioned, but what the authors did for biological replicates isn't defined and the description of the collection of cells for polysome/ribosome footprint/RNA seq samples makes it unclear whether the "biological replicates" are samples from separate transfections (true biological replicates) or different aliquots or wells from a single transfection (technical replicates) being run over a separate gradient. If using technical replicates, the data comparing the effects of knocking down D vs E vs H subunits are substantially weakened because subunit-specific differences could be the result of non-specific events that occurred in a transfection. It's also notable that while the pooled siRNAs will increase the potency of knockdown, it is possible that one or more of the siRNAs could have off-target effects, and analyzing individual siRNAs would be better for ensuring effects are specific.

      We can reassure this reviewer that our Ribo-seq and RNA-Seq libraries were prepared from true biological replicates, grown, and transfected at different times. In fact, for each biological replicate, we used a new aliquot of cells from cryostock from the same batch and transfected the cells with the same passage number only. Multiple biological replicates were grown and all underwent a series of control experiments (polysomes, qPCR, western blot) as described in the article. Based on the results, 3 samples were selected for Ribo-Seq library preparation and 4 for RNA-Seq. We decided to add a fourth replicate for RNA-Seq to increase the data robustness, because RNA-Seq is used to normalize FPs to calculate TE, which was our main metric analyzed in this article.

      As for the usage of the siRNA pool from Dharmacon/Horizon – our current article builds on our previous studies (Wagner et al. 2014 PMID: 24912683; Wagner et al. 2016 PMID: 27924037 and Herrmannová et al. 2020 PMID: 31863585), where we thoroughly characterized the effects of downregulation of individual eIF3 subunits on the growth, translation, composition and stability of eIF3 complex and on the 43S preinitiation complex assembly and subsequent mRNA recruitment. In all of these studies, we used the same siRNAs pools, the same cells and the same transfection protocol; therefore, we are convinced that our results are as coherent and reproducible as can possibly be. We have never noticed any off-target effects. Moreover, the ON-TARGETplus siRNA technology we employed uses a patented modification pattern that reduces the incidence of off-targets by up to 90% compared to unmodified siRNA (see the supplier's website for more information).

      (4) Many of the changes in protein levels reported by Western are subtle. Data from all western blots making claims of quantitative differences should really be quantified relative to nontreated over-loading control or total protein quantified from the gel, and presented with a degree of error from biological replicates to make conclusions about differences in protein levels between samples.

      Generally speaking, we agree with the reviewer’s opinion. In the original version of our study, we felt that it was not necessary to perform a quantification analysis to support our conclusions as it was not important whether a given protein was downregulated to, for example, 60% or 70%, as long as its amount was visibly reduced. The main message resided in the general trend, i.e. that the whole pathway is affected in a similar way. Nevertheless, in order to properly address this criticism, we will provide quantifications in the revised paper.

    2. Reviewer #3 (Public Review):

      Summary:

      In this article, Hermannova et al catalog the changes in ribosome association with mRNAs when the eukaryotic translation initiation factor 3 is disrupted by knocking down subunits of the multisubunit protein. They find that RNAs relying on TOP motifs for translation, such as ribosomal protein RNAs, and RNAs encoding proteins that modify other proteins in the ER or components of the lysosome are upregulated. In contrast, proteins encoding components of MAP kinase cascades are downregulated when subunits of eIF3 are knocked down.

      Strengths:

      The authors use ribosome profiling of well-characterized mutants lacking subunits of eIF3 and assess the changes in translation that take place. They supplement the ribosome association studies with western blotting to determine protein level changes of affected transcripts. They analyze what is being encoded by the transcripts undergoing translation changes, which is important for understanding more broadly how translation initiation factor levels affect cancer cell translatomes.

      Weaknesses:

      (1) The data are presented as a catalog of effects, and the paper would be strengthened if there were a clear model tying the various effects together or linking individual subunit knockdown to cancerous phenotypes. It is unclear what the hypothesis is for cells having more MAPK activity with less of the MAPK proteins being translated, so the main findings of the paper become observational without context.

      (2) The conclusions drawn are presented as very generalized other than in the last paragraph, but the experiments were only done in Hela cells. Since conclusions are being made about how translation changes affect MAP kinase signaling and there is mention in the abstract that dysregulation of these subunits is observed in cancer, at least one other cell line would need to be analyzed to provide evidence that the effects of subunit knockdown aren't cell-line specific.

      (3) It is also unclear how replicates were performed and how many replicates were performed for several experiments. Biological replicates are mentioned, but what the authors did for biological replicates isn't defined and the description of the collection of cells for polysome/ribosome footprint/RNA seq samples makes it unclear whether the "biological replicates" are samples from separate transfections (true biological replicates) or different aliquots or wells from a single transfection (technical replicates) being run over a separate gradient. If using technical replicates, the data comparing the effects of knocking down D vs E vs H subunits are substantially weakened because subunit-specific differences could be the result of non-specific events that occurred in a transfection. It's also notable that while the pooled siRNAs will increase the potency of knockdown, it is possible that one or more of the siRNAs could have off-target effects, and analyzing individual siRNAs would be better for ensuring effects are specific.

      (4) Many of the changes in protein levels reported by Western are subtle. Data from all western blots making claims of quantitative differences should really be quantified relative to nontreated over-loading control or total protein quantified from the gel, and presented with a degree of error from biological replicates to make conclusions about differences in protein levels between samples.

    3. eLife assessment

      This study demonstrates mRNA-specific regulation of translation by subunits of the eukaryotic initiation factor complex 3 (eIF3) using convincing methods, data, and analyses. The investigations have generated important information that will be of interest to biologists studying translation regulation. However, the physiological significance of the gene expression changes that were observed is not clear.

    4. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Herrmannova et al explore changes in translation upon individual depletion of three subunits of the eIF3 complex (d, e, and f) in mammalian cells. The authors provide a detailed analysis of regulated transcripts, followed by validation by RT-qPCR and/or Western blot of targets of interest, as well as GO and KKEG pathway analysis. The authors confirm prior observations that eIF3, despite being a general translation initiation factor, functions in mRNA-specific regulation, and that eIF3 is important for translation re-initiation. They show that the global effects of eIF3e and eIF3d depletion on translation and cell growth are concordant. Their results support and extend previous reports suggesting that both factors control the translation of 5'TOP mRNAs. Interestingly, they identify MAPK pathway components as a group of targets coordinately regulated by eIF3 d/e. The authors also discuss discrepancies with other reports analyzing eIF3e function.

      Strengths:

      Altogether, a solid analysis of eIF3 d/e/h-mediated translation regulation of specific transcripts. The data will be useful for scientists working in the Translation field.

      Weaknesses:

      The authors could have explored in more detail some of their novel observations, as well as their impact on cell behavior.

    5. Reviewer #2 (Public Review):

      Summary:

      mRNA translation regulation permits cells to rapidly adapt to diverse stimuli by fine-tuning gene expression. Specifically, the 13-subunit eukaryotic initiation factor 3 (eIF3) complex is critical for translation initiation as it aids in 48S PIC assembly to allow for ribosome scanning. In addition, eIF3 has been shown to drive transcript-specific translation by binding mRNA 5' cap structures through the eIF3d subunit. Dysregulation of eIF3 has been implicated in oncogenesis, however the precise eIF3 subunit contributions are unclear. Here, Herrmannová et al. aim to investigate how eIF3 subcomplexes, generated by knockdown (KD) of either eIF3e, eIF3d, or eIF3h, affect the global translatome. Using Ribo-seq and RNA-seq, the authors identified a large number of genes that exhibit altered translation efficiency upon eIF3d/e KD, while translation defects upon eIF3h KD were mild. eIF3d/e KD share multiple dysregulated transcripts, perhaps due to both subcomplexes lacking eIF3d. Both eIF3d/e KD increase the translation efficiency (TE) of transcripts encoding lysosomal, ER, and ribosomal proteins. This suggests a role of eIF3 in ribosome biogenesis and protein quality control. Many transcripts encoding ribosomal proteins harbor a TOP motif, and eIF3d KD and eIF3e KD cells exhibit a striking induction of these TOP-modified transcripts. On the other hand, eIF3d KD and eIF3e KD lead to a reduction of MAPK/ERK pathway proteins. Despite this downregulation, eIF3d KD and eIF3e KD activate MAPK/ERK signaling as ERK1/2 and c-Jun phosphorylation were induced. Finally, in all three knockdowns, MDM2 and ATF4 protein levels are reduced. This is notable because MDM2 and ATF4 both contain short uORFs upstream of the start codon, and further support a role of eIF3 in reinitiation. Altogether, Herrmannová et al. have gained key insights into precise eIF3-mediated translational control as it relates to key signaling pathways implicated in cancer.

      Strengths:

      The authors have provided a comprehensive set of data to analyze RNA and ribosome footprinting upon perturbation of eIF3d, eIF3e, and eIF3h. As described above in the summary, these data present many interesting starting points for understanding additional roles of the eIF3 complex and specific subunits in translational control.

      Weaknesses:

      - The differences between eIF3e and eIF3d knockdown are difficult to reconcile, especially since eIF3e knockdown leads to a reduction in eIF3d levels.

      - The paper would be strengthened by experiments directly testing what RNA determinants allow for transcript-specific translation regulation by the eIF3 complex. This would allow the paper to be less descriptive.

      - The paper would have more biological relevance if eIF3 subunits were perturbed to mimic naturally occurring situations where eIF3 is dysregulated. For example, eIF3e is aberrantly upregulated in certain cancers, and therefore an overexpression and profiling experiment would have been more relevant than a knockdown experiment.

    1. Author Response

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors observed a decline in autophagy and proteasome activity in the context of Milton knockdown. Through proteomic analysis, they identified an increase in the protein levels of eIF2β, subsequently pinpointing a novel interaction within eIF subunits where eIF2β contributes to the reduction of eIF2α phosphorylation levels. Furthermore, they demonstrated that overexpression of eIF2β suppresses autophagy and leads to diminished motor function. It was also shown that in a heterozygous mutant background of eIF2β, Milton knockdown could be rescued. This work represents a novel and significant contribution to the field, revealing for the first time that the loss of mitochondria from axons can lead to impaired autophagy function via eIF2β, potentially influencing the acceleration of aging. To further support the authors' claims, several improvements are necessary, particularly in the methods of quantification and the points that should be demonstrated quantitatively. It is crucial to investigate the correlation between aging and the proteins eIF2β and eIF2α.

      Thank you so much for your comments. We will further investigate the correlation between aging and the proteins eIF2β and eIF2α and include the results in the revised version.

      Reviewer #2 (Public Review):

      In the manuscript, the authors aimed to elucidate the molecular mechanism that explains neurodegeneration caused by the depletion of axonal mitochondria. In Drosophila, starting with siRNA depletion of Milton and Miro, the authors attempted to demonstrate that the depletion of axonal mitochondria induces the defect in autophagy. From proteome analyses, the authors hypothesized that autophagy is impacted by the abundance of eIF2β and the phosphorylation of eIF2α. The authors followed up the proteome analyses by testing the effects of eIF2β overexpression and depletion on autophagy. With the results from those experiments, the authors proposed a novel role of eIF2β in proteostasis that underlies neurodegeneration derived from the depletion of axonal mitochondria.

      The manuscript has several weaknesses. The reader should take extra care while reading this manuscript and when acknowledging the findings and the model in this manuscript.

      The defect in autophagy by the depletion of axonal mitochondria is one of the main claims in the paper. The authors should work more on describing their results of LC3-II/LC3-I ratio, as there are multiple ways to interpret the LC3 blotting for the autophagy assessment. Lysosomal defects result in the accumulation of LC3-II thus the LC3-II/LC3-I ratio gets higher. On the other hand, the defect in the early steps of autophagosome formation could result in a lower LC3-II/LC3-I ratio. From the results of the actual blotting, the LC3-I abundance is the source of the major difference for all conditions (Milton RNAi and eIF2β overexpression and depletion). In the text, the authors simply state the observation of their LC3 blotting. The manuscript lacks an explanation of how to evaluate the LC3-II/LC3-I ratio. Also, the manuscript lacks an elaboration on what the results of the LC3 blotting indicate about the state of autophagy by the depletion of axonal mitochondria.

      We agree with the reviewer that multiple ways exist to interpret the LC3 blotting for the autophagy assessment. Thus, we analyzed the levels of p62, an autophagy substrate, and found that milton knockdown caused elevated levels of p62 (Figure 2B). Together, these results suggest that autophagic degradation is lowered.

      Another main point of the paper is the up-regulation of eIF2β by depleting the axonal mitochondria leads to the proteostasis crisis. This claim is formed by the findings from the proteome analyses. The authors should have presented their proteomic data with much thorough presentation and explanation. As in the experiment scheme shown in Figure 4A, the author did two proteome analyses: one from the 7-day-old sample and the other from the 21-day-old sample. The manuscript only shows a plot of the result from the 7-day-old sample, but that of the result from the 21-day-old sample. For the 21-day-old sample, the authors only provided data in the supplemental table, in which the abundance ratio of eIF2β from the 21-day-old sample is 0.753, meaning eIF2β is depleted in the 21-day-old sample. The authors should have explained the impact of the eIF2β depletion in the 21-day-old sample, so the reader could fully understand the authors' interpretation of the role of eIF2β on proteostasis.

      Thank you for your comments. We will include more analyses of the proteomic data in the next version of our manuscript. In this study, we aimed to elucidate the mechanisms by which depletion of axonal mitochondria induces proteostasis disruption prematurely. Thus, we did not investigate the roles of differentially expressed proteins in proteostasis at 21-day-old in milton knockdown. Aging disrupts proteostasis via multiple pathways: eIF2β levels may be lowered by feedback of earlier changes or via interaction with other age-related changes at 21-day-old. We will include more discussion in the next version of our manuscript.

      The manuscript consists of several weaknesses in its data and explanation regarding translation.

      (1) The authors are likely misunderstanding the effect of phosphorylation of eIF2α on translation. The P-eIF2α is inhibitory for translation initiation. However, the authors seem to be mistaken that the down-regulation of P-eIF2α inhibits translation. Thank you for your comment. We understand that the phosphorylation of eIF2α is inhibitory for translation initiation, as we described in page 9, Line 312-314. We propose a model in which autophagic defects caused by milton knockdown is mediate by upregulation of eIF2β, however, we are not arguing that the translational suppression in milton knockdown is caused by a reduction in p-eIF2α. We found that milton knockdown causes an increase in eIF2β, and overexpression of eIF2β copied phenotypes of milton knockdown such as autophagic defects (Figure 5 and 6). We also found that the increase in eIF2β reduces the level of p-eIF2α (Supplemental Figure 2), thus, eIF2α phosphorylation in milton knockdown may be caused by an increase in eIF2β. However, the effects of upregulation of eIF2β on the function of eIF2 complex is not fully understood. The translational suppression in milton knockdown may be caused by disruption of eIF2 complex, while it is also possible that it is mediated by a function of eIF2β that is yet-to-be-determined, or mediated by the pathways other than eIF2. We will include more details in the revised version.

      (2) The result of polysome profiling in Figure 4H is implausible. By 10%-25% sucrose density gradient, polysomes are not expected to be observed. The authors should have used a gradient with much denser sucrose, such as 10-50%. Thank you for pointing it out. We are sorry, it was a mistake. The gradient was actually 10-50%, and we described it wrong. We will correct it in the revised version.

      (3) Also on the polysome profiling, as in the method section, the authors seemed to fractionate ultra-centrifuged samples from top to bottom and then measured A260 by a plate reader. In that case, the authors should have provided a line plot with individual data points, not the smoothly connected ones in the manuscript. Thank you for pointing it out. We will replace the graph.

      (4) For both the results from polysome profiling and puromycin incorporation (Figure 4H and I), the difference between control siRNA and Milton siRNA are subtle, if not nonexistent. This might arise from the lack of spatial resolution in their experiment as the authors used head lysate for these data but the ratio of Phospho-eIF2α/eIF2α only changes in the axons, based on their results in Figure 4E-G. The authors could have attempted to capture the spatial resolution for the axonal translation to see the difference between control siRNA and Milton siRNA.

      Thank you for your comment. A new set of experiments with technical challenges will be required to capture the spatial resolution for the axonal translation. We will work on it and hope to achieve it in the future.

    2. eLife assessment:

      In flies defective for axonal transport of mitochondria, the authors report the upregulation of one subunit, the beta subunit, of the heterotrimeric eIF2 complex via mass spectroscopy proteome analysis. Neuronal overexpression of eIF2β phenocopied aspects of neuronal dysfunction observed when axonal transport of mitochondria was compromised. Conversely, lowering eIF2β expression suppressed aspects of neuronal dysfunction. While these are intriguing observations that are potentially useful, several technical weaknesses limit the interpretation and mean the evidence supporting the current claims is incomplete.

    3. Reviewer #1 (Public Review):

      The authors observed a decline in autophagy and proteasome activity in the context of Milton knockdown. Through proteomic analysis, they identified an increase in the protein levels of eIF2β, subsequently pinpointing a novel interaction within eIF subunits where eIF2β contributes to the reduction of eIF2α phosphorylation levels. Furthermore, they demonstrated that overexpression of eIF2β suppresses autophagy and leads to diminished motor function. It was also shown that in a heterozygous mutant background of eIF2β, Milton knockdown could be rescued. This work represents a novel and significant contribution to the field, revealing for the first time that the loss of mitochondria from axons can lead to impaired autophagy function via eIF2β, potentially influencing the acceleration of aging. To further support the authors' claims, several improvements are necessary, particularly in the methods of quantification and the points that should be demonstrated quantitatively. It is crucial to investigate the correlation between aging and the proteins eIF2β and eIF2α.

    4. Reviewer #2 (Public Review):

      In the manuscript, the authors aimed to elucidate the molecular mechanism that explains neurodegeneration caused by the depletion of axonal mitochondria. In Drosophila, starting with siRNA depletion of Milton and Miro, the authors attempted to demonstrate that the depletion of axonal mitochondria induces the defect in autophagy. From proteome analyses, the authors hypothesized that autophagy is impacted by the abundance of eIF2β and the phosphorylation of eIF2α. The authors followed up the proteome analyses by testing the effects of eIF2β overexpression and depletion on autophagy. With the results from those experiments, the authors proposed a novel role of eIF2β in proteostasis that underlies neurodegeneration derived from the depletion of axonal mitochondria.

      The manuscript has several weaknesses. The reader should take extra care while reading this manuscript and when acknowledging the findings and the model in this manuscript.

      The defect in autophagy by the depletion of axonal mitochondria is one of the main claims in the paper. The authors should work more on describing their results of LC3-II/LC3-I ratio, as there are multiple ways to interpret the LC3 blotting for the autophagy assessment. Lysosomal defects result in the accumulation of LC3-II thus the LC3-II/LC3-I ratio gets higher. On the other hand, the defect in the early steps of autophagosome formation could result in a lower LC3-II/LC3-I ratio. From the results of the actual blotting, the LC3-I abundance is the source of the major difference for all conditions (Milton RNAi and eIF2β overexpression and depletion). In the text, the authors simply state the observation of their LC3 blotting. The manuscript lacks an explanation of how to evaluate the LC3-II/LC3-I ratio. Also, the manuscript lacks an elaboration on what the results of the LC3 blotting indicate about the state of autophagy by the depletion of axonal mitochondria.

      Another main point of the paper is the up-regulation of eIF2β by depleting the axonal mitochondria leads to the proteostasis crisis. This claim is formed by the findings from the proteome analyses. The authors should have presented their proteomic data with much thorough presentation and explanation. As in the experiment scheme shown in Figure 4A, the author did two proteome analyses: one from the 7-day-old sample and the other from the 21-day-old sample. The manuscript only shows a plot of the result from the 7-day-old sample, but that of the result from the 21-day-old sample. For the 21-day-old sample, the authors only provided data in the supplemental table, in which the abundance ratio of eIF2β from the 21-day-old sample is 0.753, meaning eIF2β is depleted in the 21-day-old sample. The authors should have explained the impact of the eIF2β depletion in the 21-day-old sample, so the reader could fully understand the authors' interpretation of the role of eIF2β on proteostasis.

      The manuscript consists of several weaknesses in its data and explanation regarding translation.

      (1) The authors are likely misunderstanding the effect of phosphorylation of eIF2α on translation. The P-eIF2α is inhibitory for translation initiation. However, the authors seem to be mistaken that the down-regulation of P-eIF2α inhibits translation.

      (2) The result of polysome profiling in Figure 4H is implausible. By 10%-25% sucrose density gradient, polysomes are not expected to be observed. The authors should have used a gradient with much denser sucrose, such as 10-50%.

      (3) Also on the polysome profiling, as in the method section, the authors seemed to fractionate ultra-centrifuged samples from top to bottom and then measured A260 by a plate reader. In that case, the authors should have provided a line plot with individual data points, not the smoothly connected ones in the manuscript.

      (4) For both the results from polysome profiling and puromycin incorporation (Figure 4H and I), the difference between control siRNA and Milton siRNA are subtle, if not nonexistent. This might arise from the lack of spatial resolution in their experiment as the authors used head lysate for these data but the ratio of Phospho-eIF2α/eIF2α only changes in the axons, based on their results in Figure 4E-G. The authors could have attempted to capture the spatial resolution for the axonal translation to see the difference between control siRNA and Milton siRNA.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #2 (Recommendations For The Authors):

      I would like to thank the authors for their comments. However, my request for additional experiments to consolidate this manuscript and text changes have not been addressed (point 1 and point 2), which I believe are essential for completion of this manuscript.

      The reviewer raised the question about the relevant substrates of PARG in S-phase cells (point 1). As we explained in our previous response, the most important substrate of PARG is PARP1, since we observed increased chromatin-associated PARP1 and PARylated PARP1 in cells with PARG depletion. Moreover, PARP1 or PARP1/2 depletion rescued cell lethality caused by PARG depletion. These data strongly suggest that PARP1 is the major substrate of PARG in S phase cells. Of course, PARG may have additional substrates. In the future, we will perform proteomics experiments as suggested by this reviewer to identify additional PARG substrates, which may reveal new roles of PARG in S phase progression.

      The reviewer also suggested us to re-organize our manuscript (point 2). However, we prefer to keep the manuscript as it is, since this is how the project evolved. The other reason we would like to share with the readers is the challenge to validate KO cells. This is an important lesson we learned from this study. We hope that this will raise the awareness of hypomorphic mutant cells we often use to draw conclusions about gene functions and/or genetic interactions. We understand that the current flow of our manuscript may bring some confusion. To avoid it, we included additional explanations at the beginning of this manuscript to draw attention to the readers that our initial KO cells may not be complete PARG KO cells, i.e. they may have residual PARG activity. We also included additional discussion of this important point in the Discussion section.

      Moreover, WB analysis of PARG KO clones is inconclusive, as the additional prominent band at 50 kDa could be a degradation product. The authors should check PARG levels are localization by IF, which allows detection of intact proteins and their cellular localizations, since the shorter isoform should be localized in the cytosol. WB with PARG isoforms is missing important information regarding Mw of the PARG constructs and Mw labels of western blots, which makes is difficult to evaluate this data and compare to KO. Ideally, KO and PARG isoform samples should be all on one gel for proper comparison with different antibodies.

      We appreciate the concerns raised by this reviewer. We agree that the additional prominent band at 50kDa could be a degradation product. As we explained in our previous response, despite using several PARG antibodies, we could not draw a clear conclusion which functional isoforms or truncated forms were expressed in our PARG KO cells.

      Immunostaining experiments may not be more conclusive, since IF experiments rely on the same antibodies for recognizing endogenous PARG. Additionally, even a protein mainly localizes in the cytosol, we cannot exclude the possibility that a small fraction of this protein may localize in nuclei and have nuclear functions.

      Instead, as we presented in our manuscript, we used a biochemical assay to measure PARG activity in cell lysate and showed that our initial PARG KO cells still have residual PARG activity. However, we could not detect any PARG activity in our complete/conditional PARG KO cells (cKO cells; these cells can only survive in the presence of PARP inhibitor). These data strongly suggest that PARG is essential for cell survival.

    2. Reviewer #3 (Public Review):

      These studies reveal an S-phase requirement for the PARG dePARylation enzyme in removing ADP-ribosylation from PAR-modified proteins whose PARylation is promoted by the presence of unligated Okazaki fragments. The excessive protein ADP-ribosylation observed in S-phase of PARG-depleted human cells leads to trapping of the PARP1 ADP-ribosylation enzyme on chromatin. The findings would be strengthened by identification of the relevant ADP-ribosylation substrates of PARG whose dePARylation is needed for progression through S-phase.

      Comments on revised version:

      In the revised version the authors have addressed some of the reviewers' concerns, but, despite the new explanatory paragraph on page 16, the paper remains confusing because as shown in Figure 7 at the end of the Results the PARG KO 293A cells that were analyzed at the beginning of the Results are not true PARG knockouts. The authors stated that they did not rewrite the Results because they wanted to describe the experiments in the order in which they were carried out, but there is no imperative for the experiments to be described in the order in which they were done, and it would be much easier for the uninitiated reader to appreciate the significance of these studies if the true PARG KO cell data were presented at the beginning, as all three of the original reviewers proposed.

      While the authors have to some extent clarified the nature of the PARG KO alleles, they have not been able to identify the source of the residual PARG activity in the PARG KO cells, in part because different commercial PARG antibodies give different and conflicting immunoblotting results. Additional sequence characterization of PARG mRNAs expressed in the PARG cKO cells, and also in-depth proteomic analysis of the different PARG bands could provide further insight into the origins and molecular identities of the various PARG proteins expressed from the different KO PARG alleles, and determine which of them might retain catalytic activity.

      The authors have made no progress in identifying which are the key PARG substrates required for S phase progression, although they suggest that PARP1 itself may be an important target.

    3. eLife assessment

      The demonstration that the PARG dePARylation enzyme is required in S phase to remove polyADP-ribose (PAR) protein adducts that are generated in response to the presence of unligated Okazaki fragments is potentially valuable, but the evidence is incomplete, and identification of relevant PARylated PARG substrates in S-phase is needed to understand the role of PARP1-mediated PARylation and PARG-catalyzed dePARylation in S-phase progression.

    4. Reviewer #2 (Public Review):

      Summary:

      In this manuscript Nie et al investigate the effect of PARG KO and PARG inhibition (PARGi) on pADPR, DNA damage, cell viability and synthetic lethal interactions in HEK293A and Hela cells. Surprisingly, the authors report that PARG KO cells are sensitive to PARGi and show higher pADPR levels than PARG KO cells, which is abrogated upon deletion or inhibition of PARP1/PARP2. The authors explain the sensitivity of PARG KO to PARGi through incomplete PARG depletion and demonstrate complete loss of PARG activity when incomplete PARG KO cells are transfected with additional gRNAs in the presence of PARPi. Furthermore, the authors show that the sensitivity of PARG KO cells to PARGi is not caused by NAD depletion but by S-phase accumulation of pADPR on chromatin coming from unligated Okazaki fragments, which are recognized and bound by PARP1. Consistently, PARG KO or PARG inhibition show synthetic lethality with Pol beta, which is required for Okazaki fragment maturation. PARG expression levels in ovarian cancer cell lines correlate negatively with their sensitivity to PARGi.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The author should evaluate the possibility of naturally occurring arrhythmia due to the geometry of the tissues, by using voltage or calcium dye.

      Answer: We thank the reviewer for this suggestion. We have performed new experiments using a voltage-sensitive fluorescent dye (i.e. FluoVolt) with data reported in the new Figure 4 + new results section “arrhythmia analysis”. Briefly, we found that our ring-shaped tissues are compatible with live fluorescence imaging. We were then able to show that our cardiac tissues beat regularly, without naturally occurring arrhythmias or extra beats. We could not detect any re-entrant waves in our tissues in the conditions offered by the speed of our camera. A specific paragraph has also been added to the discussion.

      (2) There is only 50% survival after 20 days of culture in the optimized seeding group. Is there any way to improve it? The tissues had two compartments, cardiac and fibroblast-rich regions, where fibroblasts are responsible for maintaining the attachment to the glass slides. Do the cardiac rings detach from the glass slides and roll up? The SD of the force measurement is a quarter of the value, which is not ideal with such a high replicate number.

      Answer: This paper report seminal data that will serve as a foundation for further use of the platform. We are currently expanding to other cell lines with improvement in survival (see https://insight.jci.org/articles/view/161356). We confirm that the rings do not detach. The pillar was specifically designed to avoid this (See figure 1B).

      As the platform utilizes imaging analysis to derive contractile dynamics, calibration should be done based on the angle and the distance of the camera lens to the individual tissues to reduce the error. On the other hand, how reproducible of the pillars? It is highly recommended to mechanically evaluate the consistency of the hydrogel-based pillars across different wells and within the wells to understand the variance.

      Answer: We propose a system and a measurement method that do not need calibration. Contraction amplitude is expressed as a ratio between the contracted / relaxed areas (See figure 3 A). There is thus no influence of the distance of the camera lens.

      In order to evaluate the consistency of the mechanical properties of the hydrogel, we reproduced the experiment pictured in Figure1-Supplement 1, and measured the Young’s Modulus of three different gel solutions on different days. In the three experiments performed, we found values of 10.0-12.2 kPa, resulting in a final average value of 11.2 (+/- 0.6) kPa, coherent with the value reported in the article. We are therefore confident that the mechanical properties are consistent across and within wells. More extensive mechanical characterization of the molded gels would require the access to an Atomic Force Microscope (AFM), and is considered in the future.

      The author should address the longevity and reproducibility issues, by working on the calibration of camera lens position/distance to tissues and further optimizing the seeding conditions with hydrogels such as collagen or fibrin, and/or making sure the PEG gels have high reproducibility and consistency.

      Answer: This paper report seminal data that will serve as a foundation for further use of the platform. This platform (including the design, approach and choice of polymers) allows a fast and reproducible formation of an important number of cardiac tissues (up to 21 per well in a 96-well format, meaning a potential total of about 2,000 tissues) with a limited number of cells.

      (3) The evaluation of the arrhythmia should be more extensively explained and demonstrated.

      Answer : See answer to comment 1

      (4) The results of isoproterenol should be checked as non-paced tissues should have increased beating frequency with increasing dosages. Dofetilide does not typically have a negative inotropic effect on the tissues. Please check on the cell viability before and after dosing

      Answer : We agree with this reviewer on the principle. However, we have repeated the experiments and we confirm our results, i.e. increasing concentrations of isoproterenol induced a trend towards increase in the contraction force and significantly increased contraction and relaxation speeds without change in the beat rate (Figure 5C). We do not have a definitive explanation for this observation. Our hypothesis is that this increase in contraction and relaxation speeds induced by isoproterenol is translated, on average in our study, into an increase in contractile force rather than in an increase in contraction frequency. This may depend on the cell line used, and is very well illustrated in a recent paper from Mannhardt and colleagues (Stem cell reports. 2020; 15(4):983–998). Of the 10 different cell lines tested in engineered heart tissues, all show an increase in contraction and relaxation speeds after isoproterenol administration, but this is translated either into an increase in contractile force (4 cell lines) or into a shortening of the beat (3 cell lines), and only 2 cell lines show an increase in both parameters. Indeed, since iPSC-CMs are immature cardiac cells, it is rare to obtain a positive force-frequency relationship without any maturation medium or mechanical or electrical training. We agree that above a concentration of 10nM, dofetilide shows cardiotoxicity in our tissues as tissues completely stop beating.

      Reviewer #2 (Recommendations For The Authors):

      In addition to the general comments in the public review, I have the following specific suggestions to the authors, that would help improve the manuscript.

      (1) Please describe the protocol for preparation of cardiac rings (shown in Figure 1C) in more detail. In particular, please describe how the tissues were transferred from the mold into the 96-well plate and how are they positioned and characterized during the study.

      Answer: There is no transfer of the tissues as they directly form in the well, that is pre-equipped with the molded PEG gel (See Figure 1B and methods section). The in situ analysis is a strong asset of this platform.

      (2) Please clarify the timepoints in this study. The overall schematic in Figure 1 C shows that the rings were formed on day 22 and then studied for 14 days, while Figure 2B shows data over 20 days following seeding, and Figure 3 shows data 14 days after seeding. It appears that these were separate studies (optimization of myocyte/fibroblast ratio followed by the main study.

      Answer: Figure 1C is showing the timeline including the cardiomyocytes differentiation. hiPSC-CMs are indeed seeded in the wells 22 days after starting the differentiation, which represent the Day0 for tissue formation. We apologize for the confusion.

      (3) Please explain if the number of rings per well (Figure 2) was used as the only criterion for selecting the myocyte/fibroblast ratio, and if so, why. Were these rings also characterized for their structural and contractile properties?

      Answer: Figure 2 supplement 1 report the contractility data according to the different tested ratios, and show no differences. The number for generated ring-shaped tissues was indeed the only criterion retained.

      (4) Please provide rationale for using the dermal rather than cardiac fibroblasts.

      Answer: We had previous experience generating EHTs using dermal fibroblasts which are easier to obtain commercially. Our approach could in theory also work using cardiac fibroblasts, which we have not tested in the present study.

      (5) Figure 2 panels C-E show an interesting segregation of cardiomyocytes into a thin cylindrical layer that does not appear to contain fibroblasts and a shorter and thicker cylinder containing fibroblasts mixed with occasional myocytes. Please specify at which time point this structure forms, and how does it change over time in culture? At which time point were the images taken? It would be helpful to include serial images taken over 1-14 days of study.

      Answer: We thank the reviewer for this interesting comment. We have performed additional immunostainings (reported in Figure 2 supplement 3) on tissues at Day 1 and day 7 after seeding. The segregation appears in the 7 first days. It appears that 1 day after seeding the fibroblasts are not yet attached, although the cardiac fiber has already started to be formed. Seven days after seeding, fibroblasts are fully spread and attached, and the contractile ring is formed and well-aligned. Brightfield images are reported in Figure 1E.

      (6) In the cardiomyocyte region (Figure 2D) the cells staining for troponin seem to be only at the surfaces. The thickness of the layer is only about 30-40 µµ, so one would assume that cell viability was not an issue. Please specify and discuss the composition of this region.

      Answer: We agree but we think this is a technical issue as at the center of the tissue, tissue thickness will limit laser penetration, although at the surface (inner our outer), the laser infiltrates easily between the tissue and the PEG. Moreover, we see on the zoomed view of the tissue in Figure 2 Supplement 2 that we have a staining inside the cardiac fiber, which just appears less strong due to tissue thickness.

      (7) Please also discuss segregation in terms of possible causes and the implications of apparently very limited contact between the two cell types, i.e., how representative is this two-region morphology of native heart tissue. Also, it would be interesting to know how the segregation has changed with the change in myocyte/fibroblast ratio.

      Answer: We are not sure there is a very limited contact as the use of fibroblasts is critical to ensure the formation of tissues (i.e. no tissues can be formed if we avoid the use of fibroblasts). We agree that these ring-shaped cardiac tissues are not especially representative of a native heart tissue in terms of interactions between several cell types. They were developed as a surrogate for physiopathological and pharmacological experiments (see a recent application in https://insight.jci.org/articles/view/161356)

      (8) There is interest and demonstrated ability to culture engineered cardiac tissues over longer periods of time. Please comment what was the rationale for selecting 14-day culture and if the system allows longer culture durations.

      Answer: In line with this comment, we have studied the contractile parameters of our rings 28 days after seeding and compared to their contractile parameters at D14. We found a slight increase for all the parameters, which is significant for the maximum contraction speed. Nevertheless, the data is much more variable and the number of tissues is lower (29 for D14 against 17 for D28). Therefore, we demonstrated that long-term culture of our tissues is possible, however not yet optimized. Hence, the following physiological and pharmacological tests have been done at D14.

      (9) Figure 3 documents the development of contractile parameters over 14 days of culture. Would it be possible to replace the arbitrary units with the actual values? Also, would it be possible to include the corresponding images of the rings taken at the same time points, to show the associated changes in ring morphologies.

      Answer: Contraction amplitude is expressed as a ratio between the contracted / relaxed areas (See figure 3 A): it is a ratio, thus without unit. Corresponding images can be seen in Figure 1 E.

      (10) The measured contraction stress, strain, and the speeds of contraction and relaxation improve from day 1 to day 7 and then plateau (Figure 3, Supplemental Figure 3. Please discuss this result.

      Answer: The new immunostainings performed on tissues at Day 1 and Day 7 show the progressive alignment of the cardiomyocytes and the muscular fibers, with an almost complete organization at Day 7.

      (11) The beating frequency does not appear to markedly change over time, while Figure 3B shows strong statistical significance (***) throughout the 14-day period. Please check/confirm.

      Answer: We confirm this result.

      (12) Please comment on the lack of effect of isoproterenol on beating frequency.

      Answer: We agree with this reviewer on the principle. However, we have repeated the experiments and we confirm our results, i.e. increasing concentrations of isoproterenol induced a trend towards increase in the contraction force and significantly increased contraction and relaxation speeds without change in the beat rate (Figure 5C). We do not have a definitive explanation for this observation. Our hypothesis is that this increase in contraction and relaxation speeds induced by isoproterenol is translated, on average in our study, into an increase in contractile force rather than in an increase in contraction frequency. This may depend on the cell line used, and is very well illustrated in a recent paper from Mannhardt and colleagues (Stem cell reports. 2020; 15(4):983–998). Of the 10 different cell lines tested in engineered heart tissues, all show an increase in contraction and relaxation speeds after isoproterenol administration, but this is translated either into an increase in contractile force (4 cell lines) or into a shortening of the beat (3 cell lines), and only 2 cell lines show an increase in both parameters. Indeed, since iPSC-CMs are immature cardiac cells, it is rare to obtain a positive force-frequency relationship without any maturation medium or mechanical or electrical training.

      (13) Please compare the contractile function of cardiac tissues measured in this study with data reported for other iPSC-derived tissue models.

      Answer : A specific paragraph tackles this aspect in the discussion

    2. eLife assessment

      This paper reports a valuable platform for cardiac tissue cultivation. The throughput, consistency of the tissue, and the potential integration of high-throughput automation are an advantage over other approaches. The tissues and the platform are validated using appropriate methodology to provide convincing evidence of the tissue cultivation capability.

    3. Reviewer #1 (Public Review):

      The manuscript, "A versatile high-throughput assay based on 3D ring-shaped cardiac tissues generated from human induced pluripotent stem cell-derived cardiomyocytes," developed a unique culture platform with PEG hydrogel that facilitates the in-situ measurement of contractile dynamics of the engineered cardiac rings. The authors optimized the tissue seeding conditions, demonstrated tissue morphology with expressions of cardiac and fibroblast markers, mathematically modeled the equation to derive contractile forces and other parameters based on imaging analysis, and concluded by testing several compounds with known cardiac responses.

      The authors answered my questions with appropriate experiments and explanation.

      (1) This paper presents an intriguing platform that creates miniature cardiac rings with merely thousands of cardiomyocytes per tissue in a 96-well plate format. The shape of the ring and the squeezing motion can recapitulate the contraction of the cardiac chamber to a certain degree. However, Thavandiran et al. (PNAS 2013) created a larger version of the cardiac ring and found that electrical propagation revealed spontaneous infinite loop-like cycles of activation propagation traversing the ring. This model was used to mimic a reentrant wave during arrhythmia. Therefore, there are concerns about whether a large number of cardiac tissues experience arrhythmia due to geometry-induced re-entry current and cannot be used as a healthy tissue model.

      In the new experiment, the authors demonstrated with voltage-sensitive dye that these miniaturized tissues do not experience any arrhythmia, potentially due to their small size.

      (2) The platform can produce 21 cardiac rings per well in 96-well plates, with the throughput being the highest among competing platforms. The resulting tissues exhibit good sarcomere striation due to the strain from the pillars. However, emerging questions pertain to culture longevity and reproducibility among tissues. According to Figure 1E, uneven ring formation around the pillar leads to tissue thinning and breakage. Only 50% survival is observed after 20 days of culture in the optimized seeding group. Are there any strategies to improve this survival rate? Additionally, do the cardiac rings detach from the glass slides and roll up, given the two compartments with cardiac and fibroblast-rich regions where fibroblasts maintain attachment to the glass slides? Moreover, the standard deviation of force measurement is a quarter of the value, which is suboptimal given the high replicate number. As the platform utilizes imaging analysis to derive contractile dynamics, calibration based on the angle and distance of the camera lens to individual tissues should be conducted to reduce error. On the other hand, how reproducible are the pillars? It is highly recommended to mechanically evaluate the consistency of the hydrogel-based pillars across different wells and within wells to understand the variance.

      The authors stated that the platform has been tested and improved with multiple cell lines to enhance tissue survival rates. The methodology of image capture and calculation of contractile dynamics were explained in detail to address concerns. Moreover, the reproducibility of the pillars was demonstrated by consistent results of Young's Modulus (AFM) from each pillar, showing low standard deviations.

      (3) Does the platform allow the observation of non-synchronized beating when testing with compounds? This can be extremely important as the intended applications of this platform are drug testing and cardiac disease modeling. The author should elaborate on the method in the manuscript and explain the obtained results in detail.

      Referring to Question #1, the platform does not present arrythmia potentially due to the small size of the tissue.

      (4) The results of drug testing are interesting. Isoperenoral is typically causing positive chronotropic and positive inotropic responses, where inotropic responses are difficult to obtain due to low tissue maturity. It is inconsistent with other reported results that cardiac rings do not exhibit increased beating frequency, but slightly increased forces only.

      The authors repeated the experiment with the same results and hypothesized that the results would be line-dependent, since the maturation of iPSC-CM is not consistent. The additional dose curves provided more information on the tissue behaviors against well-known compounds.

      Overall, the manuscript is well-written, and the designed platform presents unique advantages for high-throughput cardiac tissue culture. The paper has adequate data to demonstrate the proof-of-concept study of the platform. The throughput, consistency of the tissue, and the potential integration of high-throughput automation would be the highlights of this platform.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews

      We thank the reviewers for their insightful comments and helpful suggestions that allowed us to improve the manuscript.

      Reviewer #1:

      Thermogenic adipocyte activity associate with cardiometabolic health in humans but decline with age. Identifying the underlying mechanisms of this decline is therefore highly important.

      To address this task, Holman and co-authors investigated the effects of two major determinants of thermogenic activity: cold, which induce thermogenic de novo differentiation as well as conversion of dormant thermogenic inguinal adipocytes: and aging, which strongly reduce thermogenic activity. The authors study young and middle-aged mice at thermoneutrality and following cold exposure.

      Using linage tracing, the authors conclude that the older group produce less thermogenic adipocytes from progenitor differentiation. However, they found no differences between thermogenic differentiation capacity between the age groups when progenitors are isolated and differentiated in vitro. This finding is consistent with previous findings in humans, demonstrating that progenitor cells derived from dormant perirenal brown fat of humans differentiate into thermogenic adipocytes in vitro. Taken together, this underscores that age-related changes in the microenvironment rather than autonomous alterations in the ASPCs explain the age-related decline in thermogenic capacity. This is an important finding in terms of identifying new approaches to switch dormant adipocytes into an active thermogenic phenotype.

      To gain insight into the age-related changes, the authors use single cell and single nuclei RNA sequencing mapping of their two age groups, comparing thermoneutral and cold conditions between the two groups. Interestingly, where the literature previously demonstrated that de novo lipogenesis (DNL) occurs in relation to thermogenic activation, the authors show that DNL in fact is activated in a white adipocyte cell type, whereas the beige thermogenic adipocytes form a separate cluster.

      Considering recent findings, that adipose tissue contains several subtypes of ASPCs and adipocytes, mapping the changes at single cell resolution following cold intervention provides an important contribution to the field, in particular as an older group with limited thermogenic adaptation is analyzed in parallel with a younger, more responsive group. This model also allowed for detection of microenvironment as a determining factor of thermogenic response.

      The use of only two time points (young and middle-aged) along the aging continuum limits the conclusions that can be made on aging as the only driver of the observed differences between the groups. It should for example be noted that the older mice had higher weights and larger fat depots, thus the phenotype is complex and this should be taken into consideration when interpreting the data.

      In conclusion, this study provides an important resource for further studies on how to reactivate dormant thermogenic fat and potentially improve metabolic health.

      (1) The authors claim "Aging impairs cold-induced beige adipogenesis and adipocyte metabolic reprogramming". It is previously established in humans that aging strongly associate with a decline in thermogenic capacity. With this in mind, it is easy to accept that the reduced browning observed in the older group is due to age. However, the older group also have larger adipose depots, which also can be a confounding factor. I, therefore, recommend bringing this into the discussion and putting more focus on the complexity of the phenotype. For example, it could be discussed whether the de novo lipogenesis less due to that the adipocytes of older mice is already filled with more lipids. Additional time points along the aging continuum would be needed to make a strong conclusion about age as the determinant, but even so, aging is complex and further definitions and discussion would be needed.

      We agree with the reviewer regarding the confounding effect of body weight changes. We have added a paragraph to the discussion (pasted below) to comment on the complexity of the phenotype and the contributing role of linked changes in body weight/composition.

      “Aging is a complex process, and unsurprisingly, many pathways have been linked to the aging-related decline in beiging capacity. For example, increased adipose cell senescence, impaired mitochondrial function, elevated PDGF signaling and dysregulated immune cell activity during aging diminish beige fat formation (Benvie et al., 2023; Berry et al., 2017; Goldberg et al., 2021; Nguyen et al., 2021). Of note, older mice exhibit higher body and fat mass, which is associated with metabolic dysfunction and reduced beige fat development. While the effects of aging and altered body composition are difficult to separate, previous studies suggest that the beiging deficit in aged mice is not solely attributable to changes in body weight (Rogers et al., 2012). Further studies, including additional time points across the aging continuum may help clarify the role of aging and ascertain when beiging capacity decreases.”

      (2) The study would gain from more comparisons to existing human studies and discussion on the translation potential of the findings. For example, how does the adipocyte subtypes identified in the current study translate to subtypes identified in human adipose tissue (e.g. Emont et al).

      We analyzed the human adipose tissue atlas from Emont et al. 2022 (PMID: 35296864). We did not find any obvious homologous human adipocyte subtypes. However, this and other available human single cell studies have not investigated the effects of cold exposure on white adipose tissue depots, which may be necessary to reveal DNL-high and especially beige adipocytes.

      (3) The group has contributed multiple studies demonstrating that Prdm16 is a major inducer of a thermogenic phenotype, and the literature shows that Prdm16 promote a thermogenic phenotype in favour of a fibrogenic aging phenotype. It would therefore be interesting to see how Prdm16 is regulated in the current data set, across adipocytes subtypes, age groups and temperature conditions.

      We thank the reviewer for this comment. Previous studies showed that PRDM16 protein and not mRNA levels are downregulated during aging (Wang et al., 2019, Cell Metab, PMID: 31155495; Wang et al., 2022, Nature, PMID: 35978186). Consistent with this, we did not observe an agingassociated reduction in Prdm16 mRNA levels in adipocytes in our dataset. We did observe enrichment of Prdm16 mRNA levels in beige adipocytes relative to other adipocyte clusters. We included these data in Fig. 5F.

      (4) In Figure 1, it is difficult to understand why the 6 weeks cold exposure is not shown in relation to the thermoneutrality, 3 days and 2-week cold exposure? It would be useful to have this in the same graph relating the levels and showing all four marker genes for all time points.

      These experiments were done at different times using separate groups of mice. We have now clarified this in the figure legend.

      (5) The older mice had larger inguinal fat depots, suggesting more lipids stored. The morphology of adipose tissue has previously been shown to be modulated by cold acclimation and is also the main similarity between brown adipose tissue in adult humans and young mice beige adipose tissue. Fig S2b suggests smaller adipocytes in the young group. It would also be useful, for comparison to published data, if authors show tissue sections with H&E of their model.

      Good point. We added panels showing H&E staining of serial iWAT sections, showing changes in tissue morphology across age and temperature conditions (Figure S1F).

      (6) The authors use t-tests to compare the differences induced by e.g. cold or min vs max cell culture media etc, within each age group. However, in my opinion, a two-way Anova with post-tests would be more informative as this would allow for testing the effects of the two age categories on any quantitative variable and allow for addressing whether there is an interaction between the categories.

      Following the reviewer’s recommendation, we applied two-way ANOVA with a Tukey correction for multiple comparisons for categorical comparisons with different age groups and conditions. P values from all significant multiple comparison tests are now included within the methods section.

      (7) In Figure 5F, please include Adipoq expression between clusters and please add a reference to why Nnat is considered a canonical white adipocyte marker.

      We added Adipoq to the violin plot in Figure 5F, showing differential expression across adipocyte clusters. We included a line in the results section to highlight this observation:

      “Interestingly, Adiponectin (Adipoq) was differentially expressed across adipocyte clusters, with higher levels in Npr3-high and DNL-high cells.”

      We removed “canonical” and added references for Nnat and Lep as white marker genes.

      (8) After 14 days of cold exposure, it looks like the DNL high population divides into two populations, did the authors explore if there was any differences between these clusters?

      We also noticed this apparent division and explored this question. However, upon increasing the resolution for clustering and splitting the DNL high population, there were no obvious differentially expressed genes that defined the two subclusters. Thus, we opted to keep them together.

      (9) As cold treatment transform a subset of cells, can authors perform a data-driven analysis to visualize the directions in their single nuclei data sets by using monocle pseudotime and/or velocity analyses?

      This is a good question. We spent a long time trying to address this question using several trajectory and pseudotime analysis methods, including Velocity (scVelo), Slingshot and Dynoverse. Unfortunately, we were unable to obtain concordant results using at least two different methods and felt that the analyses were unreliable.

      Reviewer #2:

      This manuscript focused on why aging leads to decreased beiging of white adipose tissue. The authors used an inducible lineage tracing system and provided in vivo evidence that de novo beige adipogenesis from Pdgfra+ adipocyte progenitor cells is blocked during early aging in subcutaneous fat. Single-cell RNA sequencing of adipocyte progenitor cells and in vitro assays showed that these cells have similar beige adipogenic capacities in vitro. Single-cell nucleus RNA sequencing of mature adipocytes indicated that aged mice have more Npr3 high-expressing adipocytes in the subcutaneous fat from aged mice.

      Meanwhile, adipocytes from aged mice have significantly lower expression of genes involved in de novo lipogenesis, which may contribute to the declined beige adipogenesis.

      The mechanism that leads to age-related impairment of white adipose tissue beiging is not very clear. The finding that Pdgfra+ adipocyte progenitor cells contribute to beige adipogenesis is novel and interesting. It is more intriguing that the aging process represses Pdgfra+ adipocyte progenitor cells from differentiating into beige adipocytes during cold stimulation. Mature adipocytes that have high de novo lipogenesis activity may support beige adipogenesis is also novel and worth further pursuing. The study was carried out with a nice experimental design, and the authors provided sufficient data to support the major conclusions. I only have a few comments that could potentially improve the manuscript.

      (1) It is interesting that after three days of cold exposure, aged mice also have much fewer beige adipocytes. Is de novo adipogenesis involved at this early stage? Or does the previous beige adipocyte that acquired white morphology have a better "reactivation" in young mice? It would be nice if the author could discuss the possibilities.

      This is a good question. We did not evaluate beige adipogenesis at the 3d timepoint. However, a previous study demonstrates that 3d of cold exposure is sufficient to promote de novo beige adipogenesis (Wang et al., Nat Med. 2013, PMID: 23995282). We observed that beige adipogenesis from Pdgfra+ cells are a relatively minor contributor to beige adipocyte development, even after long term cold exposure in young mice. Based on these data, we presume that beige adipocyte activation (or re-activation) is the dominant mechanism for beige adipocyte development.

      To clarify this point, we have included the following lines in the manuscript:

      “Previous studies in mice using an adipocyte fate tracking system show that a high proportion of beige adipocytes arise via the de novo differentiation of ASPCs as early as 3 days of cold (Wang et al., 2013).”

      “Based on these findings, we presume that mature (dormant beige) adipocytes serve as the major source of beige adipocytes in our cold-exposure paradigm. However, long-term cold exposure also recruits smooth muscle cells to differentiate into beige adipocytes; a process that we did not investigate here (Berry et al., 2016; Long et al., 2014; McDonald et al., 2015; Shamsi et al., 2021).”

      (2) Is the absolute number of Pdgfra+ cells decreased in aged mice? It would be nice to include quantifications of the percentage of tomato+ beige adipocytes in total tomato+ cells to reflect the adipogenic rate.

      We presented FACS quantification of tdTomato+/Pdgfra+ cells in Fig. 2B. We added a graph showing the percentage of Pdgfra+ cells of total live, lin- cells in adipose tissue; this showed no difference between young and aged mice. We did not perform FACS quantification of tdTomato+ beige adipocytes due to the technical challenges with sorting adipocytes. Quantification of total tdTomato+ cells was also unreliable and inconsistent due to the widespread labeling of fibroblasts, blood vessels, along with traced adipocytes. Thus, we did not include this analysis.

      (3) Line 112, the sentence seems to be not finished.

      This has been corrected.

    2. eLife assessment

      This fundamental study provides evidence that de novo beige adipogenesis from Pdgfra+ adipocyte progenitor cells is blocked during early aging in subcutaneous fat. The depth of the data at early ages is compelling, with rigorous cell tracing methodology employed. The study will aid in identifying new approaches to switch dormant adipocytes into an active thermogenic phenotype, and should be of interest to cell biologists at large.

    3. Reviewer #1 (Public Review):

      Thermogenic adipocyte activity associate with cardiometabolic health in humans, but decline with age. Identifying the underlying mechanisms of this decline is therefore highly important.

      To address this task, Holman and co-authors present compelling data from their investigations of the effects of two major determinants of thermogenic activity: cold, which induce thermogenic de novo differentiation as well as conversion of dormant thermogenic inguinal adipocytes: and aging, which strongly reduce thermogenic activity. The authors study young and middle-aged mice at thermoneutrality and following cold exposure.

      Using linage tracing, the authors conclude that the older group produce less thermogenic adipocytes from progenitor differentiation. However, they found no differences between thermogenic differentiation capacity between the age groups when progenitors are isolated and differentiated in vitro. This finding is consistent with previous findings in humans, demonstrating that progenitor cells derived from dormant perirenal brown fat of humans differentiate into thermogenic adipocytes in vitro. Taken together, this underscores that age-related changes in the microenvironment rather than autonomous alterations in the ASPCs explain the age related decline in thermogenic capacity, This is an important finding in terms of identifying new approaches to switch dormant adipocytes into an active thermogenic phenotype.

      To gain insight into the age-related changes, the authors use single cell and single nuclei RNA sequencing mapping of their two age groups, comparing thermoneutral and cold conditions between the two groups. Interestingly, where the literature previously demonstrated that de novo lipogenesis (DNL) occurs in relation to thermogenic activation, the authors show that DNL in fact is activated in a white adipocyte cell type, whereas the beige thermogenic adipocytes form a separate cluster.

      Considering recent findings, that adipose tissue contains several subtypes of ASPCs and adipocytes, mapping the changes at single cell resolution following cold intervention provides an important contribution to the field, in particular as an older group with limited thermogenic adaptation is analyzed in parallel with a younger, more responsive group. This model also allowed for detection of microenvironment as a determining factor of thermogenic response.

      The use of only two time points (young and middle-aged) along the aging continuum limits the conclusions that can be made on aging as the only driver of the observed differences between the groups. Furthermore, as the authors also discuss, aging is a complex phenotype, and in this case the older mice were heavier and had larger fat depots, which should be taken into consideration when interpreting the data.

      In conclusion, this study provides an important resource for further studies, which should investigate how the findings can be translated into humans for reactivation of dormant thermogenic fat and a potential improvement of metabolic health.

    4. Reviewer #2 (Public Review):

      This manuscript focused on why aging leads to decreased beiging of white adipose tissue. The authors used an inducible lineage tracing system and provided in vivo evidence that de novo beige adipogenesis from Pdgfra+ adipocyte progenitor cells is blocked during early aging in subcutaneous fat. Single-cell RNA sequencing of adipocyte progenitor cells and in vitro assays showed that these cells have similar beige adipogenic capacities in vitro. Single-cell nucleus RNA sequencing of mature adipocytes indicated that aged mice have more Npr3 high-expressing adipocytes in the subcutaneous fat from aged mice. Meanwhile, adipocytes from aged mice have significantly lower expression of genes involved in de novo lipogenesis, which may contribute to the declined beige adipogenesis.

      The mechanism that leads to age-related impairment of white adipose tissue beiging is not very clear. The finding that Pdgfra+ adipocyte progenitor cells contribute to beige adipogenesis is novel and interesting. It is more intriguing that the aging process represses Pdgfra+ adipocyte progenitor cells from differentiating into beige adipocytes during cold stimulation. Mature adipocytes that have high de novo lipogenesis activity may support beige adipogenesis is also novel and worth further pursuing. The study was carried out with a nice experimental design, and the authors provided sufficient data to support the major conclusions. I only have a few comments that could potentially improve the manuscript.

      (1) It is interesting that after three days of cold exposure, aged mice also have much fewer beige adipocytes. Is de novo adipogenesis involved at this early stage? Or does the previous beige adipocyte that acquired white morphology have a better "reactivation" in young mice? It would be nice if the author could discuss the possibilities.

      (2) Is the absolute number of Pdgfra+ cells decreased in aged mice? It would be nice to include quantifications of the percentage of tomato+ beige adipocytes in total tomato+ cells to reflect the adipogenic rate.

    1. eLife assessment

      This important study combines fMRI and electrophysiology in sedated and awake rats to show that LFPs strongly explain spatial correlations in resting-state fMRI but only weakly explain temporal variability. They propose that other, electrophysiology-invisible mechanisms contribute to the fMRI signal. The evidence supporting the separation of spatial and temporal correlations is convincing, however, the support of electrophysiological-invisible mechanisms is incomplete, considering alternative potential factors that could account for the differences in spatial and temporal correlation that were observed. This work will be of interest to researchers who study the mechanisms behind resting-state fMRI.

    2. Reviewer #1 (Public Review):

      Tu et al investigated how LFPs recorded simultaneously with rsfMRI explain the spatiotemporal patterns of functional connectivity in sedated and awake rats. They find that connectivity maps generated from gamma band LFPs (from either area) explain very well the spatial correlations observed in rsfMRI signals, but that the temporal variance in rsfMRI data is more poorly explained by the same LFP signals. The authors excluded the effects of sedation in this effect by investigating rats in the awake state (a remarkable feat in the MRI scanner), where the findings generally replicate. The authors also performed a series of tests to assess multiple factors (including noise, outliers, and nonlinearity of the data) in their analysis.

      This apparent paradox is then explained by a hypothetical model in which LFPs and neurovascular coupling are generated in some sense "in parallel" by different neuron types, some of which drive LFPs and are measured by ePhys, while others (nNOS, etc.) have an important role in neurovascular coupling but are less visible in Ephys data. Hence the discrepancy is explained by the spatial similarity of neural activity but the more "selective" LFPs picked up by Ephys account for the different temporal aspects observed.

      This is a deep, outstanding study that harnesses multidisciplinary approaches (fMRI and ephys) for observing brain activity. The results are strongly supported by the comprehensive analyses done by the authors, which ruled out many potential sources for the observed findings. The study's impact is expected to be very large.

      There are very few weaknesses in the work, but I'd point out that the 1-second temporal resolution may have masked significant temporal correlations between LFPs and spontaneous activity, for instance, as shown by Cabral et al Nature Communications 2023, and even in earlier QPP work from the Keilholz Lab. The synchronization of the LFPs may correlate more with one of these modes than the total signal. Perhaps a kind of "dynamic connectivity" analysis on the authors' data could test whether LFPs correlate better with the activity at specific intervals. However, this could purely be discussed and left for future work, in my opinion.

    3. Reviewer #2 (Public Review):

      The authors address a question that is interesting and important to the sub-field of rsfMRI that examines electrophysiological correlates of rsfMRI. That is, while electrophysiology-produced correlation maps often appear similar to correlation maps produced from BOLD alone (as has been shown in many papers) is this actually coming from the same source of variance, or independent but spatially-correlated sources of variance? To address this, the authors recorded LFP signals in 2 areas (M1 and ACC) and compared the maps produced by correlating BOLD with them to maps produced by BOLD-BOLD correlations. They then attempt to remove various sources of variance and see the results.

      The basic concept of the research is sound, though primarily of interest to the subset of rsfMRI researchers who use simultaneous electrophysiology. However, there are major problems in the writing, and also a major methodological problem.

      Major problems with writing:

      (1) There is substantial literature on rats on site-specific LFP recording compared to rsfMRI, and much of it already examined removing part of the LFP and examining rsfMRI, or vice versa. The authors do not cover it and consider their work on signal removal more novel than it is.

      (2) The conclusion of the existence of an "electrophysiology-invisible signal" is far too broad considering the limited scope of this study. There are many factors that can be extracted from LFP that are not used in this study (envelope, phase, infraslow frequencies under 0.1Hz, estimated MUA, etc.) and there are many ways of comparing it to the rsfMRI data that are not done in this study (rank correlation, transformation prior to comparison, clustering prior to comparison, etc.). The one non-linear method used, mutual information, is low sensitivity and does not cover every possible nonlinear interaction. Mutual information is also dependent upon the number of bins selected in the data. Previous studies (see 1) have seen similar results where fMRI and LFP were not fully commensurate but did not need to draw such broad conclusions.

      (3) The writing refers to the spatial extent of correlation with the LFP signal as "spatial variance." However, LFP was recorded from a very limited point and the variance in the correlation map does not necessarily reflect underlying electrophysiological spatial distributions (e.g. Yu et al. Nat Commun. 2023 Mar 24;14(1):1651.)

      Major method problem:

      (4) Correlating LFP to fMRI is correlating two biological signals, with unknown but presumably not uniform distributions. However, correlating CC results from correlation maps is comparing uniform distributions. This is not a fair comparison, especially considering that the noise added is also uniform as it was created with the rand() function in MATLAB.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Response to Reviewers’ Public Comments

      We are grateful for the reviewers’ comments. We have modified the manuscript accordingly and detail our responses to their major comments below.

      (1) Reviewer 2 was concerned that transformation of continuous functional data into categorical form could reduce precision in estimating the genetic architecture.

      We agree that transforming continuous data into categories may reduce resolution, but it also improves accuracy when the continuous data are affected by measurement noise. In our dataset, many genotypes are at the lower bound of measurement, and the variation in measured fluorescence among these genotypes is largely or entirely caused by measurement noise. By transforming to categorical data, we dramatically reduced the effect of this noise on the estimation of genetic effects. We modified the results and discussion sections to address this point.

      (2) Reviewer 2 asked about generalizability of our findings.

      Because our paper is the first use of reference-free analysis of a 20-state combinatorial dataset, generalizability is at this point unknown. However, a recent manuscript from our group confirms the generality of the simplicity of genetic architecture: using reference-free methods to analyze 20 published combinatorial deep mutational scans, several of which involve 20-state libraries, we found that main and pairwise effects account for virtually all of the genetic variance across a wide variety of protein families and types of biochemical functions (Park Y, Metzger BPH, Thornton JW. 2023. The simplicity of protein sequence-function relationships. BioRxiv, 2023.09.02.556057). Concerning the facilitating effect of epistasis on the evolution of new functions, we speculate that this result is likely to be general: we have no reason to think that the underlying cause of this observation – epistasis brings genotypes with different functions closer in sequence space to each other and expands the total number of functional sequences – arises from some peculiarity of the mechanisms of steroid receptor DBD folding or DNA binding. However, we acknowledge that our data involve sequence variation at those sites in the protein that directly mediate specific protein-DNA contact; it is plausible that sites far from the “active site” may have weaker epistatic interactions and therefore have weaker effects on navigability of the landscape. We have addressed these issues in the discussion.

      (3) Reviewer 3 asked “in which situation would the authors expect that pairwise epistasis does not play a crucial role for mutational steps, trajectories, or space connectedness, if it is dominant in the genotype-phenotype landscape?”

      The question addressed in our paper is not whether epistasis shapes steps, trajectories or connectedness in sequence space but how it does so and what its particular effects are on the evolution of new functions. The dominant view in the field has been that the primary role of epistasis is to block evolutionary paths. We show, however, that in multi-state sequence space, epistasis facilitates rather than impedes the evolution of new functions. It does this by increasing the number of functional genotypes and bringing genotypes with different functions closer together in sequence space. This finding was possible because of the difference in approach between our paper and prior work: most prior work considered only direct paths in a binary sequence space between two particular starting points – and typically only considering optimization of a single function – whereas we studied the evolution of new functions in a multi-state amino acid space, under empirically relevant epistasis informed by complete combinatorial experiments. The result is a clear demonstration that the net effect of real-world levels of epistasis on navigability of the multidimensional sequence landscape is to make the evolution of new functions easier, not harder.

      (4) Reviewer 3 asked for “an explanation of how much new biological results this paper delivers as compared with the paper in which the data were originally published.”

      Starr 2017 did not use their data to characterize the underlying genetic architecture of function by estimating main and epistatic effects of amino acid states and combinations; it also did not evaluate the importance of epistasis in generating functional variants, determining the transcription factor’s specificity, or shaping evolutionary navigability on the landscape.

      (5) Reviewer 3 requested an explanation of how the results would have been (potentially) different if a reference-based approach were used, and how reference-based analysis compares with other reference-free approaches to estimating epistasis.

      This topic has been covered in detail in a recent manuscript from our group (Park et al. Biorxiv 2023.09.02.556057). Briefly, reference-free approaches provide the most efficient explanation of an entire genotype-phenotype map, explaining the maximum amount of genetic variance and reducing sensitivity to experimental noise and missing genotypes compared to reference-based approaches. Reference-based approaches tend to infer much more epistasis, especially higher-order epistasis, because measurement error and local idiosyncrasy near the wild-type sequence propagate into spurious high-order terms. Reference-based analyses are appropriate for characterizing only the immediate sequence neighborhood of a particular “wild-type” protein of interest. Reference-free approaches are therefore best suited to understanding genotype-phenotype landscapes as a whole. We have clarified these issues in the revised discussion.

      (6) Reviewer 3 suggested that the comparison between the full and main-effects-only model should involve a re-estimation of main effects in the latter case.

      This is indeed what we did in our analysis. We have clarified the description in the results and methods sections to make this clear.

      (7) Reviewer 3 asked about the applicability of the approach to data beyond those analyzed in the present study and requirements to use it.

      Our approach could be used for any combinatorial DMS dataset in which the phenotypic data are categorical (or can be converted to categorical form). Complete sampling is not required: a virtue of reference-free analysis is that by averaging the estimated effects of states and combinations over all variants that contain them, reference-free analysis is highly robust to missing data (except at the highest possible order of epistasis, where only a single variant represents a high-order effect) as long as variant sampling is unbiased with respect to phenotype. All the required code are publicly available at the github link provided in this manuscript. We have also described a general form of reference-free analysis for continuous data and applied it to 20 protein datasets in a recent publication (Park et al. Biorxiv 2023.09.02.556057).

      (8)Reviewer 3 suggested that the text could be shortened and made less dense.

      We agree and have done a careful edit to streamline the narrative.

      Response to Reviewers’ Non-Public Recommendations

      (1) Reviewer 1 noted that specific epistatic effects might in some cases produce global nonlinearities in the genotype-phenotype relationship. They then asked how our results might change if we did not impose a nonlinear transformation as part of the genotype-phenotype model. The reviewer’s underlying concern was that the non-specific transformation might capture high-order specific epistatic effects and thus reducing their importance.

      Because our data are categorical, we required a model that characterizes the effect of particular amino acid states and combinations on the probability that a variant is in a null, weak, or strong activation class. A logistic model is the classic approach to this kind of analysis. The model structure assumes that amino acid states and combinations have additive effects on the log-odds of being in one functional class versus the lower functional class(es); the only nonlinear transformation is that which arises mathematically when log-odds are transformed into probability through the logistic link function. Thinking through the reviewer’s comment, we have concluded that our model does not make any explicit transformation to account for nonlinearity in the relationship between the effects of specific sequence states/combinations and the measured phenotype (activation class). If additional global nonlinearities are present in the genotype-phenotype relationship – such as could be imposed by limited dynamic range in the production of the fluorescence phenotype or the assay used to measure it – it is possible that the sigmoid shape of the logistic link function may also accommodate these nonlinearities. We have noted this part in the revised manuscript.

      (2) Reviewer 1 observed that our model seems to prefer sets of several pairwise interactions among states across sites rather than fewer high-order interactions among those same states.

      This finding arises because the pattern of phenotypic variation across genotypes in our dataset is consistent with that which would be produced by pairwise interactions rather than by high-order interactions. In a reference-free framework, these patterns are distinct from each other: a group of second-order terms cannot fit the patterns produced by high-order epistasis, and high-order terms cannot fit the pattern produced by pairwise interactions. Similarly, main-effect terms cannot fit the pattern of phenotypes produced by a pairwise interaction, and a pairwise epistatic term cannot fit the pattern produced by main effects of states at two sites. For example, third-order terms are required when the genotypes possessing a particular triplet of states deviate from that expected given all the main and second-order effects of those states; this deviation cannot be explained by any combination of first- and second-order effects.

      We explain this point in detail in our recent manuscript (Park Y, Metzger BPH, Thornton JW. 2023. The simplicity of protein sequence-function relationships. BioRxiv, 2023.09.02.556057) and we summarize it here. Consider the simple example of two sites with two possible states (genotypes 00, 01, 10, and 11). If there are no main effects and no pairwise effects, this architecture will generate the same phenotype for all four variants – the global average (or zero-order effect). If there are pairwise effects but no main effects, this architecture will generate a set of phenotypes on which the average phenotype of genotypes with a 0 at the first site (00 and 01) equals the global average – as does the average of those with 0 at the second site (00 and 10). The epistatic effect causes the individual genotypes to deviate from the global average. This pattern can be fit only by a pairwise epistatic term, not by first-order terms. Conversely, if there are main effects but no pairwise effects, then the average phenotype of genotypes 00 and 01 will deviate from the global average (by an amount equal to the first-order effect), as will the average of (00 and 10): the phenotype of each genotype will be equal to the sum of the relevant first-order effects for the state it contains. This pattern cannot be fit by second-order model terms. The same logic extends to higher orders: a cluster of second-order terms cannot explain variation generated by third-order epistasis, because third-order variation is by definition is the deviation from the best second-order model.

      (3) Reviewer 1 suggested several places in the text where citations to prior work would be appropriate.

      We appreciate these suggestions and have modified the manuscript to refer to most of these works.

      (4) Reviewer 1 pointed to the paper of Gong et al eLife 2013 and asked whether it is known how robust the proteins in our study are to changes in conformation/stability compared to other proteins, and whether this might impact the likelihood of observing higher-order epistasis in this system.

      The DBDs that we study here are very stable, and previous work shows that mutations affect DNA specificity primarily by modifying the DBD’s affinity rather than its stability (McKeown et al., Cell 2014). Additionally, Gong et al.’s findings pertain to a globally nonlinear relationship between stability and function, which arises from the Boltzmann relationship between the energy of folding and occupancy of the folded state. Because our data are categorical – based on rank-order of measured phenotype rather than fluorescence as a continuous phenotype – the kind of global nonlinearity observed in Gong’s study are not expected to produce spurious estimates of epistasis in our work. We have modified the discussion to discuss the point.

      (5) Reviewer 1 asked a) why the epistatic models produce landscapes on which variants have fewer neighbors on average than main-effects only models and b) why the average distance from all ERE-specific nodes to all SRE-specific nodes is greater with epistasis (but the average distance from ERE to nearest SRE is lower with epistasis).

      In the main effects-only landscape, the functional genotypes are relatively similar to each other, because each must contain several of the states that contribute the most to a positive genetic score. Moreover, ERE-specific nodes are similar to each other, and SRE-specific nodes are similar to each other, because each must contain one or more of a relatively small number of specificity-determining states. When epistasis is added to the genetic architecture, two things happen: 1) more genotypes become functional because there are more combinations that can exceed the threshold score to produce a functional activator and 2) these additional functional variants are more different from each other – in general, and within the classes of ERE- or SRE-specific variants – because there are now more diverse combinations of states that can yield either phenotype. As a result, a broader span of sequence space is occupied, but ERE- and SRE-specific variants are more interspersed with each other. This means that the average distance between all pairs of nodes is greater, and this applies to all ERE-SRE pairs, as well. However, the interspersing means that the closest single SRE to any particular ERE is closer than it was without epistasis. We have added this explanation to the main text.

      (6) Reviewer 2 asked us to explain why average path length increases with pairwise epistasis as the strength of selection for specificity increases.

      This behavior occurs because of the existence of a local peak in the pairwise model. Genotypes on this peak contained few connections to other genotypes, all of which were less SRE specific. Thus, with strong selection, i.e. high population size, the simulations became stuck on the local peak, cycling among the genotypes many times before leaving, resulting in a large increase in the mean step number. As shown in the rest of the figure, when the longest set of paths are removed, there are still differences in the average number of steps with and without epistasis. This issue is described in the methods section.

      (7) Reviewers made several suggestions for clarity in the text and figures.

      We have modified the paper to address all of these comments.

      (8) Reviewer 3 stated that the code should be available.

      The code is available at https://github.com/JoeThorntonLab/DBD.GeneticArchitecture.

    2. Reviewer #2 (Public Review):

      The authors aimed to understand how epistasis influences the genetic architecture of the DNA-binding domain (DBD) of steroid hormone receptor. An ordinal regression model was developed in this study to analyze a published deep mutational scanning dataset that consists of all combinatorial amino acid variants across four positions (i.e. 160,000 variants). This published dataset measured the binding of each variant to the estrogen receptor response element (ERE, sequence: AGGTCA) as well as the steroid receptor response element (SRE, sequence: AGAACA). This model has major strengths of being reference free and able to account for global nonlinearity in the genotype-phenotype relationship. Thorough analyses of the modelling results have performed, which provided convincing results to support the importance of epistasis in promoting evolution of protein functions. This conclusion is impactful because many previous studies have shown that epistasis constrains evolution. The novelty this study will likely stimulate new ideas in the field. The model will also likely be utilized by other groups in the community.

    3. eLife assessment

      This study includes fundamental findings on protein evolution, namely that changes in function are largely attributable to pairwise rather than higher-order interactions, and that epistasis potentiates rather than constrains evolutionary paths. Compelling evidence supporting the conclusions is provided by applying a new model to a previously generated experimental dataset on deep mutational scanning of the DNA-binding domain (DBD) of steroid hormone receptor. The implications of this work are of considerable interest to protein biochemistry, evolutionary biology, and numerous other fields.

    4. Reviewer #1 (Public Review):

      Metzger et al develop a rigorous method filling an important unmet need in protein evolution - analysis of protein genetic architecture and evolution using data from combinatorially complete 20^N variant libraries. Addressing this need has become increasingly valuable, as experimental methods for generating these datasets expand in scope and scale. Their method integrates two key features - (1) it reports the effects of mutations relative to the average across all variants, rather than a particular genotype, making it useful for examining global genetic architecture, and (2) it does this for all possible 20 states at each site, in contrast to the binary analyses in prior work. These features are not individually novel but integrating them into a single analysis framework is novel and will be valuable to the protein evolution community. Using a previously published dataset generated by two of the authors, they conclude that (1) changes in function are largely attributable to pairwise but not higher-order interactions, and (2) epistasis potentiates, rather than constrains, evolutionary paths. These findings are well-supported by the data. Overall, this work has important implications for predicting the relationship between genotype and phenotype, which is of considerable interest to protein biochemistry, evolutionary biology, and numerous other fields.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The authors were trying to understand the relationship between the development of large trunks and longirrostrine mandibles in bunodont proboscideans of Miocene, and how it reflects the variation in diet patterns.

      Strengths:

      The study is very well supported, written, and illustrated, with plenty of supplementary material. The findings are highly significant for the understanding of the diversification of bunodont proboscideans in Asia during Miocene, as well as explaining the cranial/jaw disparity of fossil lineages. This work elucidates the diversification of paleobiological aspects of fossil proboscideans and their evolutionary response to open environments in the Neogene using several methods. The authors included all Asian bunodont proboscideans with long mandibles and I suggest that they should use the expression "bunodont proboscideans" instead of gomphotheres.

      Weaknesses:

      I believe that the only weakness is the lack of discussion comparing their results with the development of gigantism and long limbs in proboscideans from the same epoch.

      Thank you for your comprehensive review and positive feedback on our study regarding the co-evolution of feeding organs in bunodont proboscideans during the Miocene. We appreciate your suggestion, and have decided to use the term "bunodont elephantiforms" (for more explicit clarification, we use elephantiforms to exclude some early proboscideans, like Moeritherium, ect.) instead of "gomphotheres," and we will make this change in our revised manuscript. We also appreciate the potential weakness you mentioned regarding the lack of discussion comparing our results with the development of gigantism and long limbs in proboscideans from the same epoch. We agree with the reviewer’s suggestion, and we are aware that gigantism and long limbs are potential factors for trunk development. Gigantism resulted in the loss of flexibility in elephantiforms, and long limbs made it more challenging for them to reach the ground. A long trunk serves as compensation for these limitations. limb bones were rare to find in our material, especially those preserved in association with the skull.

      Reviewer #2 (Public Review):

      This study focuses on the eco-morphology, the feeding behaviors, and the co-evolution of feeding organs of longirostrine gomphotheres (Amebelodontidae, Choerolophodontidae, and Gomphotheriidae) which are characterised by their distinctive mandible and mandible tusk morphologies. They also have different evolutionary stages of food acquisition organs which may have co-evolve with extremely elongated mandibular symphysis and tusks. Although these three longirostrine gomphothere families were widely distributed in Northern China in the Early-Middle Miocene, the relative abundances and the distribution of these groups were different through time as a result of the climatic changes and ecosysytems.

      These three groups have different feeding behaviors indicated by different mandibular symphysis and tusk morphologies. Additionally, they have different evolutionary stages of trunks which are reflected by the narial region morphology. To be able to construct the feeding behavior and the relation between the mandible and the trunk of early elephantiformes, the authors examined the crania and mandibles of these three groups from the Early and Middle Miocene of northern China from three different museums and also made different analyses.

      The analyses made in the study are:

      (1) Finite Element (FE) analysis: They conducted two kinds of tests: the distal forces test, and the twig-cutting test. With the distal forces test, advantageous and disadvantageous mechanical performances under distal vertical and horizontal external forces of each group are established. With the twig-cutting test, a cylindrical twig model of orthotropic elastoplasity was posed in three directions to the distal end of the mandibular task to calculate the sum of the equivalent plastic strain (SEPS). It is indicated that all three groups have different mandible specializations for cutting plants.

      (2) Phylogenetic reconstruction: These groups have different narial region morphology, and in connection with this, have different stages of trunk evolution. The phylogenetic tree shows the degree of specialization of the narial morphology. And narial region evolutionary level is correlated with that of character-combine in relation to horizontal cutting. In the trilophodont longirostrine gomphotheres, co-evolution between the narial region and horizontal cutting behaviour is strongly suggested.

      (3) Enamel isotopes analysis: The results of stable isotope analysis indicate an open environment with a diverse range of habitats and that the niches of these groups overlapped without obvious differentiation.

      The analysis shows that different eco-adaptations have led to the diverse mandibular morphology and open-land grazing has driven the development of trunk-specific functions and loss of the long mandible. This conclusion has been achieved with evidence on palaecological reconstruction, the reconstruction of feeding behaviors, and the examination of mandibular and narial region morphology from the detailed analysis during the study.

      All of the analyses are explained in detail in the supplementary files. The 3D models and movies in the supplementary files are detailed and understandable and explain the conclusion. The conclusions of the study are well supported by data.

      We appreciate your detailed and insightful review of our study. Your summary accurately captures the essence of our research, and we are pleased to note that multiple research methods were used to demonstrate our conclusions. Your recognition of the evidence-based conclusions from paleoecological, feeding behavior reconstruction, and morphological analyses reinforces the validity of our findings. Once again, we appreciate your time and thoughtful reviews.

      Reviewer #1 (Recommendations For The Authors):

      Thank you very much for the invitation to review this amazing manuscript. It is very well written and supported, and I have only minor suggestions to improve the text:

      (1) Some references are not in chronological sequence in the text, and this should be reviewed.

      We greatly appreciate the positive comments of the reviewer. We revised the reference of the manuscript as the reviewer’s suggestion.

      (2) I suggest the use of the expression "bunodont proboscideans" instead of Gomphotheres because there is no agreement if Amebelodontidae and Choerolophodontidae are within Gomphotheriidae, as well as some brevirrostrine bunodont proboscideans from South America. So I think it is ok to use "Gomphotheriidae", but not gomphotheres to refer to all bunodont proboscideans included in the study.

      The reviewer is correct. Using “gomphotheres” to refer to these three groups is inappropriate. We have replaced “gomphotheres” with "bunodont elephantiforms" throughout the entire manuscript. Here, we use “elephantiforms”, not “proboscideans”, to avoid confusion with some early proboscidean members like Moeritherium, ect.

      (3) I was expecting some discussion on the development of large trunks related to the gigantism in these bunodont proboscideans, regarding the huge skulls and the columnar limbs.

      We appreciate this suggestion, and we are aware that gigantism is a potential factor for trunk development. It is difficult to compare the three groups (Amebelodontidae, Choerolophodontidae, and Gomphotheriidae) in terms of their weight and limb bone length, because in our material, limb bones were rarely found, especially those associated with cranial material. Nevertheless, at this stage, all elephantiforms had significantly enlarged cranial sizes and limb bone lengths compared to early members like Phiomia. Gigantism caused the loss of flexibility in elephantiforms, and even the long limbs made it more difficult for an elephantiform to reach the ground. A long trunk compensates for this evolutionary change. Exploring these aspects further is a part of our future work.

      (4) The reference to Alejandro et al should be replaced by Kramarz et al (and the correct surname of the authors). The name and surname of this reference need to be corrected. The correct names are Kramarz, A., Garrido, A., Bond, M. 2019. Please correct this in the text too.

      We thank the reviewer for catching this error. This reference has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      I believe your paper will lead to other studies on other Proboscidean groups on the evolution of the mandible and trunk. There are some corrections in the text:

      • In line 199 in the text in pdf, "Tassy, 1994" should be "Tassy, 1996".

      • In line 241, "studied" should be "studies"

      • In line 313, "," after the word "tool" should be "."

      We appreciate the reviewer for pointing these errors out and have revised these based on the suggestions.

      • In the References, you write "et al." in some references. You should write the names of all of the authors.

      • In the References: "Lister AM. 2013" and "Shoshani&Tassy" are not referenced in the text.

      • In the References: "Tassy P. Gaps, parsimony, and early Miocene elephantoids (Mammalia), with a re-evaluation of Gomphotherium annectens (Matsumoto, 1925). Zool. J. Linn." should be "Tassy P. 1994. Gaps, parsimony, and early Miocene elephantoids (Mammalia), with a re-evaluation of Gomphotherium annectens (Matsumoto, 1925). Zool. J. Linn. 112, 1-2, 101-117" and replaced before "Tassy P. 1996".

      We appreciate the reviewer’s suggestions and have revised these references.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors were trying to understand the relation between the development of large trunks and longirrostrine mandibles in bunodont proboscideans of Miocene, and how it reflects the variation in diet patterns.

      Strengths:

      The study is very well supported, written, and illustrated, with plenty Supplementary materials. The authors included all Asian bunodont proboscideans with long mandibles and I suggest that they should use the expression "bunodont proboscideans" instead of gomphotheres.

      Weaknesses:

      I believe that the only weakness is the lack of discussion comparing their results with the development of gigantism and long limbs in proboscideans from the same epoch.

      The authors reviewed the manuscript according to my suggestions and responded well to all my comments.

    3. eLife assessment

      This study presents fundamental findings on the evolution of extremely elongated mandibular symphysis and tusks in longirostrine gomphotheres from the Early and Middle Miocene of northern China. The integration of multiple methods provides compelling results in the eco-morphology, behavioral ecology, and co-evolutionary biology of these taxa. In doing so, the authors elucidate the diversification of fossil proboscideans and their likely evolutionary responses to late Cenozoic global climatic changes.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #1

      The authors provided experimental data in response to my comments/suggestions in the revision. Overall, most points were appropriate and satisfactory, but some issues remain.

      (1) It is not fully addressed how atypical survivors are generated independently of Rad52-mediated homologous recombination.

      The newly provided data indicate that the formation of atypical telomeres is independent of the Rad52 homologous recombination pathway.

      "The atypical telomeres clones exhibit non-uniform telomere pattern", but the TG-hybridized signals after XhoI digestion are clear and uniform.

      "Atypical telomere" clones may carry circular chromosomes embedded with short TG repeats, rather than linear chromosomes. In other words, atypical telomeres may differ from telomeres, the ends of chromosomes. Is atypical telomere formation dependent on NHEJ? Given that "two chromosomes underwent intra-chromosomal fusions" (Line 248), are atypical telomere clones detected frequently in SY13 cells containing two chromosomes?

      We thank the reviewer’s questions. Frankly, we have not been able to determine the chromosome structures in these so-called "atypical survivors". As we mentioned in the manuscript, there could be mixed telomere structures, e.g. TG tract amplification, intro-chromosome telomere fusion and inter-chromosome telomere fusion. Worse still, these 'atypical survivors' may not have maintained a stable genome, and their karyotype may have undergone stochastic changes during passages. To avoid misunderstanding, we change the term "atypical" to "uncharacterized" in the revised manuscript.

      We have previously shown that deletion of YKU70 does not affect MMEJ-mediated intra-chromosome fusion in single-chromosome SY14 cdc13Δ cells (Wu et al., 2020). In SY12 cells, double knockout of TLC1 and YKU resulted in synthetic lethality, and we were unable to continue our investigation. The result of synthetic lethality of TLC1 and YKU70 double deletion was shown in the Figure 7B in the reviewed preprint version 1, and the result was not included in the reviewed preprint version 2 in accordance with the reviewer's instructions.

      "Atypical” survivors could be detected in SY13 cells (Figure 1D), but the frequency of their formation in the SY13 strain appeared to be lower than in SY12. As one can imagine, SY13 contains two chromosomes and its survivors should have a higher frequency of intra-chromosome fusions.

      (2) From their data, it is possible that X and Y elements influence homologous recombination, type 1 and type 2 (type X), at telomeres. In particular, the presence of X and Y elements appears to be important for promoting type 1 recombination. In other words, although not essential, subtelomeres have some function in maintaining telomeres. I suggest that the authors include author response image 4 in the text. They could revise their conclusion and the paper title accordingly.

      According to this suggestion, we have included author response image 4 in the revised manuscript as Figure 2E, Figure 5D, Figure 6C and Figure 6E. Accordingly, we have changed the title as “Elimination of subtelomeric repeat sequences exerts little effect on telomere essential functions in Saccharomyces cerevisiae”.

      (3) Minor points: The newly added data indicate that X survivors are generated in a type 2-dependent manner. The authors could discuss how Y elements were eroded while retaining X elements (line 225, Figure 2A).

      Thank this reviewer’s suggestion. We have discussed it in the revised manuscript (p.13 line 244-245). When telomere was deprotected, chromosome end resection took place. Since SY12 only has one Y’-element, it is hard to search homology sequences to repair the Y’-element in XVI-L. When the X-element in XVI-L was exposed by further resection, it is easier to find homology sequences to repair. So, in Type X survivor the Y’-element was eroded while retaining X-element.

      Reviewer #2

      I would like to congratulate the authors for their work and the efforts they put in improving the manuscript. The major criticism I had previously, ie testing the genetic requirements for the survivor subtypes, has been met. Below are a few minor comments that don't necessarily require a response.

      (1) I think the Author response image 6 could have been included in the manuscript. I understand that the authors don't want to overinterpret survivor subtype frequencies, but this figure would have suggested some implication of Rad51 in the emergence of survivors even in the absence of Y' elements. At this stage, however, it is up to the authors, and leaving this figure out is also fine in my opinion.

      According to the suggestion, the author response image 6 has been presented as Figure 6—figure supplement 7.

      (2) Chromosome circularization seems to rely on microhomologies. Previously, the authors proposed that SY14 circularization depended on SSA (Wu et al. 2020), but here, since circularization appears to be Rad52-independent, it is likely to be based on MMEJ rather than SSA (although there are contradictory results on Rad52's role in MMEJ in the literature).

      Yes, we mentioned it in the revised manuscript.

      (3) p. 28 lines 511-513: "The erosion sites and fusion sequences differed from those observed in SY12 tlc1Δ-C1 cells (Figure 2D), suggesting the stochastic nature of chromosomal circularization": I don't think they are necessarily stochastic, because the sequences beyond the telomeres are now modified, the available microhomologies have changed as well.

      We agreed with your opinion. In different chromosomes, there tend to be some hotspots for chromosome fusion. For example, in Figure 6C and 6F the resection site in Chr1 and Chr2 was the same in SY12XYΔ+Y tlc1Δ-C1 and SY12XYΔ tlc1Δ-C1. So, we speculate that there are some hotspots for chromosome fusion, but which site the cell will choose in one round chromosome fusion event is stochastic.

      (4) Typos and other errors:

      • p. 3 line 52: "subtelomerice" and "varies" are mispelled.

      • p. 5 line 78: "processes" should be "process".

      • Supp files are mislabelled (the numbers do not correspond to file name).

      • Supp file 2: how come SY12 has only one Y' element and SY13 has two?

      • p. 10 line 175: "emerging" should be "emergence".

      • p.15 line 276: "counter-selected" should be "being counter-selected" or "counterselection".

      • p. 29 line 523: "the formation of them" should be "their formation".

      • p. 37 line 653: "could have been an ideal tool": the sentence is grammatically incorrect. Writing "AND could have been an ideal tool" is enough to make it structurally correct.

      Thanks for pointing these errors out. We have corrected them in the revised manuscript. For the question “how come SY12 has only one Y' element and SY13 has two?” we were not sure at this moment. We speculated that one of the Y’ might be lost during genetic engineering of the chromosomes by CRISPR–Cas9 system.

      Reviewer #3

      The authors included statistical analyses of the qPCR data (Fig 4B) as requested, but did not comment on the striking difference in expression of MPH3 and HSP32 in the SY12 strain compared to BY4742. An improvement of the manuscript is the inclusion of rad52 tlc1 strains in their analyses, demonstrating that the "atypical and circular survivors" arose independently of homologous recombination. In addition, by analyzing rad51 and rad50 mutant strain they could demonstrate that the "type X" survivors had similar molecular requirements to type II survivors. Overall, the revised submission improves the article.

      We thank the reviewer’s comments and suggestions. The SY12 strain (with three chromosomes) exhibited lower expression levels of both MPH3 and HSP32 compared to the parental strain BY4742 (with 16 chromosomes). We speculated that with the reduced chromosome numbers, the silencing proteins appeared to no longer be titrated by other telomeres that have been deleted. We have added these comments in the revised manuscript.

      Wu, Z.J., Liu, J.C., Man, X., Gu, X., Li, T.Y., Cai, C., He, M.H., Shao, Y., Lu, N., Xue, X., et al. (2020). Cdc13 is predominant over Stn1 and Ten1 in preventing chromosome end fusions. Elife 9.

    2. Reviewer #3 (Public Review):

      This study investigates subtelomeric repetitive sequences in the budding yeast Saccharomyces cerevisiae, known as Y' and X-elements. Taking advantage of yeast strain SY12 that contains only 3 chromosomes and six telomeres (normal yeast strains contain 32 telomeres) the authors are able to generate a strain completely devoid of Y'- and X-elements.

      Strengths:

      They demonstrate that the SY12 delta XY strain displays normal growth, with stable telomeres of normal length that were transcriptionally silenced, a key finding with wide implications for telomere biology. Inactivation of telomerase in the SY12 and SY12 delta XY strains frequently resulted in survivors that had circularized all three chromosomes, hence bypassing the need for telomeres altogether. They show that survivors with fused chromosomes and so-called atypical survivors arise independently of the central recombination protein Rad52. The SY12 and SY12 delta XY yeast strains can become a useful tool for future studies of telomere biology. The conclusions of this manuscript are well supported by the data and are valuable for researchers studying telomeres.

      Weaknesses:

      A weakness of the manuscript is the analysis of telomere transcriptional silencing. They state: "The results demonstrated a significant increase in the expression of the MPH3 and HSP32 upon Sir2 deletion, indicating that telomere silencing remains effective in the absence of X and Y'-elements". However, for the SY12 strain, their analyses indicate that the difference between the WT and sir2 strains is nonsignificant. In addition, a striking observation is that the SY12 strain (with only three chromosomes) express much less of both MPH3 and HSP32 than the parental strain BY4742 (16 chromosomes), both in the presence and absence of Sir2.

    3. eLife assessment

      This important study advances our understanding of the biological significance of the DNA sequence adjacent to telomeres. The data presented convincingly demonstrate that subtelomeric repeats are non-essential and have a minimal, if any, role in maintaining telomere integrity of budding yeast. The work will be of interest to the telomere community specifically and the genome integrity community more broadly.

    4. Reviewer #1 (Public Review):

      The authors have generated a set of yeast S. cerevisiae strains containing different numbers of chromosomes.<br /> Elimination of telomerase activates homologous recombination (HR) to maintain telomeres in cells containing the original 16 chromosomes. However, elimination of telomerase leads to circularization of cells containing a single or two chromosomes. The authors examined whether the subtelomeric sequences X and Y' promote HR-mediated telomere maintenance using the strain SY12 carrying three chromosomes. They found that the subtelomeric sequences X and Y' are dispensable for cell proliferation and HR-mediated telomere maintenance in telomerase-minus SY12 cells. They conclude that subtelomeric X and Y' sequences do not play essential roles in both telomerase-proficient and telomerase-null cells and propose that these sequences represent remnants of genome evolution.

      Interestingly, telomerase-minus SY12 generates survivors that are different from well-established Type I or Type II survivors. The authors uncover atypical telomere formation which does not depend on the Rad52 homologous recombination pathway.

      Strengths:

      The authors examined whether the subtelomeric sequences X and Y' promote HR-mediated telomere maintenance using the strain SY12 carrying three chromosomes. They show that subtelomeres do not have essential roles in telomere maintenance and cell proliferation.

      Weaknesses:

      It is not fully addressed how atypical survivors are generated independently of Rad52-mediated homologous recombination.<br /> It remains possible that X and Y elements influence homologous recombination, type 1 and type 2 (type X), at telomeres. In particular, the presence of X and Y elements appears to be important for promoting type 1 recombination, although the authors conclude "Elimination of subtelomeric repeat sequences exerts little effect on telomere functions".

    5. Reviewer #2 (Public Review):

      Summary:

      In this work, Hu and colleagues investigate telomerase-independent survival in Saccharomyces cerevisiae strains engineered to have different chromosome numbers. The authors describe the molecular patterns of survival that change with fewer chromosomes and that differ from the well-described canonical Type I and Type II, including chromosome circularization and other atypical outcomes. They then take advantage of the strain with 3 chromosomes to examine the effect of deleting all the subtelomeric elements, called X and Y'. For most of the tested phenotypes, they find no significant effect of the absence of X- and Y'-element, and show that they are not essential for survivor formation. They speculate that X- and Y'-elements are remnants of ancient telomere maintenance mechanisms.

      Strengths:

      This work advances our understanding of the telomerase-independent strategies available to the cell by altering the structure of the genome and of the subtelomeres, a feat that was enabled by the set of strains they engineered previously. By using strains with non-standard genome structures, several alternative survival mechanisms are uncovered, revealing the diversity and plasticity of telomere maintenance mechanisms. Overall, the conclusions are well supported by the data, with adequate sample sizes for investigating survivors. The assessment of the genetic requirements for survivors in strains with different chromosome numbers greatly improved the quality of this work. The molecular analyses based on Southern blots are also very well-conducted.

      Weaknesses:

      The authors discovered alternative telomerase-independent survival strategies beyond the well-described type I and II (including circularization, type X and atypical, as they called them) at play in the context of reduced number of chromosomes. Their work provides a molecular and a partial genetic characterization of these survival pathways. A more thorough analysis of the frequency of each type of survivors and their genetic requirements would have advanced our understanding or the diversity of survival strategies in the absence of telomerase. However, as noted by the authors, the quantification of the rate of emergence of survivors (and their subtypes) is very difficult to achieve. This comment is therefore not meant as a criticism but rather as a perspective on exciting future research avenues.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment:

      This valuable study describes a new role of epithelial intercellular adhesion molecule 1 (ICAM-1) protein in controlling bile duct size. The effect is mediated via EBP-50 and subapical actomyosin to regulate size of bile canaliculi. These solid findings have theoretical and practical implications in hepatology and human disorders of bile ducts.

      Public Reviews:

      In this study, Cacho-Navas et al. describe the role of ICAM-1 expressed on the apical membrane of bile canaliculi and its function to control the bile canaliculi (BCs) homeostasis. This is a previously unrecognized function of this protein in hepatocytes. The same authors have previously shown that basolateral ICAM-1 plays a role in controlling lymphocyte adhesion to hepatocytes during inflammation and that this interaction is responsible for the loss of polarity of hepatocytes during disease states.

      This new study shows that ICAM-1 is mainly localized in the apical domain of the BC and in association with EBP-50, communicates with the subapical acto-myosin ring to regulate the size and morphology of the BC. They used the well-known immortal cell line of liver cells (HepG2) in which they deleted ICAM-1 gene by CRISPR-Cas9 editing and hepatic organoids derived from WT and ICAM-1-KO mice. alternating KO as well as rescue experiments. They show that in the absence of apical ICAM-1, the BC become dilated.

      The data sufficiently support the conclusions of the study.

      Recommendations for the authors:

      We would like to thank the editor and reviewer for recognizing the manuscript's value and the solid nature of the data. We are also thankful to them for acknowledging that the manuscript supports the conclusions. Below, we have addressed their commentaries and questions in a point-by-point rebuttal document:

      We have a few suggestions to improve the manuscript:

      (1) HepG2 cells form canaliculi-like structures but are not the ideal system to study the apical basal polarity. On the other hand, hepatic organoids can assume a hepatocyte-like phenotype, when cultured under specific conditions but are not functionally comparable to hepatocytes organized in a 3D structure with a hollow lumen that does not recapitulate the BC physiological structure. Therefore, primary hepatocyte in collagen sandwich would be the best model to study the polarization of BCs and could be isolated from WT and ICAM-1-KO mice, that are available. Some of the major findings should be confirmed in this system.

      We adopted the culture of hepatic organoids as an experimental strategy motivated by the difficulties to culture primary hepatocytes experienced in previous analyses (RegleroReal, Cell Rep, 2014). The generation of organoids or mature hepatocytes from various sources of stem cells is a commonly employed strategy in hepatocyte cell biology (Meyer et al. EMBO Rep, 2023), due to the difficulties in maintaining mature hepatic epithelial cell cultures for longer than a few hours.

      The hepatic organoids we have used in the manuscript are being accepted as advanced cellular strategies for a broad range of fields (Belenguer, Nat Commun, 2022; de Crignis, eLife, 2021; Huch, Cell, 2015). Despite they have some morphological differences with real hepatocytes, we conducted a thorough characterization of their organization identifying canalicular-like structures with functional (CFDA) and molecular (HA-4) markers, which we believe adds value to the manuscript. In addition, the organoid technology has allowed us to import the bipotent precursors to get an permanent source of hepatic cells without the need to import and use the ICAM-1_KO mice, in line with the current guides to reduce animal experimentation.

      Taking this into account and to further validate data obtained with our cellular systems, we carried out a quantification of the canalicular diameter in livers from WT and ICAM1_KO cells (New Figure 8B), which validates our data on human cell lines and organoids. We acknowledge that the data obtained from hepatic tissues cannot rule out the contribution of immune cell adhesion to changes in the hepatocyte architecture. However, these experiments, together with the aforementioned organoids and human cell lines, strongly suggest a role for hepatic ICAM-1 in regulating canalicular size.

      (2) Overexpression of proteins was used in the study. While this approach is an easier means to visualize, without the use of specific antibodies, it is known to alter the distribution of the protein compared to the endogenous one.

      Most of our characterization has been done with antibodies or other fluorescent tools against endogenous proteins localized at BCs: CD59, F-actin, EBP50, MHC, MLC…. In addition, we have included MDR1-GFP and GFP-Rab11, the latter to analyze the subapical compartment (SAC) surrounding BCs. As requested by the reviewer, we now include in a new Supplementary Figure 1C the confocal analyses of endogenous canalicular markers, radixin and MRP2, as well as a new Supplementary Figure 1D containing the staining of an endogenous marker of the SAC, plasmolipin/PLLP (Fraticelli et al, Nat Cell Biol, 2015; Cacho-Navas, Cell Mol Life Sci, 2022), which is consistent with the previous analyses performed with GFP-Rab11.

      (3) In the absence of ICAM-1, BCs change shape and dimension but still show the presence of microvilli. What happens to the distribution of polarized transporters like Mrp2, or the transport of bile acids (CFDA clearance) in vivo in the KO animal?

      Thank you for this comment. We have analyzed this transporter in murine livers and human hepatic cells. MRP2 distribution does not significantly change and is concentrated in BCs also in ICAM-1_KO livers (New Figure 8C). Likewise, ICAM-1 gene edition does not affect MRP2 localization in the polarized human hepatic epithelial cell line in vitro (Supplementary Figure 1C). We cannot rule out changes for this transporter in other murine liver cell types in vivo, such as sinusoidal endothelial cells, which we believe should be further addressed in a different piece of work.

      (4) Does the lack of ICAM-1 affect the cell viability, proliferation or cell size?

      ICAM-1_KO cells proliferate slightly more slowly than their WT counterparts, with no detected changes in cell size and death. We present these data in Supplementary Figure 1, A and B.

      (5) Are the findings recapitulated in the livers of ICAM-1 KO animals?

      ICAM-1 KO animals present enlarged BCs, which is consistent with the main findings of the manuscript (Figure 8B).

      The text needs to be more concise. Some of the concepts, in particular those already published, should be condensed. There is a large amount of experiments that are difficult to connect logically. Possibly, cartoons summarizing the approach of the figure could help the reader.

      The text of Results and Discussion sections has been shortened by almost 100 words, despite the additional panels and experiments are now described and discussed. New cartoons have been added in Figure 5G and Figure 8F, in addition to those previously included in Figure 1 and Supplementary Figure 6, the latter containing a graphical descriptions of the main conclusions.

      Also, more detailed information about statistical analysis (what post-test was used?), concentration of cytokines, and description of the mouse model should be included in the methods.

      Cytokine concentrations have been included in the legend of Figure 3 and in the Cell and Culture section of Methods. A brief description of the ICAM-1_KO mouse and the corresponding reference for further information is also provided in the Organoid Culture section of Methods. A statistical analysis section describing the post-test used is also included at the end of Methods. The references of anti-plasmolipin, anti-radixin and antiMRP2 antibodies, as well as the new fixation methods used for immunofluorescence are also included in the corresponding Antibody List and in the Confocal Microscopy section of Methods, respectively . .

      Figure 3D. Sample names should be added as in the rest of the figures.

      The arrangement of sample names in Figure 3D has been revised and is now similar to that of Figure 3A.

    2. eLife assessment

      The authors report useful findings on novel function of apical ICAM1 in regulating bile duct homeostasis in the liver. The strength of evidence is solid using appropriate methodolgy with only minor weakness. The findings will be of interest to researchers in hepatology and membrane traffic biology.

    3. Reviewer #1 (Public Review):

      In this study Cacho-Navas et al. describes the role of ICAM-1 expressed on the apical membrane of bile canaliculi and its function to control the homeostasis of the bile canaliculi (BCs). This is a previously unrecognized function of this protein in hepatocytes. The same authors have previously shown that basolateral ICAM-1 plays a role in controlling lymphocyte adhesion to hepatocytes during inflammation and that this interaction is responsible on the loss of polarity of hepatocytes during the disease.<br /> In this new study they show that ICAM-1, is mainly localized in the apical domain of the BC and in association with EBP-50, comunicates with the subapical acto-myosin ring to regulate the size and morphology of the BC.<br /> In this study they used the well-known immortal cell line of liver cells (HepG2) in which they knocked-out ICAM-1 using CRISPR-Cas9 editing and hepatic organoid derived from WT and ICAM-1-KO mice. alternating knocking-out as well as rescue experiments they show that in the absence of apical ICAM-1, the BC dimension and shape are altered.<br /> The conclusions of the study are sufficiently supported by the data.

      Comments on revision:

      The authors have addressed most of the reviewer's comments in the re-submission, however the use of the organoids as a model to study bile canaliculi is still not convincing.<br /> The HA-4 staining and the space wehere CFDA is secreted does not overlap considering the nuclei position in the middle z-stack section. Also, the interdigitations between cells identified by EM do not form an enclosed space as we should expect for a bile canaliculi.<br /> I understand that other studies have used these organoids to show some hepatocytic functions but at the same time none has characterized before the formation of bile canaliculi as suggested in this study. Therefore a characterization showing the expression of specific markers (i.e mrp2, bsep) should be provided to support this claim.<br /> I would suggest the authors to carefully read the helpful review by Marsee et al., Cell Stem Cell 2021 that clearly and carefully address the classification and validation of liver organoids from experts in the field.

    1. eLife assessment

      The paper reports rare compound heterozygous deletion variants that affect the kinase domains of non-receptor tyrosine kinases TNK and ACK1 in families with human systemic lupus erythematosus (SLE). Using a mouse experimental model and human induced pluripotent stem cell (hiPSC)-derived macrophages, the study provides solid evidence that clarifies cause-effect relationships and that suggests a potential cellular mechanism underlying the resultant nephritis. With the identification of novel SLE-related genes, this manuscript provides an important basis for understanding the molecular and cellular basis of SLE pathogenesis.

    2. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors revealed that genetic deficiencies of ACK1 and BRK are associated with human SLE. First, the authors found that compound heterozygous deleterious variants in the kinase domains of the non-receptor tyrosine kinases (NRTK) TNK2/ACK1 in one multiplex family and PTK6/BRK in another family. Then, by an experimental blockade of ACK1 or BRK in a mouse SLE model, they found an increase in glomerular IgG deposits and circulating autoantibodies. Furthermore, they reported that ACK and BRK variants from the SLE patients impaired the MERTK-mediated anti-inflammatory response to apoptotic cells in human induced pluripotent stem cells (hiPSC)-derived macrophages. This work identified new SLE-associated ACK and BRK variants and a role for the NRTK TNK2/ACK1 and PTK6/BRK in efferocytosis, providing a new molecular and cellular mechanism of SLE pathogenesis.

      Strengths:

      This work identified new SLE-associated ACK and BRK variants and a role for the NRTK TNK2/ACK1 and PTK6/BRK in efferocytosis, providing a new molecular and cellular mechanism of SLE pathogenesis.

      Weaknesses:

      Although the manuscript is well-organized and clearly stated, there are some points below that should be considered:

      * In this study, the authors used forward genetic analyses to identify novel gene mutations that may cause SLE, combined with GWAS studies of SLE. To further explore the importance of these variants, haplotype analysis of two candidate genes could be performed, to observe the evolution and selection relationship of candidate genes in the population (UK 1000 biobank, for example).

      * Although the authors focused on SLE and macrophage efferocytosis in their studies, direct evidence of how macrophage efferocytosis significantly affects SLE is lacking. This point should at least be explicitly introduced and discussed by citing appropriate literature.

      * It is still not clear how the target molecules identified in this paper may influence macrophage efferocytosis. More direct evidence should be established.

      * For some transcriptional repressors mentioned in their studies, the authors should check whether there is clear experimental evidence. If not, it is recommended to supplement the experimental verifications for clarity.

      * In Figures 4C and 4D, it is seen that the usage of inhibitors causes cytoskeletal changes, however this reviewer would not have expected such large change. Did the authors check whether the cells die after heavy treatment by the inhibitors?

    3. Reviewer #1 (Public Review):

      Summary:

      The authors report compound heterozygous deleterious variants in the kinase domains of the non-receptor tyrosine kinases (NRTK) TNK2/ACK1 in familial SLE. They suggest that ACK1 and BRK deficiencies are associated with human SLE and impair efferocytosis.

      Strengths:

      The identification of similar mutations in non-receptor tyrosine kinases (NRTKs) in two different families with familial SLE is a significant finding in human disease. Furthermore, the paper provides a detailed analysis of the molecular mechanisms behind the impairment of efferocytosis caused by mutations in ACK1 and BRK.

      Weaknesses:

      A critical point in this paper is whether the loss of function of ACK1 or BRK contributes to the onset of familial SLE. The authors emphasize that inhibitors of ACK1/BRK worsened IgG deposition in the kidneys in a pristane-induced SLE model, which contributes not to the onset but to the exacerbation of SLE, thus only partially supporting their claim.

    1. Reviewer #1 (Public Review):

      Summary:

      This manuscript investigates the regulation of chlorophyll biosynthesis in rice embryos, focusing on the role of OsNF-YB7. The rigorous experimental approach, combining genetic, biochemical, and molecular analyses, provides a robust foundation for these findings. The research achieves its objectives, offering new insights into chlorophyll biosynthesis regulation, with the results convincingly supporting the authors' conclusions.

      Strengths:

      The major strengths include the detailed experimental design and the findings regarding OsNF-YB7's inhibitory role.

      Weaknesses:

      However, the manuscript's discussion on the practical implications for agriculture and the evolutionary analysis of regulatory mechanisms could be expanded.

    2. eLife assessment

      This is an important study on the regulation of chlorophyll biosynthesis in rice embryos. It provides insights into the genetic and molecular interactions that underlie chlorophyll accumulation, highlighting the inhibition of OsGLK1 by OsNF-YB7 and the broader implications for understanding chloroplast development and seed maturation in angiosperms. The results presented, including mutation analysis, gene expression profiles, and protein interaction studies, provide convincing evidence for the function of OsNF-YB7 as a repressor in the chlorophyll biosynthesis pathway.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors set out to establish the role of the rice LEC1 homolog OsNF-YB7 in embryo development, especially as it pertains to the development of photosynthetic capacity, with chlorophyll production as a primary focus.

      Strengths:

      The results are well-supported and each approach used complements each other. There are no major questions left unanswered and the central hypothesis is addressed in every figure.

      Weaknesses:

      There are a handful of sections that could use clarifying for readers, but overall this is a solidly composed manuscript.

      The authors clearly achieved their aims; the results compellingly establish a disparity between how this system operates in rice and Arabidopsis. Conclusions are thoroughly supported by the provided data and interpretations. This work will force a reconsideration of the value of Arabidopsis as a model organism for embryo chlorophyll biosynthesis and possibly photosynthesis during embryo maturation more broadly, as rice is a major crop organism and it very clearly does not follow the Arabidopsis model. It will thus be useful to carry out similar tests in other organisms rather than relying on Arabidopsis and attempting to more fully establish the regulatory mechanism in rice.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, the authors set out to understand the mechanisms behind chlorophyll biosynthesis in rice, focusing in particular on the role of OsNF-YB7, an ortholog of Arabidopsis LEC1, which is a positive regulator of chlorophyll (Chl) biosynthesis in Arabidopsis. They showed that OsNF-YB7 loss-of-function mutants in rice have chlorophyll-rich embryos, in contrast to Arabidopsis LEC1 loss-of-function mutants. This contrasting phenotype led the authors to carry out extensive molecular studies on OsNF-YB7, including in vitro and in vivo protein interaction studies, gene expression profiling, and protein-DNA interaction assays. The evidence provided well supported the core arguments of the authors, emphasising that OsNF-YB7 is a negative regulator of Chl biosynthesis in rice embryos by mediating the expression of OsGLK1, a transcription factor that regulates downstream Chl biosynthesis genes. In addition, they showed that OsNF-YB7 interacts with OsGLK1 to negatively regulate the expression of OsGLK1, demonstrating the broad involvement of OsNF-YB7 in rice Chl biosynthetic pathways.

      Strengths:

      This study clearly demonstrated how OsNF-YB7 regulates its downstream pathways using several in vitro and in vivo approaches. For example, gene expression analysis of OsNF-YB7 loss-of-function and gain-of-function mutants revealed the expression of selected downstream chl biosynthetic genes. This was further validated by EMSA on the gel. The authors also confirmed this using luciferase assays in rice protoplasts. These approaches were used again to show how the interaction of OsNF-YB7 and OsGLK1 regulates downstream genes. The main idea of this study is very well supported by the results and data.

      Weaknesses:

      From an evolutionary perspective, it is interesting to see how two similar genes have come to play opposite roles in Arabidopsis and rice. It would have been more interesting if the authors had carried out a cross-species analysis of AtLEC1 and OsNF-YB7. For example, overexpressing AtLEC1 in an osnf-yb7 mutant to see if the phenotype is restored or enhanced. Such an approach would help us understand how two similar proteins can play opposite roles in the same mechanism within their respective plant species.

    1. eLife assessment

      This important study combines a range of biophysical techniques to carry out a series of compelling experiments to explore whether glutamine binding protein binds glutamine via an induced fit or a conformational selection process. The evidence supporting the major conclusion of the work is convincing, although it may not be generalized to other protein-ligand or protein-protein systems. The work will be of broad interest to biochemists and biophysicists.

    2. Reviewer #1 (Public Review):

      Here the authors discuss mechanisms of ligand binding and conformational changes in GlnBP (a small E Coli periplasmic binding protein, which binds and carries L-glutamine to the inner membrane ATP-binding cassette (ABC) transporter). The authors have distinguished records in this area and have published seminal works. They include experimentalists and computational scientists. Accordingly, they provide comprehensive, high-quality, experimental and computational work.

      They observe that apo- and holo- GlnBP does not generate detectable exchange between open and (semi-) closed conformations on timescales between 100 ns and 10 ms. Especially, the ligand binding and conformational changes in GlnBP that they observe are highly correlated. Their analysis of the results indicates a dominant induced-fit mechanism, where the ligand binds GlnBP prior to conformational rearrangements. They then suggest that an approach resembling the one they undertook can be applied to other protein systems where the coupling mechanism of conformational changes and ligand binding.

      They argue that the intuitive model where ligand binding triggers a functionally relevant conformational change was challenged by structural experiments and MD simulations revealing the existence of unliganded closed or semi-closed states and their dynamic exchange with open unbound conformations, discuss alternative mechanisms that were proposed, their merits and difficulties, concluding that the findings were controversial, which, they suggest is due to insufficient availability of experimental evidence to distinguish them. As to further specific conclusions they draw from their results, they determine that a conformational selection mechanism is incompatible with their results, but induced fit is. They thus propose induced fit as the dominant pathway for GlnBP, further supported by the notion that the open conformation is much more likely to bind substrate than the closed one based on steric arguments.

      Considering the landscape of substrate-free states, in my view, the closed state is likely to be the most stable and, thus most highly populated. As the authors note and I agree that state can be sterically infeasible for a deep-pocketed substrate. As indeed they also underscore, there is likely to be a range of open states. If the populations of certain states are extremely low, they may not be detected by the experimental (or computational) methods. The free energy landscape of the protein can populate all possible states, with the populations determined by their relative energies. In principle, the protein can visit all states. Whether a particular state is observed depends on the time the protein spends in that state. The frequencies, or propensities, of the visits can determine the protein function. As to a specific order of events, in my view, there isn't any. It is a matter of probabilities which depend on the populations (energies) of the states. The open conformation that is likely to bind is the most favorable, permitting substrate access, followed by minor, induced fit conformational changes. However, a key factor is the ligand concentration. Ligand binding requires overcoming barriers to sustain the equilibrium of the unliganded ensemble, thus time. If the population of the state is low, and ligand concentration is high (often the case in in vitro experiments, and high drug dosage scenarios) binding is likely to take place across a range of available states.

      This is however a personal interpretation of the data. The paper here, which clearly embodies massive careful, and high-quality work, is extensive, making use of a range of experimental approaches, including isothermal titration calorimetry, single-molecule Förster resonance energy transfer, and surface-plasmon resonance spectroscopy. The problem the authors undertake is of fundamental importance.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Han et al and Cordes is a tour-de-force effort to distinguish between induced fit and conformational selection in glutamine binding protein (GlnBP). It is important to say that I don't agree that a decision needs to be made between these two limiting possibilities in the sense that whether a minor population can be observed depends on the experiment and the energy difference between the states. That said, the authors make an important distinction which is that it is not sufficient to observe both states in the ligand-free solution because it is likely that the ligand will not bind to the already closed state. The ligand binds to the open state and the question then is whether the ligand sufficiently changes the energy of the open state to effectively cause it to close. The authors point out that this question requires both a kinetic and a thermodynamic answer. Their "method" combines isothermal titration calorimetry, single-molecule FRET including key results from multi-parameter photon-by-photon hidden Markov modelling (mpH2MM), and SPR. The authors present this "method" of combination of experiments as an approach to definitively differentiate between induced fit and conformational selection. I applaud the rigor with which they perform all of the experiments and agree that others who want to understand the exact mechanism of protein conformational changes connected to ligand binding need to do such a multitude of different experiments to fully characterize the process. However, the situation of GlnBP is somewhat unique in the high affinity of the Gln (slow off-rate) as compared to many small molecule binding situations such as enzyme-substrate complexes. It is therefore not surprising that the kinetics result in an induced fit situation. In the case of the E-S complexes I am familiar with, the dissociation is much more rapid because the substrate binding affinity is in the micromolar range and therefore the re-equilibration of the apo state is much faster. In this case, the rate of closing and opening doesn't change much whether ligand is present or not. Here, of course, once the ligand is bound the re-equilibration is slow. Therefore, I am not sure if the conclusions based on this single protein are transferrable to most other protein-small molecule systems. I am also not sure if they are transferrable to protein-protein systems where both molecules the ligand and the receptor are expected to have multiscale dynamics that change upon binding.

      Strengths:

      The authors provide beautiful ITC data and smFRET data to explore the conformational changes that occur upon Gln binding. Figure 3D and Figure 4 (mpH2MM data) provide the really critical data. The multi-parameter photon-by-photon hidden Markov modelling (mpH2MM) data. In the presence of glutamine concentrations near the Kd, two FRET-active sub-populations are identified that appear to interconvert on timescales slower than 10 ms. They then do a whole bunch of control experiments to look for faster dynamics (Figure 5). They also do TIRF smFRET to try to compare their results to those of previous publications. Here, they find several artifacts are occurring including inactivation of ~50% of the proteins. They also perform SPR experiments to measure the association rate of Gln and obtain expectedly rapid association rates on the order of 10^8 M-1s-1.

      Weaknesses:

      Looking at the traces presented in the supplementary figures, one can see that several of the traces have more than one molecule present. The authors should make sure that they use only traces with a single photobleaching event for each fluorophore. One can see steps in some of the green traces that indicate two green fluorophors (likely from 2 different molecules) in the traces. This is one of the frequent problems with TIRF smFRET with proteins, that only some of the spots represent single molecules and the rest need to be filtered out of the analysis.

      The NMR experiments that the authors cite are not in disagreement with the work presented here. NMR is capable of detecting "invisible states" that occur in 1-5% of the population. SmFRET is not capable of detecting these very minor states. I am quite sure that if NMR spectroscopists could add very high concentrations of Gln they would also see a conversion to the closed population.

    1. eLife assessment

      This paper provides a useful analysis of the variation of the burden of strokes across geographic regions, finding differences in the relationship between strokes and their comorbidities. This dataset and the correlations found within will be a resource for directing the focus of future investigations. The statistical analyses are incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      The paper measures the prevalence and mortality of stroke and its comorbidities across geographic regions in order to find differences in risks that may lead to more effective guidance for these subpopulations. It also does a genetic analysis to look for variants that may drive these phenotypic variations.

      Strengths:

      The data provided here will provide a foundation for a lot of future research into the causes of the observed correlations as well as whether the observed differences in comorbidities across regions have clinically relevant effects on risk management.

      Weaknesses:

      As with any cross-national analysis of rates, the data is vulnerable to differences in classification and reporting across jurisdictions. Furthermore, given the increased death rate from COVID-19 associated with many of these comorbid conditions and the long-term effects of COVID-19 infection on vascular health, it is expected that many of the correlations observed in this dataset will shift along with the shifting health of the underlying populations.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have analyzed ethnogeographic differences in the comorbidity factors, such as diabetes and heart disease, for the incidences of stroke and whether it leads to mortality.

      Strengths:<br /> The idea is interesting and the data are compelling. The results are technically solid.

      The authors identify specific genetic loci that increase the risk of a stroke and how they differ by region.

      Weaknesses:

      The presentation is not focused. It would be better to include p-values and focus presentation on the main effects of the dataset analysis.

    1. eLife assessment

      This study provides valuable information on how Arg-II participates in cardiac aging. Although the phenotypic data appear robust, the study is incomplete in elucidating the mechanisms, particularly in explaining how Arg II influences IL-1b and affects cardiac aging. It would be beneficial to investigate the possibility of NO involvement in this mice model. A co-culture system may be required to understand the non-cell-autonomous functions of macrophages. Lastly, the MI mouse model may not be directly linked to cardiac aging.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Duilio M. Potenza et al. explores the role of Arginase II in cardiac aging, majorly using whole-body arg-ii knock-out mice. In this work, the authors have found that Arg-II exerts non-cell-autonomous effects on aging cardiomyocytes, fibroblasts, and endothelial cells mediated by IL-1b from aging macrophages. The authors have used arg II KO mice and an in vitro culture system to study the role of Arg II. The authors have also reported the cell-autonomous effect of Arg-II through mitochondrial ROS in fibroblasts that contribute to cardiac aging. These findings are sufficiently novel in cardiac aging and provide interesting insights. While the phenotypic data seems strong, the mechanistic details are unclear. How Arg II regulates the IL-1b and modulates cardiac aging is still being determined. The authors still need to determine whether Arg II in fibroblasts and endothelial contributes to cardiac fibrosis and cell death. This study also lacks a comprehensive understanding of the pathways modulated by Arg II to regulate cardiac aging.

      Strengths:

      This study provides interesting information on the role of Arg II in cardiac aging.

      The phenotypic data in the arg II KO mice is convincing, and the authors have assessed most of the aging-related changes.

      The data is supported by an in vitro cell culture system.

      Weaknesses:

      The manuscript needs more mechanistic details on how Arg II regulates IL-1b and modulates cardiac aging.

      The authors used whole-body KO mice, and the role of macrophages in cardiac aging is not studied in this model. A macrophage-specific arg II Ko would be a better model.

      Experiments need to validate the deficiency of Arg II in cardiomyocytes.

      The authors have never investigated the possibility of NO involvement in this mice model.

      A co-culture system would be appropriate to understand the non-cell-autonomous functions of macrophages.

      The Myocardial infarction data shown in the mice model may not be directly linked to cardiac aging.

    3. Reviewer #2 (Public Review):

      Summary:

      The results from this study demonstrated a cell-specific role of mitochondrial enzyme arginase-II (Arg-II) in heart aging and revealed a non-cell-autonomous effect of Arg-II on cardiomyocytes, fibroblasts, and endothelial cells through the crosstalk with macrophages via inflammatory factors, such as by IL-1, as well as a cell-autonomous effect of Arg-II through mtROS in fibroblasts contributing to cardiac aging phenotype. These findings highlight the significance of non-cardiomyocytes in the heart and bring new insights into the understanding of pathologies of cardiac aging. It also provides new evidence for the development of therapeutic strategies, such as targeting the ArgII activation in macrophages.

      Strengths:

      This study targets an important clinical challenge, and the results are interesting and innovative. The experimental design is rigorous, the results are solid, and the representation is clear. The conclusion is logical and justified.

      Weaknesses:

      The discussion could be extended a little bit to improve the realm of the knowledge related to this study.

    1. eLife assessment

      This useful paper looks for correlations between immunophenotypic markers and several measures of HIV reservoir volume in cross-sectional cohorts of people living with HIV on ART using several bioinformatic and machine-learning tools. The level of evidence linking these variables is incomplete given possible confounding variables, lack of directionality & effect size, and mechanistic basis.

    2. Reviewer #1 (Public Review):

      Summary:

      Semenova et al. have studied a large cross-sectional cohort of people living with HIV on suppressive ART, N=115, and performed high dimensional flow cytometry to then search for associations between immunological and clinical parameters and intact/total HIV DNA levels.

      A number of interesting data science/ML approaches were explored on the data and the project seems a serious undertaking. However, like many other studies that have looked for these kinds of associations, there was not a very strong signal. Of course, the goal of unsupervised learning is to find new hypotheses that aren't obvious to human eyes, but I felt in that context, there were (1) results slightly oversold, (2) some questions about methodology in terms mostly of reservoir levels, and (3) results were not sufficiently translated back into meaning in terms of clinical outcomes.

      Strengths:

      The study is evidently a large and impressive undertaking and combines many cutting-edge statistical techniques with a comprehensive experimental cohort of people living with HIV, notably inclusive of populations underrepresented in HIV science. A number of intriguing hypotheses are put forward that could be explored further. Sharing the data could create a useful repository for more specific analyses.

      Weaknesses:

      Despite the detailed experiments and methods, there was not a very strong signal for the variable(s) predicting HIV reservoir size. The Spearman coefficients are ~0.3, (somewhat weak, and acknowledged as such) and predictive models reach 70-80% prediction levels, though sometimes categorical variables are challenging to interpret.

      There are some questions about methodology, as well as some conclusions that are not completely supported by results, or at minimum not sufficiently contextualized in terms of clinical significance.

      On associations: the false discovery rate correction was set at 5%, but data appear underdetermined with fewer observations than variables (144vars > 115ppts), and it isn't always clear if/when variables are related (e.g inverses of one another, for instance, %CD4 and %CD8).

      The modeling of reservoir size was unusual, typically intact and defective HIV DNA are analyzed on a log10 scale (both for decays and predicting rebound). Also sometimes in this analysis levels are normalized (presumably to max/min?, e.g. S5), and given the large within-host variation of level we see in other works, it is not trivial to predict any downstream impact of normalization across population vs within-person.

      Also, the qualitative characterization of low/high reservoir is not standard and naturally will split by early/later ART if done as above/below median. Given the continuous nature of these data, it seems throughout that predicting above/below median is a little hard to translate into clinical meaning.

      Lastly, the work is comprehensive and appears solid, but the code was not shared to see how calculations were performed.

    3. Reviewer #2 (Public Review):

      Summary:

      Semenova et. al., performed a cross-sectional analysis of host immunophenotypes (using flow cytometry) and the peripheral CD4+ T cell HIV reservoir size (using the Intact Proviral DNA Assay, IPDA) from 115 people with HIV (PWH) on ART. The study mostly highlights the machine learning methods applied to these host and viral reservoir datasets but fails to interpret these complex analyses into (clinically, biologically) interpretable findings. For these reasons, the direct translational take-home message from this work is lost amidst a large list of findings (shown as clusters of associated markers) and sentences such as "this study highlights the utility of machine learning approaches to identify otherwise imperceptible global patterns" - lead to overinterpretation of their data.

      Strengths:

      Measurement of host immunophenotyping measures (multiparameter flow cytometry) and peripheral HIV reservoir size (IPDA) from 115 PWH on ART.

      Major Weaknesses:

      (1) Overall, there is little to no interpretability of their machine learning analyses; findings appear as a "laundry list" of parameters with no interpretation of the estimated effect size and directionality of the observed associations. For example, Figure 2 might actually give an interpretation of each X increase in immunophenotyping parameter, we saw a Y increase/decrease in HIV reservoir measure.

      (2) The correlations all appear to be relatively weak, with most Spearman R in the 0.30 range or so.

      (3) The Discussion needs further work to help guide the reader. The sentence: "The correlative results from this present study corroborate many of these studies, and provide additional insights" is broad. The authors should spend some time here to clearly describe the prior literature (e.g., describe the strength and direction of the association observed in prior work linking PD-1 and HIV reservoir size, as well as specify which type of HIV reservoir measures were analyzed in these earlier studies, etc.) and how the current findings add to or are in contrast to those prior findings.

      (4) The most interesting finding is buried on page 12 in the Discussion: "Uniquely, however, CD127 expression on CD4 T cells was significantly inversely associated with intact reservoir frequency." The authors should highlight this in the abstract, and title, and move this up in the Discussion. The paper describes a very high dimensional analysis and the key takeaways are not clear; the more the author can point the reader to the take-home points, the better their findings can have translatability to future follow-up mechanistic and/or validation studies.

      (5) The authors should avoid overinterpretation of these results. For example in the Discussion on page 13 "The existence of two distinct clusters of PWH with different immune features and reservoir characteristics could have important implications for HIV cure strategies - these two groups may respond differently to a given approach, and cluster membership may need to be considered to optimize a given strategy." It is highly unlikely that future studies will be performing the breadth of parameters resulting here and then use these directly for optimizing therapy.

      (6) There are only TWO limitations listed here: cross-sectional study design and the use of peripheral blood samples. (The subsequent paragraph notes an additional weakness which is misclassification of intact sequences by IPDA). This is a very limited discussion and highlights the need to more critically evaluate their study for potential weaknesses.

      (7) A major clinical predictor of HIV reservoir size and decay is the timing of ART initiation. The authors should include these (as well as other clinical covariate data - see #12 below) in their analyses and/or describe as limitations of their study.

    4. Reviewer #3 (Public Review):

      Summary:

      This valuable study by Semenova and colleagues describes a large cross-sectional cohort of 115 individuals on ART. Participants contributed a single blood sample which underwent IPDA, and 25-color flow with various markers (pre and post-stimulation). The authors then used clustering, decision tree analyses, and machine learning to look for correlations between these immunophenotypic markers and several measures of HIV reservoir volume. They identified two distinct clusters that can be somewhat differentiated based on total HIV DNA level, intact HIV DNA level, and multiple T cell cellular markers of activation and exhaustion.

      The conclusions of the paper are supported by the data but the relationships between independent and dependent variables in the models are correlative with no mechanistic work to determine causality. It is unclear in most cases whether confounding variables could explain these correlations. If there is causality, then the data is not sufficient to infer directionality (ie does the immune environment impact the HIV reservoir or vice versa or both?). In addition, even with sophisticated and appropriate machine learning approaches, the models are not terribly predictive or highly correlated. For these reasons, the study is very much hypothesis-generating and will not impact cure strategies or HIV reservoir measurement strategies in the short term.

      Strengths:

      The study cohort is large and diverse in terms of key input variables such as age, gender, and duration of ART. Selection of immune assays is appropriate. The authors used a wide array of bioinformatic approaches to examine correlations in the data. The paper was generally well-written and appropriately referenced.

      Weaknesses:

      (1) The major limitation of this work is that it is highly exploratory and not hypothesis-driven. While some interesting correlations are identified, these are clearly hypothesis-generating based on the observational study design.

      (2) The study's cross-sectional nature limits the ability to make mechanistic inferences about reservoir persistence. For instance, it would be very interesting to know whether the reservoir cluster is a feature of an individual throughout ART, or whether this outcome is dynamic over time.

      (3) A fundamental issue is that I am concerned that binarizing the 3 reservoir metrics in a 50/50 fashion is for statistical convenience. First, by converting a continuous outcome into a simple binary outcome, the authors lose significant amounts of quantitative information. Second, the low and high reservoir outcomes are not actually demonstrated to be clinically meaningful: I presume that both contain many (?all) data points above levels where rebound would be expected soon after interruption of ART. Reservoir levels would also have no apparent outcome on the selection of cure approaches. Overall, dividing at the median seems biologically arbitrary to me.

      (4) The two reservoir clusters are of potential interest as high total and intact with low % intact are discriminated somewhat by immune activation and exhaustion. This was the most interesting finding to me, but it is difficult to know whether this clustering is due to age, time on ART, other co-morbidity, ART adherence, or other possible unmeasured confounding variables.

      (5) At the individual level, there is substantial overlap between clusters according to total, intact, and % intact between the clusters. Therefore, the claim in the discussion that these 2 cluster phenotypes may require different therapeutic approaches seems rather speculative. That said, the discussion is very thoughtful about how these 2 clusters may develop with consideration of the initial insult of untreated infection and / or differences in immune recovery.

      (6) The authors state that the machine learning algorithms allow for reasonable prediction of reservoir volume. It is subjective, but to me, 70% accuracy is very low. This is not a disappointing finding per se. The authors did their best with the available data. It is informative that the machine learning algorithms cannot reliably discriminate reservoir volume despite substantial amounts of input data. This implies that either key explanatory variables were not included in the models (such as viral genotype, host immune phenotype, and comorbidities) or that the outcome for testing the models is not meaningful (which may be possible with an arbitrary 50/50 split in the data relative to median HIV DNA volumes: see above).

      (7) The decision tree is innovative and a useful addition, but does not provide enough discriminatory information to imply causality, mechanism, or directionality in terms of whether the immune phenotype is impacting the reservoir or vice versa or both. Tree accuracy of 80% is marginal for a decision tool.

      (8) Figure 2: this is not a weakness of the analysis but I have a question about interpretation. If total HIV DNA is more predictive of immune phenotype than intact HIV DNA, does this potentially implicate a prior high burden of viral replication (high viral load &/or more prolonged time off ART) rather than ongoing reservoir stimulation as a contributor to immune phenotype? A similar thought could be applied to the fact that clustering could only be detected when applied to total HIV DNA-associated features. Many investigators do not consider defective HIV DNA to be "part of the reservoir" so it is interesting to speculate why these defective viruses appear to have more correlation with immunophenotype than intact viruses.

      (9) Overall, the authors need to do an even more careful job of emphasizing that these are all just correlations. For instance, HIV DNA cannot be proven to have a causal effect on the immunophenotype of the host with this study design. Similarly, immunophenotype may be affecting HIV DNA or the correlations between the two variables could be entirely due to a separate confounding variable.

      (10) In general, in the intro, when the authors refer to the immune system, they do not consistently differentiate whether they are referring to the anti-HIV immune response, the reservoir itself, or both. More specifically, the sentence in the introduction listing various causes of immune activation should have citations. (To my knowledge, there is no study to date that definitively links proviral expression from reservoir cells in vivo to immune activation as it is next to impossible to remove the confounding possible imprint of previous HIV replication.) Similarly, it is worth mentioning that the depletion of intact proviruses is quite slow such that provial expression can only be stimulating the immune system at a low level. Similarly, the statement "Viral protein expression during therapy likely maintains antigen-specific cells of the adaptive immune system" seems hard to dissociate from the persistence of immune cells that were reactive to viremia.

      (11) Given the many limitations of the study design and the inability of the models to discriminate reservoir volume and phenotype, the limitations section of the discussion seems rather brief.

    1. Reviewer #1 (Public Review):

      Summary:

      The current study aims to quantify associations between the regular use of proton-pump inhibitors (PPI) - defined as using PPI most days of the week during the last 4 weeks at one cross-section in time - with several respiratory outcomes up to several years later in time. There are 6 respiratory outcomes included: risk of influenza, pneumonia, COVID-19, other respiratory tract infections, as well as COVID-19 severity and mortality).

      Strengths:

      Several sensitivity analyses were performed, including i) estimation of the e-value to assess how strong unmeasured confounders should be to explain observed effects, ii) comparison with another drug with a similar indication to potentially reduce (but not eliminate) confounding by indication.

      Weaknesses:

      (1) The main exposure of interest seems to be only measured at one time-point in time (at study enrollment) while patients are considered many years at risk afterwards without knowing their exposure status at the time of experiencing the outcome. As indicated by the authors, PPI are sometimes used for only short amounts of time. It seems biologically implausible that an infection was caused by using PPI for a few weeks many years ago.

      (2) Previous studies have shown that by focusing on prevalent users of drugs, one often induces several biases such as collider stratification bias, selection bias through depletion of susceptible, etc.

      (3) It seems Kaplan Meier curves are not adjusted for confounding through e.g. inverse probability weighting. As such the KM curves are currently not informative (or the authors need to make clearer that curves are actually adjusted for measured confounding).

      (4) Throughout the manuscript the authors seem to misuse the term multivariate (using one model with e.g. correlated error terms to assess multiple outcomes at once) when they seem to mean multivariable.

      (5) Given multiple outcomes are assessed there is a clear argument for accounting for multiple testing, which following the logic of the authors used in terms of claiming there is no association when results are not significant may change their conclusions. More high-level, the authors should avoid the pitfall of stating there is evidence of absence if there is only an absence of evidence in a better way (no statistically significant association doesn't mean no relationship exists).

      (6) While the authors claim that the quantitative bias analysis does show results are robust to unmeasured confounding, I would disagree with this. The e-values are around 2 and it is clearly not implausible that there are one or more unmeasured risk factors that together or alone would have such an effect size. Furthermore, if one would use the same (significance) criteria as used by the authors for determining whether an association exists, the required effect size for an unmeasured confounder to render effects 'statistically non-significant' would be even smaller.

      (7) Some patients are excluded due to the absence of follow-up, but it is unclear how that is determined. Is there potentially some selection bias underlying this where those who are less healthy stop participating in the UK biobank?

      (8) Given that the exposure is based on self-report how certain can we be that patients e.g. do know that their branded over-the-counter drugs are PPI (e.g. guardium tablets)? Some discussion around this potential issue is lacking.

      (9) Details about the deprivation index are needed in the main text as this is a UK-specific variable that will be unfamiliar to most readers.

      (10) It is unclear how variables were coded/incorporated from the main text. More details are required, e.g. was age included as a continuous variable and if so was non-linearity considered and how?

      (11) The authors state that Schoenfeld residuals were tested, but don't report the test statistics. Could they please provide these, e.g. it would already be informative if they report that all p-values are above a certain value.

      (12) The authors would ideally extend their discussion around unmeasured confounding, e.g. using the DAGs provided in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7832226/, in particular (but not limited to) around severity and not just presence/absence of comorbidities.

      (13) The UK biobank is known to be highly selected for a range of genetic, behavioural, cardiovascular, demographic, and anthropometric traits. The potential problems this might create in terms of collider stratification bias - as highlighted here for example: https://www.nature.com/articles/s41467-020-19478-2 - should be discussed in greater detail and also appreciated more when providing conclusions.

    2. Reviewer #2 (Public Review):

      Summary:

      Zeng et al investigate in an observational population-based cohort study whether the use of proton pump inhibitors (PPIs) is associated with an increased risk of several respiratory infections among which are influenza, pneumonia, and COVID-19. They conclude that compared to non-users, people regularly taking PPIs have increased susceptibility to influenza, pneumonia, as well as COVID-19 severity and mortality. By performing several different statistical analyses, they try to reduce bias as much as possible, to end up with robust estimates of the association.

      Strengths:

      The study comprehensively adjusts for a variety of critical covariates and by using different statistical analyses, including propensity-score-matched analyses and quantitative bias analysis, the estimates of the associations can be considered robust.

      Weaknesses:

      As it is an observational cohort study there still might be bias. Information on the dose or duration of acid suppressant use was not available, but might be of influence on the results. The outcome of interest was obtained from primary care data, suggesting that only infections as diagnosed by a physician are taken into account. Due to the self-limiting nature of the outcome, differences in health-seeking behavior might affect the results.

    1. eLife assessment

      In this study, the authors offer a theoretical explanation for the emergence of nematic bundles in the actin cortex, carrying implications for the assembly of actomyosin stress fibers. As such, the study is a valuable contribution to the field actomyosin organization in the actin cortex. While the theoretical work is solid, experimental evidence in support of the model assumptions remains incomplete. The presentation could be improved to enhance accessibility for readers without a strong background in hydrodynamic and nematic theories.

    2. Reviewer #1 (Public Review):

      Summary: In this article, Mirza et al developed a continuum active gel model of actomyosin cytoskeleton that account for nematic order and density variations in actomyosin. Using this model, they identify the requirements for the formation of dense nematic structures. In particular, they show that self-organization into nematic bundles requires both flow-induced alignment and active tension anisotropy in the system. By varying model parameters that control active tension and nematic alignment, the authors show that their model reproduces a rich variety of actomyosin structures, including tactoids, fibres, asters as well as crystalline networks. Additionally, discrete simulations are employed to calculate the activity parameters in the continuum model, providing a microscopic perspective on the conditions driving the formation of fibrillar patterns.

      Strengths: The strength of the work lies in its delineation of the parameter ranges that generate distinct types of nematic organization within actomyosin networks. The authors pinpoint the physical mechanisms behind the formation of fibrillar patterns, which may offer valuable insights into stress fiber assembly. Another strength of the work is connecting activity parameters in the continuum theory with microscopic simulations.

      Weaknesses: This paper is a very difficult read for nonspecialists, especially if you are not well-versed in continuum hydrodynamic theories. Efforts should be made to connect various elements of theory with biological mechanisms, which is mostly lacking in this paper. The comparison with experiments is predominantly qualitative. It is unclear if the theory is suited for in vitro or in vivo actomyosin systems. The justification for various model assumptions, especially concerning their applicability to actomyosin networks, requires a more thorough examination. The classification of different structures demands further justification. For example, the rationale behind categorizing structures as sarcomeric remains unclear when nematic order is perpendicular to the axis of the bands. Sarcomeres traditionally exhibit a specific ordering of actin filaments with alternating polarity patterns. Similarly, the criteria for distinguishing between contractile and extensile structures need clarification, as one would expect extensile structures to be under tension contrary to the authors' claim. Additionally, its unclear if the model's predictions for fiber dynamics align with observations in cells, as stress fibers exhibit a high degree of dynamism and tend to coalesce with neighboring fibers during their assembly phase. Finally, it seems that the microscopic model is unable to recapitulate the density patterns predicted by the continuum theory, raising questions about the suitability of the simulation model.

    3. Reviewer #2 (Public Review):

      Summary:

      The article by Waleed et al discusses the self organization of actin cytoskeleton using the theory of active nematics. Linear stability analysis of the governing equations and computer simulations show that the system is unstable to density fluctuations and self organized structures can emerge. While the context is interesting, I am not sure whether the physics is new. Hence I have reservations about recommending this article.

      Strengths:

      (i) Analytical calculations complemented with simulations (ii) Theory for cytoskeletal network

      Weaknesses:

      Not placed in the context or literature on active nematics.

    4. Reviewer #3 (Public Review):

      The manuscript "Theory of active self-organization of dense nematic structures in the actin cytoskeleton" analysis self-organized pattern formation within a two-dimensional nematic liquid crystal theory and uses microscopic simulations to test the plausibility of some of the conclusions drawn from that analysis. After performing an analytic linear stability analysis that indicates the possibility of patterning instabilities, the authors perform fully non-linear numerical simulations and identify the emergence of stripe-like patterning when anisotropic active stresses are present. Following a range of qualitative numerical observations on how parameter changes affect these patterns, the authors identify, besides isotropic and nematic stress, also active self-alignment as an important ingredient to form the observed patterns. Finally, microscopic simulations are used to test the plausibility of some of the conclusions drawn from continuum simulations.

      The paper is well written, figures are mostly clear and the theoretical analysis presented in both, main text and supplement, is rigorous. Mechano-chemical coupling has emerged in recent years as a crucial element of cell cortex and tissue organization and it is plausible to think that both, isotropic and anisotropic active stresses, are present within such effectively compressible structures. Even though not yet stated this way by the authors, I would argue that combining these two is of the key ingredients that distinguishes this theoretical paper from similar ones. The diversity of patterning processes experimentally observed is nicely elaborated on in the introduction of the paper, though other closely related previous work could also have been included in these references (see below for examples).

      To introduce the continuum model, the authors exclusively cite their own, unpublished pre-print, even though the final equations take the same form as previously derived and used by other groups working in the field of active hydrodynamics (a certainly incomplete list: Marenduzzo et al (PRL, 2007), Salbreux et al (PRL, 2009, cited elsewhere in the paper), Jülicher et al (Rep Prog Phys, 2018), Giomi (PRX, 2015),...). To make better contact with the broad active liquid crystal community and to delineate the present work more compellingly from existing results, it would be helpful to include a more comprehensive discussion of the background of the existing theoretical understanding on active nematics. In fact, I found it often agrees nicely with the observations made in the present work, an opportunity to consolidate the results that is sometimes currently missed out on. For example, it is known that self-organised active isotropic fluids form in 2D hexagonal and pulsatory patterns (Kumar et al, PRL, 2014), as well as contractile patches (Mietke et al, PRL 2019), just as shown and discussed in Fig. 2. It is also known that extensile nematics, \kappa<0 here, draw in material laterally of the nematic axis and expel it along the nematic axis (the other way around for \kappa>0, see e.g. Doostmohammadi et al, Nat Comm, 2018 "Active Nematics" for a review that makes this point), consistent with all relative nematic director/flow orientations shown in Figs. 2 and 3 of the present work.

      The results of numerical simulations are well-presented. Large parts of the discussion of numerical observations - specifically around Fig. 3 - are qualitative and it is not clear why the analysis is restricted to \kappa<0. Some of the observations resonate with recent discussions in the field, for example the observation of effectively extensile dynamics in a contractile system is interesting and reminiscent of ambiguities about extensile/contractile properties discussed in recent preprints (https://arxiv.org/abs/2309.04224). It is convincingly concluded that, besides nematic stress on top of isotropic one, active self-alignment is a key ingredient to produce the observed patterns.

      I compliment the authors for trying to gain further mechanistic insights into this conclusion with microscopic filament simulations that are diligently performed. It is rightfully stated that these simulations only provide plausibility tests and, within this scope, I would say the authors are successful. At the same time, it leaves open questions that could have been discussed more carefully. For example, I wonder what can be said about the regime \kappa>0 (which is dropped ad-hoc from Fig. 3 onward) microscopically, in which the continuum theory does also predict the formation of stripe patterns - besides the short comment at the very end? How does the spatial inhomogeneous organization the continuum theory predicts fit in the presented, microscopic picture and vice versa?

      Overall, the paper represents a valuable contribution to the field of active matter and, if strengthened further, might provide a fruitful basis to develop new hypothesis about the dynamic self-organisation of dense filamentous bundles in biological systems.

    1. eLife assessment

      This valuable study describes mRNA shortening during cellular stress and interestingly observes that this shortening is dependent on localization in stress granules. Surprisingly, this mRNA shortening does not appear to require the shortening of polyA tails. These are in principle novel findings, but the evidence for them is currently incomplete. Additional experiments would help bolster confidence in how the authors interpret their data.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors employed direct RNA sequencing with nanopores, enhanced by 5' end adaptor ligation, to comprehensively interrogate the human transcriptome at single-molecule and nucleotide resolution. They conclude that cellular stress induces prevalent 5' end RNA decay that is coupled to translation and ribosome occupancy. Contrary to the literature, they found that, unlike typical RNA decay models in normal conditions, stress-induced RNA decay is dependent on XRN1 but does not depend on the removal of the poly(A) tail. The findings presented are interesting but a substantial amount of work is needed to fully establish these paradigm-shifting findings.

      Strengths:

      These are paradigm-shifting observations using cutting-edge technologies.

      Weaknesses:

      The conclusions do not appear to be fully supported by the data presented.

    3. Reviewer #2 (Public Review):

      In the manuscript "Full-length direct RNA sequencing uncovers stress-granule dependent RNA decay upon cellular stress", Dar, Malla, and colleagues use direct RNA sequencing on nanopores to characterize the transcriptome after arsenite and oxidative stress. They observe a population of transcripts that are shortened during stress. The authors hypothesize that this shortening is mediated by the 5'-3' exonuclease XRN1, as XRN1 knockdown results in longer transcripts. Interestingly, the authors do not observe a polyA-tail shortening, which is typically thought to precede decapping and XRN1-mediated transcript decay. Finally, the authors use G3BP1 knockout cells to demonstrate that stress granule formation is required for the observed transcript shortening.

      The manuscript contains intriguing findings of interest to the mRNA decay community. That said, it appears that the authors at times overinterpret the data they get from a handful of direct RNA sequencing experiments. To bolster some of the statements additional experiments might be desirable.

      A selection of comments:

      (1) Considering that the authors compare the effects of stress, stress granule formation, and XRN1 loss on transcriptome profiles, it would be desirable to use a single-cell system (and validated in a few more). Most of the direct RNAseq is performed in HeLa cells, but the experiments showing that stress granule formation is required come from U2OS cells, while short RNAseq data showing loss of coverage on mRNA 5'ends is reanalyzed from HEK293 cells. It may be plausible that the same pathways operate in all those cells, but it is not rigorously demonstrated.

      (2) An interesting finding of the manuscript is that polyA tail shortening is not observed prior to transcript shortening. The authors would need to demonstrate that their approach is capable of detecting shortened polyA tails. Using polyA purified RNA to look at the status of polyA tail length may not be ideal (as avidity to oligodT beads may increase with polyA tail length and therefore the authors bias themselves to longer tails anyway). At the very least, the use of positive controls would be desirable; e.g. knockdown of CCR4/NOT.

      (3) The authors use a strategy of ligating an adapter to 5' phosphorylated RNA (presumably the breakdown fragments) to be able to distinguish true mRNA fragments from artifacts of abortive nanopore sequencing. This is a fantastic approach to curating a clean dataset. Unfortunately, the authors don't appear to go through with discarding fragments that are not adapter-ligated (presumably to increase the depth of analysis; they do offer Figure 1e that shows similar changes in transcript length for fragments with adapter, compared to Figure 1d). It would be good to know how many reads in total had the adapter. Furthermore, it would be good to know what percentage of reads without adapters are products of abortive sequencing. What percentage of reads had 5'OH ends (could be answered by ligating a different adapter to kinase-treated transcripts). More read curation would also be desirable when building the metagene analysis - why do the authors include every 3'end of sequenced reads (their RNA purification scheme requires a polyA tail, so non-polyadenylated fragments are recovered in a non-quantitative manner and should be discarded).

      (4) The authors should come to a clear conclusion about what "transcript shortening" means. Is it exonucleolytic shortening from the 5'end? They cannot say much about the 3'ends anyway (see above). Or are we talking about endonucleolytic cuts leaving 5'P that then can be attached by XRN1 (again, what is the ratio of 5'P and 5'OH fragments; also, what is the ratio of shortened to full-length RNA)?

      (5) The authors should clearly explain how they think the transcript shortening comes about. They claim it does not need polyA shortening, but then do not explain where the XRN1 substrate comes from. Does their effect require decapping? Or endonucleolytic attacks?

      (6) XRN1 KD results in lengthened transcripts. That is not surprising as XRN1 is an exonuclease - and XRN1 does not merely rescue arsenite stress-mediated transcript shortening, but results in a dramatic transcript lengthening.

    4. Reviewer #3 (Public Review):

      The work by Dar et al. examines RNA metabolism under cellular stress, focusing on stress-granule-dependent RNA decay. It employs direct RNA sequencing with a Nanopore-based method, revealing that cellular stress induces prevalent 5' end RNA decay that is coupled to translation and ribosome occupancy but is independent of the shortening of the poly(A) tail. This decay, however, is dependent on XRN1 and enriched in the stress granule transcriptome. Notably, inhibiting stress granule formation in G3BP1/2-null cells restores the RNA length to the same level as wild-type. It suppresses stress-induced decay, identifying RNA decay as a critical determinant of RNA metabolism during cellular stress and highlighting its dependence on stress-granule formation.

      This is an exciting and novel discovery. I am not an expert in sequencing technologies or sequencing data analysis, so I will limit my comments purely to biology and not technical points. The PI is a leader in applying innovative sequencing methods to studying mRNA decay.

      One aspect that appeared overlooked is that poly(A) tail shortening per se does lead to decapping. It is shortening below a certain threshold of 8-10 As that triggers decapping. Therefore, I found the conclusion that poly(A) tail shortening is not required for stress-induced decay to be somewhat premature. For a robust test of this hypothesis, the authors should consider performing their analysis in conditions where CNOT7/8 is knocked down with siRNA.

      Similarly, as XRN1 requires decapping to take place, it necessitates the experiment where a dominant-negative DCP2 mutant is over-expressed.

      Are G3BP1/2 stress granules required for stress-induced decay or simply sites for storage? This part seems unclear. A very worthwhile test here would be to assess in XRN1-null background.

      Finally, the authors speculate that the mechanism of stress-induced decay may have evolved to relieve translational load during stress. But why degrade the 5' end when removing the cap may be sufficient? This returns to the question of assessing the role of decapping in this mechanism.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, Song, Shi, and Lin use an existing deep learning-based sequence model to derive a score for each haplotype within a genomic region, and then perform association tests between these scores and phenotypes of interest. The authors then perform some downstream analyses (fine-mapping, various enrichment analyses, and building polygenic scores) to ensure that these associations are meaningful. The authors find that their approach allows them to find additional associations, the associations have biologically interpretable enrichments in terms of tissues and pathways, and can slightly improve polygenic scores when combined with standard SNP-based PRS.

      Strengths:

      • I found the central idea of the paper to be conceptually straightforward and an appealing way to use the power of sequence models in an association testing framework.

      • The findings are largely biologically interpretable, and it seems like this could be a promising approach to boost power for some downstream applications.

      Weaknesses:

      • The methods used to generate polygenic scores were difficult to follow. In particular, a fully connected neural network with linear activations predicting a single output should be equivalent to linear regression (all intermediate layers of the network can be collapsed using matrix-multiplication, so the output is just the inner product of the input with some vector). Using the last hidden layer of such a network for downstream tasks should also be equivalent to projecting the input down to a lower dimensional space with some essentially randomly chosen projection. As such, I am surprised that the neural network approach performs so well, and it would be nice if the authors could compare it to other linear approaches (e.g., LASSO or ridge regression for prediction; PCA or an auto-encoder for converting the input to a lower dimensional representation).

      Response: We thank the reviewer for the recognition and valuable suggestion on our work. Just as the reviewer suggested, our polygenic prediction procedure is equivalent to linear transformation and in this revision, we indeed found that it was unnecessary to use neural network framework to replace linear model. Indeed, both our result and previous work indicated that linear model fitted polygenic traits better than non-linear one, which was also the reason we chose linear activation for neural network in the original manuscript.

      In this revision, we followed the reviewer’s suggestion to apply a more straightforward linear framework for polygenic prediction. We first calculated weighted sum of HFS for each block (1,361 independent blocks in total), then, in each target ancestry, we used LASSO regression to integrate them with SNP PRS into one final score. We also conducted comparative analysis in British European test set and found that LASSO, ridge and elastic net gave similar result, and LASSO performed slightly better. By applying this straightforward framework and sliding window strategy, we moderately improved the prediction performance.

      Line 349: “Using height as a representative trait, we first estimated the proportion of variance captured by top loci, and found that HFS of loci with PIP>0.4 (n=5,101) captured roughly 80% of variance explained by all genome-wide loci (n=1,200,024 corresponded to sling-window strategy; Figure 5A). We then calculated HFS+LDAK in non-British European (NBE), South Asian (SAS), East Asian (EAS) and African (AFR) population in UK Biobank, and observed 17.5%, 16.1%, 17.2% and 39.8% improvement over LDAK alone (p=3.21×10-16, 0.0001, 0.002 and 0.001, respectively. Figure 5C).”

      Author response image 1.

      • A very interesting point of the paper was the low R^2 between the HFS scores in adjacent windows, but the explanation of this was unclear to me. Since the HFS scores are just deterministic functions of the SNPs, it feels like if the SNPs are in LD then the HFS scores should be and vice versa. It would be nice to compare the LD between adjacent windows to the average LD of pairs of SNPs from the two windows to see if this is driven by the fact that SNPs are being separated into windows, or if sei is somehow upweighting the importance of SNPs that are less linked to other SNPs (e.g., rare variants).

      Response: We thank the reviewer for the suggestion on understanding LD mechanism. In this revision, we used chromosome 1 as an example and calculate the pairwise LD among all SNPs within two adjacent loci. As shown in Figure S1 (below), although HFS-based LD is still significantly lower than median SNP-based LD (paired Wilcoxon test p=1.76e-5), we found that median SNP LD between loci was still lower than what typically observed between adjacent SNPs in GWAS (histogram of x axis; median =0.06). We reasoned that dividing SNPs into block is one of the reasons that HFS suffer less LD than standard GWAS, but not the whole story.

      Author response image 2.

      We agree with the reviewer that the effect of rare variants could also play an important role. In fact, sei author has also found that rare variants tended to have larger sei-predicted effects. We conducted an approximate analysis that remove all rare variants and repeated HFS calculation. Indeed, here HFS LD has profoundly raised to median=0.14, indicating that involving rare variants was vital for low LD.

      Author response image 3.

      Line 123: “Further evaluation indicated that this low LD was led by two factors: integration of rare variant impacts and segmentation. Firstly, excluding rare variants from HFS caused the LD raised to median=0.14 (Method; Figure S2C). Secondly, median LD of SNPs from adjacent loci was 0.06, which was significantly higher than HFS LD (paired Wilcoxon p=1.76×10-5) but significantly lower than HFS LD without rare variants (paired Wilcoxon p<2.2×10-16).”

      • There were also a number of robustness checks that would have been good to include in the paper. For instance, do the findings change if the windows are shifted? Do the findings change if the sequence is reverse-complemented?

      Response: Following the reviewer’s suggestion, we conducted a sliding window analysis where all loci were shifted 2048 bp, thereby doubling the total number of loci. In fine-mapping analysis, more than 90% of the causal loci were reproduced in sliding window analysis, either by themselves or by a overlapping locus:

      Line 207: “29.4% of causal loci (PIP>0.95) in the original analysis were still causal in sliding window analysis. 31.1% and 29.3% of causal loci whose 5’ and 3’ overlapping locus had PIP>0.95 in sliding window analysis, respectively, while themselves were no longer causal.”

      In polygenic prediction analysis, sliding window strategy significantly improved prediction accuracy, as we discussed in question 1.

      As for the issue of reverse complement, the nature of sei input layer is to encode both strand in a symmetric manner, such that the output for both strands would be the same. We have also run sei on the reverse complement (generated by seqkit seq -r -p) to verify that original sequence and reverse complement give the same output.

      Response: Following the reviewer’s suggestion, we added a new discussion paragraph on the issue of sequence model performance on interindividual variations. In brief, we suggest that although the drawback of lack of cross-individual training sets exists and future improvement is necessary, chromatin changes could be better predicted than gene expression. This is because the latter task requires information on long range interaction, which varies among genes and are difficult to be captured by using reference genome as training set. We made a schematic to clarify this:

      Author response image 4.

      We also noticed a few recent studies that directly validated sei predictions by experiments and showed significant accuracy, such as https://doi.org/10.1016/j.neuron.2022.12.026. Taken together, while we agreed that it is necessary to improve sequence model by adding more cross-individual training samples, the current SOTA model sei could still provide unique value to our study.

      Line 423: “The challenge of using sequence-based deep learning (DL) models in HFS applications is further compounded by their difficulty in predicting variations between individuals. Recent studies(Huang et al., 2023; Sasse et al., 2023) indicate that DL models, trained on the reference human genome, demonstrate limited accuracy in predicting gene expression levels across different individuals. This limitation is likely due to the models' inability to account for long-range regulatory patterns, which are crucial for understanding the impact of variants on gene expression and vary across genes. In contrast, our study leveraged sequence-determined functional genomic profiles in association studies, which mitigates this issue to an extent. For instance, although sei cannot identify the specific gene regulated by a given input sequence, it can predict changes in the sequence's functional activity. Future improvements in DL models' ability to predict interindividual differences could be achieved by incorporating cross-individual data in the training process. An example of such data is the EN-TEX(Rozowsky et al., 2023) dataset, which aligns functional genomic peaks with the specific individuals and haplotypes they correspond to.”

      Reviewer #2 (Public Review):

      Summary:

      In this work, Song et al. propose a locus-based framework for performing GWAS and related downstream analyses including finemapping and polygenic risk score (PRS) estimation. GWAS are not sufficiently powered to detect phenotype associations with low-frequency variants. To overcome this limitation, the manuscript proposes a method to aggregate variant impacts on chromatin and transcription across a 4096 base pair (bp) loci in the form of a haplotype function score (HFS). At each locus, an association is computed between the HFS and trait. Computing associations at the level of imputed functional genomic scores should enable the integration of information across variants spanning the allele frequency spectrum and bolster the power of GWAS.

      The HFS for each locus is derived from a sequence-based predictive model. Sei. Sei predicts 21,907 chromatin and TF binding tracks, which can be projected onto 40 pre-defined sequence classes ( representing promoters, enhancers, etc.). For each 4096 bp haplotype in their UKB cohort, the proposed method uses the Sei sequence class scores to derive the haplotype function score (HFS). The authors apply their method to 14 polygenic traits, identifying ~16,500 HFS-trait associations. They finemap these trait-associated loci with SuSie, as well as perform target gene/pathway discovery and PRS estimation.

      Strengths:

      Sequence-based deep learning predictors of chromatin status and TF binding have become increasingly accurate over the past few years. Imputing aggregated variant impact using Sei, and then performing an HFS-trait association is, therefore, an interesting approach to bolster power in GWAS discovery. The manuscript demonstrates that associations can be identified at the level of an aggregated functional score. The finemapping and pathway identification analyses suggest that HFS-based associations identify relevant causal pathways and genes from an association study. Identifying associations at the level of functional genomics increases the portability of PRSs across populations. Imputing functional genomic predictions using a sequence-based deep learning model does not suffer from the limitation of TWAS where gene expression is imputed from a limited-size reference panel such as GTEx.

      However, there are several major limitations that need to be addressed.

      Major concerns/weaknesses:

      (1) There is limited characterization of the locus-level associations to SNP-level associations. How does the set of HFS-based associations differ from SNP-level associations?

      Response: We thank the reviewer for the recognition and the valuable suggestion on our manuscript. Following the reviewer’s suggestion, in this revision we added a paragraph to compare the basic characteristics between HFS-based and SNP-based association study. These comparisons suggested that HFS had no advantage in testing marginal association, but performed better in detecting causal associations.

      Line 144: “When comparing HFS association with the standard SNP-based GWAS on the same data, we found that 98% of significant HFS loci also harbored a significant SNP. There were a few cases (n=0~5) where significant HFS loci did not harbored even marginal SNP association (GWAS p>0.01), which were due to the lack of common SNP in these loci. HFS association p value was higher than GWAS p value in 95 % of significant loci, suggested that HFS did not improve power to detect marginal effect. The genomic control inflation factor (λGC) for the HFS association test varied between 0.99 for asthma and 1.50 for height, closely resembling the SNP GWAS (Pearson Correlation Coefficient [PCC]=0.91, paired t-test p=0.16; Method and Figure S3). We concluded that HFS-based association tests had adequate power and do not introduce additional p-value inflation.”

      (2) A clear advantage of performing HFS-trait associations is that the HFS score is imputed by considering variants across the allele frequency spectrum. However, no evidence is provided demonstrating that rare variants contribute to associations derived by the model. Similarly, do the authors find evidence that allelic heterogeneity is leveraged by the HFS-based association model? It would be useful to do simulations here to characterize the model behavior in the presence of trait-associated rare variants.

      Response: Following the reviewer’s suggestion, we conducted a sensitivity analysis that removed all rare (MAF<0.01) variants and repeated the HFS analysis (HFScommon) on chromosome 1. In linear association analysis, we found that 10.6% of HFS signals (p<5×10-8) were missed by HFScommon. In fine-mapping, 55.3% of HFS causal signals (PIP>0.95) were missed by HFScommon. We concluded that rare variants played an important role in the performance of HFS, especially its advantages in fine-mapping.

      Line 175: “We also found that rare variants played an important role in the good find-mapping performance of HFS: when variants with MAF<0.01 were removed, 55.3% of the causal signals would be missed in HFS+SUSIE analysis.”

      We then attempted to conduct a simulation analysis where rare variants were causal to the phenotype, and the association statistics were the same as real GWAS of height. However, such simulation seemed not to properly reflect real scenario: no matter how we changed the association between rare variants and the phenotype, HFS association p-value could hardly reached the significance level of SNP association. We proposed that this is because simulation could not properly reflect how variants impact functional genomics: in fact, when randomly selected a rare variant as causal variant, there is high possibility that it had no impact on functional genomics, therefore its HFS would be close to zero. When such a variant was set as causal (which is unlikely in real scenario), HFS would not properly capture the association. We reasoned that it might be difficult to evaluate HFS by simulation, since the nonlinear relation between SNP and HFS as well as among SNPs were difficult to be properly simulated.

      Author response image 5.

      (3) Sei predicts chromatin status / ChIP-seq peaks in the center of a 4kb region. It would therefore be more relevant to predict HFS using overlapping sequence windows that tile the genome as opposed to using non-overlapping windows for computing HFS scores. Specifically, in line 482, the authors state that "the HFS score represents overall activity of the entire sequence, not only the few bp at the center", but this would not hold given that Sei is predicting activity at the center for any sequence.

      Response: We thank the reviewer for the suggestion on sliding window design. In this revision, we shifted all loci 2,048 bp to double the number of loci and repeated the fine-mapping and polygenic prediction analysis. For fine-mapping, we found that the result was generally robust with regard to sliding window procedure, and the majority of the causal associations were retained:

      Line 207: “29.4% of causal loci (PIP>0.95) in the original analysis were still causal in sliding window analysis. 31.1% and 29.3% of causal loci whose 5’ and 3’ overlapping locus had PIP>0.95 in sliding window analysis, respectively, while themselves were no longer causal.”

      In polygenic prediction, sliding window analysis provided a significantly improved performance compared with previous analysis on non-overlapping loci:

      However, since in this revision we have several updates on the polygenic prediction procedure, it was difficult to quantify how much improvement was led by sliding window design. Thus, we directly showed the new result in figure 5 but did not compare it with the original result.

      We also modified the previously imprecise statement to:

      Line 490: “…it integrated information of the entire sequence, not only the few bp at the center.”

      (4) Is the HFS-based association going to miss coding variation and several regulatory variants such as splicing variants? There are also going to be cases where there's an association driven by a variant that is correlated with a Sei prediction in a neighboring window. These would represent false positives for the method, it would be useful to identify or characterize these cases.

      Response: As the reviewer suggested, sei captured only functional genomic features and is by nature prone not to perform well when the causal variants impact protein sequences. In this revision, we characterized this by focusing on causal exonic variants (SNP PIP>0.95):

      Line 322: “On the other hand, HFS perform worse than SNP-based fine-mapping on exonic regions. Taking height as an example, PolyFun detected 125 causal SNPs (PIP>0.95) in the exonic regions, but only 16% (20) of loci that harbored them also reached PIP>0. 5 (11 reached PIP>0.95) in HFS+SUSIE analysis. Among the 105 loci that missed such signals (HFS PIP<0.5), 12 had a nearby locus (within 10kb) showing HFS PIP>0.95, which likely reflected false positive led by LD. Thus, SNP-based analysis should be prioritized over HFS in coding regions.”

      Additional minor concerns:

      (1) It's not clear whether SuSie-based finemapping is appropriate at the locus level, when there is limited LD between neighboring HFS bins. How does the choice of the number of causal loci and the size of the segment being finemapped affect the results and is SuSie a good fit in this scenario?

      Response: Following the reviewer’s suggestion, we reran SUSIE under different predefined causal loci number (from 2 to 10), and found that the identified causal loci were consistent.

      Author response image 6.

      Line 211: “Besides, HFS+SUSIE was also robust when the predefined number of causal loci (L=2 to 10) was changed, and the number of detected loci were not changed.”

      As for the size of segmentation, we divided the predefined segmentations (independent blocks detected by LDetect) into two half and reran SUSIE, and found that three additional causal loci emerged in one half. This suggested that using too small segmentation might increase the false positive rate. However, since there is no LD between independent blocks (which was guaranteed by LDetect), it is not necessary to use even longer blocks.

      Author response image 7.

      Line 133: “Simulation analysis revealed that when a non-reference sequence class score was associated the trait, reference class score could still capture median 70% of HFS-trait association R2.”

      (2) It is not clear how a single score is chosen from the 117 values predicted by Sei for each locus. SuSie is run assuming a single causal signal per locus, an assumption which may not hold at ~4kb resolution (several classes could be associated with the trait of interest). It's not clear whether SuSie, run in this parameter setting, is a good choice for variable selection here.

      Response: As we discussed below (question 3), in this revision we no longer applied SUSIE to find one sequence class score for each locus due to the impact of overfitting, and use the reference sequence class uniformly for all loci. As reviewer suggested, we applied simulation to evaluate how this procedure influence HFS performance, especially when multiple sequence class of the same locus is causal to the phenotype. We found that reference sequence class score could capture median 69.1% of phenotypic R2 when the causal sequence class is not the reference, and captured median 59.2% of R2 when there was 2~5 non-reference causal class. We concluded that the loss led by skipping sequence class selection is mild, and it is necessary to do so in consideration of the risk of overfitting.

      Author response image 8.

      (3) A single HFS score is being chosen from amongst multiple tracks at each locus independently. Does this require additional multiple-hypothesis correction?

      Response: We agree with the reviewer that choosing the sequence class for each locus represented multiple testing, and with additional experiments we indeed observed some evidences of overfitting of this procedure. Thus, in this revision, we no longer applied the per-locus feature selection procedure, but instead used the sequence class corresponded to the reference (hg38) sequence. Consequently, additional multiple-testing correction is avoided with this procedure. We admitted that such simplification missed certain information, but as mentioned above, such lost is moderate, and is necessary to ensure statistical robustness and reduce false positive. In fact, with such simplification we better controlled the inflation factor of HFS GWAS and got better portability in polygenic prediction.

      (4) The results show that a larger number of loci are identified with HFS-based finemapping & that causal loci are enriched for causal SNPs. However, it is not clear how the number of causal loci should relate to the number of SNPs. It would be really nice to see examples of cases where a previously unresolved association is resolved when using HFS-based GWAS + finemapping.

      Response: In this revision, we did not observe a clear relation between causal loci number and causal gene number. The only trend is that SNP-based fine-mapping seemed to perform better at coding regions, in accordance with the fact that HFS capture functional genomic signals. We also added new interpretations to highlight some examples where HFS resolve previously unresolved association signals. For example,

      Line 287: “Specifically, in 1q32.1 region, HFS+SUSIE identified two loci with PIP>0.9 (Figure 4B). SNP-based association also found significant association in this region, but SNP fine-mapping(Weissbrod et al., 2020) could not resolve this signal and only found seven signals between PIP=0.1 to 0.5.”

      (5) Sequence-based deep learning model predictions can be miscalibrated for insertions and deletions (INDELs) as compared to SNPs. Scaling INDEL predictions would likely improve the downstream modeling.

      Response: Following the reviewer’s suggestion, we conducted a sensitivity analysis that removed all indel on chromosome 1 and repeated HFS analysis. Removing indel has indeed increased the number of significant (p<5e-8) association by 9%, but also slightly increased inflation factor (paired wilcox test p=0.0001). In fine mapping analysis, removing indel caused a 4.7% decrement in the number of detected causal association (PIP>0.95). We reasoned that the potential miscalibration on indel has indeed impacted the statistical power of HFS, but the proper approach to control this impact might not be direct and is still await optimizing. In this revision, we still kept all indels in the analysis, since we proposed that the power of fine-mapping is more important than the power of marginal association.

      Line 213: “Lastly, removing insertion and deletion would reveal 9% more significant association (p<5×10-8) but 4.7% less causal association (PIP>0.95), and slightly increased inflation factor (Wilcoxon p=0.0001, Figure S4).”

      Author response image 9.

      Reviewer #1 (Recommendations For The Authors):

      It was unclear to me why the sei output was rounded to two decimal places to "avoid influence of sei prediction noise". Wouldn't rounding introduce additional noise?

      Response: We thank the reviewer for pointing out our inadequate description. The rounding procedure is used to mask the low value that likely did not reflect any real change. The idea is that, even if a variant actually does not bring about any functional changes, sei would still output a very low HFS value that is not equal to, but close to, zero. By rounding procedure, such low values would be set to zero, which could avoid noise. We have added this rationale to the method section:

      Line 529: “This is due to the fact that even if a variant actually makes no impact on functional genomics, sei would still output a value that are close to but not equal to reference sequence class score. Rounding procedure would set such HFS to zero and remove the random value from sei.”

      Minor comments / typos:

      • There are many typos in the abstract.

      Response: We have revised the typo and grammar issues in the abstract in this revision.

      • I believe "Arachnoid acid-intelligence" should be "Arachidonic acid-intelligence".

      • Consistently there is no space between text and parenthetical citations. For example, "sei(Chen et al., 2022)" should be "sei (Chen et al., 2022)".

      • Line 110: "at least one non-reference haplotypes" --> "at least one non-reference haplotype".

      • Line 155: "data-based method" --> "data-based methods".

      • Lines 165-166: "functionally importance" --> "functional importance".

      Response: We have made these revisions accordingly.

      • Line 210: the sentence containing "this annotation on conditioned of a set of baseline annotations" is unclear.

      Response: We have revised this sentence as “…regressed the PIP against this annotation, with a set of baseline annotations included as covariates, similar to the LDSC framework.”

      • Line 213: "association" --> "associations".

      • Line 219: "association" --> "associations".

      • Line 251: "result" --> "results".

      • Line 269: "result" --> "results".

      • Line 289: "known to involved" --> "known to be involved".

      • Line 356: "LDAK along" --> "LDAK alone".

      • Line 362: "BOLT-LMM along" --> "BOLT-LMM alone".

      • Supplement: "Hihglighted" --> "Highlighted".

      Response: We have made these revisions accordingly.

      • Line 444: Were "British ancestry Caucasians" defined as individuals that self-identified as "white British"? If so, then they should be described as "self-identified "white British"".

      Response: As the reviewer pointed out, we have changed the description as self-identified British ancestry Caucasians.

      Reviewer #2 (Recommendations For The Authors):

      (1) A 2022 cistrome-wide association study (CWAS) computed associations between genetically-predicted chromatin activity and phenotypes. Adding a reference to this paper would be helpful. https://pubmed.ncbi.nlm.nih.gov/36071171/

      Response: Following the reviewer’s suggestion, we discussed the similarity between CWAS and our study:

      Line 89: “In line with this notion, a recent similar strategy called cistrome-wide association study (CWAS) integrated variant-chromatin activity and variant-phenotype association to boost power of genetic study of cancer. (Baca et al., 2022).”

      (2) Line 487 states: "We applied sei to predict 21,906 functional genomic tracks for each sequence, without normalizing for histone mark." It's not clear what normalization is being referred to here.

      Response: We have revised the sentence to:

      Line 495: “We applied sei to predict 21,906 functional genomic tracks for each sequence, without normalizing for histone mark (divided each track score by the sum of histone mark score) as suggested by the sei author.”

      (3) The figures are extremely low resolution, they need to be updated.

      Response: In this revision, we uploaded separate pdf file for each figure to provide high resolution graphs.

      (4). The results section was difficult to follow and would benefit from being written more clearly.

      Response: In this revision, we re-arranged some of the result section to better clarify the main idea. We moved all statistical results to the bracket and focused our main text on the interpretation. For example,

      Line 123: “Further evaluation indicated that this low LD was led by two factors: integration of rare variant impacts and segmentation. Firstly, excluding rare variants from HFS caused the LD raised to median=0.14 (Method; Figure S2C). Secondly, median LD of SNPs from adjacent loci was 0.06, which was significantly higher than HFS LD (paired Wilcoxon p=1.76×10-5) but significantly lower than HFS LD without rare variants (paired Wilcoxon p<2.2×10-16).”

      (5) "Along" is used several times in the final results section (PRS estimation), this should be "alone".

      Response: We have modified all misused “along” by “alone” in this revision.

      (6) Instead of using notation identifying genomic location, it might be clearer to provide gene names when illustrating examples of trait-associated promoters.

      Response: In this revision, we added gene name of the corresponding promoters to the main text to better clarify the findings.

    2. Reviewer #2 (Public Review):

      Summary:

      In this work, Song et al. propose a locus-based framework for performing GWAS and related downstream analyses including finemapping and polygenic risk score (PRS) estimation. GWAS are not sufficiently powered to detect phenotype associations with low-frequency variants. To overcome this limitation, the manuscript proposes a method to aggregate variant impacts on chromatin and transcription across a 4096 base pair (bp) loci in the form of a haplotype function score (HFS). At each locus, an association is computed between the HFS and trait. Computing associations at the level of imputed functional genomic scores enables integration of information across variants spanning the allele frequency spectrum and bolster the power of GWAS.

      The HFS for each locus is derived from a sequence-based predictive model - Sei. Sei predicts 21,907 chromatin and TF binding tracks, which can be projected onto 40 pre-defined sequence classes ( representing promoters, enhancers etc.). For each 4096 bp haplotype in their UKB cohort, the proposed method uses the Sei sequence class scores to derive the haplotype function score (HFS). The authors apply their method to 14 polygenic traits, identifying ~16,500 HFS-trait associations. They finemap these trait-associated loci with SuSie, as well perform target gene/pathway discovery and PRS estimation.

      Strengths:

      Sequence-based deep learning predictors of chromatin status and TF binding have become increasingly accurate over the past few years. Imputing aggregated variant impact using Sei, and then performing an HFS-trait association is therefore an interesting approach to bolster power in GWAS discovery. The manuscript demonstrates that region-level associations can be identified at the level of an aggregated functional score using sequence-based deep learning models. The finemapping and pathway identification analyses suggest that HFS-based associations identify relevant causal pathways and genes from an association study. Identifying associations at the level of functional genomics increases portability of PRSs across populations. Imputing functional genomic predictions using a sequence-based deep learning model does not suffer from the limitation of TWAS where gene expression is imputed from a limited size reference panel such as GTEx and is an interesting direction to bolster discovery power.

      However, a few limitations to this method in its current form are:

      (1) HFS-based association is going to miss coding variation as well as noncoding regulatory variants such as splicing variants/polyadenylation variants which are not modeled by Sei. This will lead to false negatives in the HFS-based association and additionally false negatives + associated false positives in the finemapping. Going forward, it'll therefore be important to characterize how this influences the genome-wide finemapping.

      (2) Sei predicts chromatin status / ChIP-seq peaks in the center of a 4kb region. It is thus not clear therefore whether the functional effects of variants not in the center of the 4kb region would be captured in a single Sei score. It also remains unclear how much the choice of window affects the association tests / finemapping.

      (3) There are going to be cases where there's an association driven by a variant that is correlated with a Sei prediction in a neighboring window. These would represent false positives for the method, it would be useful to identify or characterize these cases.

      Minor Concerns:<br /> (1) Sequence based deep learning model predictions can be miscalibrated for insertions and deletions (INDELs) as compared to SNPs. It'll be important to note that model INDEL scores may not be calibrated, which might also lead to false positives / false negatives in the finemapping.

    3. eLife assessment

      This valuable paper presents a new approach for association testing, using the output of neural networks that have been trained to predict functional changes from DNA sequences. As such, the approach is an interesting addition to statistical genetics, and the evidence for the presented method being able to identify trait-associations in regions where GWASs are typically underpowered is solid. A limitation is, however, that it is unclear how the quality of these associations compares to those detected using conventional methods. Additional work assessing this method's power and characterizing false positives / false negative regions would be critical to ensure that the method is broadly adopted by the field.

    4. Reviewer #1 (Public Review):

      Summary:

      In this paper, Song, Shi, and Lin use an existing deep learning-based sequence model to derive a score for each haplotype within a genomic region, and then perform association tests between these scores and phenotypes of interest. The authors then perform some downstream analyses (fine-mapping, various enrichment analyses, building polygenic scores) to ensure that these associations are meaningful. The authors find that their approach allows them to find additional associations, the associations have biologically interpretable enrichments in terms of tissues and pathways, and can slightly improve polygenic scores when combined with standard SNP-based PRS.

      Strengths:

      - I found the central idea of the paper to be conceptually straightforward and an appealing way to use the power of sequence models in an association testing framework.

      - The findings are largely biologically interpretable, and it seems like this could be a promising approach to boost power for some downstream applications.

      Weaknesses:

      - While not a weakness of the manuscript, the proposed method is computationally intensive.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Comments

      Reviewer 1

      (1) Despite the well-established role of Netrin-1 and UNC5C axon guidance during embryonic commissural axons, it remains unclear which cell type(s) express Netrin-1 or UNC5C in the dopaminergic axons and their targets. For instance, the data in Figure 1F-G and Figure 2 are quite confusing. Does Netrin-1 or UNC5C express in all cell types or only dopamine-positive neurons in these two mouse models? It will also be important to provide quantitative assessments of UNC5C expression in dopaminergic axons at different ages.

      Netrin-1 is a secreted protein and in this manuscript we did not examine what cell types express Netrin-1. This question is not the focus of the study and we consider it irrelevant to the main issue we are addressing, which is where in the forebrain regions we examined Netrin-1+ cells are present. As per the reviewer’s request we include below images showing Netrin-1 protein and Netrin-1 mRNA expression in the forebrain. In Figure 1 below, we show a high magnification immunofluorescent image of a coronal forebrain section showing Netrin-1 protein expression.

      Author response image 1.

      This confocal microscope image shows immunofluorescent staining for Netrin-1 (green) localized around cell nuclei (stained by DAPI in blue). This image was taken from a coronal section of the lateral septum of an adult male mouse. Scale bar = 20µm

      In Figures 2 and 3 below we show low and high magnification images from an RNAscope experiment confirming that cells in the forebrain regions examined express Netrin-1 mRNA.

      Author response image 2.

      This confocal microscope image of a coronal brain section of the medial prefrontal cortex of an adult male mouse shows Netrin-1 mRNA expression (green) and cell nuclei (DAPI, blue). Brain regions are as follows: Cg1: Anterior cingulate cortex 1, DP: dorsopeduncular cortex, fmi: forceps minor of the corpus callosum, IL: Infralimbic Cortex, PrL: Prelimbic Cortex

      Author response image 3.

      A higher resolution image from the same sample as in Figure 2 shows Netrin-1 mRNA (green) and cell nuclei (DAPI; blue). DP = dorsopeduncular cortex

      Regarding UNC5c, this receptor homologue is expressed by dopamine neurons in the rodent ventral tegmental area (Daubaras et al., 2014; Manitt et al., 2010; Phillips et al., 2022). This does not preclude UNC5c expression in other cell types. UNC5c receptors are ubiquitously expressed in the brain throughout development, performing many different developmental functions (Kim and Ackerman, 2011; Murcia-Belmonte et al., 2019; Srivatsa et al., 2014). In this study we are interested in UNC5c expression by dopamine neurons, and particularly by their axons projecting to the nucleus accumbens. We therefore used immunofluorescent staining in the nucleus accumbens, showing UNC5 expression in TH+ axons. This work adds to the study by Manitt et al., 2010, which examined UNC5 expression in the VTA. Manitt et al. used Western blotting to demonstrate that UNC5 expression in VTA dopamine neurons increases during adolescence, as can be seen in the following figure:

       References:
      

      Daubaras M, Bo GD, Flores C. 2014. Target-dependent expression of the netrin-1 receptor, UNC5C, in projection neurons of the ventral tegmental area. Neuroscience 260:36–46. doi:10.1016/j.neuroscience.2013.12.007

      Kim D, Ackerman SL. 2011. The UNC5C Netrin Receptor Regulates Dorsal Guidance of Mouse Hindbrain Axons. J Neurosci 31:2167–2179. doi:10.1523/jneurosci.5254-10.20110.2011

      Manitt C, Labelle-Dumais C, Eng C, Grant A, Mimee A, Stroh T, Flores C. 2010. Peri-Pubertal Emergence of UNC-5 Homologue Expression by Dopamine Neurons in Rodents. PLoS ONE 5:e11463-14. doi:10.1371/journal.pone.0011463

      Murcia-Belmonte V, Coca Y, Vegar C, Negueruela S, Romero C de J, Valiño AJ, Sala S, DaSilva R, Kania A, Borrell V, Martinez LM, Erskine L, Herrera E. 2019. A Retino-retinal Projection Guided by Unc5c Emerged in Species with Retinal Waves. Current Biology 29:1149-1160.e4. doi:10.1016/j.cub.2019.02.052

      Phillips RA, Tuscher JJ, Black SL, Andraka E, Fitzgerald ND, Ianov L, Day JJ. 2022. An atlas of transcriptionally defined cell populations in the rat ventral tegmental area. Cell Reports 39:110616. doi:10.1016/j.celrep.2022.110616

      Srivatsa S, Parthasarathy S, Britanova O, Bormuth I, Donahoo A-L, Ackerman SL, Richards LJ, Tarabykin V. 2014. Unc5C and DCC act downstream of Ctip2 and Satb2 and contribute to corpus callosum formation. Nat Commun 5:3708. doi:10.1038/ncomms4708

      (2) Figure 1 used shRNA to knockdown Netrin-1 in the Septum and these mice were subjected to behavioral testing. These results, again, are not supported by any valid data that the knockdown approach actually worked in dopaminergic axons. It is also unclear whether knocking down Netrin-1 in the septum will re-route dopaminergic axons or lead to cell death in the dopaminergic neurons in the substantia nigra pars compacta?

      First we want to clarify and emphasize, that our knockdown approach was not designed to knock down Netrin-1 in dopamine neurons or their axons. Our goal was to knock down Netrin-1 expression in cells expressing this guidance cue gene in the dorsal peduncular cortex.

      We have previously established the efficacy of the shRNA Netrin-1 knockdown virus used in this experiment for reducing the expression of Netrin-1 (Cuesta et al., 2020). The shRNA reduces Netrin-1 levels in vitro and in vivo.

      We agree that our experiments do not address the fate of the dopamine axons that are misrouted away from the medial prefrontal cortex. This research is ongoing, and we have now added a note regarding this to our manuscript.

      Our current hypothesis, based on experiments being conducted as part of another line of research in the lab, is that these axons are rerouted to a different brain region which they then ectopically innervate. In these experiments we are finding that male mice exposed to tetrahydrocannabinol in adolescence show reduced dopamine innervation in the medial prefrontal cortex in adulthood but increased dopamine input in the orbitofrontal cortex. In addition, these mice show increased action impulsivity in the Go/No-Go task in adulthood (Capolicchio et al., Society for Neuroscience 2023 Abstracts)

      References:

      Capolicchio T., Hernandez, G., Dube, E., Estrada, K., Giroux, M., Flores, C. (2023) Divergent outcomes of delta 9 - tetrahydrocannabinol in adolescence on dopamine and cognitive development in male and female mice. Society for Neuroscience, Washington, DC, United States [abstract].

      Cuesta S, Nouel D, Reynolds LM, Morgunova A, Torres-Berrío A, White A, Hernandez G, Cooper HM, Flores C. 2020. Dopamine Axon Targeting in the Nucleus Accumbens in Adolescence Requires Netrin-1. Frontiers Cell Dev Biology 8:487. doi:10.3389/fcell.2020.00487

      (3) Another issue with Figure1J. It is unclear whether the viruses were injected into a WT mouse model or into a Cre-mouse model driven by a promoter specifically expresses in dorsal peduncular cortex? The authors should provide evidence that Netrin-1 mRNA and proteins are indeed significantly reduced. The authors should address the anatomic results of the area of virus diffusion to confirm the virus specifically infected the cells in dorsal peduncular cortex.

      All the virus knockdown experiments were conducted in wild type mice, we added this information to Figure 1k.

      The efficacy of the shRNA in knocking down Netrin-1 was demonstrated by Cuesta et al. (2020) both in vitro and in vivo, as we show in our response to the reviewer’s previous comment above.

      We also now provide anatomical images demonstrating the localization of the injection and area of virus diffusion in the mouse forebrain. In Author response image 4 below the area of virus diffusion is visible as green fluorescent signal.

      Author response image 4.

      Fluorescent microscopy image of a mouse forebrain demonstrating the localization of the injection of a virus to knock down Netrin-1. The location of the virus is in green, while cell nuclei are in blue (DAPI). Abbreviations: DP: dorsopeduncular cortex IL: infralimbic cortex

      References:

      Cuesta S, Nouel D, Reynolds LM, Morgunova A, Torres-Berrío A, White A, Hernandez G, Cooper HM, Flores C. 2020. Dopamine Axon Targeting in the Nucleus Accumbens in Adolescence Requires Netrin-1. Frontiers Cell Dev Biology 8:487. doi:10.3389/fcell.2020.00487

      (4) The authors need to provide information regarding the efficiency and duration of knocking down. For instance, in Figure 1K, the mice were tested after 53 days post injection, can the virus activity in the brain last for such a long time?

      In our study we are interested in the role of Netrin-1 expression in the guidance of dopamine axons from the nucleus accumbens to the medial prefrontal cortex. The critical window for these axons leaving the nucleus accumbens and growing to the cortex is early adolescence (Reynolds et al., 2018b). This is why we injected the virus at the onset of adolescence, at postnatal day 21. As dopamine axons grow from the nucleus accumbens to the prefrontal cortex, they pass through the dorsal peduncular cortex. We disrupted Netrin-1 expression at this point along their route to determine whether it is the Netrin-1 present along their route that guides these axons to the prefrontal cortex. We hypothesized that the shRNA Netrin-1 virus would disrupt the growth of the dopamine axons, reducing the number of axons that reach the prefrontal cortex and therefore the number of axons that innervate this region in adulthood.

      We conducted our behavioural tests during adulthood, after the critical window during which dopamine axon growth occurs, so as to observe the enduring behavioral consequences of this misrouting. This experimental approach is designed for the shRNa Netrin-1 virus to be expressed in cells in the dorsopeduncular cortex when the dopamine axons are growing, during adolescence.

       References:
      

      Capolicchio T., Hernandez, G., Dube, E., Estrada, K., Giroux, M., Flores, C. (2023) Divergent outcomes of delta 9 - tetrahydrocannabinol in adolescence on dopamine and cognitive development in male and female mice. Society for Neuroscience, Washington, DC, United States [abstract].

      Reynolds LM, Yetnikoff L, Pokinko M, Wodzinski M, Epelbaum JG, Lambert LC, Cossette M-P, Arvanitogiannis A, Flores C. 2018b. Early Adolescence is a Critical Period for the Maturation of Inhibitory Behavior. Cerebral cortex 29:3676–3686. doi:10.1093/cercor/bhy247

      (5) In Figure 1N-Q, silencing Netrin-1 results in less DA axons targeting to infralimbic cortex, but why the Netrin-1 knocking down mice revealed the improved behavior?

      This is indeed an intriguing finding, and we have now added a mention of it to our manuscript. We have demonstrated that misrouting dopamine axons away from the medial prefrontal cortex during adolescence alters behaviour, but why this improves their action impulsivity ability is something currently unknown to us. One potential answer is that the dopamine axons are misrouted to a different brain region that is also involved in controlling impulsive behaviour, perhaps the dorsal striatum (Kim and Im, 2019) or the orbital prefrontal cortex (Jonker et al., 2015).

      We would also like to note that we are finding that other manipulations that appear to reroute dopamine axons to unintended targets can lead to reduced action impulsivity as measured using the Go No Go task. As we mentioned above, current experiments in the lab, which are part of a different line of research, are showing that male mice exposed to tetrahydrocannabinol in adolescence show reduced dopamine innervation in the medial prefrontal cortex in adulthood, but increased dopamine input in the orbitofrontal cortex. In addition, these mice show increased action impulsivity in the Go/No-Go task in adulthood (Capolicchio et al., Society for Neuroscience 2023 Abstracts)

      References

      Capolicchio T., Hernandez, G., Dube, E., Estrada, K., Giroux, M., Flores, C. (2023) Divergent outcomes of delta 9 - tetrahydrocannabinol in adolescence on dopamine and cognitive development in male and female mice. Society for Neuroscience, Washington, DC, United States [abstract].

      Jonker FA, Jonker C, Scheltens P, Scherder EJA. 2015. The role of the orbitofrontal cortex in cognition and behavior. Rev Neurosci 26:1–11. doi:10.1515/revneuro2014-0043 Kim B, Im H. 2019. The role of the dorsal striatum in choice impulsivity. Ann N York Acad Sci 1451:92–111. doi:10.1111/nyas.13961

      (6) What is the effect of knocking down UNC5C on dopamine axons guidance to the cortex?

      We have found that mice that are heterozygous for a nonsense Unc5c mutation, and as a result have reduced levels of UNC5c protein, show reduced amphetamine-induced locomotion and stereotypy (Auger et al., 2013). In the same manuscript we show that this effect only emerges during adolescence, in concert with the growth of dopamine axons to the prefrontal cortex. This is indirect but strong evidence that UNC5c receptors are necessary for correct adolescent dopamine axon development.

      References

      Auger ML, Schmidt ERE, Manitt C, Dal-Bo G, Pasterkamp RJ, Flores C. 2013. unc5c haploinsufficient phenotype: striking similarities with the dcc haploinsufficiency model. European Journal of Neuroscience 38:2853–2863. doi:10.1111/ejn.12270

      (7) In Figures 2-4, the authors only showed the amount of DA axons and UNC5C in NAcc. However, it remains unclear whether these experiments also impact the projections of dopaminergic axons to other brain regions, critical for the behavioral phenotypes. What about other brain regions such as prefrontal cortex? Do the projection of DA axons and UNC5c level in cortex have similar pattern to those in NAcc?

      UNC5c receptors are expressed throughout development and are involved in many developmental processes (Kim and Ackerman, 2011; Murcia-Belmonte et al., 2019; Srivatsa et al., 2014). We cannot say whether the pattern we observe here is unique to the nucleus accumbens, but it is certainly not universal throughout the brain.

      The brain region we focus on in our manuscript, in addition to the nucleus accumbens, is the medial prefrontal cortex. Close and thorough examination of the prefrontal cortices of adult mice revealed practically no UNC5c expression by dopamine axons. However, we did observe very rare cases of dopamine axons expressing UNC5c. It is not clear whether these rare cases are present before or during adolescence.

      Below is a representative set of images of this observation, which is now also included as Supplementary Figure 4:

      Author response image 5.

      Expression of UNC5c protein in the medial prefrontal cortex of an adult male mouse. Low (A) and high (B) magnification images demonstrate that there is little UNC5c expression in dopamine axons in the medial prefrontal cortex. Here we identify dopamine axons by immunofluorescent staining for tyrosine hydroxylase (TH, see our response to comment #9 regarding the specificity of the TH antibody for dopamine axons in the prefrontal cortex). This figure is also included as Supplementary Figure 4 in the manuscript. Abbreviations: fmi: forceps minor of the corpus callosum, mPFC: medial prefrontal cortex.

      References:

      Kim D, Ackerman SL. 2011. The UNC5C Netrin Receptor Regulates Dorsal Guidance of Mouse Hindbrain Axons. J Neurosci 31:2167–2179. doi:10.1523/jneurosci.5254- 10.20110.2011

      Murcia-Belmonte V, Coca Y, Vegar C, Negueruela S, Romero C de J, Valiño AJ, Sala S, DaSilva R, Kania A, Borrell V, Martinez LM, Erskine L, Herrera E. 2019. A Retino-retinal Projection Guided by Unc5c Emerged in Species with Retinal Waves. Current Biology 29:1149-1160.e4. doi:10.1016/j.cub.2019.02.052

      Srivatsa S, Parthasarathy S, Britanova O, Bormuth I, Donahoo A-L, Ackerman SL, Richards LJ, Tarabykin V. 2014. Unc5C and DCC act downstream of Ctip2 and Satb2 and contribute to corpus callosum formation. Nat Commun 5:3708. doi:10.1038/ncomms4708

      (8) Can overexpression of UNC5c or Netrin-1 in male winter hamsters mimic the observations in summer hamsters? Or overexpression of UNC5c in female summer hamsters to mimic the winter hamster? This would be helpful to confirm the causal role of UNC5C in guiding DA axons during adolescence.

      This is an excellent question. We are very interested in both increasing and decreasing UNC5c expression in hamster dopamine axons to see if we can directly manipulate summer hamsters into winter hamsters and vice versa. We are currently exploring virus-based approaches to design these experiments and are excited for results in this area.

      (9) The entire study relied on using tyrosine hydroxylase (TH) as a marker for dopaminergic axons. However, the expression of TH (either by IHC or IF) can be influenced by other environmental factors, that could alter the expression of TH at the cellular level.

      This is an excellent point that we now carefully address in our methods by adding the following:

      In this study we pay great attention to the morphology and localization of the fibres from which we quantify varicosities to avoid counting any fibres stained with TH antibodies that are not dopamine fibres. The fibres that we examine and that are labelled by the TH antibody show features indistinguishable from the classic features of cortical dopamine axons in rodents (Berger et al., 1974; 1983; Van Eden et al., 1987; Manitt et al., 2011), namely they are thin fibres with irregularly-spaced varicosities, are densely packed in the nucleus accumbens, sparsely present only in the deep layers of the prefrontal cortex, and are not regularly oriented in relation to the pial surface. This is in contrast to rodent norepinephrine fibres, which are smooth or beaded in appearance, relatively thick with regularly spaced varicosities, increase in density towards the shallow cortical layers, and are in large part oriented either parallel or perpendicular to the pial surface (Berger et al., 1974; Levitt and Moore, 1979; Berger et al., 1983; Miner et al., 2003). Furthermore, previous studies in rodents have noted that only norepinephrine cell bodies are detectable using immunofluorescence for TH, not norepinephrine processes (Pickel et al., 1975; Verney et al., 1982; Miner et al., 2003), and we did not observe any norepinephrine-like fibres.

      Furthermore, we are not aware of any other processes in the forebrain that are known to be immunopositive for TH under any environmental conditions.

      To reduce confusion, we have replaced the abbreviation for dopamine – DA – with TH in the relevant panels in Figures 1, 2, 3, and 4 to clarify exactly what is represented in these images. As can be seen in these images, fluorescent green labelling is present only in axons, which is to be expected of dopamine labelling in these forebrain regions.

      References:

      Berger B, Tassin JP, Blanc G, Moyne MA, Thierry AM (1974) Histochemical confirmation for dopaminergic innervation of the rat cerebral cortex after destruction of the noradrenergic ascending pathways. Brain Res 81:332–337.

      Berger B, Verney C, Gay M, Vigny A (1983) Immunocytochemical Characterization of the Dopaminergic and Noradrenergic Innervation of the Rat Neocortex During Early Ontogeny. In: Proceedings of the 9th Meeting of the International Neurobiology Society, pp 263–267 Progress in Brain Research. Elsevier.

      Levitt P, Moore RY (1979) Development of the noradrenergic innervation of neocortex. Brain Res 162:243–259.

      Manitt C, Mimee A, Eng C, Pokinko M, Stroh T, Cooper HM, Kolb B, Flores C (2011) The Netrin Receptor DCC Is Required in the Pubertal Organization of Mesocortical Dopamine Circuitry. J Neurosci 31:8381–8394.

      Miner LH, Schroeter S, Blakely RD, Sesack SR (2003) Ultrastructural localization of the norepinephrine transporter in superficial and deep layers of the rat prelimbic prefrontal cortex and its spatial relationship to probable dopamine terminals. J Comp Neurol 466:478–494.

      Pickel VM, Joh TH, Field PM, Becker CG, Reis DJ (1975) Cellular localization of tyrosine hydroxylase by immunohistochemistry. J Histochem Cytochem 23:1–12.

      Van Eden CG, Hoorneman EM, Buijs RM, Matthijssen MA, Geffard M, Uylings HBM (1987) Immunocytochemical localization of dopamine in the prefrontal cortex of the rat at the light and electron microscopical level. Neurosci 22:849–862.

      Verney C, Berger B, Adrien J, Vigny A, Gay M (1982) Development of the dopaminergic innervation of the rat cerebral cortex. A light microscopic immunocytochemical study using anti-tyrosine hydroxylase antibodies. Dev Brain Res 5:41–52.

      (10) Are Netrin-1/UNC5C the only signal guiding dopamine axon during adolescence? Are there other neuronal circuits involved in this process?

      Our intention for this study was to examine the role of Netrin-1 and its receptor UNC5C specifically, but we do not suggest that they are the only molecules to play a role. The process of guiding growing dopamine axons during adolescence is likely complex and we expect other guidance mechanisms to also be involved. From our previous work we know that the Netrin-1 receptor DCC is critical in this process (Hoops and Flores, 2017; Reynolds et al., 2023). Several other molecules have been identified in Netrin-1/DCC signaling processes that control corpus callosum development and there is every possibility that the same or similar molecules may be important in guiding dopamine axons (Schlienger et al., 2023).

      References:

      Hoops D, Flores C. 2017. Making Dopamine Connections in Adolescence. Trends in Neurosciences 1–11. doi:10.1016/j.tins.2017.09.004

      Reynolds LM, Hernandez G, MacGowan D, Popescu C, Nouel D, Cuesta S, Burke S, Savell KE, Zhao J, Restrepo-Lozano JM, Giroux M, Israel S, Orsini T, He S, Wodzinski M, Avramescu RG, Pokinko M, Epelbaum JG, Niu Z, Pantoja-Urbán AH, Trudeau L-É, Kolb B, Day JJ, Flores C. 2023. Amphetamine disrupts dopamine axon growth in adolescence by a sex-specific mechanism in mice. Nat Commun 14:4035. doi:10.1038/s41467-023-39665-1

      Schlienger S, Yam PT, Balekoglu N, Ducuing H, Michaud J-F, Makihara S, Kramer DK, Chen B, Fasano A, Berardelli A, Hamdan FF, Rouleau GA, Srour M, Charron F. 2023. Genetics of mirror movements identifies a multifunctional complex required for Netrin-1 guidance and lateralization of motor control. Sci Adv 9:eadd5501. doi:10.1126/sciadv.add5501

      (11) Finally, despite the authors' claim that the dopaminergic axon project is sensitive to the duration of daylight in the hamster, they never provided definitive evidence to support this hypothesis.

      By “definitive evidence” we think that the reviewer is requesting a single statistical model including measures from both the summer and winter groups. Such a model would provide a probability estimate of whether dopamine axon growth is sensitive to daylight duration. Therefore, we ran these models, one for male hamsters and one for female hamsters.

      In both sexes we find a significant effect of daylength on dopamine innervation, interacting with age. Male age by daylength interaction: F = 6.383, p = 0.00242. Female age by daylength interaction: F = 21.872, p = 1.97 x 10-9. The full statistical analysis is available as a supplement to this letter (Response_Letter_Stats_Details.docx).

      Reviewer 3

      (1) Fig 1 A and B don't appear to be the same section level.

      The reviewer is correct that Fig 1B is anterior to Fig 1A. We have changed Figure 1A to match the section level of Figure 1B.

      (2) Fig 1C. It is not clear that these axons are crossing from the shell of the NAC.

      We have added a dashed line to Figure 1C to highlight the boundary of the nucleus accumbens, which hopefully emphasizes that there are fibres crossing the boundary. We also include here an enlarged image of this panel:

      Author response image 6.

      An enlarged image of Figure1c in the manuscript. The nucleus accumbens (left of the dotted line) is densely packed with TH+ axons (in green). Some of these TH+ axons can be observed extending from the nucleus accumbens medially towards a region containing dorsally oriented TH+ fibres (white arrows).

      (3) Fig 1. Measuring width of the bundle is an odd way to measure DA axon numbers. First the width could be changing during adult for various reasons including change in brain size. Second, I wouldn't consider these axons in a traditional bundle. Third, could DA axon counts be provided, rather than these proxy measures.

      With regards to potential changes in brain size, we agree that this could have potentially explained the increased width of the dopamine axon pathway. That is why it was important for us to use stereology to measure the density of dopamine axons within the pathway. If the width increased but no new axons grew along the pathway, we would have seen a decrease in axon density from adolescence to adulthood. Instead, our results show that the density of axons remained constant.

      We agree with the reviewer that the dopamine axons do not form a traditional “bundle”. Therefore, throughout the manuscript we now avoid using the term bundle.

      Although we cannot count every single axon, an accurate estimate of this number can be obtained using stereology, an unbiassed method for efficiently quantifying large, irregularly distributed objects. We used stereology to count TH+ axons in an unbiased subset of the total area occupied by these axons. Unbiased stereology is the gold-standard technique for estimating populations of anatomical objects, such as axons, that are so numerous that it would be impractical or impossible to measure every single one. Here and elsewhere we generally provide results as densities and areas of occupancy (Reynolds et al., 2022). To avoid confusion, we now clarify that we are counting the width of the area that dopamine axons occupy (rather than the dopamine axon “bundle”).

      References:

      Reynolds LM, Pantoja-Urbán AH, MacGowan D, Manitt C, Nouel D, Flores C. 2022. Dopaminergic System Function and Dysfunction: Experimental Approaches. Neuromethods 31–63. doi:10.1007/978-1-0716-2799-0_2

      (4) TH in the cortex could also be of noradrenergic origin. This needs to be ruled out to score DA axons

      This is the same comment as Reviewer 1 #9. Please see our response below, which we have also added to our methods:

      In this study we pay great attention to the morphology and localization of the fibres from which we quantify varicosities to avoid counting any fibres stained with TH antibodies that are not dopamine fibres. The fibres that we examine and that are labelled by the TH antibody show features indistinguishable from the classic features of cortical dopamine axons in rodents (Berger et al., 1974; 1983; Van Eden et al., 1987; Manitt et al., 2011), namely they are thin fibres with irregularly-spaced varicosities, are densely packed in the nucleus accumbens, sparsely present only in the deep layers of the prefrontal cortex, and are not regularly oriented in relation to the pial surface. This is in contrast to rodent norepinephrine fibres, which are smooth or beaded in appearance, relatively thick with regularly spaced varicosities, increase in density towards the shallow cortical layers, and are in large part oriented either parallel or perpendicular to the pial surface (Berger et al., 1974; Levitt and Moore, 1979; Berger et al., 1983; Miner et al., 2003). Furthermore, previous studies in rodents have noted that only norepinephrine cell bodies are detectable using immunofluorescence for TH, not norepinephrine processes (Pickel et al., 1975; Verney et al., 1982; Miner et al., 2003), and we did not observe any norepinephrine-like fibres.

      References:

      Berger B, Tassin JP, Blanc G, Moyne MA, Thierry AM (1974) Histochemical confirmation for dopaminergic innervation of the rat cerebral cortex after destruction of the noradrenergic ascending pathways. Brain Res 81:332–337.

      Berger B, Verney C, Gay M, Vigny A (1983) Immunocytochemical Characterization of the Dopaminergic and Noradrenergic Innervation of the Rat Neocortex During Early Ontogeny. In: Proceedings of the 9th Meeting of the International Neurobiology Society, pp 263–267 Progress in Brain Research. Elsevier.

      Levitt P, Moore RY (1979) Development of the noradrenergic innervation of neocortex. Brain Res 162:243–259.

      Manitt C, Mimee A, Eng C, Pokinko M, Stroh T, Cooper HM, Kolb B, Flores C (2011) The Netrin Receptor DCC Is Required in the Pubertal Organization of Mesocortical Dopamine Circuitry. J Neurosci 31:8381–8394.

      Miner LH, Schroeter S, Blakely RD, Sesack SR (2003) Ultrastructural localization of the norepinephrine transporter in superficial and deep layers of the rat prelimbic prefrontal cortex and its spatial relationship to probable dopamine terminals. J Comp Neurol 466:478–494.

      Pickel VM, Joh TH, Field PM, Becker CG, Reis DJ (1975) Cellular localization of tyrosine hydroxylase by immunohistochemistry. J Histochem Cytochem 23:1–12.

      Van Eden CG, Hoorneman EM, Buijs RM, Matthijssen MA, Geffard M, Uylings HBM (1987) Immunocytochemical localization of dopamine in the prefrontal cortex of the rat at the light and electron microscopical level. Neurosci 22:849–862.

      Verney C, Berger B, Adrien J, Vigny A, Gay M (1982) Development of the dopaminergic innervation of the rat cerebral cortex. A light microscopic immunocytochemical study using anti-tyrosine hydroxylase antibodies. Dev Brain Res 5:41–52.

      (5) Netrin staining should be provided with NeuN + DAPI; its not clear these are all cell bodies. An in situ of Netrin would help as well.

      A similar comment was raised by Reviewer 1 in point #1. Please see below the immunofluorescent and RNA scope images showing expression of Netrin-1 protein and mRNA in the forebrain.

      Author response image 7.

      This confocal microscope image shows immunofluorescent staining for Netrin-1 (green) localized around cell nuclei (stained by DAPI in blue). This image was taken from a coronal section of the lateral septum of an adult male mouse. Scale bar = 20µm

      Author response image 8.

      This confocal microscope image of a coronal brain section of the medial prefrontal cortex of an adult male mouse shows Netrin-1 mRNA expression (green) and cell nuclei (DAPI, blue). RNAscope was used to generate this image. Brain regions are as follows: Cg1: Anterior cingulate cortex 1, DP: dorsopeduncular cortex, IL: Infralimbic Cortex, PrL: Prelimbic Cortex, fmi: forceps minor of the corpus callosum

      Author response image 9.

      A higher resolution image from the same sample as in Figure 2 shows Netrin-1 mRNA (green) and cell nuclei (DAPI; blue). DP = dorsopeduncular cortex

      (6) The Netrin knockdown needs validation. How strong was the knockdown etc?

      This comment was also raised by Reviewer 1 #1.

      We have previously established the efficacy of the shRNA Netrin-1 knockdown virus used in this experiment for reducing the expression of Netrin-1 (Cuesta et al., 2020). The shRNA reduces Netrin-1 levels in vitro and in vivo.

      References:

      Cuesta S, Nouel D, Reynolds LM, Morgunova A, Torres-Berrío A, White A, Hernandez G, Cooper HM, Flores C. 2020. Dopamine Axon Targeting in the Nucleus Accumbens in Adolescence Requires Netrin-1. Frontiers Cell Dev Biology 8:487. doi:10.3389/fcell.2020.00487

      (7) If the conclusion that knocking down Netrin in cortex decreases DA innervation of the IL, how can that be reconciled with Netrin-Unc repulsion.

      This is an intriguing question and one that we are in the planning stages of addressing with new experiments.

      Although we do not have a mechanistic answered for how a repulsive receptor helps guide these axons, we would like to note that previous indirect evidence from a study by our group also suggests that reducing UNC5c signaling in dopamine axons in adolescence increases dopamine innervation to the prefrontal cortex (Auger et al, 2013).

      References

      Auger ML, Schmidt ERE, Manitt C, Dal-Bo G, Pasterkamp RJ, Flores C. 2013. unc5c haploinsufficient phenotype: striking similarities with the dcc haploinsufficiency model. European Journal of Neuroscience 38:2853–2863. doi:10.1111/ejn.12270

      (8) The behavioral phenotype in Fig 1 is interesting, but its not clear if its related to DA axons/signaling. IN general, no evidence in this paper is provided for the role of DA in the adolescent behaviors described.

      We agree with the reviewer that the behaviours we describe in adult mice are complex and are likely to involve several neurotransmitter systems. However, there is ample evidence for the role of dopamine signaling in cognitive control behaviours (Bari and Robbins, 2013; Eagle et al., 2008; Ott et al., 2023) and our published work has shown that alterations in the growth of dopamine axons to the prefrontal cortex leads to changes in impulse control as measured via the Go/No-Go task in adulthood (Reynolds et al., 2023, 2018a; Vassilev et al., 2021).

      The other adolescent behaviour we examined was risk-like taking behaviour in male and female hamsters (Figures 4 and 5), as a means of characterizing maturation in this behavior over time. We decided not to use the Go/No-Go task because as far as we know, this has never been employed in Siberian Hamsters and it will be difficult to implement. Instead, we chose the light/dark box paradigm, which requires no training and is ideal for charting behavioural changes over short time periods. Indeed, risk-like taking behavior in rodents and in humans changes from adolescence to adulthood paralleling changes in prefrontal cortex development, including the gradual input of dopamine axons to this region.

      References:

      Bari A, Robbins TW. 2013. Inhibition and impulsivity: Behavioral and neural basis of response control. Progress in neurobiology 108:44–79. doi:10.1016/j.pneurobio.2013.06.005

      Eagle DM, Bari A, Robbins TW. 2008. The neuropsychopharmacology of action inhibition: cross-species translation of the stop-signal and go/no-go tasks. Psychopharmacology 199:439–456. doi:10.1007/s00213-008-1127-6

      Ott T, Stein AM, Nieder A. 2023. Dopamine receptor activation regulates reward expectancy signals during cognitive control in primate prefrontal neurons. Nat Commun 14:7537. doi:10.1038/s41467-023-43271-6

      Reynolds LM, Hernandez G, MacGowan D, Popescu C, Nouel D, Cuesta S, Burke S, Savell KE, Zhao J, Restrepo-Lozano JM, Giroux M, Israel S, Orsini T, He S, Wodzinski M, Avramescu RG, Pokinko M, Epelbaum JG, Niu Z, Pantoja-Urbán AH, Trudeau L-É, Kolb B, Day JJ, Flores C. 2023. Amphetamine disrupts dopamine axon growth in adolescence by a sex-specific mechanism in mice. Nat Commun 14:4035. doi:10.1038/s41467-023-39665-1

      Reynolds LM, Pokinko M, Torres-Berrío A, Cuesta S, Lambert LC, Pellitero EDC, Wodzinski M, Manitt C, Krimpenfort P, Kolb B, Flores C. 2018a. DCC Receptors Drive Prefrontal Cortex Maturation by Determining Dopamine Axon Targeting in Adolescence. Biological psychiatry 83:181–192. doi:10.1016/j.biopsych.2017.06.009

      Vassilev P, Pantoja-Urban AH, Giroux M, Nouel D, Hernandez G, Orsini T, Flores C. 2021. Unique effects of social defeat stress in adolescent male mice on the Netrin-1/DCC pathway, prefrontal cortex dopamine and cognition (Social stress in adolescent vs. adult male mice). Eneuro ENEURO.0045-21.2021. doi:10.1523/eneuro.0045-21.2021

      (9) Fig2 - boxes should be drawn on the NAc diagram to indicate sampled regions. Some quantification of Unc5c would be useful. Also, some validation of the Unc5c antibody would be nice.

      The images presented were taken medial to the anterior commissure and we have edited Figure 2 to show this. However, we did not notice any intra-accumbens variation, including between the core and the shell. Therefore, the images are representative of what was observed throughout the entire nucleus accumbens.

      To quantify UNC5c in the accumbens we conducted a Western blot experiment in male mice at different ages. A one-way ANOVA analyzing band intensity (relative to the 15-day-old average band intensity) as the response variable and age as the predictor variable showed a significant effect of age (F=5.615, p=0.01). Posthoc analysis revealed that 15-day-old mice have less UNC5c in the nucleus accumbens compared to 21- and 35-day-old mice.

      Author response image 10.

      The graph depicts the results of a Western blot experiment of UNC5c protein levels in the nucleus accumbens of male mice at postnatal days 15, 21 or 35 and reveals a significant increase in protein levels at the onset adolescence.

      Our methods for this Western blot were as follows: Samples were prepared as previously (Torres-Berrío et al., 2017). Briefly, mice were sacrificed by live decapitation and brains were flash frozen in heptane on dry ice for 10 seconds. Frozen brains were mounted in a cryomicrotome and two 500um sections were collected for the nucleus accumbens, corresponding to plates 14 and 18 of the Paxinos mouse brain atlas. Two tissue core samples were collected per section, one for each side of the brain, using a 15-gauge tissue corer (Fine surgical tools Cat no. NC9128328) and ejected in a microtube on dry ice. The tissue samples were homogenized in 100ul of standard radioimmunoprecipitation assay buffer using a handheld electric tissue homogenizer. The samples were clarified by centrifugation at 4C at a speed of 15000g for 30 minutes. Protein concentration was quantified using a bicinchoninic acid assay kit (Pierce BCA protein assay kit, Cat no.PI23225) and denatured with standard Laemmli buffer for 5 minutes at 70C. 10ug of protein per sample was loaded and run by SDS-PAGE gel electrophoresis in a Mini-PROTEAN system (Bio-Rad) on an 8% acrylamide gel by stacking for 30 minutes at 60V and resolving for 1.5 hours at 130V. The proteins were transferred to a nitrocellulose membrane for 1 hour at 100V in standard transfer buffer on ice. The membranes were blocked using 5% bovine serum albumin dissolved in tris-buffered saline with Tween 20 and probed with primary (UNC5c, Abcam Cat. no ab302924) and HRP-conjugated secondary antibodies for 1 hour. a-tubulin was probed and used as loading control. The probed membranes were resolved using SuperSignal West Pico PLUS chemiluminescent substrate (ThermoFisher Cat no.34579) in a ChemiDoc MP Imaging system (Bio-Rad). Band intensity was quantified using the ChemiDoc software and all ages were normalized to the P15 age group average.

      Validation of the UNC5c antibody was performed in the lab of Dr. Liu, from whom it was kindly provided. Briefly, in the validation study the authors showed that the anti-UNC5C antibody can detect endogenous UNC5C expression and the level of UNC5C is dramatically reduced after UNC5C knockdown. The antibody can also detect the tagged-UNC5C protein in several cell lines, which was confirmed by a tag antibody (Purohit et al., 2012; Shao et al., 2017).

      References:

      Purohit AA, Li W, Qu C, Dwyer T, Shao Q, Guan K-L, Liu G. 2012. Down Syndrome Cell Adhesion Molecule (DSCAM) Associates with Uncoordinated-5C (UNC5C) in Netrin-1mediated Growth Cone Collapse. The Journal of biological chemistry 287:27126–27138. doi:10.1074/jbc.m112.340174

      Shao Q, Yang T, Huang H, Alarmanazi F, Liu G. 2017. Uncoupling of UNC5C with Polymerized TUBB3 in Microtubules Mediates Netrin-1 Repulsion. J Neurosci 37:5620–5633. doi:10.1523/jneurosci.2617-16.2017

      (10) "In adolescence, dopamine neurons begin to express the repulsive Netrin-1 receptor UNC5C, and reduction in UNC5C expression appears to cause growth of mesolimbic dopamine axons to the prefrontal cortex".....This is confusing. Figure 2 shows a developmental increase in UNc5c not a decrease. So when is the "reduction in Unc5c expression" occurring?

      We apologize for the mistake in this sentence. We have corrected the relevant passage in our manuscript as follows:

      In adolescence, dopamine neurons begin to express the repulsive Netrin-1 receptor UNC5C, particularly when mesolimbic and mesocortical dopamine projections segregate in the nucleus accumbens (Manitt et al., 2010; Reynolds et al., 2018a). In contrast, dopamine axons in the prefrontal cortex do not express UNC5c except in very rare cases (Supplementary Figure 4). In adult male mice with Unc5c haploinsufficiency, there appears to be ectopic growth of mesolimbic dopamine axons to the prefrontal cortex (Auger et al., 2013). This miswiring is associated with alterations in prefrontal cortex-dependent behaviours (Auger et al., 2013).

      References:

      Auger ML, Schmidt ERE, Manitt C, Dal-Bo G, Pasterkamp RJ, Flores C. 2013. unc5c haploinsufficient phenotype: striking similarities with the dcc haploinsufficiency model. European Journal of Neuroscience 38:2853–2863. doi:10.1111/ejn.12270

      Manitt C, Labelle-Dumais C, Eng C, Grant A, Mimee A, Stroh T, Flores C. 2010. Peri-Pubertal Emergence of UNC-5 Homologue Expression by Dopamine Neurons in Rodents. PLoS ONE 5:e11463-14. doi:10.1371/journal.pone.0011463

      Reynolds LM, Pokinko M, Torres-Berrío A, Cuesta S, Lambert LC, Pellitero EDC, Wodzinski M, Manitt C, Krimpenfort P, Kolb B, Flores C. 2018a. DCC Receptors Drive Prefrontal Cortex Maturation by Determining Dopamine Axon Targeting in Adolescence. Biological psychiatry 83:181–192. doi:10.1016/j.biopsych.2017.06.009

      (11) In Fig 3, a statistical comparison should be made between summer male and winter male, to justify the conclusions that the winter males have delayed DA innervation.

      This analysis was also suggested by Reviewer 1, #11. Here is our response:

      We analyzed the summer and winter data together in ANOVAs separately for males and females. In both sexes we find a significant effect of daylength on dopamine innervation, interacting with age. Male age by daylength interaction: F = 6.383, p = 0.00242. Female age by daylength interaction: F = 21.872, p = 1.97 x 10-9. The full statistical analysis is available as a supplement to this letter (Response_Letter_Stats_Details.docx).

      (12) Should axon length also be measured here (Fig 3)? It is not clear why the authors have switched to varicosity density. Also, a box should be drawn in the NAC cartoon to indicate the region that was sampled.

      It is untenable to quantify axon length in the prefrontal cortex as we cannot distinguish independent axons. Rather, they are “tangled”; they twist and turn in a multitude of directions as they make contact with various dendrites. Furthermore, they branch extensively. It would therefore be impossible to accurately quantify the number of axons. Using unbiased stereology to quantify varicosities is a valid, well-characterized and straightforward alternative (Reynolds et al., 2022).

      References:

      Reynolds LM, Pantoja-Urbán AH, MacGowan D, Manitt C, Nouel D, Flores C. 2022. Dopaminergic System Function and Dysfunction: Experimental Approaches. Neuromethods 31–63. doi:10.1007/978-1-0716-2799-0_2

      (13) In Fig 3, Unc5c should be quantified to bolster the interesting finding that Unc5c expression dynamics are different between summer and winter hamsters. Unc5c mRNA experiments would also be important to see if similar changes are observed at the transcript level.

      We agree that it would be very interesting to see how UNC5c mRNA and protein levels change over time in summer and winter hamsters, both in males, as the reviewer suggests here, and in females. We are working on conducting these experiments in hamsters as part of a broader expansion of our research in this area. These experiments will require a lengthy amount of time and at this point we feel that they are beyond the scope of this manuscript.

      (14) Fig 4. The peak in exploratory behavior in winter females is counterintuitive and needs to be better discussed. IN general, the light dark behavior seems quite variable.

      This is indeed a very interesting finding, which we have expanded upon in our manuscript as follows:

      When raised under a winter-mimicking daylength, hamsters of either sex show a protracted peak in risk taking. In males, it is delayed beyond 80 days old, but the delay is substantially less in females. This is a counterintuitive finding considering that dopamine development in winter females appears to be accelerated. Our interpretation of this finding is that the timing of the risk-taking peak in females may reflect a balance between different adolescent developmental processes. The fact that dopamine axon growth is accelerated does not imply that all adolescent maturational processes are accelerated. Some may be delayed, for example those that induce axon pruning in the cortex. The timing of the risk-taking peak in winter female hamsters may therefore reflect the amalgamation of developmental processes that are advanced with those that are delayed – producing a behavioural effect that is timed somewhere in the middle. Disentangling the effects of different developmental processes on behaviour will require further experiments in hamsters, including the direct manipulation of dopamine activity in the nucleus accumbens and prefrontal cortex.

      Full Reference List

      Auger ML, Schmidt ERE, Manitt C, Dal-Bo G, Pasterkamp RJ, Flores C. 2013. unc5c haploinsufficient phenotype: striking similarities with the dcc haploinsufficiency model. European Journal of Neuroscience 38:2853–2863. doi:10.1111/ejn.12270

      Bari A, Robbins TW. 2013. Inhibition and impulsivity: Behavioral and neural basis of response control. Progress in neurobiology 108:44–79. doi:10.1016/j.pneurobio.2013.06.005

      Cuesta S, Nouel D, Reynolds LM, Morgunova A, Torres-Berrío A, White A, Hernandez G, Cooper HM, Flores C. 2020. Dopamine Axon Targeting in the Nucleus Accumbens in Adolescence Requires Netrin-1. Frontiers Cell Dev Biology 8:487. doi:10.3389/fcell.2020.00487

      Daubaras M, Bo GD, Flores C. 2014. Target-dependent expression of the netrin-1 receptor, UNC5C, in projection neurons of the ventral tegmental area. Neuroscience 260:36–46. doi:10.1016/j.neuroscience.2013.12.007

      Eagle DM, Bari A, Robbins TW. 2008. The neuropsychopharmacology of action inhibition: crossspecies translation of the stop-signal and go/no-go tasks. Psychopharmacology 199:439– 456. doi:10.1007/s00213-008-1127-6

      Hoops D, Flores C. 2017. Making Dopamine Connections in Adolescence. Trends in Neurosciences 1–11. doi:10.1016/j.tins.2017.09.004

      Jonker FA, Jonker C, Scheltens P, Scherder EJA. 2015. The role of the orbitofrontal cortex in cognition and behavior. Rev Neurosci 26:1–11. doi:10.1515/revneuro-2014-0043

      Kim B, Im H. 2019. The role of the dorsal striatum in choice impulsivity. Ann N York Acad Sci 1451:92–111. doi:10.1111/nyas.13961

      Kim D, Ackerman SL. 2011. The UNC5C Netrin Receptor Regulates Dorsal Guidance of Mouse Hindbrain Axons. J Neurosci 31:2167–2179. doi:10.1523/jneurosci.5254-10.2011

      Manitt C, Labelle-Dumais C, Eng C, Grant A, Mimee A, Stroh T, Flores C. 2010. Peri-Pubertal Emergence of UNC-5 Homologue Expression by Dopamine Neurons in Rodents. PLoS ONE 5:e11463-14. doi:10.1371/journal.pone.0011463

      Murcia-Belmonte V, Coca Y, Vegar C, Negueruela S, Romero C de J, Valiño AJ, Sala S, DaSilva R, Kania A, Borrell V, Martinez LM, Erskine L, Herrera E. 2019. A Retino-retinal Projection Guided by Unc5c Emerged in Species with Retinal Waves. Current Biology 29:1149-1160.e4. doi:10.1016/j.cub.2019.02.052

      Ott T, Stein AM, Nieder A. 2023. Dopamine receptor activation regulates reward expectancy signals during cognitive control in primate prefrontal neurons. Nat Commun 14:7537. doi:10.1038/s41467-023-43271-6

      Phillips RA, Tuscher JJ, Black SL, Andraka E, Fitzgerald ND, Ianov L, Day JJ. 2022. An atlas of transcriptionally defined cell populations in the rat ventral tegmental area. Cell Reports 39:110616. doi:10.1016/j.celrep.2022.110616

      Purohit AA, Li W, Qu C, Dwyer T, Shao Q, Guan K-L, Liu G. 2012. Down Syndrome Cell Adhesion Molecule (DSCAM) Associates with Uncoordinated-5C (UNC5C) in Netrin-1-mediated Growth Cone Collapse. The Journal of biological chemistry 287:27126–27138. doi:10.1074/jbc.m112.340174

      Reynolds LM, Hernandez G, MacGowan D, Popescu C, Nouel D, Cuesta S, Burke S, Savell KE, Zhao J, Restrepo-Lozano JM, Giroux M, Israel S, Orsini T, He S, Wodzinski M, Avramescu RG, Pokinko M, Epelbaum JG, Niu Z, Pantoja-Urbán AH, Trudeau L-É, Kolb B, Day JJ, Flores C. 2023. Amphetamine disrupts dopamine axon growth in adolescence by a sex-specific mechanism in mice. Nat Commun 14:4035. doi:10.1038/s41467-023-39665-1

      Reynolds LM, Pantoja-Urbán AH, MacGowan D, Manitt C, Nouel D, Flores C. 2022. Dopaminergic System Function and Dysfunction: Experimental Approaches. Neuromethods 31–63. doi:10.1007/978-1-0716-2799-0_2

      Reynolds LM, Pokinko M, Torres-Berrío A, Cuesta S, Lambert LC, Pellitero EDC, Wodzinski M, Manitt C, Krimpenfort P, Kolb B, Flores C. 2018a. DCC Receptors Drive Prefrontal Cortex Maturation by Determining Dopamine Axon Targeting in Adolescence. Biological psychiatry 83:181–192. doi:10.1016/j.biopsych.2017.06.009

      Reynolds LM, Yetnikoff L, Pokinko M, Wodzinski M, Epelbaum JG, Lambert LC, Cossette M-P, Arvanitogiannis A, Flores C. 2018b. Early Adolescence is a Critical Period for the Maturation of Inhibitory Behavior. Cerebral cortex 29:3676–3686. doi:10.1093/cercor/bhy247

      Schlienger S, Yam PT, Balekoglu N, Ducuing H, Michaud J-F, Makihara S, Kramer DK, Chen B, Fasano A, Berardelli A, Hamdan FF, Rouleau GA, Srour M, Charron F. 2023. Genetics of mirror movements identifies a multifunctional complex required for Netrin-1 guidance and lateralization of motor control. Sci Adv 9:eadd5501. doi:10.1126/sciadv.add5501

      Shao Q, Yang T, Huang H, Alarmanazi F, Liu G. 2017. Uncoupling of UNC5C with Polymerized TUBB3 in Microtubules Mediates Netrin-1 Repulsion. J Neurosci 37:5620–5633. doi:10.1523/jneurosci.2617-16.2017

      Srivatsa S, Parthasarathy S, Britanova O, Bormuth I, Donahoo A-L, Ackerman SL, Richards LJ, Tarabykin V. 2014. Unc5C and DCC act downstream of Ctip2 and Satb2 and contribute to corpus callosum formation. Nat Commun 5:3708. doi:10.1038/ncomms4708

      Torres-Berrío A, Lopez JP, Bagot RC, Nouel D, Dal-Bo G, Cuesta S, Zhu L, Manitt C, Eng C, Cooper HM, Storch K-F, Turecki G, Nestler EJ, Flores C. 2017. DCC Confers Susceptibility to Depression-like Behaviors in Humans and Mice and Is Regulated by miR-218. Biological psychiatry 81:306–315. doi:10.1016/j.biopsych.2016.08.017

      Vassilev P, Pantoja-Urban AH, Giroux M, Nouel D, Hernandez G, Orsini T, Flores C. 2021. Unique effects of social defeat stress in adolescent male mice on the Netrin-1/DCC pathway, prefrontal cortex dopamine and cognition (Social stress in adolescent vs. adult male mice). Eneuro ENEURO.0045-21.2021. doi:10.1523/eneuro.0045-21.2021

      Private Comments

      Reviewer #1

      (12) The language should be improved. Some expression is confusing (line178-179). Also some spelling errors (eg. Figure 1M).

      We have removed the word “Already” to make the sentence in lines 178-179 clearer, however we cannot find a spelling error in Figure 1M or its caption. We have further edited the manuscript for clarity and flow.

      Reviewer #2

      (1) The authors claim to have revealed how the 'timing of adolescence is programmed in the brain'. While their findings certainly shed light on molecular, circuit and behavioral processes that are unique to adolescence, their claim may be an overstatement. I suggest they refine this statement to discuss more specifically the processes they observed in the brain and animal behavior, rather than adolescence itself.

      We agree with the reviewer and have revised the manuscript to specify that we are referring to the timing of specific developmental processes that occur in the adolescent brain, not adolescence overall.

      (2) Along the same lines, the authors should also include a more substantiative discussion of how they selected their ages for investigation (for both mice and hamsters), For mice, their definition of adolescence (P21) is earlier than some (e.g. Spear L.P., Neurosci. and Beh. Reviews, 2000).

      There are certainly differences of opinion between researchers as to the precise definition of adolescence and the period it encompasses. Spear, 2000, provides one excellent discussion of the challenges related to identifying adolescence across species. This work gives specific ages only for rats, not mice (as we use here), and characterizes post-natal days 28-42 as being the conservative age range of “peak” adolescence (page 419, paragraph 1). Immediately thereafter the review states that the full adolescent period is longer than this, and it could encompass post-natal days 20-55 (page 419, paragraph 2).

      We have added the following statement to our methods:

      There is no universally accepted way to define the precise onset of adolescence. Therefore, there is no clear-cut boundary to define adolescent onset in rodents (Spear, 2000). Puberty can be more sharply defined, and puberty and adolescence overlap in time, but the terms are not interchangeable. Puberty is the onset of sexual maturation, while adolescence is a more diffuse period marked by the gradual transition from a juvenile state to independence. We, and others, suggest that adolescence in rodents spans from weaning (postnatal day 21) until adulthood, which we take to start on postnatal day 60 (Reynolds and Flores, 2021). We refer to “early adolescence” as the first two weeks postweaning (postnatal days 21-34). These ranges encompass discrete DA developmental periods (Kalsbeek et al., 1988; Manitt et al., 2011; Reynolds et al., 2018a), vulnerability to drug effects on DA circuitry (Hammerslag and Gulley, 2014; Reynolds et al., 2018a), and distinct behavioral characteristics (Adriani and Laviola, 2004; Makinodan et al., 2012; Schneider, 2013; Wheeler et al., 2013).

      References:

      Adriani W, Laviola G. 2004. Windows of vulnerability to psychopathology and therapeutic strategy in the adolescent rodent model. Behav Pharmacol 15:341–352. doi:10.1097/00008877-200409000-00005

      Hammerslag LR, Gulley JM. 2014. Age and sex differences in reward behavior in adolescent and adult rats. Dev Psychobiol 56:611–621. doi:10.1002/dev.21127

      Hoops D, Flores C. 2017. Making Dopamine Connections in Adolescence. Trends in Neurosciences 1–11. doi:10.1016/j.tins.2017.09.004

      Kalsbeek A, Voorn P, Buijs RM, Pool CW, Uylings HBM. 1988. Development of the Dopaminergic Innervation in the Prefrontal Cortex of the Rat. The Journal of Comparative Neurology 269:58–72. doi:10.1002/cne.902690105

      Makinodan M, Rosen KM, Ito S, Corfas G. 2012. A critical period for social experiencedependent oligodendrocyte maturation and myelination. Science 337:1357–1360. doi:10.1126/science.1220845

      Manitt C, Mimee A, Eng C, Pokinko M, Stroh T, Cooper HM, Kolb B, Flores C. 2011. The Netrin Receptor DCC Is Required in the Pubertal Organization of Mesocortical Dopamine Circuitry. J Neurosci 31:8381–8394. doi:10.1523/jneurosci.0606-11.2011

      Reynolds LM, Flores C. 2021. Mesocorticolimbic Dopamine Pathways Across Adolescence: Diversity in Development. Front Neural Circuit 15:735625. doi:10.3389/fncir.2021.735625

      Reynolds LM, Yetnikoff L, Pokinko M, Wodzinski M, Epelbaum JG, Lambert LC, Cossette MP, Arvanitogiannis A, Flores C. 2018. Early Adolescence is a Critical Period for the Maturation of Inhibitory Behavior. Cerebral cortex 29:3676–3686. doi:10.1093/cercor/bhy247

      Schneider M. 2013. Adolescence as a vulnerable period to alter rodent behavior. Cell and tissue research 354:99–106. Doi:10.1007/s00441-013-1581-2

      Spear LP. 2000. Neurobehavioral Changes in Adolescence. Current directions in psychological science 9:111–114. doi:10.1111/1467-8721.00072

      Wheeler AL, Lerch JP, Chakravarty MM, Friedel M, Sled JG, Fletcher PJ, Josselyn SA, Frankland PW. 2013. Adolescent Cocaine Exposure Causes Enduring Macroscale Changes in Mouse Brain Structure. J Neurosci 33:1797–1803. doi:10.1523/jneurosci.3830-12.2013

      (3) Figure 1 - the conclusions hinge on the Netrin-1 staining, as shown in panel G, but the cells are difficult to see. It would be helpful to provide clearer, more zoomed images so readers can better assess the staining. Since Netrin-1 expression reduces dramatically after P4 and they had to use antigen retrieval to see signal, it would be helpful to show some images from additional brain regions and ages to see if expression levels follow predicted patterns. For instance, based on the allen brain atlas, it seems that around P21, there should be high levels of Netrin-1 in the cerebellum, but low levels in the cortex. These would be nice controls to demonstrate the specificity and sensitivity of the antibody in older tissue.

      We do not study the cerebellum and have never stained this region; doing so now would require generating additional tissue and we’re not sure it would add enough to the information provided to be worthwhile. Note that we have stained the forebrain for Netrin-1 previously, providing broad staining of many brain regions (Manitt et al., 2011)

      References:

      Manitt C, Mimee A, Eng C, Pokinko M, Stroh T, Cooper HM, Kolb B, Flores C. 2011. The Netrin Receptor DCC Is Required in the Pubertal Organization of Mesocortical Dopamine Circuitry. J Neurosci 31:8381–8394. doi:10.1523/jneurosci.0606-11.2011

      (4) Figure 3 - Because mice tend to avoid brightly-lit spaces, the light/dark box is more commonly used as a measure of anxiety-like behavior than purely exploratory behavior (including in the paper they cited). It is important to address this possibility in their discussion of their findings. To bolster their conclusions about the coincidence of circuit and behavioral changes in adolescent hamsters, it would be useful to add an additional measure of exploratory behaviors (e.g. hole board).

      Regarding the light/dark box test, this is an excellent point. We prefer the term “risk taking” to “anxiety-like” and now use the former term in our manuscript. Furthermore, our interest in the behaviour is purely to chart the development of adolescent behaviour across our treatment groups, not to study a particular emotional state. Regardless of the specific emotion or emotions governing the light/dark box behaviour, it is an ideal test for charting adolescent shifts in behaviour as it is well-characterized in this respect, as we discuss in our manuscript.

      (5) Supplementary Figure 4,5 The authors defined puberty onset using uterine and testes weights in hamsters. While the weights appear to be different for summer and winter hamsters, there were no statistical comparison. Please add statistical analyses to bolster claims about puberty start times. Also, as many studies use vaginal opening to define puberty onset, it would be helpful to discuss how these measurements typically align and cite relevant literature that described use of uterine weights. Also, Supplementary Figures 4 and 5 were mis-cited as Supp. Fig. 2 in the text (e.g. line 317 and others).

      These are great suggestions. We have added statistical analyses to Supplementary Figures 5 and 6 and provided Vaginal Opening data as Supplementary Figure 7. The statistical analyses confirm that all three characters are delayed in winter hamsters compared to summer hamsters.

      We have also added the following references to the manuscript:

      Darrow JM, Davis FC, Elliott JA, Stetson MH, Turek FW, Menaker M. 1980. Influence of Photoperiod on Reproductive Development in the Golden Hamster. Biol Reprod 22:443–450. doi:10.1095/biolreprod22.3.443

      Ebling FJP. 1994. Photoperiodic Differences during Development in the Dwarf Hamsters Phodopus sungorus and Phodopus campbelli. Gen Comp Endocrinol 95:475–482. doi:10.1006/gcen.1994.1147

      Timonin ME, Place NJ, Wanderi E, Wynne-Edwards KE. 2006. Phodopus campbelli detect reduced photoperiod during development but, unlike Phodopus sungorus, retain functional reproductive physiology. Reproduction 132:661–670. doi:10.1530/rep.1.00019

      (6) The font in many figure panels is small and hard to read (e.g. 1A,D,E,H,I,L...). Please increase the size for legibility.

      We have increased the font size of our figure text throughout the manuscript.

      Reviewer #3

      (15) Fig 1 C,D. Clarify the units of the y axis

      We have now fixed this.

      Full Reference List

      Adriani W, Laviola G. 2004. Windows of vulnerability to psychopathology and therapeutic strategy in the adolescent rodent model. Behav Pharmacol 15:341–352. doi:10.1097/00008877-200409000-00005

      Hammerslag LR, Gulley JM. 2014. Age and sex differences in reward behavior in adolescent and adult rats. Dev Psychobiol 56:611–621. doi:10.1002/dev.21127

      Hoops D, Flores C. 2017. Making Dopamine Connections in Adolescence. Trends in Neurosciences 1–11. doi:10.1016/j.tins.2017.09.004

      Kalsbeek A, Voorn P, Buijs RM, Pool CW, Uylings HBM. 1988. Development of the Dopaminergic Innervation in the Prefrontal Cortex of the Rat. The Journal of Comparative Neurology 269:58–72. doi:10.1002/cne.902690105

      Makinodan M, Rosen KM, Ito S, Corfas G. 2012. A critical period for social experiencedependent oligodendrocyte maturation and myelination. Science 337:1357–1360. doi:10.1126/science.1220845

      Manitt C, Mimee A, Eng C, Pokinko M, Stroh T, Cooper HM, Kolb B, Flores C. 2011. The Netrin Receptor DCC Is Required in the Pubertal Organization of Mesocortical Dopamine Circuitry. J Neurosci 31:8381–8394. doi:10.1523/jneurosci.0606-11.2011

      Reynolds LM, Flores C. 2021. Mesocorticolimbic Dopamine Pathways Across Adolescence: Diversity in Development. Front Neural Circuit 15:735625. doi:10.3389/fncir.2021.735625 Reynolds LM, Yetnikoff L, Pokinko M, Wodzinski M, Epelbaum JG, Lambert LC, Cossette M-P, Arvanitogiannis A, Flores C. 2018. Early Adolescence is a Critical Period for the Maturation of Inhibitory Behavior. Cerebral cortex 29:3676–3686. doi:10.1093/cercor/bhy247

      Schneider M. 2013. Adolescence as a vulnerable period to alter rodent behavior. Cell and tissue research 354:99–106. doi:10.1007/s00441-013-1581-2

      Spear LP. 2000. Neurobehavioral Changes in Adolescence. Current directions in psychological science 9:111–114. doi:10.1111/1467-8721.00072

      Wheeler AL, Lerch JP, Chakravarty MM, Friedel M, Sled JG, Fletcher PJ, Josselyn SA, Frankland PW. 2013. Adolescent Cocaine Exposure Causes Enduring Macroscale Changes in Mouse Brain Structure. J Neurosci 33:1797–1803. doi:10.1523/jneurosci.3830-12.2013

    2. Reviewer #3 (Public Review):

      This study from the Flores group aims at understanding neuronal circuit changes during adolescence which is an ill-defined, transitional period involving dramatic changes in behavior and anatomy. They focus on DA innervation of the prefrontal cortex, and their interaction with the guidance cue Netrin-1. They propose DA axons in the PFC increase in the postnatal period, and their density is reduced in a Netrin 1 knockdown, suggesting that Netrin abets the development of this mesocortical pathway. In such mice impulsivity gauged by a go-no go task is reduced. They then provide some evidence that Unc5c is developmentally regulated in DA axons. Finally they use an interesting hamster model, to study the effect of light hours on mesocortical innervation, and make some interesting observations about the timing of innervation and Unc5c expression, and the fact that females housed in winter day length conditions display an accelerated innervation of the prefrontal cortex.

      Comments on the revision. Several points were addressed; some remain to be addressed.

      4. It's not clear to me that TH doesnt stain noradrenergic axons in the PFC. See Islam and Blaess, 2021, and references therein.

      6. The Netrin knockdown data provided is from a previous study/samples.

      8. While the authors make the argument that the behavior is linked to DA, they still haven't formally tested it, in my opinion.

      13. Fig 3, UNc 5c levels are not yet quantified. Furthermore, I agree with the previous reviewer that Unc5C knockdown would corroborate key aspects of the model.

      New - Developmental trajectory of prefrontal TH-positive axons from early adolescence to adulthood is similar in male and female rats, (Willing Juraska et al., 2017). This needs discussion.

    3. Reviewer #1 (Public Review):

      In this study, Hoops et al. showed that Netrin-1 and UNC5c can guide dopaminergic innervation from nucleus accumbens to cortex during adolescence in rodent models. They found that these dopamine axons project to the prefrontal cortex in a Netrin-1 dependent manner and knocking down Netrin-1 disrupted motor and learning behaviors in mice. Furthermore, the authors used hamsters, a seasonal model that is affected by the length of daylight, to demonstrate that the guidance of dopamine axons is mediated by the environmental factor such as daytime length and in sex dependent manner.

      Regarding the cell type specificity of Netrin-1 expression, the authors began by stating "this question is not the focus of the study and we consider it irrelevant to the main issue we are addressing, which is where in the forebrain regions we examined Netrin-1+ cells are present." This statement contradicts the exact issue regarding the specificity issue I raised. They then went on to show the RNAscope data for Netriin-1 in Figure 2, which showed Netrin-1 mRNA was actually expressed quite ubiquitously in anterior cingulate cortex, dorsopeduncular cortex, infralimbic cortex, prelimbic cortex, etc. In addition, contrary to the authors' statement that Netrin-1 is a "secreted protein", the confocal images in Figure 1 in the rebuttal letter actually show Netrin-1 present in "granule-like" organelles inside the cytoplasm of neurons. Finally, the authors presented Figure 7 to indicate the location where virus expressing Netrin-1 shRNA might be located. Again, the brain region targeted was quite focal and most likely did not cover all the Netrin-1+ brain regions in Figure 2. Collectively, these results raised more questions regarding the specificity of Netrin-1 expression in brain regions that are behaviorally relevant to this study.

      With respect to the effectiveness of Netrin-1 knockdown in the animals in this study, the authors cited data in HEK293 cells (Figure 5), which did not include any statistics, and previously published in vivo data in a separate, independent study (Figure 6). They do not provide any data regarding the effectiveness of Netrin-1 knockdown in THIS study.

      Similar concerns regarding UNC5C knockdown (points #6, #7, and #8) were not adequately addressed.

      In brief, while this study provides a potential role of Netrin-1-UNC5C in target innervation of dopaminergic neurons and its behavioral output in risk-taking, the data lack sufficient evidence to firmly establish the cause-effect relationship.

    4. Reviewer #2 (Public Review):

      In this manuscript, Hoops et al., using two different model systems, identified key developmental changes in Netrin-1 and UNC5C signaling that correspond to behavioral changes and are sensitive to environmental factors that affect the timing of development. They found that Netrin-1 expression is highest in regions of the striatum and cortex where TH+ axons are travelling, and that knocking down Netrin-1 reduces TH+ varicosities in mPFC and reduces impulsive behaviors in a Go-No-Go test. Further, they show that the onset of Unc5 expression is sexually dimorphic in mice, and that in Siberian hamsters, environmental effects on development are also sexually dimorophic. This study addresses an important question using approaches that link molecular, circuit and behavioral changes. Understanding developmental trajectories of adolescence, and how they can be impacted by environmental factors, is an understudied area of neuroscience that is highly relevant to understanding the onset of mental health disorders. I appreciated the inclusion of replication cohorts within the study.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper, the authors developed an image analysis pipeline to automacally idenfy individual neurons within a populaon of fluorescently tagged neurons. This applicaon is opmized to deal with mul-cell analysis and builds on a previous soware version, developed by the same team, to resolve individual neurons from whole-brain imaging stacks. Using advanced stascal approaches and several heuriscs tailored for C. elegans anatomy, the method successfully idenfies individual neurons with a fairly high accuracy. Thus, while specific to C. elegans, this method can become instrumental for a variety of research direcons such as in-vivo single-cell gene expression analysis and calcium-based neural acvity studies.

      Thank you.

      Reviewer #2 (Public Review):

      The authors succeed in generalizing the pre-alignment procedure for their cell idenficaon method to allow it to work effecvely on data with only small subsets of cells labeled. They convincingly show that their extension accurately idenfies head angle, based on finding auto florescent ssue and looking for a symmetric l/r axis. They demonstrate method works to allow the idenficaon of a parcular subset of neurons. Their approach should be a useful one for researchers wishing to idenfy subsets of head neurons in C. elegans, and the ideas might be useful elsewhere.

      The authors also assess the relave usefulness of several atlases for making identy predicons. They atempt to give some addional general insights on what makes a good atlas, but here insights seem less clear as available data does not allow for experiments that cleanly decouple: 1. the number of examples in the atlas 2. the completeness of the atlas. and 3. the match in strain and imaging modality discussed. In the presented experiments the custom atlas, besides the strain and imaging modality mismatches discussed is also the only complete atlas with more than one example. The neuroPAL atlas, is an imperfect stand in, since a significant fracon of cells could not be idenfied in these data sets, making it a 60/40 mix of Openworm and a hypothecal perfect neuroPAL comparison. This waters down general insights since it is unclear if the performance is driven by strain/imaging modality or these difficules creang a complete neuroPal atlas. The experiments do usefully explore the volume of data needed. Though generalizaon remains to be shown the insight is useful for future atlas building that for the specific (small) set of cells labeled in the experiments 5-10 examples is sufficient to build a accurate atlas.

      The reviewer brings up an interesting point. As the reviewer noted, given the imperfection of the datasets (ours and others’), it is possible that artifacts from incomplete atlases can interfere with the assessment of the performances of different atlases. To address this, as the reviewer suggested, we have searched the literature and found two sets of data that give specific coordinates of identified neurons (both using NeuroPAL). We compared the performance of the atlases derived from these datasets to the strain-specific atlases, and the original conclusion stands. Details are now included in the revised manuscript (Figure 3- figure supplement 2).

      Recommendaons for the authors:

      Reviewer #1 (Recommendaons For The Authors):

      I appreciate the new mosaic analysis (Fig. 3 -figure suppl 2). Please fix the y-axis ck label that I believe should be 0.8 (instead of 0.9).

      We thank the reviewer for spotting the typo. We have fixed the error.

      **Reviewer #2 (Recommendaons For The Authors):

      Though I'm not familiar with the exact quality of GT labels in available neuroPAL data I know increasing volumes of published data is available. Comparison with a complete neuroPAL atlas, and a similar assessment on atlas size as made with the custom atlas would to my mind qualitavely increase the general insights on atlas construcon.

      We thank the reviewer for the insightful suggestion. We have newly constructed several other NeuroPAL atlases by incorporating neuron positional data from two other published data: [Yemini E. et al. NeuroPAL: A Multicolor Atlas for Whole-Brain Neuronal Identification in C. elegans. Cell. 2021 Jan 7;184(1):272-288.e11] and [Skuhersky, M. et al. Toward a more accurate 3D atlas of C. elegans neurons. BMC Bioinformatics 23, 195 (2022)].

      Interestingly, we found that the two new atlases (NP-Yemini and NP-Skuhersky) have significantly different values of PA, LR, DV, and angle relationships for certain cells compared to the OpenWorm and glr-1 atlases. For example, in both the NP atlases, SMDD is labeled as being anterior to AIB, which is the opposite of the SMDD-AIB relationship in the glr-1 atlas.

      Because this relationship (and other similar cases) were missing in our original NeuroPAL atlas (NP-Chaudhary), the addition of these two NeuroPAL datasets to our NeuroPAL atlas dramatically changed the atlas. As a result, incorporating the published data sets into the NeuroPAL atlas (NP-all) actually decreased the average prediction accuracy to 44%, while the average accuracy of original NeuroPAL atlas (NP-Chaudhary) was 57%. The atlas based on the Yemini et al. data alone (NP-Yemini) had 43% accuracy, and the atlas based on the Skuhersky et al. data alone (NP-Skuhersky) had 38% accuracy.

      For the rest of our analysis, we focused on comparing the NeuroPAL atlas that resulted in the highest accuracy against other atlases in figure 3 (NP-Chaudhary). Therefore, we have added Figure 3- figure supplement 2 and the following sentence in the discussion. “Several other NeuroPAL atlases from different data sources were considered, and the atlas that resulted in the highest neuron ID correspondence was selected (Figure 3- figure supplement 2).”

      Author response image 1.

      Figure3- figure supplement 2. Comparison of neuron ID correspondences resulng from addional atlases- atlases driven from NeuroPAL neuron posional data from mulple sources (Chaudhary et al., Yemini et al., and Skuhersky et al.) in red compared to other atlases in Figure 3. Two sample t-tests were performed for stascal analysis. The asterisk symbol denotes a significance level of p<0.05, and n.s. denotes no significance. OW: atlas driven by data from OpenWorm project, NP-source: NeuroPAL atlas driven by data from the source. NP-Chaudhary atlas corresponds to NeuroPAL atlas in Figure 3.

      80% agreement among manual idenficaons seems low to me for a relavely small, (mostly) known set of cells, which seems to cast into doubt ground truth idenes based on a best 2 out of 3 vote. The authors menon 3% of cell idenes had total disagreement and were excluded, what were the fracon unanimous and 2/3? Are there any further insights about what limited human performance in the context of this parcular idenficaon task?

      We closely looked into the manual annotation data. The fraction of cells in unanimous, two thirds, and no agreement are approximately 74%, 20%, and 6%, respectively. We made the corresponding change in the manuscript from 3% to 6%. Indeed, we identified certain patterns in labels that were more likely to be disagreed upon. First, cells in close proximity to each other, such as AVE and RMD, were often switched from annotator to annotator. Second, cells in the posterior part of the cluster, such as RIM, AVD, AVB, were more variable in positions, so their identities were not clear at times. Third, annotators were more likely to disagree on cells whose expressions are rare and low, and these include AIB, AVJ, and M1. These observations agree with our results in figure 4c.

    2. eLife assessment

      This research advance article describes a valuable image analysis method to identify individual neurons within a ‎population of fluorescently labeled cells in the nematode C. elegans. The findings are solid and the method succeeds to identify cells with high precision. The method will be valuable to the C. elegans research community.

    3. Reviewer #1 (Public Review):

      In this paper, the authors developed an image analysis pipeline to automatically identify individual ‎‎neurons within a population of fluorescently tagged neurons. This application is optimized to deal with ‎‎multi-cell analysis and builds on a previous software version, developed by the same team, to resolve ‎‎individual neurons from whole-brain imaging stacks. Using advanced statistical approaches and ‎‎several heuristics tailored for C. elegans anatomy, the method successfully identifies individual ‎‎neurons with a fairly high accuracy. Thus, while specific to C. elegans, this method can ‎become ‎instrumental for a variety of research directions such as in-vivo single-cell gene expression ‎analysis ‎and calcium-based neural activity studies.‎

    4. Reviewer #2 (Public Review):

      The authors succeed in generalizing the pre-alignment procedure for their cell identification method to allow it to work effectively on data with only small subsets of cells labeled. They convincingly show that their extension accurately identifies head angle, based on finding auto florescent tissue and looking for a symmetric l/r axis. Their demonstrated method works to allow the identification of a particular subset of neurons. Their approach should be a useful one for researchers wishing to identify subsets of head neurons in C. elegans, and the ideas might be useful elsewhere.

      The authors also assess the relative usefulness of several atlases for making identity predictions. They attempt to give some additional general insights on what makes a good atlas, and clearly demonstrate the value of more data. Some insights seem less clear as available data do not allow for experiments that cleanly decouple: 1) the number of examples in the atlas; 2) the completeness of the atlas; and 3) the match in strain and imaging modality discussed. In the presented experiments the custom atlas, besides the strain and imaging modality congruence discussed is also the only complete atlas with more than one example. The main neuroPAL atlas is an imperfect stand-in since a significant fraction of cells could not be identified in these data sets, making it a 60/40 mix of Openworm and a hypothetical perfect neuroPAL comparison. The alternate neuroPal atlases shown in supplemental figure 4 are complete but provide only one point cloud.

      It is striking that in the best available apples to apples match the single data set glr-1 atlas produces qualitatively better results than the single (complete) neuroPAL atlas. This is a clear performance advantage given the ground truth. This is as good an evaluation as is possible given current data however given the inexact nature of assigning ground truth identities I think it is difficult from results to tease out if this is due to strain, imaging conditions or systematically different identifications of cells from different sources.

      The experiments do usefully explore the volume of data needed. Though generalization to other arbitrary cell subsets remains to be shown the insight is useful for future atlas building that for the specific (small) set of cells labeled in the experiments 5-10 examples is sufficient to build an accurate atlas.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this manuscript, Yao et al. explored the transcriptomic characteristics of neural stem cells (NSCs) in the human hippocampus and their changes under different conditions using single-nucleus RNA sequencing (snRNA-seq). They generated single-nucleus transcriptomic profiles of human hippocampal cells from neonatal, adult, and aging individuals, as well as from stroke patients. They focused on the cell groups related to neurogenesis, such as neural stem cells and their progeny. They revealed genes enriched in different NSC states and performed trajectory analysis to trace the transitions among NSC states and towards astroglial and neuronal lineages in silico. They also examined how NSCs are affected by aging and injury using their datasets and found differences in NSC numbers and gene expression patterns across age groups and injury conditions. One major issue of the manuscript is questionable cell type identification. For example, more than 50% of the cells in the astroglial lineage clusters are NSCs, which is extremely high and inconsistent with classic histology studies.

      While the authors have made efforts to address previous critics, major concerns have not been adequately addressed, including a very limited sample size and with poor patient information. In addition, some analytical approaches are still questionable and the authors acknowledged that some they cannot address. Therefore, while the topic is interesting, some results are preliminary and some conclusions are not fully supported by the data presented.

      We thank the reviewer for reevaluating our revised manuscript. We respect the reviewer’s comments and discuss the technical and conceptual limitations of this work. Here we provide the response to Reviewer #1 (Public Review) on these below.

      Firstly, we appreciate the concerns raised by Reviewer 1 regarding the high proportion of NSCs within the astroglia lineage clusters. it is worth mentioning that distinguishing hippocampal qNSCs from astrocytes by transcription profiling poses a significant challenge in the field due to their high transcriptional similarity. From previous global UMAP analysis, AS1 (adult specific) can be separated from qNSCs, but AS2 (NSC-like astrocytes) cannot. Therefore, the data presented in Figure 2C to G aimed to further distinguish the qNSCs from AS2 by using gene set scores analysis. Based on different scores, we categorized qNSC/AS lineages into qNSC1, qNSC2 and AS2. Figure 2C presented the UMAP plot of qNSC/AS2 population from only neonatal sample. We apologize for not clarifying this in the figure legend. We have now clarified this information in the figure legend of Figure 2C. More importantly, we have added UMAP plots and quantifications for other groups in Figure 2-Supplement 2A and B, including adult, aging, and injure samples. This supplementary figure provides more complete information of the cell type composition and dynamic variations during aging and injury. Although the ratio of NSCs in the astroglia lineage clusters remains higher compared to classic histology studies, the trends indicate a reduction in qNSCs and an increase in astrocytes during aging and injury, which supports that cell type identification by using gene set score analysis is effective, although still not optimal. Combined methods to accurately distinguish between qNSCs and astrocytes are required in the future, and we also discuss this in the corresponding texts.

      Secondly, we cannot adequately address the major concern regarding sample size raised by the reviewer due to the scarcity of stroke and neonatal human brain samples. We have collected additional details about the donors. Please refer to Figure 1-source data 1 for the updated information. Other information regarding the lifestyle parameters of these donors has not been sufficiently recorded by the hospital. Therefore, we cannot improve the patient information further.

      Thirdly, regarding the questionable subpopulations of granule cells (GCs) that derive from neuroblasts in Figure 4A-4D, which are inconsistent with previous single-cell transcriptomic studies, we tried various strategies to confirm the identity of the two subpopulations of granule cells (GCs) derived from neuroblasts but didn’t get a clear answer. As a result, we can only provide an objective description of the differences in gene expression and developmental trajectory and speculate that these differences may be related to their degree of maturity but are not aligned on the same trajectory.

      In the end, we have discussed the technical and conceptual limitations of this work and added a brief discussion about these limitations in the last paragraph of the main text. We hope the readers can interprate our data critically and objectively.

      Reviewer #2 (Public Review):

      In this manuscript, Yao et al. present a series of experiments aiming at generating a cellular atlas of the human hippocampus across aging, and how it may be affected by injury, in particular, stroke. Although the aim of the study is interesting and relevant for a larger audience, due to the ongoing controversy around the existence of adult hippocampal neurogenesis in humans, a number or technical weaknesses result in a poor support for many of the conclusions made from the results of these experiments.

      In particular, a recent meta analysis of five previous studies applying similar techniques to human samples has identified different aspects of sample size as main determinants of the statistical power needed to make significant conclusions. Some of this aspects are the number of nuclei sequenced and subject stratification. These two aspects are of concern in Yao's study. First, the number of sequenced nuclei is lower than the calculated numbers of nuclei required for detecting rare cell types. However, Yao et al. report succeeding in detecting rare populations, including several types of neural stem cells in different proliferation states, which have been demonstrated to be extremely scarce by previous studies. It would be very interesting to read how the authors interpret these differences. Secondly, the number of donors included in some of the groups is extremely low (n=1) and the miscellaneous information provided about the donors is practically inexistent. As individual factors such as chronic conditions, medication, lifestyle parameters, etc... are considered determinant for the variability of adult hippocampal neurogenesis levels across individuals, this represents a series limitation of the current study. Overall, several technical weaknesses severely limit the relevance of this study and the ability of the authors to achieve their experimental aims.

      After a first review round, the manuscript is still lacking a clear discussion of its several technical limitations, which will help the audience to grasp the relevance of the findings. In particular, detailed information about individual patients health status and relevant lifestyle parameters that may have affected it is lacking. The authors make the point themselves that the discrepancies among studies might be caused by health state differences across hippocampi, which subsequently lead to different degrees of hippocampal neurogenesis.". So, even in the authors own interpretation this is a serious limitation to the manuscript, that however out of the authors control, impacts on the quality of their findings.

      Reviewer #2 (Recommendations For The Authors):

      Please see public review. I do understand the authors point about incomplete patient data collection and low patient numbers and how the former is out of their control. Nevertheless, these are crucial parameters that impact negatively on the quality and relevance of several of their bold claims in the manuscript, especially given the low number of patients included. The current version still lacks a clear and honest discussion of the several technical and conceptual limitations of the authors work, as in some cases they are presented to the reviewers in the rebuttal letter, for the readership, so that they could critically evaluate the relevance of the authors' finding in a bigger perspective.

      We thank the reviewer for reevaluating our revised manuscript. We respect the reviewer’s comm¬ents and discuss the technical and conceptual limitations of this work. Here we provide the response to Reviewer #2 (Public Review) on these below.

      We understand the reviewer’s concern and have also noticed that according to the computational modeling conducted by Tosoni et al. (Neuron, 2023), at least 21 neuroblast cells (NBs) can be identified out of 30,000 granule cells (GCs) from a total of 180,000 dentate gyrus (DG) cells. In our dataset, we sequenced 24,671 GC nuclei and 92,966 total DG cell nuclei, which also includes neonatal samples. The number of nuclei we sequenced is 4.5 times higher than that of Wang et al. (Cell Research, 2022), who also detected NBs. Therefore, it is possible that we are able to detect NBs. Importantly, we have implemented strict quality control measures to support the reliability of our sequencing data. These measures include: 1. Immediate collection of tissue samples after postmortem (3-4 hrs) to ensure the quality of isolated nuclei. 2. Only nuclei expressing more than 200 genes but fewer than 5000-8600 genes (depending on the peak of enrichment genes) were considered. On average, each cell detected around 3000 genes. 3. The average proportion of mitochondrial genes in each sample was approximately 1.8%, with no sample exceeding 5%. We have shown that the number of cells captured from individual samples and the average number of genes detected per cell are sufficient, indicating overall good sequencing quality (Figure 1-supplement 1A,B andF, and Figure 1-source data 1). Additionally, we have further confirmed the presence of these cell types with low abundance by integrating immunofluorescence staining (Figure 4E, 5D and 6B), cell type-specific gene expression (Figure1 C and D), overall transcriptomic characteristics (Figure 1-supplement 1E), and developmental potential (Figure4 A-D, Figure 6E and F). We hope these evidences together could explain why we can identify the rare neurogenic populations.

      Regarding the limited sample size and poor patient information, we cannot adequately address these two major concerns. Due to the scarcity of stroke or neonatal human samples, it was not feasible to collect a larger sample size within the expected timeframe. We have collected additional details about the donors. Please refer to Figure 1-source data 1 for the updated information. Other information regarding the lifestyle parameters of these donors has not been sufficiently recorded by the hospital. Therefore, we cannot improve the patient information further.

      As per the reviewer’s recommendation, in the latest version, we have discussed the technical and conceptual limitations of this work and added a brief discussion about these limitations in the last paragraph of the main text. We hope the readers can interprate our data critically and objectively.

    2. Reviewer #2 (Public Review):

      In this manuscript, Yao et al. present a series of experiments aiming at generating a cellular atlas of the human hippocampus across aging, and how it may be affected by injury, in particular, stroke. Although the aim of the study is interesting and relevant for a larger audience, due to the ongoing controversy around the existence of adult hippocampal neurogenesis in humans, a number or technical weaknesses result in a poor support for many of the conclusions made from the results of these experiments.<br /> In particular, a recent meta analysis of five previous studies applying similar techniques to human samples has identified different aspects of sample size as main determinants of the statistical power needed to make significant conclusions. Some of this aspects are the number of nuclei sequenced and subject stratification. These two aspects are of concern in Yao's study. First, the number of sequenced nuclei is lower than the calculated numbers of nuclei required for detecting rare cell types. However, Yao et al. report succeeding in detecting rare populations, including several types of neural stem cells in different proliferation states, which have been demonstrated to be extremely scarce by previous studies. It would be very interesting to read how the authors interpret these differences. Secondly, the number of donors included in some of the groups is extremely low (n=1) and the miscellaneous information provided about the donors is practically inexistent. As individual factors such as chronic conditions, medication, lifestyle parameters, etc... are considered determinant for the variability of adult hippocampal neurogenesis levels across individuals, this represents a series limitation of the current study. Overall, several technical weaknesses severely limit the relevance of this study and the ability of the authors to achieve their experimental aims.

      After a first review round, the manuscript is still lacking a clear discussion of its several technical limitations, which will help the audience to grasp the relevance of the findings. In particular, detailed information about individual patients health status and relevant lifestyle parameters that may have affected it is lacking. The authors make the point themselves that the discrepancies among studies might be caused by health state differences across hippocampi, which subsequently lead to different degrees of hippocampal neurogenesis." So, even in the authors own interpretation this is a serious limitation to the manuscript, that however out of the authors control, impacts on the quality of their findings.

    3. eLife assessment

      Using state-of-the-art single-nucleus RNA sequencing, Yao et al. investigate the transcriptomic features of neural stem cells (NSCs) in the human hippocampus to address how they vary across different age groups and stroke conditions. The authors report alterations in NSC subtype proportions and gene expression profiles after stroke. Although the study is valuable and the analysis is comprehensive, the significance is restricted by well-acknowledged technical limitations leading to incomplete evidence supporting some main conclusions.

    4. Reviewer #1 (Public Review):

      In this manuscript, Yao et al. explored the transcriptomic characteristics of neural stem cells (NSCs) in the human hippocampus and their changes under different conditions using single-nucleus RNA sequencing (snRNA-seq). They generated single-nucleus transcriptomic profiles of human hippocampal cells from neonatal, adult, and aging individuals, as well as from stroke patients. They focused on the cell groups related to neurogenesis, such as neural stem cells and their progeny. They revealed genes enriched in different NSC states and performed trajectory analysis to trace the transitions among NSC states and towards astroglial and neuronal lineages in silico. They also examined how NSCs are affected by aging and injury using their datasets and found differences in NSC numbers and gene expression patterns across age groups and injury conditions. One major issue of the manuscript is questionable cell type identification. For example, more than 50% of the cells in the astroglial lineage clusters are NSCs, which is extremely high and inconsistent with classic histology studies.

      While the authors have made efforts to address previous critics, major concerns have not been adequately addressed, including a very limited sample size and patient information. In addition, some analytical approaches are still questionable and the authors acknowledge some issues they cannot address. Therefore, while the topic is interesting, some results are preliminary and some conclusions are not fully supported by the data presented.

    1. Author Response

      The following is the authors’ response to the original reviews.

      We thank the reviewers and the editors for their careful reading of our manuscript and for the detailed and constructive feedback on our work. Please find attached the revised version of the manuscript. We performed an extensive revision of the manuscript to address the issues raised by the referees. We provide new analyses (regarding the response consistency and the neural complexity), added supplementary figures and edits to figures and texts. Based on the reviewers’ comments, we introduced several major changes to the manuscript.

      Most notably, we

      • added a limitation statement to emphasize the speculative nature of our interpretation of the timing of word processing/associative binding

      • emphasized the limitations of the control condition

      • added analyses on the interaction between memory retrieval after 12h versus 36h

      • clarified our definition of episodic memory

      • added detailed analyses of the “Feeling of having heard” responses and the confidence ratings

      We hope that the revised manuscript addresses the reviewers' comments to their satisfaction. We believe that the revised manuscript has been significantly improved owing to the feedback provided. Below you can find a point-by-point response to each reviewer comment in blue. We are looking forward that the revision will be published in the Journal eLife.

      Reviewer #1 (Public Review):

      The authors show that concurrently presenting foreign words and their translations during sleep leads to the ability to semantically categorize the foreign words above chance. Specifically, this procedure was successful when stimuli were delivered during slow oscillation troughs as opposed to peaks, which has been the focus of many recent investigations into the learning & memory functions of sleep. Finally, further analyses showed that larger and more prototypical slow oscillation troughs led to better categorization performance, which offers hints to others on how to improve or predict the efficacy of this intervention. The strength here is the novel behavioral finding and supporting physiological analyses, whereas the biggest weakness is the interpretation of the peak vs. trough effect.

      R1.1. Major importance:

      I believe the authors could attempt to address this question: What do the authors believe is the largest implication of this studies? How far can this technique be pushed, and how can it practically augment real-world learning?

      We revised the discussion to put more emphasis on possible practical applications of this study (lines 645-656).

      In our opinion, the strength of this paper is its contribution to the basic understanding of information processing during deep sleep, rather than its insights on how to augment realworld learning. Given the currently limited data on learning during sleep, we believe it would be premature to make strong claims about potential practical applications of sleep-learning. In addition, as pointed out in the discussion section, we do not know what adverse effects sleep-learning has on other sleep-related mechanisms such as memory consolidation.

      R1.2. Lines 155-7: How do the authors argue that the words fit well within the half-waves when the sounds lasted 540 ms and didn't necessarily start right at the beginning of each half-wave? This is a major point that should be discussed, as part of the down-state sound continues into the up-state. Looking at Figure 3A, it is clear that stimulus presented in the slow oscillation trough ends at a time that is solidly into the upstate, and would not neurolinguists argue that a lot of sound processing occurs after the end of the sound? It's not a problem for their findings, which is about when is the best time to start such a stimulus, but it's a problem for the interpretation. Additionally, the authors could include some discussion on whether possibly presenting shorter sounds would help to resolve the ambiguities here.

      The word pairs’ presentations lasted on average ~540 ms. Importantly, the word pairs’ onset was timed to occur 100 ms before the maximal amplitude of the targeted peaks/troughs.

      Therefore, most of a word’s sound pattern appeared during the negative going half-wave (about 350ms of 540ms). Importantly, Brodbeck and colleagues (2022) have shown that phonemes are continuously analyzed and interpreted with delays of about 50-200 ms, peaking at 100ms delay. These results suggest that word processing started just following the negative maximum of a trough and finished during the next peak. Our interpretation (e.g. line 520+) suggests that low-level auditory processing reaches the auditory cortex before the positive going half-wave. During the positive going half-wave the higher-level semantic networks appear the extract the presented word's meaning and associate the two simultaneously presented words. We clarified the time course regarding slow-wave phases and sound presentation in the manuscript (lines 158-164). Moreover, we added the limitation that we cannot know for sure when and in which slow-wave phase words were processed (lines 645-656). Future studies might want to look at shorter lasting stimuli to narrow down the timing of the word processing steps in relation to the sleep slow waves.

      R1.3. Medium importance:

      Throughout the paper, another concern relates to the term 'closed-loop'. It appears this term has been largely misused in the literature, and I believe the more appropriate term here is 'real-time' (Bergmann, 2018, Frontiers in Psychology; Antony et al., 2022, Journal of Sleep Research). For instance, if there were some sort of algorithm that assessed whether each individual word was successfully processed by the brain during sleep and then the delivery of words was subsequently changed, that could be more accurately labelled as 'closed-loop'.

      We acknowledge that the meaning of “closed-loop” in its narrowest sense is not fulfilled here. We believe that “slow oscillation phase-targeted, brain-state-dependent stimulation” is the most appropriate term to describe the applied procedure (BSDBS, Bergmann, 2018). We changed the wording in the manuscript to brain-state-dependent stimulation algorithm. Nevertheless, we would like to point out that the algorithm we developed and used (TOPOSO) is very similar to the algorithms often termed closed-loop algorithm in memory and sleep (e.g. Esfahani et al., 2023; Garcia-Molina et al., 2018; Ngo et al., 2013, for a comparison of TOPOSO to these techniques see Wunderlin et al., 2022 and for more information about TOPOSO see Ruch et al., 2022).

      R1.4. Figure 5 and corresponding analyses: Note that the two conditions end up with different sounds with likely different auditory complexities. That is, one word vs. two words simultaneously likely differ on some low-level acoustic characteristics, which could explain the physiological differences. Either the authors should address this via auditory analyses or it should be added as a limitation.

      This is correct, the two conditions differ on auditory complexities. Accordingly, we added this issue as another limitation of the study (line 651-653). We had decided for a single word control condition to ensure that no associative learning (between pseudowords) could take place in the control condition because this was the critical learning process in the experimental condition. We would like to point out that we observed significant differences in brain responses to the presentation of word-pairs (experimental condition) vs single pseudowords (control condition) in the Trough condition, but not the Peak condition. If indeed low-level acoustic characteristics explained the EEG differences occurring between the two conditions then one would expect these differences occurring in both the trough and the peak condition because earlier studies showed that low-level acoustic processing proceeds in both phases of slow waves (Andrillon et al., 2016; Batterink et al., 2016; Daltrozzo et al., 2012).

      R1.5. Line 562-7 (and elsewhere in the paper): "episodic" learning is referenced here and many times throughout the paper. But episodic learning is not what was enhanced here. Please be mindful of this wording, as it can be confusing otherwise.

      The reported unconscious learning of novel verbal associations during sleep may not match textbook definitions of episodic memory. However, the traditional definitions of episodic memory have long been criticised (e.g., Dew & Cabeza, 2011; Hannula et al., 2023; Henke, 2010; Reder et al., 2009; Shohamy & Turk-Browne, 2013).

      We stand by our claim that sleep-learning was of episodic nature. Here we use a computational definition of episodic memory (Cohen & Eichenbaum, 1993; Henke, 2010; O’Reilly et al., 2014; O’Reilly & Rudy, 2000) and not the traditional definition of episodic memory that ties episodic memory to wakefulness and conscious awareness (Gabrieli, 1998; Moscovitch, 2008; Schacter, 1998; Squire & Dede, 2015; Tulving, 2002). We revised the manuscript to clarify that and how our definition differs from traditional definitions. Please see reviewer comment R3.1 for a more extensive answer.

      Reviewer #2 (Public Review):

      In this project, Schmidig, Ruch and Henke examined whether word pairs that were presented during slow-wave sleep would leave a detectable memory trace 12 and 36 hours later. Such an effect was found, as participants showed a bias to categorize pseudowords according to a familiar word that they were paired with during slow-wave sleep. This behavior was not accompanied by any sign of conscious understanding of why the judgment was made, and so demonstrates that long-term memory can be formed even without conscious access to the presented content. Unconscious learning occurred when pairs were presented during troughs but not during peaks of slow-wave oscillations. Differences in brain responses to the two types of presentation schemes, and between word pairs that were later correctly- vs. incorrectly-judged, suggest a potential mechanism for how such deep-sleep learning can occur.

      The results are very interesting, and they are based on solid methods and analyses. Results largely support the authors' conclusions, but I felt that there were a few points in which conclusions were not entirely convincing:

      R2.1. As a control for the critical stimuli in this study, authors used a single pseudoword simultaneously played to both ears. This control condition (CC) differs from the experimental condition (EC) in a few dimensions, among them: amount of information provided, binaural coherence and word familiarity. These differences make it hard to conclude that the higher theta and spindle power observed for EC over CC trials indicate associative binding, as claimed in the paper. Alternative explanations can be made, for instance, that they reflect word recognition, as only EC contains familiar words.

      We agree. In the revised version of the manuscript, we emphasise this as a limitation of our study (line 653-656). Moreover, we understand that the differences between stimuli of the control and the experimental condition must not rely only on the associative binding of two words. We cautioned our interpretation of the findings.

      Interestingly, EC vs CC exhibits differences following trough- but not peak targeting (see R1.4). If indeed all the EC vs CC differences were unrelated to associative binding, we would expect the same EC vs CC differences when peaks were targeted. Hence, the selective EC vs CC differences in the trough condition suggest that the brain is more responsive to sound, information, word familiarity and word semantics during troughs, where we found successful learning, compared to peaks, where no learning occurred. Troughtargeted word pairs (EC) versus foreign words (CC) enhanced the theta power 336 at 500 ms following word onset and this theta enhancement correlated significantly with interindividual retrieval performance indicating that theta probably promoted associative learning during sleep. This correlation was insignificant for spindle power.

      R2.2. The entire set of EC pairs were tested both following 12 hours and following 36 hours. Exposure to the pairs during test #1 can be expected to have an effect over memory one day later, during test #2, and so differences between the tests could be at least partially driven by the additional activation and rehearsal of the material during test #1. Therefore, it is hard to draw conclusions regarding automatic memory reorganization between 12 and 36 hours after unconscious learning. Specifically, a claim is made regarding a third wave of plasticity, but we cannot be certain that the improvement found in the 36 hour test would have happened without test #1.

      We understand that the retrieval test at 12h may have had an impact on performance on the retrieval test at 36h. Practicing retrieval of newly formed memories is known to facilitate future retrieval of the same memories (e.g. Karpicke & Roediger, 2008). Hence, practicing the retrieval of sleep-formed memories during the retrieval test at 12h may have boosted performance at 36h.

      However, recent literature suggests that retrieval practice is only beneficial when corrective feedback is provided (Belardi et al., 2021; Metcalfe, 2017). In our study, we only presented the sleep-played pseudowords at test and participants received no feedback regarding the accuracy of their responses. Thus, a proper conscious re-encoding could not take place. Nevertheless, the retrieval at 12h may have altered performance at 36h in other ways. For example, it could have tagged the reactivated sleep-formed memories for enhanced consolidation during the next night (Rabinovich Orlandi et al., 2020; Wilhelm et al., 2011).

      We included a paragraph on the potential carry-over effects from retrieval at 12h on retrieval at 36h in the discussion section (line 489-496; line 657-659). Furthermore, we removed the arguments about the “third wave of plasticity”.

      R2.3. Authors claim that perceptual and conceptual processing during sleep led to increased neural complexity in troughs. However, neural complexity was not found to differ between EC and CC, nor between remembered and forgotten pairs. It is therefore not clear to me why the increased complexity that was found in troughs should be attributed to perceptual and conceptual word processing, as CC contains meaningless vowels. Moreover, from the evidence presented in this work at least, I am not sure there is room to infer causation - that the increase in HFD is driven by the stimuli - as there is no control analysis looking at HFD during troughs that did not contain stimulation.

      With the analysis of the HFD we would like to provide an additional perspective to the oscillation-based analysis. We checked whether the boundary condition of Peak and Trough targeting changes the overall complexity or information content in the EEG. Our goal was to assess the change in neural complexity (relative to a pre-stimulus baseline) following the successful vs unsuccessful encoding of word pairs during sleep.

      We acknowledge that a causal interpretation about HFD is not warranted, and we revised the manuscript accordingly. It was unexpected that we could not find the same results in the contrast of EC vs CC or correct vs incorrect word pairs. We suggest that our signal-to noise ratio might have been too weak.

      One could argue that the phase targeting alone (without stimulation) induces peak/trough differences in complexity. We cannot completely rule out this concern. But we tried to use the EEG that was not influenced by the ongoing slow-wave: the EEG 2000-500ms before the stimulus onset and 500-2000ms after the stimulus onset. Therefore, we excluded the 1s of the targeted slow-wave, hoping that most of the phase inherent complexity should have faded out (see Figure 2). We could not further extend the time window of analysis due to the minimal stimulus onset interval of 2s. Of course we cannot exclude that the targeted Trough impacted the following HFD. We clarified this in the manuscript (line 384-425).

      Furthermore, we did find a difference of neural complexity between the pre-stimulus baseline and the post-stimulus complexity in the Peak condition but not in the Trough condition (we now added this contrast to the manuscript, line 416-419). Hence, the change in neural complexity is a reaction to the interaction of the specific slow-wave phase with the processing of the word pairs. Even though these results cannot provide unambiguous, causal links, we think they can figure as an important start for other studies to decipher neural complexity during slow wave sleep.

      Reviewer #3 (Public Review):

      The study aims at creating novel episodic memories during slow wave sleep, that can be transferred in the awake state. To do so, participants were simultaneously presented during sleep both foreign words and their arbitrary translations in their language (one word in each ear), or as a control condition only the foreign word alone, binaurally. Stimuli were presented either at the trough or the peak of the slow oscillation using a closed-loop stimulation algorithm. To test for the creation of a flexible association during sleep, participant were then presented at wake with the foreign words alone and had (1) to decide whether they had the feeling of having heard that word before, (2) to attribute this word to one out of three possible conceptual categories (to which translations word actually belong), and (3) to rate their confidence about their decision.

      R3.1. The paper is well written, the protocol ingenious and the methods are robust. However, the results do not really add conceptually to a prior publication of this group showing the possibility to associate in slow wave sleep pairs of words denoting large or small object and non words, and then asking during ensuing wakefulness participant to categorise these non words to a "large" or "small" category. In both cases, the main finding is that this type of association can be formed during slow wave sleep if presented at the trough (versus the peak) of the slow oscillation. Crucially, whether these associations truly represent episodic memory formation during sleep, as claimed by the authors, is highly disputable as there is no control condition allowing to exclude the alternative, simpler hypothesis that mere perceptual associations between two elements (foreign word and translation) have been created and stored during sleep (which is already in itself an interesting finding). In this latter case, it would be only during the awake state when the foreign word is presented that its presentation would implicitly recall the associated translation, which in turn would "ignite" the associative/semantic association process eventually leading to the observed categorisation bias (i.e., foreign words tending to be put in the same conceptual category than their associated translation). In the absence of a dis-confirmation of this alternative and more economical hypothesis, and if we follow Ocam's razor assumption, the claim that there is episodic memory formation during sleep is speculative and unsupported, which is a serious limitation irrespective of the merits of the study. The title and interpretations should be toned down in this respect

      Our study conceptually adds to and extends the findings by Züst et al. (a) by highlighting the precise time-window or brain state during which sleep-learning is possible (e.g. slow-wave trough targeting), (b) by demonstrating the feasibility of associative learning during night sleep, and (c) by uncovering the longevity of sleep-formed memories.

      We acknowledge that the reported unconscious learning of novel verbal associations during sleep may not match textbook definitions of episodic memory. However, the traditional definitions of episodic memory have long been criticised (e.g, (Dew & Cabeza, 2011; Hannula et al., 2023; Henke, 2010; Reder et al., 2009; Shohamy & Turk-Browne, 2013). We stand by our claim that sleep-learning was of episodic nature. We use a computational definition of episodic memory (Cohen & Eichenbaum, 1993; Henke, 2010; O’Reilly et al., 2014; O’Reilly & Rudy, 2000), and not the traditional definition of episodic memory that ties episodic memory to wakefulness and conscious awareness (Gabrieli, 1998; Moscovitch, 2008; Schacter, 1998; Squire & Dede, 2015; Tulving, 2002). The core computational features of episodic memory are 1) rapid learning, 2) association formation, and 3) a compositional and flexible representation of the associations in long-term memory.

      Therefore, we revised the manuscript to emphasize how our definition differs from traditional definitions (line 64).

      For the current study, we designed a retrieval task that calls on the core computational features of episodic memory by assessing flexible retrieval of sleep-formed compositional word-word associations. Reviewer 3 suggests an alternative interpretation for the learning observed here: mere perceptual associations between foreign words and translations words are stored during sleep, and semantic associations are only inferred at retrieval testing during ensuing wakefulness. First, these processing steps would require the rapid soundsound associative encoding, long-term storage, and the flexible sound retrieval, which would still require hippocampal processing and computations in the episodic memory system. Second, this mechanism seems highly laborious and inefficient. The sound pattern of a word at 12 hours after learning triggers the reactivation of an associated sound pattern of another word. This sound pattern then elicits the activation of the translation words’ semantics leading to the selection of the correct superordinate semantic category at test.

      Overall, we believe that our pairwise-associative learning paradigm triggered a rapid conceptual-associative encoding process mediated by the hippocampus that provided for flexible representations of foreign and translation words in episodic memory. This study adds to the existing literature by examining specific boundary conditions of sleep-learning and demonstrates the longevity (at least 36 hours) of sleep-learned associations.

      Other remarks:

      R3.2. Lines 43-45 : the assumption that the sleeping brain decides whether external events can be disregarded, requires awakening or should be stored for further consideration in the waking state is dubious, and the supporting references date from a time (the 60') during which hypnopedia was investigated in badly controlled sleep conditions (leaving open the doubt about the possibility that it occurred during micro awakenings)

      We revised the manuscript to add timelier and better controlled studies that bolster the 60ties-born claim (line 40-51). Recently, it has been shown that the sleeping brain preferentially processes relevant information. For example the information conveyed by unfamiliar voices (Ameen et al., 2022), emotional content (Holeckova et al., 2006; Moyne et al., 2022), our own compared to others’ names (Blume et al., 2018).

      R3.3. 1st paragraph, lines 48-53 , the authors should be more specific about what kind of new associations and at which level they can be stored during sleep according to recent reports, as a wide variety of associations (mostly elementary levels) are shown in the cited references. Limitations in information processing during sleep should also be acknowledged.

      In the lines to which R3 refers, we cite an article (Ruch & Henke, 2020) in which two of the three authors of the current manuscript elaborate in detail what kind of associations can be stored during sleep. We revised these lines to more clearly present the current understanding of the potential and the limitations of sleep-learning (line 40-51). Although information processing during sleep is generally reduced (Andrillon et al., 2016), a variety of different kinds of associations can be stored, ranging from tone-odour to word-word association (Arzi et al., 2012, 2014; Koroma et al., 2022; Züst et al., 2019).

      R3.4. The authors ran their main behavioural analyses on delayed retrieval at 36h rather than 12h with the argument that retrieval performance was numerically larger at 36 than 12h but the difference was non-significant (line 181-183), and that effects were essentially similar. Looking at Figure 2, is the trough effect really significant at 12h ? In any case, the fact that it is (numerically) higher at 36 than 12h might suggest that the association created at the first 12h retrieval (considering the alternative hypothesis proposed above) has been reinforced by subsequent sleep.

      The Trough effect at 12h is not significant, as stated on line 185 (“Planned contrasts against chance level revealed that retrieval performance significantly exceeded chance at 36 hours only (P36hours = 0.036, P12hours = 0.094).”). It seems that our wording was not clear. Therefore, we refined the description of the behavioural analysis in the manuscript (lines 188-193).

      In brief, we report an omnibus ANOVA with a significant main effect of targeting type (Trough vs Peak, main effect Peak versus Trough: F(1,28) = 5.237, p = 0.030, d = 0.865). Because Trough-targeting led to significantly better memory retention than Peak-targeting, we computed a second ANOVA, solely including participants with through-targeted word-pair encoding. The memory retention in the Trough condition is above chance (MTrough = 39.11%, SD = 10.76; FIntercept (1,14) = 5.660, p = 0.032) and does not significantly differ between the 12h and 36h retrieval (FEncoding-Test Delay (1,14) = 1.308, p = 0.272). However, the retrieval performance at 36h numerically exceeds the performance at 12h and the direct comparison against chance reveals that the 36h but not the 12h retrieval was significant (P36hours = 0.036, P12hours = 0.094). Hence, we found no evidence for above chance performance at the 12h retrieval and focused on the retrieval after 36h in the EEG analysis.

      We agree with the reviewer that the subsequent sleep seems to have improved consolidation and subsequent retrieval. We assume that the reviewer suggests that participants merely formed perceptual associations during sleep and encoded episodic-like associations during testing at 12h (as pointed out in R 3.1). However, we believe that it is unlikely that the awake encoding of semantic associations during the 12h retrieval led to improved performance after 36h. We changed the discussion regarding the interaction between retrieval at 12h and 36h (line 505-512, also see R 2.2)

      R3.5> In the discussion section lines 419-427, the argument is somehow circular in claiming episodic memory mechanisms based on functional neuroanatomical elements that are not tested here, and the supporting studies conducted during sleep were in a different setting (e.g. TMR)

      Indeed, the TMR and animal studies are a different setting compared to the present study. We re-wrote this part and only focused on the findings of Züst and colleagues (2019), who examined hippocampal activity during the awake retrieval of sleep-formed memories (lines 472-482). Additionally, we would like to emphasise that our main reasoning is that the task requirements called upon the episodic memory system.

      R3.6. Supplementary Material: in the EEG data the differentiation between correct and incorrect ulterior classifications when presented at the peak of the slow oscillation is only significant in association with 36h delayed retrieval but not at 12h, how do the authors explain this lack of effect at 12 hour ?

      We assume that the reviewer refers to the TROUGH condition (word-pairs targeted at a slow-wave trough) and not as written to the peak condition. We argue that the retention performance at 12h is not significantly above chance (M12hours = 37.4%, P12hours = 0.094).

      Hence, the distinction between “correctly” and “incorrectly” categorised word pairs was not informative for the EEG analysis during sleep. For whatever reason the 12h retrieval was not significantly above chance, the less successful memory recall and thus a less balanced trial count makes recall accuracy a worse delineator for separating EEG trials then the recall performance after 36 hours.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor importance:

      Abstract: The opening framing is confusing here and in the introduction. Why frame the paper in the broadest terms about awakenings and threats from the environment when this is a paper about intersections between learning & memory and sleep? I do understand that there is an interesting point to be made about the counterintuitive behavioral findings with respect to sleep generally being perceived as a time when stimuli are blocked out, but this does not seem to me to be the broadest points or the way to start the paper. The authors should consider this but of course push back if they disagree.

      We understand the reviewer’s criticism but believe that this has more to do with personal preferences than with the scientific value or validity of our work. We believe that it is our duty as researchers to present our study in a broader context because this may help readers from various fields to understand why the work is relevant. To some readers, evidence for learning during sleep may seem trivial, to others, it may seem impossible or a weird but useless conundrum. By pointing out potential evolutionary benefits of the ability to acquire new information during sleep, we help the broad readership of eLife understand the relevance of this work.

      Lines 31-32: "Neural complexity" -> "neural measures of complexity" because it isn't clear what "neural complexity" means at this point in the abstract. Though, note my other point that I believe this analysis should be removed.

      To our understanding, “neural complexity” is a frequently used term in the field and yields more than 4000 entries on google scholar. Whereas ‘neural measures of complexity’ only finds 3 hits on google scholar [September 2023]. In order to link our study with other studies on neural complexity, we would like to keep this terminology. As an example, two recent publications using “neural complexity” are Lee et al. (2020) and Frohlich et al. (2022).

      Lines 42-43: The line of work on 'sentinel' modes would be good to cite here (e.g., Blume et al., 2017, Brain & Language).

      We added the suggested citation to the manuscript (lines 52).

      Lines 84-90: While I appreciate the authors desire to dig deep and try to piece this all together, this is far too speculative in my opinion. Please see my other points on the same topic.

      In this paragraph, we point out why both peaks and troughs are worth exploring for their contributions to sensory processing and learning during sleep. Peaks and troughs are contributing mutually to sleep-learning. Our speculations should inspire further work aimed at pinning down the benefits of peaks and troughs for sleep-learning. We clarified the purpose and speculative nature of our arguments in the revised version of the manuscript.

      Line 109: "outlasting" -> "lasting over" or "lasting >"

      We changed the wording accordingly.

      Line 111: I believe 'nonsense' is not the correct term here, and 'foreign' (again) would be preferred. Some may be offended to hear their foreign word regarded as 'nonsense'. However, please let me know if I have misunderstood.

      We would like to use the linguistic term “pseudoword” (aligned with reviewer 2’s comment) and we revised the manuscript accordingly.

      Figure 1A: "Enconding" -> "Encoding"

      Thank you for pointing this out.

      Lines 201-2: Were there interactions between confidence and correctness on the semantic categorization task? Were correct responses given with more confidence than incorrect ones? This would not necessarily be a problem for the authors' account, as there can of course be implicit influences on confidence (i.e., fluency).

      As is stated in the results section, confidence ratings did not differ significantly between correct and incorrect assignments (Trough condition: F(1,14) = 2.36, p = 0.15); Peak condition: F(1,14) = 0.48, p = 0.50).

      Line 236: "Nicknazar" -> "Niknazar"

      Thank you for pointing this out.

      Line 266: "profited" -> "benefited"

      We changed the wording accordingly.

      Lines 280-4: There seems some relevance here with Malerba et al. (2018) and her other papers to categorize slow oscillations.

      Diving into the details on how to best categorise slow oscillations is beyond the scope of this manuscript. Here, we build on work from the field of microstate analyses and use two measures to describe and quantify the targeted brain states: the topography of the electric field (i.e., the correlation of the electric field with an established template or “microstate”), and the field strength (global field power, GFP). While the topography of a quasi-stable electric field reflects activity in a specific neural network, the strength (GFP) of a field most likely mirrors the degree of activation (or inactivity) in the specific network. Here, we find that consistent targeting of a specific network state yielding a strong frontal negativity benefitted learning during sleep. For a more detailed explanation of the slow-wave phase targeting see (Ruch et al., 2022).

      Lines 343-6: Was it intentional to have 0.5 s (0.2-0.7 s) surrounding the analysis around 500 ms but only 0.4 s (0.8-1.2 s) surrounding the analysis around 1 s? Could the authors use the same size interval or justify having them be different?

      We apologise for the misleading phrasing and we clarified this in the revised manuscript. We applied the same procedure for the comparison of later correctly vs incorrectly classified pseudowords as we did for the comparison between EC and CC. Hence, we analysed the entire window from 0s to 2.5s with a cluster-based permutation approach. Contrary to the EC vs CC contrast, no cluster remained significant for the comparison of the subsequent memory effect. By mistake we reported the wrong time window. In the revised manuscript, the paragraph is corrected (lines 364-369).

      Line 356-entire HFD section: it is unclear what's gained by this analysis, as it could simply be another reflection of the state of the brain at the time of word presentation. In my opinion, the authors should remove this analysis and section, as it does not add clarity to other aspects of the paper.

      (If the authors keep the section) Line 361-2 - "Moreover, high HFD values have been associated with cognitive processing (Lau et al., 2021; Parbat & Chakraborty, 2021)." This statement is vague. Could the authors elaborate?

      Please see our answer to Reviewer 2 (2.3) for a more detailed explanation. In brief, we would like to keep the analysis with the broad time window of -2 to -0.5 and from 0.5 to 2 s.

      Lines 403-4: How was it determined that these neural networks mediated both conscious/unconscious processes? Perhaps the authors meant to make a different point, but the way it reads to me is that there is evidence that some neural networks are conscious and others are not and both forms engage in similar functions.

      We revised the manuscript to be more precise and clear: “The conscious and unconscious rapid encoding and flexible retrieval of novel relational memories was found to recruit the same or similar networks including the hippocampus(Henke et al., 2003; Schneider et al., 2021). This suggests that conscious and unconscious relational memories are processed by the same memory system.” (p. 22, top).

      Lines 433-41: Performance didn't actually significantly increase from 12 to 36 hours, so this is all too speculative in my opinion.

      We removed the speculative claim that performance may have increased from the retrieval at 12 hours to the retrieval at 36 hours.

      Line 534: "assisted by enhanced" -> "coincident with". It's unclear whether theta reflects successful processing as having occurred or whether it directly affects or assists with it.

      We have adjusted the wording to be more cautious, as suggested (line 588).

      Line 572-4: Rothschild et al. (2016) is relevant here.

      Unfortunately, we do not see the relevance of this article within the context of our work.

      Line 577 paragraph: The authors may consider adding a note on the importance of ethical considerations surrounding this form of 'inception'.

      We extended this part by adding ethical considerations to the discussion section (Stickgold et al., 2021, line 657).

      Line 1366: It would be better if the authors could eventually make their data publicly available. This is obviously not required, but I encourage the authors to consider it if they have not considered it already.

      In my opinion, the discussion is too long. I really appreciate the authors trying to figure out the set of precise times in which each level of neural processing might occur and how this intersects with their slow oscillation phase results. However, I found a lot of this too speculative, especially given that the sounds may bleed into parts of other phases of the slow oscillation. I do not believe this is a problem unique to these authors, as many investigators attempting to target certain phases in the target memory reactivation literature have faced the same problem, but I do believe the authors get ahead of the data here. In particular, there seems to be one paragraph in the discussion that is multiple pages long (p. 22-24). This paragraph I believe has too much detail and should be broken up regardless, as it is difficult for the reader to follow.

      Considering the recent literature, we believe this interpretation best explains the data. As argued earlier, we believe that a speculative interpretation of the reported phenomena can provide substantial added value because it inspires future experimental work. We have improved the manuscript by clearly distinguishing between data and interpretation. We do declare the speculative nature of some offered interpretations. We hope that these speculations, which are testable hypotheses (!), will eventually be confirmed or refuted experimentally.

      Reviewer #2 (Recommendations For The Authors):

      I very much enjoyed the paper and think it describes important findings. I have a few suggestions for improvement, and minor comments that caught my eye during reading:

      (1) I was missing an analysis of CC ERP, and its comparison to EC ERP.

      We added this analysis to the manuscript (line 299-301). The comparison of CC ERP with EC ERP did not yield any significant cluster for either the peak (cluster-level Monte Carlo p=0.54) or the trough (cluster-level Monte Carlo p>0.37). We assume that the noise level was too high for the identification of differences between CC and EC ERP.

      (2) Regarding my public review comment #2, some light can be shed on between-test effects, I believe, using an item-based analysis - looking at correlations between items' classifications in test #1 and test #2. The assumption seems to be that items that were correct in test #1 remained correct in test #2 while other new correct classifications were added, owing to the additional consolidation happening between the two tests. But that is an empirical question that can be easily tested. If no consistency in item classification is found, on the other hand, or if only consistency in correct classification is found, that would be interesting in itself. This item-based analysis can help tease away real memory from random correct classification. For instance, the subset of items that are consistently classified correctly could be regarded as non-fluke at higher confidence and used as the focus of subsequent-memory analysis instead of the ones that were correct only in test #2.

      Thanks, we re-analysed the data accordingly. Participants were consistent at choosing a specific object category for an item at 12 hours and 36 hours (consistency rate = 47% same category, chance level is 1/3). Moreover, the consistency rate did not differ between the Trough and the Peak condition (MTrough = 47.2%, MPeak = 47.0%, P = 0.98). The better retrieval performance in the Trough compared to the Peak condition after 36 hours is due to: A) if participants were correct at 12h, they chose again the correct answer at 36h (Trough: 20% & Peak: 14%). B) Following an incorrect answer at 12h, participants switched to another object category at 36h (Trough: 72%, Peak: 67%). C) If participants switched the object category following an incorrect answer at 12h, they switched more often to the correct category at 36h in the trough versus the peak condition (Trough: in 56% & Peak: 53%). Hence, the data support the reviewer’s assumption: items that were correct after 12 hours remained correct after 36 hours, while other new correct classifications were generated at 36h owing to the additional consolidation happening between the two tests. We added this finding to the manuscript (line 191-200, Figure S6):

      Author response image 1.

      As suggested, we re-analysed the ERP with respect to the subsequent memory effect. This time we computed four conditions according to the reviewer’s argument about consistently correctly classified pseudowords, presented in the figure below: ERP of trials that were correctly classified at 36h (blue), ERP of trials that were incorrectly classified at 36h (light blue), ERP of trials that were correctly classified twice (brown) and ERP of trials that were not correctly classified twice (orange, all trials that are not in brown). Please note that the two blue lines are reported in the manuscript and include all trials. The brown and the orange line take the consistency into account and together include as well all trials.

      Author response image 2.

      By excluding even more trials from the group of correct retrieval responses, the noise level gets high. Therefore, the difference between the twice-correct and the not-twice-correct trials is not significant (cluster-level Monte Carlo p > 0.27). Because the ERP of twice-correct trials seems very similar to the ERP of the trials correctly classified at 36h at frontal electrodes, we assume that our ERP effect is not driven by a few extreme subjects. Similarly, not-twicecorrect trials (orange) have a stronger frontal trough than the trials incorrectly classified at 36h (light blue).

      (3) In a similar vein, a subject-based analysis would be highly interesting. First and foremost, readers would benefit from seeing the lines that connect individual dots across the two tests in figures 2B and 2C. It is reasonable to expect that only a subset of participants were successful learners in this experiment. Finding them and analyzing their results separately could be revealing.

      We added a Figure S1 to the supplementary material, providing the pairing between performance of the 12h and the 36h retrieval.

      It is an interesting idea to look at successful learners alone. We computed the ERP of the subsequent memory effect for those participants, who had an above change retrieval accuracy at 36h. The result shows a similar effect as reported for all participants (frontal cluster ~0-0.3s). The p-value is only 0.08 because only 9 of 15 participants exhibited an above chance retrieval performance at 36 hours.

      Author response image 3.

      ERP effect of correct (blue) vs incorrect (light blue) pseudoword category assignment of participants with a retrieval performance above chance at 36h (SD as shades):

      We prefer to not include this data in the manuscript, but are happy to provide it here.

      (4) I wondered why the authors informed subjects of the task in advance (that they will be presented associations when they slept)? I imagine this may boost learning as compared to completely naïve subjects. Whether this is the reason or not, I think an explanation of why this was done is warranted, and a statement whether authors believe the manipulation would work otherwise. Also, the reader is left wondering why subjects were informed only about test #1 and not about test #2 (and when were they told about test #2).

      Subjects were informed of all the tests upfront. We apologize for the inconsistency in the manuscript and revised the method part. The explanation of why participants were informed is twofold: a) Participants had to sleep with in-ear headphones. We wanted to explain to participants why these are necessary and why they should not remove them. b) We hoped that participants would be expecting unconsciously sounds played during sleep, would process these sounds efficiently and would remain deeply asleep (no arousals).

      (5) FoHH is a binary yes/no question, and so may not have been sensitive enough to demonstrate small differences in familiarity. For comparison, the Perceptual Awareness Scale (Ramsøy & Overgaard, 2004) that is typically used in studies of unconscious processing is of a 4-point scale, and this allows to capture more nuanced effects such as partial consciousness and larger response biases. Regardless, it would be informative to have the FoHH numbers obtained in this study, and not just their comparison between conditions. Also, was familiarity of EC and CC pseudowords compared? One may wonder whether hearing the pseudowords clearly vs. in one ear alongside a familiar word would make the word slightly more familiar.

      We apologize for having simplified this part too much in the manuscript. Indeed, the FoHH is comparable to the PAS. We used a 4-point scale, where participants rated their feeling of whether they have heard the pseudoword during previous sleep. In the revised manuscript, we report the complete results (line 203-223). The FoHH did not differ between any of the suggested contrasts. Thus, for both the peak and the trough condition, the FoHH did not differ between sleep-played vs new; correct EC trials vs new; correct vs incorrect EC trials; EC vs CC trials. To illustrate the results, a figure of the FoHH has been added to the supplement (Figure S4).

      (6) Similarly, it would be good to report the numbers of the confidence ratings in the paper as well.

      In the revised manuscript, we extended the description of the confidence rating results. We added the descriptive statistics (line 224-236) and included a corresponding figure in the supplement (Figure S5).

      Minor/aesthetic comments:

      We implemented all the following suggestions.

      (1) I suggest using "pseudoword" or "nonsense word" instead of "foreign word", because "foreign word" typically means a real word from a different language. It is quite confusing when starting to read the paper.

      After reconsidering, we think that pseudoword is the appropriate linguistic term and have revised the manuscript accordingly.

      (2) Lines 1000-1001: "The required sample size of N = 30 was determined based on a previous sleep-learning study". I was missing a description of what study you are referring to.

      (3) I am not sure I understood the claim nor the rationale made in lines 414-417. Is the claim that pairs did not form one integrated engram? How do we know that? And why would having one engram not enable extracting the meaning from a visual-auditory presentation of the cue? The sentence needs some rewording and/or unpacking.

      (4) Were categories counterbalanced (i.e., did each subjects' EC contain 9 animal words, 9 tool words and 9 place words)?

      (5) Asterisks indicating significant effects are missing from Figure 4 and S2.

      (6) Fig1 legend: "Participants were played with pairs" is ungrammatical.

      (7) Line 1093: no need for a comma.

      (8) Line 1336: missing opening parenthesis

      (9) Line 430: "observe" instead of "observed".

      (10) Line 466: two dots instead of one..

      Reviewer #3 (Recommendations For The Authors):

      Methods: 2 separate ANOVAs are performed (lines 160-185), but would not it make more sense to combine both in one ? If kept separated then a correction for multiple comparisons might be needed (p/2 = 0.025)

      We computed an omnibus ANOVA. In a next step, we examined the effect in the significant targeting condition by computing another ANOVA. For further explanations, see reviewer comment 3.4.

      References

      Ameen, M. S., Heib, D. P. J., Blume, C., & Schabus, M. (2022). The Brain Selectively Tunes to Unfamiliar Voices during Sleep. Journal of Neuroscience, 42(9), 1791–1803. https://doi.org/10.1523/JNEUROSCI.2524-20.2021

      Andrillon, T., Poulsen, A. T., Hansen, L. K., Léger, D., & Kouider, S. (2016). Neural Markers of Responsiveness to the Environment in Human Sleep. The Journal of Neuroscience, 36(24), Article 24. https://doi.org/10.1523/JNEUROSCI.0902-16.2016

      Arzi, A., Holtzman, Y., Samnon, P., Eshel, N., Harel, E., & Sobel, N. (2014). Olfactory Aversive Conditioning during Sleep Reduces Cigarette-Smoking Behavior. Journal of Neuroscience, 34(46), Article 46. https://doi.org/10.1523/JNEUROSCI.2291-14.2014

      Arzi, A., Shedlesky, L., Ben-Shaul, M., Nasser, K., Oksenberg, A., Hairston, I. S., & Sobel, N. (2012). Humans can learn new information during sleep. Nature Neuroscience, 15(10), Article 10. https://doi.org/10.1038/nn.3193

      Batterink, L. J., Creery, J. D., & Paller, K. A. (2016). Phase of Spontaneous Slow Oscillations during Sleep Influences Memory-Related Processing of Auditory Cues. Journal of Neuroscience, 36(4), 1401–1409. https://doi.org/10.1523/JNEUROSCI.3175-15.2016

      Belardi, A., Pedrett, S., Rothen, N., & Reber, T. P. (2021). Spacing, Feedback, and Testing Boost Vocabulary Learning in a Web Application. Frontiers in Psychology, 12. https://www.frontiersin.org/articles/10.3389/fpsyg.2021.757262

      Bergmann, T. O. (2018). Brain State-Dependent Brain Stimulation. Frontiers in Psychology, 9, 2108. https://doi.org/10.3389/fpsyg.2018.02108

      Blume, C., del Giudice, R., Wislowska, M., Heib, D. P. J., & Schabus, M. (2018). Standing sentinel during human sleep: Continued evaluation of environmental stimuli in the absence of consciousness. NeuroImage, 178, 638–648. https://doi.org/10.1016/j.neuroimage.2018.05.056

      Brodbeck, C., & Simon, J. Z. (2022). Cortical tracking of voice pitch in the presence of multiple speakers depends on selective attention. Frontiers in Neuroscience, 16. https://www.frontiersin.org/articles/10.3389/fnins.2022.828546

      Cohen, N. J., & Eichenbaum, H. (1993). Memory, Amnesia, and the Hippocampal System. A Bradford Book.

      Daltrozzo, J., Claude, L., Tillmann, B., Bastuji, H., & Perrin, F. (2012). Working memory is partially preserved during sleep. PloS One, 7(12), Article 12.

      Dew, I. T. Z., & Cabeza, R. (2011). The porous boundaries between explicit and implicit memory: Behavioral and neural evidence. Annals of the New York Academy of Sciences, 1224(1), 174–190. https://doi.org/10.1111/j.1749-6632.2010.05946.x

      Esfahani, M. J., Farboud, S., Ngo, H.-V. V., Schneider, J., Weber, F. D., Talamini, L. M., & Dresler, M. (2023). Closed-loop auditory stimulation of sleep slow oscillations: Basic principles and best practices. Neuroscience & Biobehavioral Reviews, 153, 105379. https://doi.org/10.1016/j.neubiorev.2023.105379

      Frohlich, J., Chiang, J. N., Mediano, P. A. M., Nespeca, M., Saravanapandian, V., Toker, D., Dell’Italia, J., Hipp, J. F., Jeste, S. S., Chu, C. J., Bird, L. M., & Monti, M. M. (2022). Neural complexity is a common denominator of human consciousness across diverse regimes of cortical dynamics. Communications Biology, 5(1), Article 1. https://doi.org/10.1038/s42003-022-04331-7

      Gabrieli, J. D. E. (1998). Cognitive neuroscience of human memory. Annual Review of Psychology, 87–115.

      Garcia-Molina, G., Tsoneva, T., Jasko, J., Steele, B., Aquino, A., Baher, K., Pastoor, S., Pfundtner, S., Ostrowski, L., Miller, B., Papas, N., Riedner, B., Tononi, G., & White, D. P. (2018). Closed-loop system to enhance slow-wave activity. Journal of Neural Engineering, 15(6), 066018. https://doi.org/10.1088/1741-2552/aae18f

      Hannula, D. E., Minor, G. N., & Slabbekoorn, D. (2023). Conscious awareness and memory systems in the brain. WIREs Cognitive Science, 14(5), e1648. https://doi.org/10.1002/wcs.1648

      Henke, K. (2010). A model for memory systems based on processing modes rather than consciousness. Nature Reviews Neuroscience, 11(7), Article 7. https://doi.org/10.1038/nrn2850

      Henke, K., Mondadori, C. R. A., Treyer, V., Nitsch, R. M., Buck, A., & Hock, C. (2003). Nonconscious formation and reactivation of semantic associations by way of the medial temporal lobe. Neuropsychologia, 41(8), Article 8. https://doi.org/10.1016/S0028-3932(03)00035-6

      Holeckova, I., Fischer, C., Giard, M.-H., Delpuech, C., & Morlet, D. (2006). Brain responses to a subject’s own name uttered by a familiar voice. Brain Research, 1082(1), 142–152. https://doi.org/10.1016/j.brainres.2006.01.089

      Karpicke, J. D., & Roediger, H. L. (2008). The Critical Importance of Retrieval for Learning. Science, 319(5865), 966–968. https://doi.org/10.1126/science.1152408

      Koroma, M., Elbaz, M., Léger, D., & Kouider, S. (2022). Learning New Vocabulary Implicitly During Sleep Transfers With Cross-Modal Generalization Into Wakefulness. Frontiers in Neuroscience, 16, 801666. https://doi.org/10.3389/fnins.2022.801666

      Lee, Y., Lee, J., Hwang, S. J., Yang, E., & Choi, S. (2020). Neural Complexity Measures. Advances in Neural Information Processing Systems, 33, 9713–9724. https://proceedings.neurips.cc/paper/2020/hash/6e17a5fd135fcaf4b49f2860c2474c7 c-Abstract.html

      Metcalfe, J. (2017). Learning from Errors. Annual Review of Psychology, 68(1), 465–489. https://doi.org/10.1146/annurev-psych-010416-044022

      Moscovitch, M. (2008). The hippocampus as a “stupid,” domain-specific module: Implications for theories of recent and remote memory, and of imagination. Canadian Journal of Experimental Psychology/Revue Canadienne de Psychologie Expérimentale, 62, 62–79. https://doi.org/10.1037/1196-1961.62.1.62

      Moyne, M., Legendre, G., Arnal, L., Kumar, S., Sterpenich, V., Seeck, M., Grandjean, D., Schwartz, S., Vuilleumier, P., & Domínguez-Borràs, J. (2022). Brain reactivity to emotion persists in NREM sleep and is associated with individual dream recall. Cerebral Cortex Communications, 3(1), tgac003. https://doi.org/10.1093/texcom/tgac003

      Ngo, H.-V. V., Martinetz, T., Born, J., & Mölle, M. (2013). Auditory Closed-Loop Stimulation of the Sleep Slow Oscillation Enhances Memory. Neuron, 78(3), Article 3. https://doi.org/10.1016/j.neuron.2013.03.006

      O’Reilly, R. C., Bhattacharyya, R., Howard, M. D., & Ketz, N. (2014). Complementary Learning Systems. Cognitive Science, 38(6), 1229–1248. https://doi.org/10.1111/j.1551-6709.2011.01214.x

      O’Reilly, R. C., & Rudy, J. W. (2000). Computational principles of learning in the neocortex and hippocampus. Hippocampus, 10(4), 389–397. https://doi.org/10.1002/1098-1063(2000)10:4<389::AID-HIPO5>3.0.CO;2-P

      Rabinovich Orlandi, I., Fullio, C. L., Schroeder, M. N., Giurfa, M., Ballarini, F., & Moncada, D. (2020). Behavioral tagging underlies memory reconsolidation. Proceedings of the National Academy of Sciences, 117(30), 18029–18036. https://doi.org/10.1073/pnas.2009517117

      Reder, L. M., Park, H., & Kieffaber, P. D. (2009). Memory systems do not divide on consciousness: Reinterpreting memory in terms of activation and binding. Psychological Bulletin, 135(1), Article 1. https://doi.org/10.1037/a0013974

      Ruch, S., & Henke, K. (2020). Learning During Sleep: A Dream Comes True? Trends in Cognitive Sciences, 24(3), 170–172. https://doi.org/10.1016/j.tics.2019.12.007

      Ruch, S., Schmidig, F. J., Knüsel, L., & Henke, K. (2022). Closed-loop modulation of local slow oscillations in human NREM sleep. NeuroImage, 264, 119682. https://doi.org/10.1016/j.neuroimage.2022.119682

      Schacter, D. L. (1998). Memory and Awareness. Science, 280(5360), 59–60. https://doi.org/10.1126/science.280.5360.59

      Schneider, E., Züst, M. A., Wuethrich, S., Schmidig, F., Klöppel, S., Wiest, R., Ruch, S., & Henke, K. (2021). Larger capacity for unconscious versus conscious episodic memory. Current Biology, 31(16), 3551-3563.e9. https://doi.org/10.1016/j.cub.2021.06.012

      Shohamy, D., & Turk-Browne, N. B. (2013). Mechanisms for widespread hippocampal involvement in cognition. Journal of Experimental Psychology: General, 142(4), 1159–1170. https://doi.org/10.1037/a0034461

      Squire, L. R., & Dede, A. J. O. (2015). Conscious and Unconscious Memory Systems. Cold Spring Harbor Perspectives in Biology, 7(3), a021667. https://doi.org/10.1101/cshperspect.a021667

      Stickgold, R., Zadra, A., & Haar, A. J. H. (2021). Advertising in Dreams is Coming: Now What? Dream Engineering. https://dxe.pubpub.org/pub/dreamadvertising/release/1

      Tulving, E. (2002). Episodic Memory: From Mind to Brain. Annual Review of Psychology, 53(1), 1–25. https://doi.org/10.1146/annurev.psych.53.100901.135114

      Wilhelm, I., Diekelmann, S., Molzow, I., Ayoub, A., Mölle, M., & Born, J. (2011). Sleep Selectively Enhances Memory Expected to Be of Future Relevance. Journal of Neuroscience, 31(5), 1563–1569. https://doi.org/10.1523/JNEUROSCI.3575-10.2011

      Wunderlin, M., Koenig, T., Zeller, C., Nissen, C., & Züst, M. A. (2022). Automatized online prediction of slow-wave peaks during non-rapid eye movement sleep in young and old individuals: Why we should not always rely on amplitude thresholds. Journal of Sleep Research, 31(6), e13584. https://doi.org/10.1111/jsr.13584

      Züst, M. A., Ruch, S., Wiest, R., & Henke, K. (2019). Implicit Vocabulary Learning during Sleep Is Bound to Slow-Wave Peaks. Current Biology, 29(4), 541-553.e7. https://doi.org/10.1016/j.cub.2018.12.038

    2. eLife assessment

      This manuscript supports the intriguing idea that some aspects of novel learning can occur during sleep and outside of awareness. The authors provide solid evidence that presenting participants with novel words and their translations during sleep, especially during slow oscillation troughs, leads to the ability to categorize the semantic meaning of those words during awake testing 36 hours later. These findings represent a valuable contribution to the literature on unconscious processing and learning during sleep, although the claim that the results reflect episodic memory formation, in particular, deviates from the typical use of this term in the literature.

    3. Reviewer #1 (Public Review):

      The authors show that concurrently presenting foreign words and their translations during sleep leads to the ability to semantically categorize the foreign words above chance. Specifically, this procedure was successful when stimuli were delivered during slow oscillation troughs as opposed to peaks, which has been the focus of many recent investigations into the learning & memory functions of sleep. Finally, further analyses showed that larger and more prototypical slow oscillation troughs led to better categorization performance, which offers hints to others on how to improve or predict the efficacy of this intervention.

      Comments on the revised version:

      I applaud the authors on a nice rebuttal. Many responses use solid arguments based on the existing literature, such as their response regarding the possibility that low-level acoustic characteristics explaining EEG differences between conditions. Their new analyses also clarify the paper. Additionally, I appreciate their labeling their more speculative claims as such. Below are my remaining thoughts:

      Major point:

      The largest remaining issue for me regards the term 'episodic'. Before I begin, I should say that I imagine the authors have thought considerably about this definition and may disagree with what I will say. That would be fine - it's their choice at this journal. My main point in writing this is to help them clarify their case further. R3 had a similar concern on the first round of review, and I imagine others holding the "traditional" view of episodic memory would be similarly skeptical. If the authors have a great rebuttal to these points, I imagine it will address others' concerns too.<br /> I believe I understand the authors' argument: I read the Henke (2010, Nature Reviews Neuroscience) piece years back with great interest and again now, and I've gone back to read their other papers cited in this manuscript. Again, I applaud the authors on producing a large collection of fascinating findings expanding knowledge of what can be accomplished via unconscious learning. That includes this paper! But I still disagree with the term 'episodic' for what is measured here. The authors state in the Methods section that they prompted participants to 'guess whether the presented pseudoword designates an animal, a tool, or a place'. IMHO, the main issue of using 'episodic' is the nature of the memory representation - 'guessing' does not ask participants anything about the source (the who-what-when-why-where) of the information (anything about an episode).<br /> Notably, it does seem to fit their own definition from Henke (2010). Rapid? I believe so - 4 trial-learning is fairly quick. Certainly, there are studies of supposed episodic memory that use a few rounds of learning the same stimuli (rather than single trial learning) and one can still get away with calling the nature of the memories 'episodic'. Flexible? I believe the authors mean that their task is flexible because participants learn a category exemplar during sleep (e.g., 'aryl'-'bird') but then only respond based on its category membership ('animal'?). If this is the case, I agree that the representations are flexible. Reliant on the 'episodic memory system' (lines 495-9)? Reasonably likely, given their prior findings (e.g., Züst et al., 2019). However, there is considerable data suggesting the hippocampus contributes to functions beyond episodic memory, including statistical learning (e.g., Schapiro et al., 2013, Current Biology), motor learning (e.g., Schendan et al., 2003, Neuron; Dohring et al., 2017, Cortex; Jacobacci et al., 2020, PNAS), attention (e.g., Aly & Turk-Browne, 2016, Cerebral Cortex), perception (e.g., Lee et al., 2012), and semantic memory (e.g., Cutler et al., 2019, Frontiers in Human Neuroscience). Therefore, given that the hippocampus contributes to other tasks too, saying the task is episodic in part because it likely relies on the hippocampus (the 'episodic memory system') is an incorrect reverse inference. But regardless of this concern, it seems true to me that the term fits 'episodic' according to Henke (2010).<br /> So, it seems I'm raising an issue with this entire way of defining memory. IMHO, the biggest issue is that there is no reason to assume the participant relies upon any source-related information in making their guess. There is room in the field for a new type of rapid, unconscious, flexible, hippocampal-dependent learning that does not need to align with the term, 'episodic', for it to be important and fascinating! The term, 'episodic', is convenient for a reason - namely, for labeling the behavioral output of what it measures, not the process that underlies it. The authors have continually made an excellent case for rapid, unconscious, flexible, hippocampal-dependent learning, and it would seem even more beneficial for the field for the authors to just call this its own thing.

      A related point:<br /> - I see that the authors do not use 'episodic' in prior papers with similar tasks (e.g., Züst et al., 2019), and I am curious if anything changed in their thinking or why they use the term now. They can ignore this if they'd like, but it would perhaps give useful context.

      Other points:<br /> IMHO, the issue of repeated tests is more legitimate than the authors suggest. They state in their response letter, "However, recent literature suggests that retrieval practice is only beneficial when corrective feedback is provided (Belardi et al., 2021; Metcalfe, 2017)." This is incorrect. While retrieval practice is often less effective without feedback, it can be effective without feedback if retrieval accuracy is high and if the experimenters later employ a long enough retention interval to witness long-term effects. This is clear in various papers (e.g., Roediger & Karpicke, 2006, Psychological Science; Karpicke & Roediger, 2008, Science) and there is a nice theoretical model explaining how these complex effects could arise (Halamish & Bjork, 2011, JEP:LMC; Kornell et al., 2011, JML). The authors do not heavily rely on this in their paper, but they could consider tempering their claims that it is 'unlikely' (line 509) that delayed retrieval was affected by the first retrieval.<br /> The authors claim that fast spindles are part of a speculative model underlying their learning effects (lines 605-6). However, they did not find any differential spindle effects in determining later performance, so they could consider keeping just points #1&2 or mentioning that spindles differ by condition but may not directly influence the learning effects here.

    4. Reviewer #3 (Public Review):

      This is a revision in response to the reviewer's comments. The authors provided new analyses and try to acknowledge limitations, overall doing a good job, but the interpretation still seems to me going above the available evidence, especially for the claim that it is episodic memory formation during sleep. I still believe the paper will be fairer in dropping this speculative part and omitting the word "episodic" from the title (like actually they did in the abstract). The argument of the authors is that they refer to a computational definition of episodic memory, which is to some extent valid, but I am afraid it is not the way it will be understood by most readers, and it will thus indirectly contribute to an erroneous (or at least, not substantiated) interpretation of the brain's sleeping capabilities.

      My main concern is that I have not seen any proposal for a control condition allowing to exclude the alternative, simpler hypothesis that mere perceptual associations between two elements (foreign word and translation) have been created and stored during sleep (which, I repeat, is already in itself an interesting finding). The authors argue that it seems to them not an efficient processing, but this an opinion, not a demonstration.

    1. Author Response

      We thank both reviewers for the positive evaluation of our work and suggestions on how to improve it.

      We agree with Reviewer #1 that reporting uncertainties will both clarify and strengthen our arguments. Where applicable, uncertainties will be added in a revised version.

      To Reviewer #2’s suggestion of including free energy calculations to estimate the free energies of hydrogen bond and hydrophobic interactions, the current free energy methods are capable of given accurate estimates of the relative binding free energies of similar ligands; however, accurate calculations of the absolute free energies of hydrogen bond and hydrophobic interactions are not feasible yet.

      Again, we thank the reviewers for their assessment and suggestions. We will update the manuscript as we have outlined above.

    2. eLife assessment

      This important work illuminates the dynamics of BRAF in both its monomeric and dimeric forms, with or without inhibitors, combining traditional techniques and sophisticated computational analyses. The evidence presented is convincing, though a more detailed description of the analyses could enhance reproducibility and the quality of the results. This study will interest structural biologists, medicinal chemists, and pharmacologists.

    3. Reviewer #1 (Public Review):

      This manuscript from Clayton and co-authors, entitled "Mechanism of dimer selectivity and binding cooperativity of BRAF inhibitors", aims to clarify the molecular mechanism of BRAF dimer selectivity. Indeed, first-generation BRAF inhibitors, targeting monomeric BRAFV600E, are ineffective in treating resistant dimeric BRAF isoforms. Here, the authors employed molecular dynamics simulations to study the conformational dynamics of monomeric and dimeric BRAF, in the presence and absence of inhibitors. Multi-microsecond MD simulations showed an inward shift of the αC helix in the BRAFV600E mutant dimer. This helped in identifying a hydrogen bond between the inhibitors and the BRAF residue Glu501 as critical for dimer compatibility. The stability of the aforementioned interaction seems to be important to distinguish between dimer-selective and equipotent inhibitors.

      The study is overall valuable and robust. The authors used the recently developed particle mesh Ewald constant pH molecular dynamics, a state-of-the-art method, to investigate the correct histidine protonation considering the dynamics of the protein. Then, multi-microsecond simulations showed differences in the flexibility of the αC helix and DFG motif. The dimerization restricts the αC position in the inward conformation, in agreement with the result that dimer-compatible inhibitors can stabilize the αC-in state. Noteworthy, the MD simulations were used to study the interactions between the inhibitors and the protein, suggesting a critical role for a hydrogen bond with Glu501. Finally, simulations of a mixed state of BRAF (one protomer bound to the inhibitor and the other apo) indicate that the ability to stabilize the inward αC state of the apo protomer could be at the basis of the positive cooperativity of PHI1.

      One potential weakness in the manuscript is the lack of reported uncertainties related to the analyzed quantities. Providing this information would significantly enhance the clarity regarding the reliability of the analyses and the confidence in the claims presented.

    4. Reviewer #2 (Public Review):

      The authors employ molecular dynamics simulations to understand the selectivity of FDA-approved inhibitors within dimeric and monomeric BRAF species. Through these comprehensive simulations, they shed light on the selectivity of BRAF inhibitors by delineating the main structural changes occurring during dimerization and inhibitor action. Notably, they identify the two pivotal elements in this process: the movement and conformational changes involving the alpha-C helix and the formation of a hydrogen bond involving the Glu-501 residue. These findings find support in the analyses of various structures crystallized from dimers and co-crystallized monomers in the presence of inhibitors. The elucidation of this mechanism holds significant potential for advancing our understanding of kinase signaling and the development of future BRAF inhibitor drugs.

      The authors employ a diverse array of computational techniques to characterize the binding sites and interactions between inhibitors and the active site of BRAF in both dimeric and monomeric forms. They combine traditional and advanced molecular dynamics simulation techniques such as CpHMD (all-atom continuous constant pH molecular dynamics) to provide mechanistic explanations. Additionally, the paper introduces methods for identifying and characterizing the formation of the hydrogen bond involving the Glu501 residue without the need for extensive molecular dynamics simulations. This approach facilitates the rapid identification of future BRAF inhibitor candidates.

      The use of molecular dynamics yields crucial structural insights and outlines a mechanism to elucidate dimer selectivity and cooperativity in these systems. However, the authors could consider the adoption of free energy methods to estimate the values of hydrogen bond energies and hydrophobic interactions, thereby enhancing the depth of their analysis.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public review

      Reviewer 1

      Zhang et al. tackle the important topic of primate-specific structural features of the brain and the link with functional specialization. The authors explore and compare gyral peaks of the human and macaque cortex through non-invasive neuroimagery, using convincing techniques that have been previously validated elsewhere. They show that nearly 60% of the macaque peaks are shared with humans, and use a multi-modal parcellation scheme to describe the spatial distribution of shared and unique gyral peaks in both species.

      We thank the reviewer for his/her summary and affirmation of our work.

      The claim is made that shared peaks are mainly located in lower-order cortical areas whereas unique peaks are located in higher-order regions, however, no systematic comparison is made. The authors then show that shared peaks are more consistently found across individuals than unique peaks, and show a positive but small and non-significant correlation between cross-individual counts of the shared peaks of the human and the macaque i.e. the authors show a non-significant trend for shared peaks that are more consistently found across humans to be those that are also more found across macaques.

      Answer: We appreciate the reviewer for raising questions about our work. In order to provide a more systematic comparison for the conclusion that ‘shared peaks are mainly located in lowerorder cortical areas whereas unique peaks are located in higher-order regions’, we have conducted two additional experiments. Following the reviewers’ suggestions, we conducted a statistical analysis of the ratio of shared and unique peaks within different brain networks (as depicted in Figure 2 (b)), and also presented the specific distribution quantities of the two types of peaks in both low- and high-order brain networks (as detailed in the corresponding Table 1). Through these three experiments, we have obtained a more systematic and comprehensive conclusion that ‘shared peaks are more distributed in lower-order networks, while unique peaks are more in higher-order networks’.

      In order to identify if unique and shared peaks could be identified based on the structural features of the cortical regions containing them, the authors compared them with t-tests. A correction for multiple comparisons should be applied and t-values reported. Graph-theoretical measures were applied to functional connectivity datasets (resting-state fMRI) and compared between unique and shared peak regions for each species separately. Again the absence of multiple comparison correction and t-values make the results hard to interpret. The same comment applies to the analysis reporting that shared peaks are surrounded by a larger number of brain regions than unique peaks. Finally, the potentially extremely interesting results about differential human gene expression of shared and unique peaks regions are not systematically reported e.g. the 28 genes identified are not listed and the selection procedure of 7 genes is not fully reported.

      Answer: We appreciate the reviewer for their suggestions about the statistical analysis in our manuscript. Firstly, we applied False Discovery Rate (FDR) correction to all experiments involving multiple comparisons throughout the entire manuscript, and the corrected t-values are reported (Table 2-5 and A5-A6). Additionally, in response to the reviewers’ guidance regarding the gene analysis section, we provided a list of 28 genes (Table A7) selected by lasso, along with the t-values obtained from Welch’s t-test for the expression of the two type of peaks. The functions corresponding to the seven genes with final t-values below 0.05 are reported in Table 6.

      The paper is well written and the methods used for data processing are very compelling i.e. the peak cluster extraction pipeline and cross-species registration. However, the analysis and especially the reporting of statistics, as they stand now, constitutes the main weakness of the paper. Some aspects of the statistical analysis need to be clarified.

      Reviewer 2

      The authors compared the cortical folding of human brains with folding in macaque monkey brains to reveal shared and unique locations of gyral peaks. The shared gyral peaks were located in cortical regions that are functionally similar and less changed in humans from those in macaques, while the locations of unique peaks in humans are in regions that have changed or expanded functions. These findings are important in that they suggest where human brains have changed more than macaque brains in their subsequent evolution from a common ancestor. The massive analysis of comparative results provides evidence of where humans and macaques are similar or different in cortical markers, as well as noting some of the variations within each of the two primates.

      Answer: Gratitude to the reviewer for his/her summary and appreciation of our cross-species work.

      Strengths:

      The study includes massive detail.

      Weaknesses:

      The manuscript is too long and there is not enough focus on the main points.

      Answer: We appreciate the reviewer for pointing out the shortcomings in our manuscript. Firstly, considering the manuscript is too long, we have chosen to retain only the core experiments and relevant analyses in the main text. Relatively minor conclusions have been moved to the supplementary information, such as original Table 1 is now moved to the Supplementary Information as Table A1 (locations of all shared clusters). Additionally, some non-essential expressions in the original manuscript have been removed.

      Our experiments primarily revealed the existence of partially shared cortical landmarks, known as gyral peaks, in both humans and macaques. We found that these shared and unique peaks are mainly distributed across low- and high-order brain networks. To emphasize this main point, we added two experiments on top of the existing ones to provide a more systematic explanation of this conclusion. We conducted a statistical analysis of the ratio of shared and unique peaks within different brain networks (as depicted in Figure 2 (b)), and also presented the specific distribution quantities of the two types of peaks in both low- and high-order brain networks (as detailed in the corresponding Table 1). By combining the results of these two experiments with the original manuscript’s statistical findings on the proportions of the two type of peaks in different brain networks, the conclusion that ‘shared and unique peaks are predominantly located in low-order and high-order brain networks’ becomes more prominent.

      A brief listing of previous views on why fissures form and what factors are important would be helpful.

      Answer: In response to this suggestion from the reviewer, we have incorporated some previous views on why fissures form and what factors are important into the ‘Introduction’ section.

      ‘Cortical folds are important features of primate brains. The primary driver of cortical folding is the differential growth between cortical and subcortical layers. During the gyrification process in the cortex, areas with high-density stiff axonal fiber bundles towards gyri. The brain’s folding pattern formed through a series of complex processes. The folding patterns in the brain, formed through a series of complex processes, are found to play a crucial role in various cognitive and behavioral processes, including perception, action, and cognition (Fornito et al. 2004; Cachia et al. 2018; Yang et al. 2019; Whittle et al. 2009).’

      Reviewer 1 (Recommendations For The Authors):

      (1) Figure 3b shows a non-significant trend for shared peaks that are more consistently found across humans to be those that are also more found across macaques. In the discussion, lines 218-219, the fact that the correlation is not significant should be reported more clearly.

      Answers: We thank the reviewer for this question. We revised the Line 218-219 (now Line 257-259) as follows: ‘2. Consistency: The inter-individual consistency of shared peaks within each species was greater than that of unique peaks. The consistency of shared peaks in the human and macaque brains exhibits a positive correlation (non-significant though).’

      (2) It is not fully clear how much shared peaks are mostly distributed in the higher-order cortex, especially in the macaque. It is reported in the results lines 132-133 that ‘In the macaque brain, shared peak cluster centers most distributed in the V2, DMN, and CON (Figure.2 (d)), while unique peak cluster centers most distributed in the DMN, Language (Lan), and Dorsal-attention (DAN)’ but not further discussed. Please develop this point in the discussion. Further, the results presented in Figures 2 and A1 are actually quite different and this shall be better described in the results. Given that shared and unique peaks can be found in the same region, this analysis would gain importance by applying a comparison test for the selection of regions where the most shared or unique peaks are found. The sentence lines 306-308 should be accordingly revised.

      It is hard to understand what the 0-3% corresponds to in Figures 2 and A1?

      Please also correct in both legends and in the text the labeling of panels and add in the legends a brief description of panel (c). In the legend of Figure 2, ‘shared peaks’ in the second sentence shall be replaced by ‘unique peaks’.

      Answers: We thank the reviewer for these questions and suggestions. Our responses to them are itemized as follows:

      A1: In general, to clarify the distribution of shared and unique peaks in the high-order and loworder networks, we divided 12 brain networks in Cole-Anticevic atlas into the low-order networks (visual 1 (V1), visual 2 (V2), auditory (Aud), somatomotor (SMN), posterior multimodal (PMN), ventral multimodal (VMN), and orbito-affective networks (OAN)) and higher-order networks (include cingulo-opercular (CON), dorsal attention (DAN), language (Lan), frontoparietal (FPN), default mode network (DMN)) based on previous research (Golesorkhi et al. 2022; Ito, Hearne, and Cole 2020). On this lower/higher -order division, we reported the number of shared and unique peaks in both species in Author response table 1. It is found that, whether in humans or macaques, shared peaks are more distributed in lower-order networks, while unique peaks are more in higher-order networks. This observation is particularly pronounced in humans.

      Author response table 1.

      The number of shared and unique peaks in lower- and higher-order brain networks of the two species. Lower-order networks include visual 1 (V1), visual 2 (V2), auditory (Aud), somatomotor (SMN), posterior multimodal (PMN), ventral multimodal (VMN), and orbito-affective networks (OAN), higher-order networks include cingulo-opercular (CON), dorsal attention (DAN), language (Lan), frontoparietal (FPN), default-mode network (DMN).

      In the main text, Figure 2 (referring to Author response figure 1 later in the text.) illustrates the proportions of shared and unique peaks across 12 brain networks in both species. In each pie chart, we have specifically highlighted the top three ranked brain regions. Although the pie chart also generally supports the above results, two brain networks deserve further discussion. They are DMN and CON, two higher-order networks that have higher ranks in terms of shared peak count (the second-ranked and the third-ranked on macaque shared peaks; the fourth-ranked and the fifth-ranked on human shared peaks).

      The cingulo-opercular network (CON) is a brain network associated with action, goal, arousal, and pain. However, a study found three newly discovered areas of the primary motor cortex that exhibit strong functional connectivity with the CON region, forming a novel network known as the somato-cognitive action network (SCAN) (Gordon et al. 2023). The SCAN integrates body control (motor and autonomic) and action planning, consistent with the findings that aspects of higher-level executive control might derive from movement coordination (Llinás 2002; Gordon et al. 2023). CON may be shared in the form of the SCAN network across these two species. This could explain in part the results in Author response figure 1 that shared peaks are more on CONs.

      Author response image 1.

      Pie chart shows the count of shared and unique peaks across different brain networks for both human and macaque. Right panel shows the Cole-Anticevic (CA) networks (Ji et al. 2019) on human surface as a reference.

      Default-mode network (DMN) is a ensemble of brain regions that are active in passive tasks, including the anterior and posterior cingulate cortex, medial and lateral parietal cortex, and medial prefrontal cortex (Buckner, Andrews-Hanna, and Schacter 2008). Although DMN is considered a higher-order brain network, numerous studies have provided evidence of its homologous presence in both humans and macaques. Many existing studies have confirmed the similarity between the DMN regions in humans and macaques from various perspectives, including cytoarchitectonic (Parvizi et al. 2006; Buckner, Andrews-Hanna, and Schacter 2008; Caminiti et al. 2010) and anatomical tracing (Vincent et al. 2007). These studies all support the notion that some elements of the DMN may be conserved across primate species (Mantini et al. 2011). In general, the partial sharing of DMN between humans and macaques may be attributed to the higher occurrence of shared peaks within the DMN.

      These results have been added to Table 2 along with corresponding text and discussion section.

      A2: The difference between the results of Figure 2 and Figure A1 (now Figure A2) is whether the peak count is normalized by cortical area, which hugely varies across networks. For example, among the 12 brain networks, the three networks with the largest surface areas are the DMN, SMN and CON, and the three networks with the smallest area are OAN, PMN and VMN. The area difference between networks can be as large as 18-fold. Therefore, it is not difficult to find that, although the DMN ranks high in both shared and unique peak counts during statistical analysis (Figure 2 (a)), it is relatively small in Figure A2 after area normalization. In contrast, VMN ranks lower in peak count statistics but exhibits a substantial proportion after area normalization (For example, 38% of macaque shared peaks are distributed in the VMN region, but there are actually only four peaks). However, the two pie charts deliver the same message that there are more shared peaks in lower-order networks, while unique peaks are more in higher-order networks (except for macaques, where shared peaks are also distributed significantly in DMN and CON).

      Following the suggestion from the reviewer, we adopted a new approach to present the ratio between shared peak count and unique peak count for each network (see Author response figure 2), such that the networks where the most shared or unique peaks are found can be easily highlighted. To mitigate potential imbalances in proportions caused by differences in the absolute numbers of each category (shared or unique) of peak, the proportions of peaks within their respective categories were utilized in the calculations. In Author response figure 2, the pink and green color bins represent ratios of shared and unique peaks, respectively. The dark blue dashed line represents the 50% reference line. In general, from left to right in the figure, the ratio of shared peaks decreases gradually while the ratio of unique peaks increases, suggesting that shared peaks are more (>0.5, above the dashed line) on lower-order networks (orange font), while unique peaks are generally more on higher-order networks (blue font). In specific, in human brains, the networks with a higher abundance of shared peaks are Aud, VMN, V1, SMN, and V2; whereas in macaques, they are CON, VMN, V1, V2, FPN, and SMN. Again, in the human brains, the disparity between shared and unique peaks tends to be more significant (further away from the reference line), for both lower-order and higher-order networks, respectively. In contrast, in the macaque brains, the disparity between shared and unique peaks is less significant (closer to the reference line). The ratio of shared and unique peaks is around 0.5 for 6 out of all 10 networks (including both lower and higher-order ones).

      Author response image 2.

      The ratio of shared and unique peaks in each brain network in the Cole-Anticevic (CA) atlas. The pink and green color bins represent ratios of shared and unique peaks, respectively. The dark blue dashed line represents the 50% reference line. For each brain region, the sum of the ratios of shared and unique peaks is equal to 1.

      Based on these analyses, the sentence lines 306-308 (now Line 368-370) has been revised as follows: ‘In the human brain, the more shared peaks (about 65%) are located in lower-order brain regions, while unique peaks are mainly (about 74%) located in higher-order regions. However, this trend is relatively less pronounced in the macaque brain.’

      These results have been added to Figure 2 (b) along with corresponding text and discussion section.

      A3: In response to the third suggestion from the reviewer, we have clearly labeled the brain region names corresponding to 0% to 3% in Figure 2 (now Figure 2 (a)) and Figure A1 (now Figure A2).

      Author response image 3.

      Pie chart shows the count of shared and unique peaks across different brain networks for both human and macaque. Right panel shows the Cole-Anticevic (CA) networks (Ji et al. 2019) on human surface as a reference.

      A4: Finally, we would like to express our gratitude to the reviewer for pointing out our mistakes.

      We have made improvements to Figure 2 and revised the figure captions accordingly.

      (3) The conclusions regarding the spatial relationship between peaks and functional regions shall be revised (Lines 187-188, 228-229, and 329-330). In the macaque, the results are opposite in the two atlases used. Further, in the human, it is not clear how multiple comparison corrections will impact statistics and some atlases show opposite results, although conclusions hold true in the majority of human atlases.

      Answers: We thank the reviewer very much for this suggestion. We have added the results of the Cole-Anticevic atlas for macaques in the main text, which also has the observation that shared>unique (Author response table 2, corresponds to Table 5 in main text), namely, there are more diverse brain regions around shared peaks than around unique peaks. Therefore, out of the commonly used three macaque atlases, two (Markov91 and Cole-Anticevic) conform to this observation, while BA05 does not. We utilized false discovery rate (FDR) correction for multiple comparisons, and the corrected p-values are reported in Tables (in the revised main text and are shown below). Results on atlas with multiple resolutions are reported in Author response table 4) (Table A6 in the Supplementary Information). The observation that more diverse brain regions around shared peaks than around unique peaks, holds for human atlases in Author response table 3) (Table 4 in main text), where the atlas resolutions ranges from 7 parcels to 300 parcels, demonstrating the robustness of the conclusion. It is noted that the observation is not consistent on atlases with relatively lower resolutions (e.g., BA05 for macaque, n=30 and Yeo2011 for human, n=7) or, in particular, higher resolutions (e.g., Schaefer-500, and Vosdewael-400, n>300). This inconsistency could be reasonable since the resolution of the parcellation itself will largely determines the chance of a cortical region appear in a peak’s neighborhood, if the parcellation is too coarse or too fine. For example, if n=1 (the entire cortex is the only one region) or n=30k (each vertex is a region), each peak will has the same number of neighboring regions for these two extreme cases (one brain region for each peak for n=1; around 30 vertices for each peak for n=30k).

      In conclusion, we observed that there are more diverse brain regions around shared peaks than around unique peaks for multiple brain atlases with a median parcellation resolution. These results have been added to Tables 4, 5, and A6 along with corresponding text and discussion section.

      Author response table 2.

      The mean values (±SD) of brain regions that appeared within a 3-ring neighborhood for shared and unique peaks in 3 common macaque atlases. For both Markov91 and Cole-Anticevic atlas, the shared peaks has more variety of functional regions around it than the unique peaks. But for the altas BA05, the conclusion was reversed. The bold font represent the larger values between the shared peak and unique peaks. All p<0.001, after false discovery rate (FDR) corrected.

      (4) For Tables 2-4, A4, and Figure 3a, please indicate in all the legends if values correspond to Mean plus minus Standard Deviation, report t-value, and n in the legend or in the text.

      Answers: We thank the reviewer very much for this suggestion. We added the ‘mean (±SD)’ in the notes of Tables 2-4, A4 (now A6), and Figure 3 (a). All the t and n values of t-test are reported in tables or in the main text.

      (5) Please create a statistical section in the Methods to describe more precisely the tests used e.g. for t-tests, if datasets follow a normal distribution with unknown variance. In the case of multiple comparisons like in e.g. Table 2-4, A4, please report what multiple comparisons correction was used to adjust the significance level.

      Author response table 3.

      The mean values (±SD) of brain regions that appeared within a 3-ring neighborhood for shared and unique peaks in 10 common human atlases. All the shared peaks in the table have a greater number of neighboring brain regions compared to the unique peaks. All p<0.001, false discovery rate (FDR) corrected.

      Author response table 4.

      The mean values (±SD) of brain regions where shared and unique peaks appeared within a 3-ring neighborhood in 21 common human atlases. The p-values were corrected by FDR.

      Answers: Thanks for the reviewer’s suggestion, we added a ‘Statistic Analysis’ section in the ‘Materials and Methods’ part:

      ‘All variables used in the two-samples t-test follow a normal distribution check and all p-values were corrected for multiple comparisons using the false discovery rate (FDR) method. Moreover, in order to identify differently expressed genes between shared and unique peaks, we employed the Welch’s t-test, given the unequal sample sizes for shared and unique peaks. For all tests, a p-value <0.05 was considered significant (FDR corrected).’

      For the experiments of multiple comparisons such as Table 2-4, A4 (now A6), etc., we have added explanations in the main text, multiple comparisons correction has been corrected by false discovery rate (FDR), p-value<0.05 is considered significant.

      (6) It would be of great interest to provide the full list of the 28 genes that significantly contributed to the classification of shared and unique peaks. Please provide a description of the Welch’s t-test results. From the 7 genes selected, only two are discussed. Could the authors please describe briefly the function of the other genes? Although we understand that they are not associated with neuronal activity and brain function.

      Answers: We thank the reviewer for these suggestions. We have provided a complete list of 28 genes selected by LASSO in the Author response table 5. Additionally, Welch’s t-test was employed to calculate p-values for the expression differences of each gene in shared and unique peak clusters, and the results are also reported in the Author response table 5.

      Author response table 5.

      The 28 genes selected by LASSO and their corresponding p-values from Welch’s t-test.

      Seven genes showed significant differential expression between shared and unique peaks in Welch’s t-test. These genes were PECAM1, TLR1, SNAP29, DHRS4, BHMT2, PLBD1, KCNH5. Brief descriptions of their functions are listed in Author response table 6. All gene function descriptions were derived from the NCBI website (https://www.ncbi.nlm.nih.gov/).

      These results have been added to Tables 6 and A7 along with corresponding text.

      (6) For comparison, could the authors provide a supplementary figure of shared peak clusters like in Figure 1b but displayed on the surface of the macaque brain template?

      Answers: We thank the reviewer very much for this suggestion and we have incorporated a display of shared peak clusters on the macaque brain template surface (Author response figure 4, corresponds to Figure A1 of Supplementary Information.)

      (7) Could the author develop or rephrase the sentence lines 69-72 which remains unclear?

      Answers: We appreciate the reviewer’s feedback and have revised this sentence to ensure clarity. The sentences from line 69 to 72 have been revised to ‘In the study of macaques, it has been observed that the peak consistently present across individuals is located on more curved gyri (S. Zhang, Chavoshnejad, et al. 2022). Similar conclusions have been drawn in human brain research (S. Zhang, T. Zhang, et al. 2023).’ Now, this sentence corresponds to lines 74-77 in the main text.

      (8) Line 99: please indicate which section.

      Author response table 6.

      Seven genes were selected using LASSO that showed significant differential expression in shared and unique peaks.

      Answers: We thank the reviewer very much for this suggestion and we revised this sentence to ‘The definition of peaks and the method for extracting peak clusters within each species are described in the Materials and Methods section’.

      (9) In Figure 3b, please report R2 and p-value. A semi-log might be more appropriate given the overdispersion of Human Peak Counts.

      Answers: We thank the reviewer very much for this suggestion. Linear regression analysis was conducted on the average counts of all corresponding shared peak clusters of human and macaque. The horizontal and vertical axes of the Author response figure 5 (b) represent the average count of shared peaks in the macaque and human brains, respectively. The Pearson correlation coefficient (PCC) of the interspecies consistency of the left and right brain is 0.20 and 0.26 (p>0.05 for both), respectively. The result of linear regression shows that there is a positive correlation in the inter-individual consistency of shared peaks between macaque and human brains, but it is not statistically significant (with R2 for the left and right brain are 0.07 and 0.01, respectively).

      Author response image 4.

      Shared peak clusters of macaque, shows on macaque brain template.

      The goodness of fit (R2), pearson correlation coefficient (PCC), and their respective p-values were indicated in Author response figure 5 (b). To avoid overdispersion, the peak count of the human brain is displayed in a semi-log format.

      The updated Figure and results are presented in Figure 3 of the main text.

      (10) Line 177: please indicate where in the Supplementary Information.

      Answers: Thank you for the reviewer’s reminder. We have incorporated the results of the human brain structural connectivity matrix into Table A5 in the Supplementary Information and provided corresponding indications in the main text.

      (11) Line 226: please correct ‘(except for betweeness [and efficiency] of the’.

      Answers: We thank the reviewer very much for this suggestion and we added ‘and efficiency’ in original Line 173 and 226 (now Line 206 and 267) after ‘betweeness’.

      (12) The gene expression dataset used is from the Allen Human Brain Atlas (AHBA). Reference to Hawrylycz et al., 2012 Nature. 2012 Sep 20;489(7416):391-399. doi: 10.1038/nature11405 shall be made and abbreviation defined at first use in the text.

      Answers: We added the full name ‘Allen Human Brain Atlas’ when AHBA is first mentioned, along with the reference suggested by the reviewer.

      Author response image 5.

      (a) Mean peak count (±SD) covered by shared and unique peak clusters in two species. ***indicates p<0.001. The t-values for the t-tests in humans and macaques are 4.74 and 2.67, respectively. (b) Linear regression results of the consistency of peak clusters shared between macaque and human brains. The pink and blue colors represent the left and right hemispheres, respectively. The results of the linear regression are depicted in the figure. While there was a positive correlation observed in the consistency of gyral peaks between macaque and human, the obtained p-value for the fitted results exceeded the significance threshold of 0.05.

      (13) Line 17: remove ‘are’.

      Answers: We thank the reviewer very much for this suggestion and we removed ‘are’ in Line 17 (now Line 18).

      (14) Line 201: remove ‘is used’.

      Answers: We thank the reviewer very much for this suggestion and we removed ‘is used’ in Line 201 (now Line 237).

      References

      Buckner, Randy L, Jessica R Andrews-Hanna, and Daniel L Schacter (2008). “The brain’s default network: anatomy, function, and relevance to disease”. In: Annals of the new York Academy of Sciences 1124.1, pp. 1–38.

      Cachia, Arnaud et al. (2018). “How interindividual differences in brain anatomy shape reading accuracy”. In: Brain Structure and Function 223, pp. 701–712.

      Caminiti, Roberto et al. (2010). “Understanding the parietal lobe syndrome from a neurophysiological and evolutionary perspective”. In: European Journal of Neuroscience 31.12, pp. 2320–2340.

      Fornito, Alexander et al. (2004). “Individual differences in anterior cingulate/paracingulate morphology are related to executive functions in healthy males”. In: Cerebral cortex 14.4, pp. 424–431.

      Golesorkhi, Mehrshad et al. (2022). “From temporal to spatial topography: hierarchy of neural dynamics in higher-and lower-order networks shapes their complexity”. In: Cerebral Cortex 32.24, pp. 5637–5653.

      Gordon, Evan M et al. (2023). “A somato-cognitive action network alternates with effector regions in motor cortex”. In: Nature, pp. 1–9.

      Ito, Takuya, Luke J Hearne, and Michael W Cole (2020). “A cortical hierarchy of localized and distributed processes revealed via dissociation of task activations, connectivity changes, and intrinsic timescales”. In: NeuroImage 221, p. 117141.

      Ji, Jie Lisa et al. (2019). “Mapping the human brain’s cortical-subcortical functional network organization”. In: Neuroimage 185, pp. 35–57.

      Llinás, Rodolfo R (2002). I of the vortex: From neurons to self. MIT press.

      Mantini, Dante et al. (2011). “Default mode f brain function in monkeys”. In: Journal of Neuroscience 31.36, pp. 12954–12962.

      Parvizi, Josef et al. (2006). “Neural connections of the posteromedial cortex in the macaque”. In:Proceedings of the National Academy of Sciences 103.5, pp. 1563–1568.

      Vincent, Justin L et al. (2007). “Intrinsic functional architecture in the anaesthetized monkey brain”.In: Nature 447.7140, pp. 83–86.

      Whittle, Sarah et al. (2009). “Variations in cortical folding patterns are related to individual differences in temperament”. In: Psychiatry Research: Neuroimaging 172.1, pp. 68–74.

      Yang, Shimin et al. (2019). “Temporal variability of cortical gyral-sulcal resting state functional activity correlates with fluid intelligence”. In: Frontiers in neural circuits 13, p. 36.

      Zhang, Songyao, Poorya Chavoshnejad, et al. (2022). “Gyral peaks: Novel gyral landmarks in developing macaque brains”. In: Human Brain Mapping 43.15, pp. 4540–4555.

      Zhang, Songyao, Tuo Zhang, et al. (2023). “Gyral peaks and patterns in human brains”. In: Cerebral Cortex.

    2. eLife assessment

      This important paper compares cross-species cortical folding patterns in human and non-human primates, showing that most gyral peaks shared across species are in lower-order cortical regions. The supporting evidence is solid and multi-faceted, encompassing anatomy, connectivity and gene expression. This paper will be of interest to a broad readership within the neuroscience community, especially for those interested in cross-species correspondences in brain organisation.

    3. Reviewer #1 (Public Review):

      Zhang et al. tackle the important topic of primate-specific structural features of the brain and the link with functional specialization. The authors explore and compare gyral peaks of the human and macaque cortex through non-invasive neuroimagery, using convincing techniques that have been previously validated elsewhere. They show that nearly 60% of the macaque peaks are shared with humans, and use a multi-modal parcellation scheme to describe the spatial distribution of shared and unique gyral peaks in both species.

      The claim is made that shared peaks are mainly located in lower-order cortical areas whereas unique peaks are located in higher-order regions, however, no systematic comparison is made. The authors then show that shared peaks are more consistently found across individuals than unique peaks, and show a positive but small and non-significant correlation between cross-individual counts of the shared peaks of the human and the macaque i.e. the authors show a non-significant trend for shared peaks that are more consistently found across humans to be those that are also more found across macaques.

      In order to identify if unique and shared peaks could be identified based on the structural features of the cortical regions containing them, the authors compared them with t-tests. A correction for multiple comparisons should be applied and t-values reported. Graph-theoretical measures were applied to functional connectivity datasets (resting-state fMRI) and compared between unique and shared peak regions for each species separately. Again the absence of multiple comparison correction and t-values make the results hard to interpret. The same comment applies to the analysis reporting that shared peaks are surrounded by a larger number of brain regions than unique peaks. Finally, the potentially extremely interesting results about differential human gene expression of shared and unique peaks regions are not systematically reported e.g. the 28 genes identified are not listed and the selection procedure of 7 genes is not fully reported.

      The paper is well written and the methods used for data processing are very compelling i.e. the peak cluster extraction pipeline and cross-species registration.

      Comments on revision:

      The authors have convincingly addressed all my previous concerns such that, as the revised paper stands now, the presented results provide solid support for the conclusions of the authors. The revised paper is now of interest for a large part of the neuroscience community and specifically for those interested in primate-specific structural features of the brain and the link with functional specialization.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study by Ghafari et al. addresses a question that is highly relevant for the field of attention as it connects structural differences in subcortical regions with oscillatory modulations during attention allocation. Using a combination of magnetoencephalography (MEG) and magnetic resonance imaging (MRI) data in human subjects, inter-individual differences in the lateralization of alpha oscillations are explained by asymmetry of subcortical brain regions. The results are important, and the strength of the evidence is convincing. Yet, clarifying the rationale, reporting the data in full, a more comprehensive analysis, and a more detailed discussion of the implications will strengthen the manuscript further.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors re-analysed the data of a previous study in order to investigate the relation between asymmetries of subcortical brain structures and the hemispheric lateralization of alpha oscillations during visual spatial attention. The visual spatial attention task crossed the factors of target load and distractor salience, which made it possible to also test the specificity of the relation of subcortical asymmetries to lateralized alpha oscillations for specific attentional load conditions. Asymmetry of globus pallidus, caudate nucleus, and thalamus explained inter-individual differences in attentional alpha modulation in the left versus right hemisphere. Multivariate regression analysis revealed that the explanatory potential of these regions' asymmetries varies as a function of target load and distractor salience.

      Strengths:

      The analysis pipeline is straightforward and follows in large parts what the authors have previously used in Mazzetti et al (2019). The authors use an interesting study design, which allows for testing of effects specific to different dimensions of attentional load (target load/distractor salience). The results are largely convincing and in part replicate what has previously been shown. The article is well-written and easy to follow.

      We thank the reviewer for their interest in our study.

      Weaknesses:

      While the article is interesting to read for researchers studying alpha oscillations in spatial attention, I am somewhat sceptical about whether this article is of high interest to a broader readership. Although I read the article with interest, the conceptual advance made here can be considered mostly incremental. As the authors describe, the present study's main advance is that it does not include reward associations (as in previous work) and includes different levels of attentional load. While these design features and the obtained results indeed improve our general understanding of how asymmetries of subcortical structures relate to lateralized alpha oscillations, the conceptual advance is somewhat limited.

      We thank the reviewer for their constructive comment. We’d like to highlight that this is the first study to show relationship between subcortical structures asymmetry with attention-modulated alpha oscillation that did not involve any reward-associations- which is the most studied role of basal ganglia. We also believe there is value is having a second study linking the asymmetry in volume of subcortical structures to the modulation of alpha oscillations as this surprising finding also have important clinical implications (see below). We edited the manuscript as below to explain the advances made in this study:

      Introduction (Line 112): “Our current findings broaden our understanding of how subcortical structures are involved in modulating alpha oscillations during top-down spatial attention, in the absence of any reward or value associations. “

      Discussion (Line 301): “It has also been shown that the spatial extent of pathological change in subcortical structures can predict cognitive changes in Parkinson’s Disease (43). […] Changes in neocortical oscillatory activity have also been observed in neurological disorders which mainly are known to affect subcortical structures. For example, individuals with Alzheimer's Disease demonstrate an increase in slow oscillatory activities and a decrease in higher frequency oscillations (45). Moreover, in patients with Parkinson’s Disease, the power of beta oscillations increases relatively to when they are dopamine-depleted compared with when they are on dopaminergic medication (46).”

      While the analysis of the relation of individual subcortical structures to alpha lateralization in different attentional load conditions is interesting, I am not convinced that the present analysis is suited to draw strong conclusions about the subcortical regions' specificity. For example, the Thalamus (Fig. 5) shows a significant negative beta estimate only in one condition (low-load target, non-salient distractor) but not in the other conditions. However, the actual specificity of the relation of thalamus asymmetry to lateralized alpha oscillations would require that the beta estimate for this one condition is significantly higher than the beta estimates for the other three conditions, which has not been tested as far as I understand.

      We thank the reviewer for this constructive comment. We agree with the reviewer that we should compare the beta value amongst the conditions. We therefore determined to better harness the multivariate nature of our analysis. Multivariate regression analysis allows one to test the null hypothesis that a given predictor does not contribute to all the dependent variables. A rejection of this hypothesis would suggest that lateralization of a given region of interest significantly predicts variability across all 4 of the task conditions, whereas failure to reject the null would imply that the predictive relationship holds only for that single condition. We tested this global null hypothesis using a MANOVA test and found the following which we have added to the manuscript:

      Results (Line 250): “To ascertain whether each predictor contributes to all conditions, we conducted statistical tests on the results of our MMR using the null hypothesis that a given regressor does not impact all dependent variables. We found that while, with marginal significancy, caudate nucleus can predict variability across all four of the task conditions (F(26,4) = 2.82, p-value = 0.046), the predictive relationships of thalamus (F(26,4) = 2.43, p-value = 0.073) with condition 1, and globus pallidus (F(26,4) = 2.29, p-value = 0.087) with conditions 2 and 3 hold only for these conditions. In sum, this demonstrates that when the task is easiest (condition 1), the thalamus is related to alpha modulation. When the task is most difficult (condition 4), the caudate nucleus relates to the alpha modulation, however, its contributions are substantial enough to predict outcomes across all conditions. For the conditions with medium difficulty (conditions 2 and 3) the globus pallidus is related to the alpha band modulation. “

      Method (Line 599): “To examine the specificity of each regressor for lateralized alpha in each condition, we statistically assessed the results of the MMR against the null hypothesis that a particular predictor does not contribute to all dependent variables, employing a MANOVA test in RStudio (version 2022.02.2) (80).”

      Discussion (Line 337): “Thalamus, Globus Pallidus, and Caudate nucleus play varying roles across different load conditions.”

      Discussion (Line 361): “Although these findings highlight the varying contributions of different regions, they do not imply a lack of evidence for correlations between these subcortical structures and other load conditions.”

      Discussion (Line 379): “Additionally, we refrained from directly comparing the contributions of subcortical structures to different conditions due to low statistical power. […] In future studies it would be interesting to design an experiment directly addressing which subcortical regions contribute to distractor and target load in terms of modulating the alpha band activity. In order to ensure sufficient statistical power for doing so possibly each factor needs to be addressed in different experiments.”

      Reviewer #3 (Public Review):

      Summary:

      In this study, Ghafari et al. explored the correlation between hemispheric asymmetry in the volume of various subcortical regions and lateralization of posterior alpha-band oscillations in a spatial attention task with varying cognitive demands. To this end, they combined structural MRI and task MEG to investigate the relationship between hemispheric differences in the volume of basal ganglia, thalamus, hippocampus, and amygdala and hemisphere-specific modulation of alpha-band power. The authors report that differences in the thalamus, caudate nucleus, and globus pallidus volume are linked to the attention-related changes in alpha band oscillations with differential correlations for different regions in different conditions of the design (depending on the salience of the distractor and/or the target).

      Strengths:

      The manuscript contributes to filling an important gap in current research on attention allocation which commonly focuses exclusively on cortical structures. Because it is not possible to reliably measure subcortical activity with non-invasive electrophysiological methods, they correlate volumetric measurements of the relevant subcortical regions with cortical measurements of alpha band power. Specifically, they build on their own previous finding showing a correlation between hemispheric asymmetry of basal ganglia volumes and alpha lateralization by assessing a task without an explicit reward component. Furthermore, the authors use differences in saliency and perceptual load to disentangle the individual contributions of the subcortical regions.

      We appreciate the reviewer’s interest in our study.

      Weaknesses:

      The theoretical bases of several aspects of the design and analyses remain unclear. Specifically, we missed statements in the introduction about why it is reasonable, from a theoretical perspective, to expect:

      (i) a link between volumetric measurements and task activity;

      We thank the reviewer for this constructive feedback. We have now addressed this concern in the revised manuscript.

      Discussion (Line 293): “It has been demonstrated that extensive navigation experience enlarges the size of right hippocampus (40). Furthermore, in terms of neurological disorders, it is well established that shrinkage (atrophy) in specific regions is a predictor of a number of neurological and psychiatric conditions including Parkinson’s disease, dementia, and Huntington’s disease. […] It has also been shown that the spatial extent of pathological change in subcortical structures can predict cognitive changes in Parkinson’s Disease (43). […] Changes in neocortical oscillatory activity have also been observed in neurological disorders which mainly are known to affect subcortical structures. For example, individuals with Alzheimer's Disease demonstrate an increase in slow oscillatory activities and a decrease in higher frequency oscillations (45). Moreover, in patients with Parkinson’s Disease, the power of beta oscillations increase relatively to when they are dopamine-depleted compared with when they are on dopaminergic medication (46). “

      (ii) a specific link with hemispheric asymmetry in subcortical structures (While focusing on hemispheric lateralization might circumvent the problem of differences in head size, it would be better to justify this focus theoretically, which requires for example a short review of evidence showing ipsilateral vs contralateral connections between the relevant subcortical and cortical structures);

      We thank the reviewer for this helpful comment that resulted in clarification of the manuscript. We addressed this issue in the revised manuscript; we also now have complemented the revised manuscript with papers directly investigating asymmetry of subcortical regions in relation to neurological disorders:

      Introduction (Line 102): “We utilized the hemispheric laterality of subcortical structures and alpha modulation to overcome issues related to individual variations in oscillatory power and head size.”

      Discussion (Line 314): “Employing hemispheric lateralization was motivated by the organizational characteristic of structural asymmetry in healthy brain (47). Additionally, considering the effects of aging (48) and neurodegenerative disorders, such as Alzheimer's Disease (49), on brain symmetry influenced this approach. Furthermore, computing lateralization indices for individuals addresses the challenge of accommodating variations in both head size and the power of oscillatory activity.”

      Discussion (Line 374): “Furthermore, in this study, our emphasis has been on assessing the size of subcortical structures. Future investigations could explore subcortical white matter connectivities and hemispheric asymmetries. This approach has previously been conducted on superior longitudinal fasciculus (SLF) (61,62) and holds potential for examining cortico-subcortical connectivity in the context of oscillatory asymmetries.”

      (iii) effects not only in basal ganglia and thalamus, but also hippocampus and amygdala (a justification of selection of all ROIs);

      We thank the reviewer for this comment. We assessed the hippocampus and amygdala because they are automatically segmented in the FIRST algorithm. As our analysis showed they did not show a relation to the modulation of alpha oscillations, these regions also provide a useful control for our approach. Therefore, we included all subcortical structures in the model and evaluated their predictive impact. This is now addressed in the revised manuscript.

      Method (Line 477): “FIRST is an automated model-based tool that runs a two-stage affine transformation to MNI152 space, to achieve a robust pre-alignment of thalamus, caudate nucleus, putamen, globus pallidus, hippocampus, amygdala, and nucleus accumbens based on individual’s T1-weighted MR images.”

      Method (Line 576): “The absence of a relationship between modulations of alpha oscillations and the hippocampus and amygdala was expected as these regions typically are not associated with the allocation of spatial attention and thus add validity to our approach. “

      (iv) effects that depend on distractor versus target salience (a rationale for the specific two-factor design is missing);

      We thank the reviewer for this comment that helped us clarify the manuscript. The two-factor design is to investigate how allocation of attentional resources specifically relates to mechanisms of excitability and suppression mechanism. For this reason, both the salience of the distractor (associated with suppression) and the perceptual load of the target (associated with excitability) had to be manipulated. We clarified the rationale in the revised version as below:

      Introduction (Line 96): “We analyzed MEG and structural data from a previous study (27), in which spatial cues guided participants to covertly attend to one stimulus (target) and ignore the other (distractor). To investigate the relationship between the allocation of attentional resources and mechanisms of neural excitability and suppression, the target load and the visual saliency of the distractor were manipulated using a noise mask. This load/salience manipulation resulted in four conditions that affect the attentional demands of target and distractor.”

      (v) effects in the absence of reward (why it is important to show that the effect seen previously in a task with reward is seen also in a task without reward);

      We thank the reviewer for this clarification comment. We addressed this question in introduction and discussion as below:

      Introduction (Line 107): “By examining their role in a task without explicit reward, we aim to elucidate the generalizability of the contributions of subcortical structures to spatial attention modulation. Such a finding would implicate a role for the basal ganglia in cognition beyond the well-studied realm of the estimation of choice values (33). Specifically, in a prior study (28), we observed that the contributions of the basal ganglia were most pronounced when the items in question were associated with a reward. Our current findings broaden our understanding of how subcortical structures are involved in modulating alpha oscillations during top-down spatial attention, in the absence of any reward or value associations. “

      Discussion (Line 333): “This convergence of results not only corroborates the validity and consistency of our findings but also extends the empirical foundation supporting the predictive role of the asymmetry of globus pallidus in modulating alpha oscillations beyond reward valence and to the context of attention.”

      (vi) effects on rapid frequency tagging.

      We thank the reviewer for this constructive comment. We have now included this analysis and added the results to the revised manuscript.

      Results (Line 224): “It is worth noting that neither the behavioural nor the rapid invisible frequency tagging (RIFT) measures showed significant relationships with LVs and HLM() (Supplementary material, Figure 1 and Table 3).”

      Discussion (Line 396): “We did not find any association between the power of RIFT signal and the size asymmetry of subcortical structures. Since to Bayes factors were less than 0.1, we conclude that our RIFT null findings are robust, suggesting a dissociation between how alpha oscillations and neuronal excitability indexed by RIFT relate to subcortical structures.”

      Method (Line 548): “We computed the modulation index (MI) for rapid invisible frequency tagging (RIFT) by averaging the power of the signal in sensors on the right when attention was directed to the right compared to when it was directed to the left. This calculation was also performed for sensors on the left. Consequently, we identified the top 5 sensors on each side with the highest MI as the Region of Interest (ROI). Utilizing the sensors within the ROI, we computed hemispheric lateralization modulation (HLM) of RIFT by summing the average MI(RIFT) of the right sensors and the average MI(RIFT) of the left sensors, obtaining one HLM(RIFT) value for each participant. For a more comprehensive analysis, refer to reference (24).”

      Supplementary Materials (Line 839): “Figure 1. Lateralization volume of thalamus, caudate nucleus and globus pallidus in relation to hemispheric lateralization modulation of rapid invisible frequency tagging (HLM(RIFT)) on the right and behavioural asymmetry on the left. A and E, The beta coefficients for the best model (having three regressors) associated with a generalized linear model (GLM) where lateralization volume (LV) values were defined as explanatory variables for HLM(RIFT) (A) and behavioural asymmetry (E). Error bars indicate standard errors of mean (SEM). B and F, Partial regression plot showing the association between LVTh and HLM(RIFT) (B, p-value = 0.59) and behavioural asymmetry (F, p-value = 0.38) while controlling for LVGP and LVCN. C and G, Partial regression plot showing the association between LVGP and HLM(RIFT) (C, p-value = 0.16) and behavioural asymmetry (G, p-value = 0.80) while controlling for LVTh and LVCN . D and H, Partial regression plot showing the association between LVCN and HLM(RIFT) (D, p-value = 0.53) and behavioural asymmetry (H, p-value = 0.74) while controlling for LVTh and LVGP. Negative (or positive) LVs indices denote greater left (or right) volume for a given substructure; similarly negative HLM(RIFT) values indicate stronger modulation of RIFT power in the left compared with the right hemisphere, and vice versa; positive behavioural asymmetry value shows higher accuracy when the target was on the right as compared with left, and vice versa for negative behavioural asymmetry values. The dotted curves in B, C, D, F, G, and H indicate 95% confidence bounds for the regression line fitted on the plot in red.

      Author response image 1.

      Second, the results are not fully reported. The model space and the results from the model comparison are omitted. Behavioral data and rapid frequency tagging results are not shown. Without having access to the data or the results of the analyses, the reader cannot evaluate whether the null effect corresponds to the absence of evidence or (as claimed in the discussion) evidence of absence.

      We thank the reviewer for this constructive suggestion. In the revised manuscript, we incorporated the model space, model comparisons, BIC values from the models, behavioral and rapid frequency tagging analysis methods, and their respective results. Additionally, we computed Bayes factors for our null findings to enhance the interpretability of our results.

      Results (Line 199): “This model predicted the HLM(α) values significantly in the GLM (F3,29 = 7.4824, p = 0.0007, adjusted R2 = 0.376) as compared with an intercept-only null model (Figure 4A).”

      Although, the beta estimate of LVGP only showed a positive trend, removing it from the regression resulted in worse models (AIC and BIC tables in supplementary material).

      Supplementary materials (Line 827): “Table 1. Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values for all possible combinations of regressors (Lateralized Volume of subcortical structures). The selected model, with lowest AIC, is marked in green.

      Author response table 1.

      Author response table 2.

      Author response table 3.

      Bayes factors for correlation between hemispheric laterality of subcortical structures with hemispheric lateralization modulation of rapid invisible frequency tagging (HLM(RIFT)) and with behavioural asymmetry (BA). The Pearson correlation between each subcortical structure with HLM(RIFT) and behavioural asymmetry was calculated. The likelihood of the data under the alternative hypothesis (the evidence of correlation) were subsequently compared to the likelihood under null hypothesis (absence of correlation), given the data. As it is demonstrated in the table, all Bayes factors were below or very close to 1 indicating evidence for the null hypothesis.

      For the results of frequency tagging signal, we have now included this analysis and added the results to the revised manuscript. We refer the reviewer to our response to the weakness (vi) from reviewer #3.

      Third, it remains unclear whether the MMS is the best approach to analyzing effects as a function of target and distractor salience. To address the question of whether the effects of subcortical volumes on alpha lateralization vary with task demands (which we assume is the primary research question of interest, given the factorial design), we would like to evaluate some sort of omnibus interaction effect, e.g., by having target and distractor saliency interact with the subcortical volume factors to predict alpha lateralization. Without such analyses, the results are very hard to interpret. What are the implications of finding the differential effects of the different volumes for the different task conditions without directly assessing the effect of the task manipulation? Moreover, the report would benefit from a further breakdown of the effects into simple effects on unattended and attended alpha, to evaluate whether effects as a function of distractor (vs target) salience are indeed accompanied by effects on unattended (vs attended) alpha.

      The reviewer is correct that we did not directly compare between task conditions when we assessed the predictive relationship between basal ganglia lateralization and alpha lateralization. We opted for the multivariate regression approach as this allowed us to simultaneously model the predictive relationship between our continuous predictors and HLM alpha in each condition, allowing us to be most efficient with our level of statistical power (N=33). Indeed, directly comparing between task conditions within one model would result in an extra 16 regressors (1 (intercept) + 4-1 to model the difference between conditions + 3 to model the regressors + 3 x 3 to model each region x task condition interaction). This approach would be underpowered given our sample size, and the ensuing results are likely to be unreliable.

      However, we statistically analysed our regression results. Multivariate regression analysis allows one to test the null hypothesis that a given predictor does not contribute to all the dependent variables. A rejection of this hypothesis would suggest that lateralization of a given region of interest significantly predicts variability across all 4 of the task conditions, whereas failure to reject the null would imply that the predictive relationship holds only for that single condition. We tested this global null hypothesis using a MANOVA test and reported the findings in response to weakness two from reviewer #1.

      Discussion (Line 384): “In future studies it would be interesting to design an experiment directly addressing which subcortical regions contribute to distractor and target load in terms of modulating the alpha band activity. In order to ensure sufficient statistical power for doing so possibly each factor needs to be addressed in different experiments. “

      The fourth concern is that the discussion section is not quite ready to help the reader appreciate the implications of key aspects of the findings. What are the implications for our understanding of the roles of different subcortical structures in the various psychological component processes of spatial attention? Why does the volumetric asymmetry of different subcortical structures have diametrically opposite effects on alpha lateralization? Instead, the discussion section highlights that the different subcortical structures are connected in circuits: "Globus pallidus also has wide projections to the thalamus and can thereby impact the dorsal attentional networks by modulating prefrontal activities." If this is true, then why does the effect of the GP dissociate from that of the thalamus? Also, what is it about the current behavioural paradigm that makes the behavioral readout insensitive to variation in subcortical volume (or alpha lateralization?)?

      We thank the reviewer for this feedback. These are indeed all good points, and we hope that our findings will inspire further research to address these issues. In the revised manuscript we now write:

      Discussion (Line 349): “The opposite effect of the globus pallidus compared to the thalamus is striking, and possibly explained but the globus pallidus containing GABAergic interneurons. Thus the inhibitory nature of the globus pallidus projections to thalamus could explain why they are related to the alpha modulation in different manners (57).”

      Discussion (Line 379): “Moreover, the current study faced methodological constraints, limiting the analysis to the entire thalamus. […] . It would be of great interest to conduct further investigations to quantify the distinct impacts of individual thalamic nuclei on the association between subcortical structures and the modulation of oscillatory activity.“

      Discussion (Line 388): “Moreover, our failure to identify a relationship between the lateralized volume of subcortical structures and behavioural measures should be addressed in studies that are better designed to capture performance asymmetries (63). Individual preferences toward one hemifield, which were not addressed in the current study design, could potentially strengthen the power to detect correlations between structural variations in the subcortical structures and behavioural measures.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Minor comment:

      Between-subject correlation/regression analyses always rely on the assumption that the underlying dependent measures are reliable. While the reliability of asymmetries of subcortical structures can be assumed, the reliability of lateralized alpha oscillations during spatial attention can be questioned. It would be helpful if the authors could test the reliability of alpha lateralization, for instance by calculating HLM(a) in the first and second half of the experiment and correlating the resulting HLM(a) values (split-half reliability).

      We appreciate the reviewer for their insightful comment. Acknowledging that the between-subject regression relies on the reliability of alpha lateralization. Nonetheless, a previous study has demonstrated consistent results regarding HLM(α). We have further elaborated on these aspects in the discussion section:

      Discussion (Line 328): “Furthermore, our regression analysis outcomes align with the findings of Mazzetti et al. (28) underscoring the significant predictive influence exerted by the lateralized volume of globus pallidus on the modulation of hemispheric lateralization in alpha oscillations during spatial attention tasks. This convergence of results not only corroborates the validity and consistency of our findings but also extends the empirical foundation supporting the predictive role of the asymmetry of globus pallidus in modulating alpha oscillations within the context of attention.”

      Reviewer #3 (Recommendations For The Authors):

      We recommend that a revised version of the manuscript

      • Clarifies the theoretical basis for the 6 key design & analysis choices that we have outlined above;

      We thank the reviewer for their precision. We addressed the concerns outlined above in the previous section.

      • Also clarifies the task description (perhaps referring to target and distractor salience instead of target load versus distractor salience might help);

      Thank you for this constructive comment. We used the terms ‘load’ for target and ‘salience’ for distractor because the noise manipulation of the faces reduces the salience of the image which results in distractors being less distractive (easier) but targets being more perceptually loaded (harder). The explanation of these terms is made clear in the revised manuscript.

      Method (Line 447): “Over trials, the perceptual load of targets was manipulated using a noise mask; noisy targets are harder to detect than clear targets and therefore incur greater perceptual load in their detection. The saliency of distractor stimuli was also manipulated using a noise mask; noisy distractor stimuli are less salient than clear distractors and therefore less disruptive to performance on the detection task. The noise mask was created by randomly swapping 50% of the stimulus pixels (Figure 1B). This manipulation resulted in four target-load/distractor-saliency conditions: (1) target: low load, distractor: low saliency (i.e., clear target, noisy distractor), (2) target: high load, distractor: low saliency (i.e., noisy target, noisy distractor), (3) target: low load, distractor: high saliency (i.e., clear target, clear distractor), (4) target: high load, distractor: high saliency (i.e., noisy target, clear distractor) (Figure 1B and C).”

      • Fully reports all the data, including those of the model comparisons, the behavioural results, and the rapid frequency tagging results;

      We thank the reviewer for this constructive comment. We refer the reviewer to our response to second comment and comment (vi) from reviewer #3.

      • Reports interaction effects to directly test the modulating role of task demands in the link between volume and alpha, and break down the alpha lateralization indices into their simple effects on the ipsilateral and contralateral hemispheres;

      task demands have been addressed in response to in response to weakness two from reviewer #1.

      Regarding the second part of the comment, in our study, to compare the lateralized modulation of alpha oscillations between the right and left hemispheres, we computed hemispheric lateralization modulation. This involved dividing trials into attention right and attention left. Subsequently, we calculated the lateralization index separately for sensors on the right and left. Specifically, this entailed computing ipsilateral – contralateral for sensors on the right and contralateral – ipsilateral for sensors on the left side of the brain. We addressed this concern in methods section as below:

      Method (Line 537): “As MI(α) consistently represents power of alpha in attention right versus attention left conditions, it entails the comparison between ipsilateral and contralateral alpha modulation power for sensors located on the right side of the head. The same comparison applies inversely for sensors situated on the left side of the brain.”

      • Clarifies in the discussion section the specific implications of the results for our understanding of the link between distinct subcortical structures and distinct component processes of spatial attention.

      We thank the reviewer for their constructive comment. This point is addressed in response to the fourth concern of reviewer #3.

      More detailed specific recommendations are provided below:

      • Line 40ff: In this paragraph, the theoretical framework concerning the function of the subcortical regions of interest is described. Here, the authors jump back and forth between the role of the basal ganglia and the role of the thalamus. For clarity, we would advise to describe the functions of these two structures one after the other. And include a justification for assessing the hippocampus and the amygdala.

      We appreciate the reviewer’s preciseness in this comment. We put the description of these structures one after the other in the revised manuscript as below:

      Introduction (Line 44): “For instance, it has been shown that the pulvinar plays an important role in the modulation of neocortical alpha oscillations associated with the allocation of attention (9). Studies in rats and non-human primates have shown that both the thalamus and superior colliculus, are involved in the control of spatial attention by contributing to the regulation of neocortical activity (9-11). Notably, when the largest nucleus of the thalamus, the pulvinar, was inactivated after muscimol infusion, the monkey’s ability to detect colour changes in attended stimuli was lowered. This behavioral deficit occurred when the target was in the receptive field of V4 neurons that were connected to lesioned pulvinar (12). The basal ganglia play a role in different aspects of cognitive control, encompassing attention (13,14), behavioural output (15), and conscious perception (16). Moreover, the basal ganglia contribute to visuospatial attention by linking with cortical regions like the prefrontal cortex via the thalamus.”

      Justification for assessing the hippocampus and the amygdala has been addressed in response to weakness (iii) from reviewer #3.

      • The authors mention they defined symmetric clusters of 5 sensors in each hemisphere that showed the highest modulation, but it is not clear how this number of sensors was determined a priori.

      We thank the reviewer for their comment. We edited the revised manuscript as below:

      Method (Line 536): “Ten sensors were selected to ensure sufficient coverage of the region exhibiting alpha modulation as judged from prior work (62).”

      • In line 141, the abbreviation HLM is first mentioned but the concept of "hemispheric lateralization modulation of alpha power" is only mentioned in the following section. For the ease of the reader, the abbreviation could be mentioned together with this concept at the beginning of this paragraph.

      We thank the reviewer for the attention. In the revised manuscript HLM() is now mentioned with its concept.

      Results (Line 153): “Next, we computed the hemispheric lateralization modulation of alpha power (HLM()) in each individual.”

      • In line 188 of the results section, it is mentioned that the table including the AIC values for model comparisons is in the supplementary material, however, we could not locate this table.

      We thank the reviewer for their constructive feedback. The supplementary materials were uploaded in a separate file, and it must not have been available to the reviewers. We have now added the supplementary materials to the end of the manuscript for convenience.

      • Figure 4 is missing the panel headers A, B, C, and D.

      We thank the reviewer for their precision. This figure is now fixed.

      Author response image 2.

      • In lines 205 and 206, behavioral and rapid frequency tagging analysis are mentioned. For the behavioral analysis, the method is described, but no results are provided. For the rapid frequency tagging, neither the methods nor the results are described. To evaluate the strength of this (non)-evidence, we would advise to elaborate on these analysis steps and report the results in the supplementary material.

      We thank the reviewer for this constructive comment. A brief explanation of the analysis method of rapid frequency tagging signal is added to the revised manuscript.

      Method (Line 548): “We computed the modulation index (MI) for rapid invisible frequency tagging (RIFT) by averaging the power of the signal in sensors on the right when attention was directed to the right compared to when it was directed to the left. This calculation was also performed for sensors on the left. Consequently, we identified the top 5 sensors on each side with the highest MI as the Region of Interest (ROI). Utilizing the sensors within the ROI, we computed hemispheric lateralization modulation (HLM) of RIFT by summing the average MI(RIFT) of the right sensors and the average MI(RIFT) of the left sensors, obtaining one HLM(RIFT) value for each participant. For a more comprehensive analysis, refer to reference (24).” For a more detailed answer, we refer the reviewer to the second comment from reviewer #3.

      • For the paragraph starting at line 209, we would recommend referring to Figure 1.

      We thank the reviewer for their suggestion. This paragraph is now referring to Figure 1.

      Results (Line 229): “To relate load and salience conditions of the task to the relationship between subcortical structures and the alpha activity, we combined low-load or high-load targets with high-saliency or low-saliency distractors to manipulate the perceptual load appointed to each trial (Method section, Figure 1). “

      • Figure 5 as well as the report of the beta weights in this section shows a difference in the direction of the effect for the thalamus compared to the globus pallidus and caudate nucleus which is not discussed in this section.

      We thank the reviewer for bringing this important point to our attention. We addressed this comment in the discussion section as below:

      Discussion (Line 349): “The opposite effect of the globus pallidus compared to the thalamus is striking, and possibly explained by the globus pallidus containing GABAergic interneurons. Thus the inhibitory nature of the globus pallidus projections to thalamus could explain why they are related to the alpha modulation in different manners (54).”

      Discussion (Line 379): “Moreover, the current study faced methodological constraints, limiting the analysis to the entire thalamus. […] It would be of great interest to conduct further investigations to quantify the distinct impacts of individual thalamic nuclei on the association between subcortical structures and the modulation of oscillatory activity.“

      • Comment 2 on line 80 is addressed in the paragraph following 264 by describing volumetric changes in basal ganglia in neurodegenerative disorders such as PD or Huntington's. Still, the link of how a decrease in volume in this region could be causally linked to changes in alpha-band power could be better supported.

      We thank the reviewer for their constructive feedback. We are here highlighting the significant correlation between subcortical structures and changes in attention modulated alpha oscillation. We added a few more references to the discussion supporting the relationship between size and function in relation to neurological disorders. We also edited the manuscript to make this point clearer as below:

      Introduction (Line 113): “Our current findings broaden our understanding of how subcortical structures are involved in modulating alpha oscillations during top-down spatial attention, independent of any reward or value associations. “

      Discussion (Line 305): “Changes in neocortical oscillatory activity have also been observed in neurological disorders which mainly are known to affect subcortical structures. For example, individuals with Alzheimer's Disease demonstrate an increase in slow oscillatory activities and a decrease in higher frequency oscillations (42). Moreover, in patients with Parkinson’s Disease, the power of beta oscillations increases relatively to when they are dopamine-depleted compared with when they are on dopaminergic medication (43). “

      • Related to the previous comment on behavioral and rapid frequency tagging results, these are difficult to evaluate without mention of the methods and/or results.

      We thank the reviewer for this comment. We refer the reviewer to our response to the second comment from reviewer #3.

      • The authors show differential effects of target load and distractor saliency; however, we missed the description of how these two variables differ conceptually as they are both described as contributing to task difficulty and it is not described why we would expect differential effects for these concepts (or in other words, how the authors explain the differential effects).

      We thank the reviewer for their comment. Directly comparing between task conditions within one model would result in an extra 16 regressors (1 (intercept) + 4-1 to model the difference between conditions + 3 to model the regressors + 3 x 3 to model each region x task condition interaction). Give our sample size, this study is underpowered to directly compare alpha lateralisation in contralateral versus ipsilateral conditions. For a more detailed answer please refer to our response to weakness two from reviewer #1.

      • Line 364ff: Based on the description of the experimental design, it is not clear to us whether participants only had to report on the change in gaze for the stimulus in the cued hemifield.

      We thank the reviewer for this comment, which prompted us to clarify the experimental design as below:

      Method (Line 440): “Then followed a 1000 ms response interval where participants were asked to respond with their right or left index finger whether the gaze direction of the cued face shifted left or right.”

      • Line 47ff: As mentioned above, the AIC table is not included. Further, as it is mentioned that BIC values led to similar results (indicating that they are not identical), it would be valuable to report both AIC and BIC values.

      We thank the reviewer for their constructive feedback. The supplementary materials were uploaded in a separate file, and it must not have been available to the reviewers. We have now added the BIC values and attached the supplementary materials to the end of the manuscript for convenience.

    2. Reviewer #2 (Public Review):

      Summary:

      In this study, Ghafari et al. explored the correlation between hemispheric asymmetry in the volume of various subcortical regions and lateralization of posterior alpha band oscillations in a spatial attention task with varying cognitive demands. To this end, they combined structural MRI and task MEG to investigate the relationship between hemispheric differences in volume of basal ganglia, thalamus, hippocampus and amygdala and hemisphere-specific modulation of alpha-band power. The authors report that differences in the thalamus, caudate nucleus and globus pallidus volume are linked to the attention-related changes in alpha band oscillations with differential correlations for different regions in different conditions of the design (depending on the salience of the distractor and/or the target).

      The manuscript contributes to filling an important gap in current research on attention allocation which commonly focuses exclusively on cortical structures. Because it is not possible to reliably measure subcortical activity with non-invasive electrophysiological methods, they correlate volumetric measurements of the relevant subcortical regions with cortical measurements of alpha band power. Specifically, they build on their own previous finding showing a correlation between hemispheric asymmetry of basal ganglia volumes and alpha lateralization by assessing a task without an explicit reward component. Furthermore, the authors use differences in saliency and perceptual load to disentangle the individual contributions of the subcortical regions. These remain somewhat hard to interpret, given their post hoc nature, and the lack of statistical power to compare task demand effects directly, but the results raise interesting new hypotheses for future work.

    3. eLife assessment

      This study by Ghafari et al. tackles a question relevant for the field of attention as it connects structural differences in subcortical regions with oscillatory modulations during attention allocation. Using a combination of Magnetoencephalography (MEG) and magnetic resonance imaging (MRI) data in human subjects, the valuable results show that inter-individual differences in the lateralisation of alpha oscillations are explained by asymmetry of subcortical brain regions. The strength of evidence is deemed convincing in line with current state-of-the-art.

    4. Reviewer #1 (Public Review):

      Summary:

      The authors re-analysed the data of a previous study in order to investigate the relation between asymmetries of subcortical brain structures and the hemispheric lateralization of alpha oscillations during visual spatial attention. The visual spatial attention task crossed the factors of target load and distractor salience, which made it possible to also test the specificity of the relation of subcortical asymmetries to lateralized alpha oscillations for specific attentional load conditions. Asymmetry of globus pallidus, caudate nucleus, and thalamus explained inter-individual differences in attentional alpha modulation in the left versus right hemisphere. Multivariate regression analysis revealed that the explanatory potential of these regions' asymmetries varies as a function of target load and distractor salience.

      In the revision of the article, the authors addressed my concerns.

      However, my concern with regard to the statistical analysis of the specificity of certain subcortical regions predicting HLM seems to be not fully addressed. The authors added an additional statistical analysis for "testing the null hypothesis that a given regressor does not impact all dependent variables". To my understanding, this is a somewhat unusual definition of a null hypothesis. Typically, the null hypothesis is the hypothesis of no effect, meaning here it should state that the effect is the same across predictors.

      In the new statistical analysis, the authors seem to take non-significant results (p>.05) as evidence for the specificity of subcortical regions in predicting HLM. The rationale of this statistical approach is difficult to follow and was somewhat unclear to me.

      A much simpler and more straight-forward approach would be to contrast beta-estimates per subcortical region between experimental conditions. For instance, if the beta estimates in the thalamus for the "low-load target, non-salient distractor" condition would be significantly larger than the beta estimates for the other conditions, this would speak to specificity.

    1. eLife assessment

      This important study identifies the mitotic localization mechanism for Aurora B and INCENP (parts of the chromosomal passenger complex, CPC) in Trypanosoma brucei. The mechanism differs from that in the more commonly studied opisthokonts and is supported by compelling RNAi and imaging experiments, targeted mutations, immunoprecipitations with crosslinking/mass spec, and AlphaFold interaction predictions. The findings will be of interest to cell biologists working on cell division, parasitologists, and those interested in the evolution of mitotic mechanisms.

    2. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study identifies the mitotic localization mechanism for Aurora B and INCENP (parts of the chromosomal passenger complex, CPC) in Trypanosoma brucei. The mechanism is different from that in the more commonly studied opisthokonts and there is solid support from RNAi and imaging experiments, targeted mutations, immunoprecipitations with crosslinking/mass spec, and AlphaFold interaction predictions. The results could be strengthened by biochemically testing proposed direct interactions and demonstrating that the targeting protein KIN-A is a motor. The findings will be of interest to parasitology researchers as well as cell biologists working on mitosis and cell division, and those interested in the evolution of the CPC.

      We thank the editor and the reviewers for their thorough and positive assessment of our work and the constructive feedback to further improve our manuscript. Please find below our responses to the reviewers’ comments. Please note that the conserved glycine residue in the Switch II helix in KIN-A was mistakenly labelled as G209 in the original manuscript. We now corrected it to G210 in the revised manuscript.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The CPC plays multiple essential roles in mitosis such as kinetochore-microtubule attachment regulation, kinetochore assembly, spindle assembly checkpoint activation, anaphase spindle stabilization, cytokinesis, and nuclear envelope formation, as it dynamically changes its mitotic localization: it is enriched at inner centromeres from prophase to metaphase but it is relocalized at the spindle midzone in anaphase. The business end of the CPC is Aurora B and its allosteric activation module IN-box, which is located at the C-terminal part of INCENP. In most well-studied eukaryotic species, Aurora B activity is locally controlled by the localization module of the CPC, Survivin, Borealin, and the N-terminal portion of INCENP. Survivin and Borealin, which bind the N terminus of INCENP, recognize histone residues that are specifically phosphorylated in mitosis, while anaphase spindle midzone localization is supported by the direct microtubule-binding capacity of the SAH (single alpha helix) domain of INCENP and other microtubule-binding proteins that specifically interact with INCENP during anaphase, which are under the regulation of CDK activity. One of these examples includes the kinesin-like protein MKLP2 in vertebrates.

      Trypanosoma is an evolutionarily interesting species to study mitosis since its kinetochore and centromere proteins do not show any similarity to other major branches of eukaryotes, while orthologs of Aurora B and INCENP have been identified. Combining molecular genetics, imaging, biochemistry, cross-linking IP-MS (IP-CLMS), and structural modeling, this manuscript reveals that two orphan kinesin-like proteins KIN-A and KIN-B act as localization modules of the CPC in Trypanosoma brucei. The IP-CLMS, AlphaFold2 structural predictions, and domain deletion analysis support the idea that (1) KIN-A and KIN-B form a heterodimer via their coiled-coil domain, (2) Two alpha helices of INCENP interact with the coiled-coil of the KIN-A-KIN-B heterodimer, (3) the conserved KIN-A C-terminal CD1 interacts with the heterodimeric KKT9-KKT11 complex, which is a submodule of the KKT7-KKT8 kinetochore complex unique to Trypanosoma, (4) KIN-A and KIN-B coiled-coil domains and the KKT7-KKT8 complex are required for CPC localization at the centromere, (5) CD1 and CD2 domains of KIN-A support its centromere localization. The authors further show that the ATPase activity of KIN-A is critical for spindle midzone enrichment of the CPC. The imaging data of the KIN-A rigor mutant suggest that dynamic KIN-A-microtubule interaction is required for metaphase alignment of the kinetochores and proliferation. Overall, the study reveals novel pathways of CPC localization regulation via KIN-A and KIN-B by multiple complementary approaches.

      Strengths:

      The major conclusion is collectively supported by multiple approaches, combining site-specific genome engineering, epistasis analysis of cellular localization, AlphaFold2 structure prediction of protein complexes, IP-CLMS, and biochemical reconstitution (the complex of KKT8, KKT9, KKT11, and KKT12).

      We thank the reviewer for her/his positive assessment of our manuscript.

      Weaknesses:

      • The predictions of direct interactions (e.g. INCENP with KIN-A/KIN-B, or KIN-A with KKT9-KKT11) have not yet been confirmed experimentally, e.g. by domain mutagenesis and interaction studies.

      Thank you for this point. It is true that we do not have evidence for direct interactions between KIN-A with KKT9-KKT11. However, the interaction between INCENP with KIN-A/KIN-B is strongly supported by our cross-linking IP-MS of native complexes. Furthermore, we show that deletion of the INCENPCPC1 N-terminus predicted to interact with KIN-A:KIN-B abolishes kinetochore localization.

      • The criteria used to judge a failure of localization are not clearly explained (e.g., Figure 5F, G).

      As suggested by the reviewer in recommendation #14, we have now included example images for each category (‘kinetochores’, ‘kinetochores + spindle’, ‘spindle’) along with a schematic illustration in Fig. 5F.

      • It remains to be shown that KIN-A has motor activity.

      We thank the reviewer for this important comment. Indeed, motor activity remains to demonstrated using an in vitro system, which is beyond the scope of this study. What we show here is that the motor domain of KIN-A effectively co-sediments with microtubules and that spindle localization of KIN-A is abolished upon deletion of the motor domain. Moreover, mutation of a conserved Glycine residue in the Switch II region (G210) to Alanine (‘rigor mutation’, (Rice et al., 1999)), renders KIN-A incapable of translocating to the central spindle, suggesting that its ATPase activity is required for this process. To clarify this point in the manuscript, we have replaced all instances, where we refer to ‘motor activity’ of KIN-A with ‘ATPase activity’ when referring to experiments performed using the KIN-A rigor mutant. In addition, we have included a Multiple Sequence Alignment (MSA) of KIN-A and KIN-B from different kinetoplastids with human Kinesin-1, human Mklp2 and yeast Klp9 in Figure 6A and S6A, showing the conservation of key motifs required for ATP coordination and tubulin interaction. In the corresponding paragraph in the main text, we describe these data as follows:

      ‘We therefore speculated that anaphase translocation of the kinetoplastid CPC to the central spindle may involve the kinesin motor domain of KIN-A. KIN-B is unlikely to be a functional kinesin based on the absence of several well-conserved residues and motifs within the motor domain, which are fully present in KIN-A (Li et al., 2008). These include the P-loop, switch I and switch II motifs, which form the nucleotide binding cleft, and many conserved residues within the α4-L12 elements, which interact with tubulin (Fig. S6A) (Endow et al., 2010). Consistent with this, the motor domain of KIN-B, contrary to KIN-A, failed to localize to the mitotic spindle when expressed ectopically (Fig. S2E) and did not co-sediment with microtubules in our in vitro assay (Fig. S6B).’

      • The authors imply that KIN-A, but not KIN-B, interacts with microtubules based on microtubule pelleting assay (Fig. S6), but the substantial insoluble fractions of 6HIS-KINA and 6HIS-KIN-B make it difficult to conclusively interpret the data. It is possible that these two proteins are not stable unless they form a heterodimer.

      This is indeed a possibility. We are currently aiming at purifying full-length recombinant KIN-A and KIN-B (along with the other CPC components), which will allow us to perform in vitro interaction studies and to investigate biochemical properties of this complex (including the role of the motor domains of KIN-A and KIN-B) within the framework of an in-depth follow-up study. To address the point above, we have added the following text in the legend corresponding to Fig. S6:

      ‘Microtubule co-sedimentation assay with 6HIS-KIN-A2-309 (left) and 6HIS-KIN-B2-316 (right). S and P correspond to supernatant and pellet fractions, respectively. Note that both constructs to some extent sedimented even in the absence of microtubules. Hence, lack of microtubule binding for KIN-B may be due to the unstable non-functional protein used in this study.’

      • For broader context, some prior findings should be introduced, e.g. on the importance of the microtubule-binding capacity of the INCENP SAH domain and its regulation by mitotic phosphorylation (PMID 8408220, 26175154, 26166576, 28314740, 28314741, 21727193), since KIN-A and KIN-B may substitute for the function of the SAH domain.

      We have modified the introduction to include the following text and references mentioned by the reviewer: ‘The localization module comprises Borealin, Survivin and the N-terminus of INCENP, which are connected to one another via a three-helical bundle (Jeyaprakash et al., 2007, 2011; Klein et al., 2006). The two modules are linked by the central region of INCENP, composed of an intrinsically disordered domain and a single alpha helical (SAH) domain. INCENP harbours microtubule-binding domains within the N-terminus and the central SAH domain, which play key roles for CPC localization and function (Samejima et al., 2015; Kang et al., 2001; Noujaim et al., 2014; Cormier et al., 2013; Wheatley et al., 2001; Nakajima et al., 2011; Fink et al., 2017; Wheelock et al., 2017; van der Horst et al., 2015; Mackay et al., 1993).’

      Reviewer #2 (Public Review):

      How the chromosomal passenger complex (CPC) and its subunit Aurora B kinase regulate kinetochore-microtubule attachment, and how the CPC relocates from kinetochores to the spindle midzone as a cell transitions from metaphase to anaphase are questions of great interest. In this study, Ballmer and Akiyoshi take a deep dive into the CPC in T. brucei, a kinetoplastid parasite with a kinetochore composition that varies greatly from other organisms.

      Using a combination of approaches, most importantly in silico protein predictions using alphafold multimer and light microscopy in dividing T. brucei, the authors convincingly present and analyse the composition of the T. brucei CPC. This includes the identification of KIN-A and KIN-B, proteins of the kinesin family, as targeting subunits of the CPC. This is a clear advancement over earlier work, for example by Li and colleagues in 2008. The involvement of KIN-A and KIN-B is of particular interest, as it provides a clue for the (re)localization of the CPC during the cell cycle. The evolutionary perspective makes the paper potentially interesting for a wide audience of cell biologists, a point that the authors bring across properly in the title, the abstract, and their discussion.

      The evolutionary twist of the paper would be strengthened 'experimentally' by predictions of the structure of the CPC beyond T. brucei. Depending on how far the authors can extend their in-silico analysis, it would be of interest to discuss a) available/predicted CPC structures in well-studied organisms and b) structural predictions in other euglenozoa. What are the general structural properties of the CPC (e.g. flexible linkers, overall dimensions, structural differences when subunits are missing etc.)? How common is the involvement of kinesin-like proteins? In line with this, it would be good to display the figure currently shown as S1D (or similar) as a main panel.

      We thank the reviewer for her/his encouraging assessment of our manuscript and the appreciation on the extent of the evolutionary relevance of our work. As suggested, we have moved the phylogenetic tree previously shown in Fig. S1D to the main Fig. 1F. Our AF2 analysis of CPC proteins and (sub)complexes from other kinetoplastids failed to predict reliable interactions among CPC proteins except for that between Aurora B and the IN box. It therefore remains unclear whether CPC structures are conserved among kinetoplastids. Because components of CPC remain unknown in other euglenozoa (other than Aurora B and INCENP), we cannot perform structural predictions of CPC in diplonemids or euglenids.

      It remains unclear how common the involvement of kinesin-like proteins with the CPC is in other eukaryotes, partly because we could not identify an obvious homolog of KIN-A/KIN-B outside of kinetoplastids. Addressing this question would require experimental approaches in various eukaryotes (e.g. immunoprecipitation and mass spectrometry of Aurora B) as we carried out in this manuscript using Trypanosoma brucei.

      Reviewer #3 (Public Review):

      Summary:

      The protein kinase, Aurora B, is a critical regulator of mitosis and cytokinesis in eukaryotes, exhibiting a dynamic localisation. As part of the Chromosomal Passenger Complex (CPC), along with the Aurora B activator, INCENP, and the CPC localisation module comprised of Borealin and Survivin, Aurora B travels from the kinetochores at metaphase to the spindle midzone at anaphase, which ensures its substrates are phosphorylated in a time- and space-dependent manner. In the kinetoplastid parasite, T. brucei, the Aurora B orthologue (AUK1), along with an INCENP orthologue known as CPC1, and a kinetoplastid-specific protein CPC2, also displays a dynamic localisation, moving from the kinetochores at metaphase to the spindle midzone at anaphase, to the anterior end of the newly synthesised flagellum attachment zone (FAZ) at cytokinesis. However, the trypanosome CPC lacks orthologues of Borealin and Survivin, and T. brucei kinetochores also have a unique composition, being comprised of dozens of kinetoplastid-specific proteins (KKTs). Of particular importance for this study are KKT7 and the KKT8 complex (comprising KKT8, KKT9, KKT11, and KKT12). Here, Ballmer and Akiyoshi seek to understand how the CPC assembles and is targeted to its different locations during the cell cycle in T. brucei.

      Strengths & Weaknesses:

      Using immunoprecipitation and mass-spectrometry approaches, Ballmer and Akiyoshi show that AUK1, CPC1, and CPC2 associate with two orphan kinesins, KIN-A and KIN-B, and with the use of endogenously expressed fluorescent fusion proteins, demonstrate for the first time that KIN-A and KIN-B display a dynamic localisation pattern similar to other components of the CPC. Most of these data provide convincing evidence for KIN-A and KIN-B being bona fide CPC proteins, although the evidence that KIN-A and KIN-B translocate to the anterior end of the new FAZ at cytokinesis is weak - the KIN-A/B signals are very faint and difficult to see, and cell outlines/brightfield images are not presented to allow the reader to determine the cellular location of these faint signals (Fig S1B).

      We thank the reviewer for their thorough assessment of our manuscript and the insightful feedback to further improve our study. To address the point above, we have acquired new microscopy data for Fig. S1B and S1C, which now includes phase contrast images, and have chosen representative cells in late anaphase and telophase. We hope that the signal of Aurora BAUK1, KIN-A and KIN-B at the anterior end of the new FAZ can be now distinguished more clearly.

      They then demonstrate, by using RNAi to deplete individual components, that the CPC proteins have hierarchical interdependencies for their localisation to the kinetochores at metaphase. These experiments appear to have been well performed, although only images of cell nuclei were shown (Fig 2A), meaning that the reader cannot properly assess whether CPC components have localised elsewhere in the cell, or if their abundance changes in response to depletion of another CPC protein.

      We chose to show close-ups of the nucleus to highlight the different localization patterns of CPC proteins under the different RNAi conditions. In none of these conditions did we observe mis-localization of CPC subunits to the cytoplasm. To clarify this point, we added the following sentence in the legend for Figure 2A:

      ‘A) Representative fluorescence micrographs showing the localization of YFP-tagged Aurora BAUK1, INCENPCPC1, KIN-A and KIN-B in 2K1N cells upon RNAi-mediated knockdown of indicated CPC subunits. Note that nuclear close-ups are shown here. CPC proteins were not detected in the cytoplasm. RNAi was induced with 1 μg/mL doxycycline for 24 h (KIN-B RNAi) or 16 h (all others). Cell lines: BAP3092, BAP2552, BAP2557, BAP3093, BAP2906, BAP2900, BAP2904, BAP3094, BAP2899, BAP2893, BAP2897, BAP3095, BAP3096, BAP2560, BAP2564, BAP3097. Scale bars, 2 μm.’

      Ballmer and Akiyoshi then go on to determine the kinetochore localisation domains of KIN-A and KIN-B. Using ectopically expressed GFP-tagged truncations, they show that coiled-coil domains within KIN-A and KIN-B, as well as a disordered C-terminal tail present only in KIN-A, but not the N-terminal motor domains of KIN-A or KIN-B, are required for kinetochore localisation. These data are strengthened by immunoprecipitating CPC complexes and crosslinking them prior to mass spectrometry analysis (IP-CLMS), a state-of-the-art approach, to determine the contacts between the CPC components. Structural predictions of the CPC structure are also made using AlphaFold2, suggesting that coiled coils form between KIN-A and KIN-B, and that KIN-A/B interact with the N termini of CPC1 and CPC2. Experimental results show that CPC1 and CPC2 are unable to localise to kinetochores if they lack their N-terminal domains consistent with these predictions. Altogether these data provide convincing evidence of the protein domains required for CPC kinetochore localisation and CPC protein interactions. However, the authors also conclude that KIN-B plays a minor role in localising the CPC to kinetochores compared to KIN-A. This conclusion is not particularly compelling as it stems from the observation that ectopically expressed GFP-NLS-KIN-A (full length or coiled-coil domain + tail) is also present at kinetochores during anaphase unlike endogenously expressed YFP-KIN-A. Not only is this localisation probably an artifact of the ectopic expression, but the KIN-B coiled-coil domain localises to kinetochores from S to metaphase and Fig S2G appears to show a portion of the expressed KIN-B coiled-coil domain colocalising with KKT2 at anaphase. It is unclear why KIN-B has been discounted here.

      As the reviewer points out, a small fraction of GFP-NLS-KIN-B317-624 is indeed detectable at kinetochores in anaphase, although most of the protein shows diffuse nuclear staining. There are various explanations for this phenomenon: It is conceivable that the KIN-B motor domain may contribute to microtubule binding and translocation of the CPC from kinetochores onto the spindle in anaphase. In our experiments, ectopically expressed KIN-B317-624 likely outcompetes a fraction of endogenous KIN-B for binding to KIN-A, which could interfere with this translocation process, leaving a population of CPC ‘stranded’ at kinetochores in anaphase. Another possibility, hinted at by the reviewer, is that the C-terminus of KIN-B interacts with receptors at the kinetochore/centromere. Although we do not discount this possibility, we nevertheless decided to focus on KIN-A in this study, because the anaphase kinetochore retention phenotype for both full-length GFP-NLS-KIN-A and -KIN-A309-862 is much stronger than for KIN-B317-624. Two additional reasons were that (i) KIN-A is highly conserved within kinetoplastids, whereas KIN-B orthologs are missing in some kinetoplastids, and (ii) no convincing interactions between KIN-B and kinetochore proteins were predicted by AF2.

      To address the reviewer’s point, we decided to include KIN-B in the title of this manuscript, which now reads: ‘Dynamic localization of the chromosomal passenger complex is controlled by the orphan kinesins KIN-A and KIN-B in the kinetoplastid parasite Trypanosoma brucei’.

      Moreover, we modified the corresponding paragraph in the results section as follows:

      ‘Intriguingly, unlike endogenously YFP-tagged KIN-A, ectopically expressed GFP fusions of both full-length KIN-A and KIN-A310-862 clearly localized at kinetochores even in anaphase (Figs. 2, F and H). Weak anaphase kinetochore signal was also detectable for KIN-B317-624 (Fig. S2F). GFP fusions of the central coiled-coil domain or the C-terminal disordered tail of KIN-A did not localize to kinetochores (data not shown). These results show that kinetochore localization of the CPC is mediated by KIN-A and KIN-B and requires both the central coiled-coil domain as well as the C-terminal disordered tail of KIN-A.’

      Next, using a mixture of RNAi depletion and LacI-LacO recruitment experiments, the authors show that kinetochore proteins KKT7 and KKT9 are required for AUK1 to localise to kinetochores (other KKT8 complex components were not tested here) and that all components of the KKT8 complex are required for KIN-A kinetochore localisation. Further, both KKT7 and KKT8 were able to recruit AUK1 to an ectopic locus in the S phase, and KKT7 recruited KKT8 complex proteins, which the authors suggest indicates it is upstream of KKT8. However, while these experiments have been performed well, the reciprocal experiment to show that KKT8 complex proteins cannot recruit KKT7, which could have confirmed this hierarchy, does not appear to have been performed. Further, since the LacI fusion proteins used in these experiments were ectopically expressed, they were retained (artificially) at kinetochores into anaphase; KKT8 and KIN-A were both able to recruit AUK1 to LacO foci in anaphase, while KKT7 was not. The authors conclude that this suggests the KKT8 complex is the main kinetochore receptor of the CPC - while very plausible, this conclusion is based on a likely artifact of ectopic expression, and for that reason, should be interpreted with a degree of caution.

      We previously showed that RNAi-mediated depletion of KKT7 disrupts kinetochore localization of KKT8 complex members, whereas kinetochore localization of KKT7 is unaffected by disruption of the KKT8 complex (Ishii and Akiyoshi, 2020). Moreover, in contrast to the KKT8 complex, KKT7 remains at kinetochores in anaphase (Akiyoshi and Gull, 2014). These data show that KKT7 is upstream of the KKT8 complex. In this context, the LacI-LacO tethering approach can be very useful to probe whether two proteins (or domains of proteins) could interact in vivo either directly or indirectly. However, a recruitment hierarchy cannot be inferred from such experiments because the data just shows whether X can recruit Y to an ectopic locus (but not whether X is upstream of Y or vice versa). Regarding the retention of Aurora BAUK1 at kinetochores in anaphase upon ectopic expression of GFP-KKT8-LacI, we agree with the reviewer that these data need to be carefully interpreted. Nevertheless, the notion that the KKT7-KKT8 complex recruits the CPC to kinetochores is also strongly supported by IP-MS, RNAi experiments, and AF2 predictions. For clarification and to address the reviewer’s point, we re-formulated the corresponding paragraph in the main text:

      ‘We previously showed that KKT7 lies upstream of the KKT8 complex (Ishii and Akiyoshi, 2020). Indeed, GFP-KKT72-261-LacI recruited tdTomato-KKT8, -KKT9 and -KKT12 (Fig. S4E). Expression of both GFP-KKT72-261-LacI and GFP-KKT8-LacI resulted in robust recruitment of tdTomato-Aurora BAUK1 to LacO foci in S phase (Figs. 4, E and F). Intriguingly, we also noticed that, unlike endogenous KKT8 (which is not present in anaphase), ectopically expressed GFP-KKT8-LacI remained at kinetochores during anaphase (Fig. 4F). This resulted in a fraction of tdTomato-Aurora BAUK1 being trapped at kinetochores during anaphase instead of migrating to the central spindle (Fig. 4F). We observed a comparable situation upon ectopic expression of GFP-KIN-A, which is retained on anaphase kinetochores together with tdTomato-KKT8 (Fig. S4F). In contrast, Aurora BAUK1 was not recruited to LacO foci marked by GFP- KKT72-261-LacI in anaphase (Fig. 4E).’

      Further IP-CLMS experiments, in combination with recombinant protein pull-down assays and structural predictions, suggested that within the KKT8 complex, there are two subcomplexes of KKT8:KKT12 and KKT9:KKT11, and that KKT7 interacts with KKT9:KKT11 to recruit the remainder of the KKT8 complex. The authors also assess the interdependencies between KKT8 complex components for localisation and expression, showing that all four subunits are required for the assembly of a stable KKT8 complex and present AlphaFold2 structural modelling data to support the two subcomplex models. In general, these data are of high quality and convincing with a few exceptions. The recombinant pulldown assay (Fig. 4H) is not particularly convincing as the 3rd eluate gel appears to show a band at the size of KKT11 (despite the labelling indicating no KKT11 was present in the input) but no pulldown of KKT9, which was present in the input according to the figure legend (although this may be mislabeled since not consistent with the text). The text also states that 6HIS-KKT8 was insoluble in the absence of KKT12, but this is not possible to assess from the data presented.

      We thank the reviewer for pointing out an error in the text: ‘Removal of both KKT9 and KKT11 did not impact formation of the KKT8:KKT12 subcomplex’ should read ‘Removal of either KKT9 or KKT11 did not impact formation of the KKT8:KKT12 subcomplex’. Regarding the very faint band perceived to be KKT11 in the 3rd eluate: This band runs slightly lower than KKT11 and likely represents a bacterial contaminant (which we have seen also in other preps in the past). We have made a note of this in the corresponding legend (new Fig. 4I). Moreover, we provide the estimated molecular weights for each subunit, as suggested by the reviewer in recommendation #14 (see below):

      ‘(I) Indicated combinations of 6HIS-tagged KKT8 (~46 kDa), KKT9 (~39 kDa), KKT11 (~29 kDa) and KKT12 (~23 kDa) were co-expressed in E. coli, followed by metal affinity chromatography and SDS-PAGE. The asterisk indicates a common contaminant.’

      The corresponding paragraph in the results section now reads:

      To validate these findings, we co-expressed combinations of 6HIS-KKT8, KKT9, KKT11 and KKT12 in E. coli and performed metal affinity chromatography (Fig. 4I). 6HIS-KKT8 efficiently pulled down KKT9, KKT11 and KKT12, as shown previously (Ishii and Akiyoshi, 2020). In the absence of KKT9, 6HIS-KKT8 still pulled down KKT11 and KKT12. Removal of either KKT9 or KKT11 did not impact formation of the KKT8:KKT12 subcomplex. In contrast, 6HIS-KKT8 could not be recovered without KKT12, indicating that KKT12 is required for formation of the full KKT8 complex. These results support the idea that the KKT8 complex consists of KKT8:KKT12 and KKT9:KKT11 subcomplexes.’

      It is also surprising that data showing the effects of KKT8, KKT9, and KKT12 depletion on KKT11 localisation and abundance are not presented alongside the reciprocal experiments in Fig S4G-J.

      YFP-KKT11 is delocalized upon depletion of KKT8 and KKT9 (see below). Unfortunately, we were unsuccessful in our attempts at deriving the corresponding KKT12 RNAi cell line, rendering this set of data incomplete. Because these data are not of critical importance for this study, we decided not to invest more time in attempting further transfections.

      Author response image 1.

      The authors also convincingly show that AlphaFold2 predictions of interactions between KKT9:KKT11 and a conserved domain (CD1) in the C-terminal tail of KIN-A are likely correct, with CD1 and a second conserved domain, CD2, identified through sequence analysis, acting synergistically to promote KIN-A kinetochore localisation at metaphase, but not being required for KIN-A to move to the central spindle at anaphase. They then hypothesise that the kinesin motor domain of KIN-A (but not KIN-B which is predicted to be inactive based on non-conservation of residues key for activity) determines its central spindle localisation at anaphase through binding to microtubules. In support of this hypothesis, the authors show that KIN-A, but not KIN-B can bind microtubules in vitro and in vivo. However, ectopically expressed GFP-NLS fusions of full-length KIN-A or KIN-A motor domain did not localise to the central spindle at anaphase. The authors suggest this is due to the GPF fusion disrupting the ATPase activity of the motor domain, but they provide no evidence that this is the case. Instead, they replace endogenous KIN-A with a predicted ATPase-defective mutant (G209A), showing that while this still localises to kinetochores, the kinetochores were frequently misaligned at metaphase, and that it no longer concentrates at the central spindle (with concomitant mis-localisation of AUK1), causing cells to accumulate at anaphase. From these data, the authors conclude that KIN-A ATPase activity is required for chromosome congression to the metaphase plate and its central spindle localisation at anaphase. While potentially very interesting, these data are incomplete in the absence of any experimental data to show that KIN-A possesses ATPase activity or that this activity is abrogated by the G209A mutation, and the conclusions of this section are rather speculative.

      Thank you for this important comment, which relates to a similar point raised by Reviewer 1 (see above). Indeed, ATPase and motor activity of KIN-A remain to demonstrated biochemically using recombinant proteins, which is beyond the scope of this study. We generated MSAs of KIN-A and KIN-B from different kinetoplastids with human Kinesin-1, human Mklp2 and yeast Klp9, which are now presented in Figure 6A and S6A. These clearly show that key motifs required for ATP or tubulin binding in other kinesins are highly conserved in KIN-A (but not KIN-B). This includes the conserved glycine residue in the Switch II helix (G234 in human Kinesin-1, G210 in T. brucei KIN-A), which forms a hydrogen bond with the γ-phosphate of ATP, and upon mutation has been shown to impair ATPase activity and trap the motor head in a strong microtubule (‘rigor’) state (Rice et al., 1999; Sablin et al., 1996). The prominent rigor phenotype of KIN-AG210A is consistent with KIN-A having ATPase activity. In addition to the data in Fig. 6A and S6A, we made following changes to the main text:

      ‘We therefore speculated that anaphase translocation of the kinetoplastid CPC to the central spindle may involve the kinesin motor domain of KIN-A. KIN-B is unlikely to be a functional kinesin based on the absence of several well-conserved residues and motifs within the motor domain, which are fully present in KIN-A (Li et al., 2008). These include the P-loop, switch I and switch II motifs, which form the nucleotide binding cleft, and many conserved residues within the α4-L12 elements, which interact with tubulin (Fig. S6A) (Endow et al., 2010). Consistent with this, the motor domain of KIN-B, contrary to KIN-A, failed to localize to the mitotic spindle when expressed ectopically (Fig. S2E) and did not co-sediment with microtubules in our in vitro assay (Fig. S6B).

      Ectopically expressed GFP-KIN-A and -KIN-A2-309 partially localized to the mitotic spindle but failed to concentrate at the midzone during anaphase (Figs. 2, F and G), suggesting that N-terminal tagging of the KIN-A motor domain may interfere with its function. To address whether the ATPase activity of KIN-A is required for central spindle localization of the CPC, we replaced one allele of KIN-A with a C-terminally YFP-tagged G210A ATP hydrolysis-defective rigor mutant (Fig. 6A) (Rice et al., 1999) and used an RNAi construct directed against the 3’UTR of KIN-A to deplete the untagged allele. The rigor mutation did not affect recruitment of KIN-A to kinetochores (Figs. S6, C and D). However, KIN-AG210A-YFP marked kinetochores were misaligned in ~50% of cells arrested in metaphase, suggesting that ATPase activity of KIN-A promotes chromosome congression to the metaphase plate (Figs. S6, E-H).’

      Impact:

      Overall, this work uses a wide range of cutting-edge molecular and structural predictive tools to provide a significant amount of new and detailed molecular data that shed light on the composition of the unusual trypanosome CPC and how it is assembled and targeted to different cellular locations during cell division. Given the fundamental nature of this research, it will be of interest to many parasitology researchers as well as cell biologists more generally, especially those working on aspects of mitosis and cell division, and those interested in the evolution of the CPC.

      We thank the reviewer for his/her feedback and thoughtful and thorough assessment of our study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Why did the authors omit KIN-B from the title?

      We decided to add KIN-B in the title. Please see our response to Reviewer #3 (public review).

      (2) Abstract, line 28, "Furthermore, the kinesin motor activity of KIN-A promotes chromosome alignment in prometaphase and CPC translocation to the central spindle upon anaphase onset." This must be revised - see public review.

      We changed this section of the abstract as follows:

      ‘Furthermore, the ATPase activity of KIN-A promotes chromosome alignment in prometaphase and CPC translocation to the central spindle upon anaphase onset. Thus, KIN-A constitutes a unique ‘two-in-one’ CPC localization module in complex with KIN-B, which directs the CPC to kinetochores (from S phase until metaphase) via its C-terminal tail, and to the central spindle (in anaphase) via its N-terminal kinesin motor domain.’

      (3) Line 87-90. The findings by Li et al., 2008 (KIN-A and KIN-B interacting with Aurora B and epistasis analysis) should be introduced more comprehensively in the Introduction section.

      We added the following sentence in the introduction:

      ‘In addition, two orphan kinesins, KIN-A and KIN-B, have been proposed to transiently associate with Aurora BAUK1 during mitosis (Li et al., 2008; Li, 2012).’

      (4) Figure 1B. The way the Trypanosoma cell cycle is defined should be briefly explained in the main text, rather than just referring to the figure.

      The ‘KN’ annotation of the trypanosome cell cycle is explained in the Figure 1 legend. We now also added a brief description in the main text:

      ‘We next assessed the localization dynamics of fluorescently tagged KIN-A and KIN-B over the course of the cell cycle (Figs. 1, B-E). T. brucei possesses two DNA-containing organelles, the nucleus (‘N’) and the kinetoplast (‘K’). The kinetoplast is an organelle found uniquely in kinetoplastids, which contains the mitochondrial DNA and replicates and segregates prior to nuclear division. The ‘KN’ configuration serves as a good cell cycle marker (Woodward and Gull, 1990; Siegel et al., 2008).’

      (5) Line 118. Throughout the paper, it is not clear why GFP-NLS fusion was used instead of GFP fusion. Please justify the fusion of NLS.

      NLS refers to a short ‘nuclear localization signal’ (TGRGHKRSREQ) (Marchetti et al., 2000), which ensures that the ectopically expressed construct is imported into the nucleus. When we previously expressed truncations of KKT2 and KKT3 kinetochore proteins, many fragments did not go into the nucleus presumably due to the lack of an NLS, which prevented us from determining which domains are responsible for their kinetochore localization. We have since then consistently used this short NLS sequence in our inducible GFP fusions in the past without any complications. We added a sentence in the Materials & Methods section under Trypanosome culture: ‘All constructs for ectopic expression of GFP fusion proteins include a short nuclear localization signal (NLS) (Marchetti et al., 2000).’ To avoid unnecessary confusion, we removed ‘NLS’ from the main text and figures.

      (6) Line 121, "Unexpectedly". It is not clear why this was unexpected.

      To clarify this point, we modified this paragraph in the results section:

      ‘To our surprise, KIN-A-YFP and GFP-KIN-B exhibited a CPC-like localization pattern identical to that of Aurora BAUK1: Both kinesins localized to kinetochores from S phase to metaphase, and then translocated to the central spindle in anaphase (Figs. 1, C-E). Moreover, like Aurora BAUK1, a population of KIN-A and KIN-B localized at the new FAZ tip from late anaphase onwards (Figs. S1, B and C). This was unexpected, because KIN-A and KIN-B were previously reported to localize to the spindle but not to kinetochores or the new FAZ tip (Li et al., 2008). These data suggest that KIN-A and KIN-B are bona fide CPC proteins in trypanosomes, associating with AuroraAUK1, INCENPCPC1 and CPC2 throughout the cell cycle.’

      (7) Line 127-129. Defining homologs and orthologs is tricky - there are many homologs and paralogs of kinesin-like proteins. The method to define the presence or absence of KIN-A/KIN-B homologs should be described in the Materials and Methods section.

      Due to the difficulty in defining true orthologs for kinesin-like proteins, we took a conservative approach: reciprocal best BLAST hits. We first searched KIN-A homologs using BLAST in the TriTryp database or using hmmsearch using manually prepared hmm profiles. When the top hit in a given organism found T. brucei KIN-A in a reciprocal BLAST search in T. brucei proteome, we considered the hit as a true ortholog. We modified the Materials and Methods section as below.

      ‘Searches for homologous proteins were done using BLAST in the TriTryp database (Aslett et al., 2010) or using hmmsearch using manually prepared hmm profiles (HMMER version 3.0; Eddy, 1998). The top hit was considered as a true ortholog only if the reciprocal BLAST search returned the query protein in T. brucei.’

      (8) Line 156. For non-experts of Trypanosoma cell biology, it is not clear how the nucleolar localization is defined.

      The nucleolus in T. brucei is discernible as a DAPI-dim region in the nucleus.

      (9) Fig.2G and Fig.S2F. These data imply that the coiled-coil and C-terminal tail domains of KIN-A/KIN-B are important for anaphase spindle midzone enrichment. However, it is odd that this was not mentioned. This reviewer recommends that the authors quantify the midzone localization data of these constructs and discuss the role of the coiled-coil domains.

      One possibility is that KIN-A and KIN-B need to form a complex (via their coiled-coil domains) to localize to the spindle midzone. Another likely possibility, which is discussed in the manuscript, is that N-terminal tagging of KIN-A impairs motor activity. This is supported by the fact that the central spindle localization is also disrupted in full-length GFP-KIN-A. We decided not to provide a quantification for these data due to low sample sizes for some of the constructs (e.g. expression not observed in all cells).

      (10) Line 288-289, "pLDDT scores improved significantly for KIN-A CD1 in complex with KKT9:KKT11 (>80) compared to KIN-A CD1 alone (~20) (Figs. S3, A and B)." I can see that pLDDT score is about 20 at KIN-A CD1 from Figs S3A, but the basis of pLDDT > 80 upon inclusion go KKT9:KKT11 is missing.

      We added the pLDDT and PAE plots for the AF2 prediction of KIN-A700-800 in complex with KKT9:KKT11 in Fig. S5B.

      (11) Fig. 5A. Since there is no supporting biochemical data for KIN-A-KKT9-KKT11 interaction, it is important to assess the stability of AlphaFold-based structural predictions of the KIN-A-KKT9-KKT11 interaction. Are there significant differences among the top 5 prediction results, and do these interactions remain stable after the "simulated annealing" process used in the AlphaFold predictions? Are predicted CD1-interacting regions/amino residues in KKT9 and KKT11 evolutionarily conserved?

      See above. The interaction was predicted in all 5 predictions as shown in Fig. S5B. Conservation of the CD1-interacting regions in KKT9 and KKT11 are shown below:

      Author response image 2.

      KKT9 (residues ~53 – 80 predicted to interact with KIN-A in T. brucei)

      Author response image 3.

      KKT11 (residues 61-85 predicted to interact with KIN-A in T. brucei)

      (12) Line 300, Fig. S5D and E, "failed to localize at kinetochores". From this resolution of the microscopy images, it is not clear if these proteins fail to localize at kinetochores as the KKT and KIN-A310-716 signals overlap. Perhaps, "failed to enrich at kinetochores" is a more appropriate statement.

      We changed this sentence according to the reviewer’s suggestion.

      (13) Line 309 and Fig 5D and F, "predominantly localized to the mitotic spindle". From this image shown in Fig 5D, it is not clear if KIN-A∆CD1-YFP and Aurora B are predominantly localized to the spindle or if they are still localized to centromeres that are misaligned on the spindle. Without microtubule staining, it is also not clear how microtubules are distributed in these cells. Please clarify how the presence or absence of kinetochore/spindle localization was defined.

      As shown in Fig. S5E and S5F, deletion of CD1 clearly impairs kinetochore localization of KIN-A (kinetochores marked by tdTomato-KKT2). Moreover, misalignment of kinetochores, as observed upon expression of the KIN-AG210A rigor mutant, would result in an increase in 2K1N cells and proliferation defects, which is not the case for the KIN-A∆CD1 mutant (Fig. 5H, Fig. S5I). KIN-A∆CD1-YFP appears to localize diffusely along the entire length of the mitotic spindle, whereas we still observe kinetochore-like foci in the rigor mutant. Unfortunately, we do not have suitable antibodies that would allow us to distinguish spindle microtubules from the vast subpellicular microtubule array present in T. brucei and hence need to rely on tagging spindle-associated proteins such as MAP103.

      (14) Fig. 5F, G, S5F. Along the same lines, it would be helpful to show example images for each category - "kinetochores", "kinetochores + spindle", and "spindle".

      As suggested by the reviewer, we have now included example images for each category (‘kinetochores’, ‘kinetochores + spindle’, ‘spindle’) along with a schematic illustration in Fig. 5F.

      (15) Line 332 and Fig. S6A. The experiment may be repeated in the presence of ATP or nonhydrolyzable ATP analogs.

      We thank the reviewer for the suggestion. We envisage such experiments for an in-depth follow-up study.

      (16) Line 342, "motor activity of KIN-A". Until KIN-A is shown to have motor activity, the result based on the rigor mutant does not show that the motor activity of KIN-A promotes chromosome congression. The result suggests that the ATPase activity of KIN-A is important.

      We changed that sentence as suggested by the reviewer.

      (17) Line 419 -. The authors base their discussion on the speculation that KIN-A is a plus-end directed motor. Please justify this speculation.

      Indeed, the notion that KIN-A is a plus-end directed motor remains a hypothesis, which is based on sequence alignments with other plus-end directed motors and the observation that the KIN-A motor domain is involved in translocation of the CPC to the central spindle in anaphase. We have modified the corresponding section in the discussion as follows:

      ‘It remains to be investigated whether KIN-A truly functions as a plus-end directed motor. The role of the KIN-B in this context is equally unclear. Since KIN-B does not possess a functional kinesin motor domain, we deem it unlikely that the KIN-A:KIN-B heterodimer moves hand-over-hand along microtubules as do conventional (kinesin-1 family) kinesins. Rather, the KIN-A motor domain may function as a single-headed unit and drive processive plus-end directed motion using a mechanism similar to the kinesin-3 family kinesin KIF1A (Okada and Hirokawa, 1999).’

      (18) Line 422-423, "plus-end directed motion using a mechanism similar to kinesin-3 family kinesins (such as KIF1A)." Please cite a reference supporting this statement.

      See above. We cited a paper by (Okada and Hirokawa, 1999).

      Reviewer #2 (Recommendations For The Authors):

      Please provide a quantification of data shown in Figure 2F-H and described in lines 151-166.

      We decided not to provide a quantification for these data due to low sample sizes for some of the constructs (e.g. expression not observed in all cells).

      It appears as if the paper more or less follows a chronological order of the experiments that were performed before AF multimer enabled the insightful and compelling structural analysis. That is a matter of style, but in some cases, the writing could be updated, shortened, or re-arranged into a more logical order. Concrete examples:

      (i) Line 144: "we did not include CPC2 for further analysis in this study" Although CPC2 features at a prominent and interesting position in the predicted structures of the kinetoplastid CPC, shown in later main figures.

      We attempted RNAi-mediated depletion of CPC2 using two different shRNA constructs. However, we cannot exclude the possibility that the knockdown of CPC2 was less efficient compared with the other CPC subunits. For this reason, we decided to remove all the data on CPC2 from Fig. S2.

      (ii) The work with the KIN-A motor domain only and KIN-A ∆motor domain (Fig 2) begs the question about a more subtle mutation to interfere with the motor domain. Which is ultimately presented in Fig 6. I think that the final paragraph and Figure 6 follow naturally after Figure 2.

      We appreciate the suggestion. However, we would like to keep Figure 6 there.

      (iii) The high-confidence structural predictions in Fig 3 and Fig 4 are insightful. The XL-MS descriptions that precede them are not so helpful (Fig 3A and 4G and in the text). To emphasize their status as experimental support for the predicted structures, which is very important, it would be good to discuss the XL-MS after presenting the models.

      As suggested, we have re-arranged the text and/or figures such that the AF2 predictions are discussed first and the CLMS data are brought in afterwards.

      Figure 1A prominently features an arbitrary color code and a lot of protein IDs without a legend. That is not a very convincing start. Figure S1 is more informative, containing annotated protein names and results of the KIN-A and KIN-B IPs. Please improve Figure 1A, for example by presenting a modified version of Figure S1. In all these types of figures, please list both protein names and gene IDs.

      We agree with the reviewer that the IP-MS data in Fig. S1 is more informative and hence decided to swap the heatmaps in Fig. 1A and Fig. S1A. We further annotated the heatmap corresponding to the Aurora BAUK1 IP-MS (now presented in Fig. S1) as suggested by the reviewer.

      The visualization of the structural predictions is not consistent among figures:

      (i) The structure in Fig 4I is important and could be displayed larger. The pLDDT scores, and especially those of the non-displayed models, do not add much information and should not be a main panel. If the authors want to display the pLDDT scores, I recommend a panel (main or supplement) of the structure colored for local prediction confidences, as in Fig 5A.

      (ii) In Figure 5A itself, it is hard to follow the chains in general, and KIN-A in particular, since the structure is pLDDT-coloured. Please present an additional panel colored by chain (consistent with Fig 4I, as mentioned above).

      (iii) The summarizing diagram, currently displayed as Fig 4J, should be placed after Fig 5A and take the discovered KIN-A - KKT9-11 connection into account. Ideally, it also covers the suspected importance of the motor domain and serves as a summarising diagram.

      We thank the reviewer for the constructive comments. For each structure prediction, we now present two images side by side; one coloured by chain and one colored by pLDDT. We recently re-ran AF2 for the full CPC and also for the KKT7N-KKT8 complex, and got improved predictions. Hence some of the models in Fig. 3/S3 and Fig. 4/S4 have been updated accordingly. For the CLMS plots, we also decided to colour the cross-links according to whether the 30 angstrom distance constraints were fulfilled or not in the AF2 prediction. We also increased the size of the structures shown in Fig. 4. Furthermore, we decided to remove the summarizing diagram from Fig. 4 and instead made a new main Fig. 7, which shows a more detailed schematic, which also takes into account the proposed function of the KIN-A motor domain, as suggested by the reviewer, and other points addressed in the Discussion.

      The methods section for the structural predictions lacks essential information. Predictions can only be reproduced if the version of AF2 multimer v2.x is specified and key parameters are mentioned.

      As suggested, we have added the details in the Materials and Methods section as follows.

      ‘Structural predictions of KIN-A/KIN-B, KIN-A310-862/KIN-B317-624, CPC1/CPC2/KIN-A300-599/KIN-B 317-624, and KIN-A700-800/KKT9/KKT11 were performed using ColabFold version 1.3.0 (AlphaFold-Multimer version 2), while those of AUK1/CPC1/CPC2/KIN-A1-599/KIN-B, KKT71-261/KKT9/KKT11/KKT8/KKT12, KKT9/KKT11/KKT8/KKT12, and KKT71-261/KKT9/KKT11 were performed using ColabFold version 1.5.3 (AlphaFold-Multimer version 2.3.1) using default settings, accessed via https://colab.research.google.com/github/sokrypton/ColabFold/blob/v1.3.0/AlphaFold2.ipynb and https://colab.research.google.com/github/sokrypton/ColabFold/blob/v1.5.3/AlphaFold2.ipynb.’

      Line 121, please explain the "Unexpectedly" by including a reference to the work from Li and colleagues. A statement with some details would be useful, as the difference between both studies appears to be crucial for the novelty of this paper. Alternatively, refer to this being covered in the discussion.

      To clarify this point, we modified this paragraph in the results section:

      ‘To our surprise, KIN-A-YFP and GFP-KIN-B exhibited a CPC-like localization pattern identical to that of Aurora BAUK1: Both kinesins localized to kinetochores from S phase to metaphase, and then translocated to the central spindle in anaphase (Figs. 1, C-E). Moreover, like Aurora BAUK1, a population of KIN-A and KIN-B localized at the new FAZ tip from late anaphase onwards (Figs. S1, B and C). This was unexpected, because KIN-A and KIN-B were previously reported to localize to the spindle but not to kinetochores or the new FAZ tip (Li et al., 2008). These data suggest that KIN-A and KIN-B are bona fide CPC proteins in trypanosomes, associating with AuroraAUK1, INCENPCPC1 and CPC2 throughout the cell cycle.’

      Line 285 refers to "conserved" regions in the C-terminal part of KIN-A, referring to Figure 5. Please expand the MSA in Figure 5B to get an idea about the conservation/variation outside CD1 and CD2.

      We now present the full MSA for KIN-A proteins in kinetoplastids in Fig. S5A.

      Please specify what is meant by Line 367-369 for someone who is not familiar with the work by Komaki et al. 2022. Either clarify in the text or clarify in the text with data to support it.

      We updated the corresponding section in the discussion as follows:

      ‘Komaki et al. recently identified two functionally redundant CPC proteins in Arabidopsis, Borealin Related Interactor 1 and 2 (BORI1 and 2), which engage in a triple helix bundle with INCENP and Borealin using a conserved helical domain but employ an FHA domain instead of a BIR domain to read H3T3ph (Komaki et al., 2022).’

      Data presented in Figure S6A, the microtubule co-sedimentation assay, is not convincing since a substantial amount of KIN-A/B is pelleted in the absence of microtubules. Did the authors spin the proteins in BRB80 before the assay to continue with soluble material and reduce sedimentation in the absence of microtubules? If the authors want to keep the wording in lines 331-332, the MT-binding properties of KIN-A and KIN-B need to be investigated in more detail, for example with a titration and a quantification thereof. Otherwise, they should change the text and replace "confirms" with "is consistent with". In any case, the legend needs to be expanded to include more information.

      To address the point above, we have added the following text in the legend corresponding to Fig. S6:

      ‘Microtubule co-sedimentation assay with 6HIS-KIN-A2-309 (left) and 6HIS-KIN-B2-316 (right). S and P correspond to supernatant and pellet fractions, respectively. Note that both constructs to some extent sedimented even in the absence of microtubules. Hence, lack of microtubule binding for KIN-B may be due to the unstable non-functional protein used in this study.’

      We have also updated the main text in the results section:

      ‘We therefore speculated that anaphase translocation of the kinetoplastid CPC to the central spindle may involve the kinesin motor domain of KIN-A. KIN-B is unlikely to be a functional kinesin based on the absence of several well-conserved residues and motifs within the motor domain, which are fully present in KIN-A (Li et al., 2008). These include the P-loop, switch I and switch II motifs, which form the nucleotide binding cleft, and many conserved residues within the α4-L12 elements, which interact with tubulin (Fig. S6A) (Endow et al., 2010). Consistent with this, the motor domain of KIN-B, contrary to KIN-A, failed to localize to the mitotic spindle when expressed ectopically (Fig. S2E) and did not co-sediment with microtubules in our in vitro assay (Fig. S6B).’

      Details:

      The readability of the pAE plots could be improved by arranging sequences according to their position in the structure. For example in Fig4I, KKT8 could precede KKT12. If it is easy to update this, the authors might want to do so.

      We re-ran the AF2 predictions for the KKT7N – KKT8 complex in Fig. 4/S4 and changed the order according to the reviewer’s suggestion (KKT9:KKT11:KKT8:KKT12).

      The same paper is referred to as Je Van Hooff et al. 2017 and as Van Hooff et al. 2017

      Thank you for pointing this out. We have corrected the citation.

      Reviewer #3 (Recommendations For The Authors):

      (1) Please state at the end of the introduction/start of the results section that this work was performed in procyclic trypanosomes. Given that the cell cycles of procyclic and bloodstream forms differ, this is important.

      We added this information at the end of the introduction:

      ‘Here, by combining biochemical, structural and cell biological approaches in procyclic form T. brucei, we show that the trypanosome CPC is a pentameric complex comprising Aurora BAUK1, INCENPCPC1, CPC2 and the two orphan kinesins KIN-A and KIN-B.’

      (2) Please define NLS at first use (line 118), and for clarity, explain the rationale for using GFP with an NLS.

      NLS refers to a short ‘nuclear localization signal’ (TGRGHKRSREQ) (Marchetti et al., 2000), which ensures that the ectopically expressed construct is imported into the nucleus. When we previously expressed truncations of KKT2 and KKT3 kinetochore proteins, many fragments did not go into the nucleus presumably due to the lack of an NLS, which prevented us from determining which domains are responsible for their kinetochore localization. We have since then consistently used this short NLS sequence in our inducible GFP fusions in the past without any complications. We added a sentence in the Materials & Methods section under Trypanosome culture: ‘All constructs for ectopic expression of GFP fusion proteins include a short nuclear localization signal (NLS) (Marchetti et al., 2000).’ To avoid unnecessary confusion, we removed ‘NLS’ from the main text and figures.

      (3) Lines 148-150 - it would strengthen this claim if KIN-A/B protein levels were assessed by Western blot.

      We now present a Western blot in Fig. S2C, showing that bulk KIN-B levels are clearly reduced upon KIN-A RNAi. The same is true also to some extent for KIN-A levels upon KIN-B RNAi, although this is less obvious, possibly due to the lower efficiency of KIN-B compared to KIN-A RNAi as judged by fluorescence microscopy (quantified in Fig. 2D and 2E).

      (4) Line 253 - the text mentions the removal of both KKT9 and KKT11, which is not consistent with the figure (Fig 4H) - do you mean the removal of either KKT9 or KKT11?

      Yes, we thank the reviewer for pointing out this mistake in the text, which has now been corrected.

      (5) Line 337 - please include a reference for the G209A ATPase-defective rigor mutant - has this been shown to result in KIN-A being inactive previously?

      Please see above our answer in public review.

      (6) It is not always obvious when fluorescent fusion proteins are being expressed endogenously or ectopically, or when they are being expressed in an RNAi background or not without tracing the cell lines in Table S1 - please ensure this is clearly stated throughout the manuscript.

      We now made sure that this is clearly stated in the main text as well as in the figure legends.

      (7) Line 410 - 'KIN-A C-terminal tail is stuffed full of conserved CDK1CRK3 sites' - what does 'stuffed full' really mean (this is rather imprecise) and what are the consensus sites - are these CDK1 consensus sites that are assumed to be conserved for CRK3? I'm not aware of consensus sites for CRK3 having been determined, but if they have, this should be referenced.

      We have modified the corresponding section in the discussion as follows:

      ‘In support of this, the KIN-A C-terminal tail harbours many putative CRK3 sites (10 sites matching the minimal S/T-P consensus motif for CDKs) and is also heavily phosphorylated by Aurora BAUK1 in vitro (Ballmer et al. 2024). Finally, we speculate that the interaction of KIN-A motor domain with microtubules, coupled to the force generating ATP hydrolysis and possibly plus-end directed motion, eventually outcompetes the weakened interactions of the CPC with the kinetochore and facilitates the extraction of the CPC from chromosomes onto spindle microtubules during anaphase. Indeed, deletion of the KIN-A motor domain or impairment of its motor function through N-terminal GFP tagging causes the CPC to be trapped at kinetochores in anaphase. Central spindle localization is additionally dependent on the ATPase activity of the KIN-A motor domain as illustrated by the KIN-A rigor mutant.’

      (8) Lines 412-416: this proposal is written rather definitively - given no motor activity has been demonstrated for KIN-A, please make clear that this is still just a theory.

      See above.

      (9) Fig 1: KKT2 is not highlighted in Fig 1A - given this has been used for colocalization in Fig 1C-E, was it recovered, and if not, why not? Fig 1B-E: the S phase/1K1N terminology is somewhat misleading. Not all S phase cells will have elongated kinetoplasts - usually an asterisk is used to signify replicated DNA, not kinetoplast shape. If it is to be used here for elongation, then for consistency, N should be used for G2/mitotic cells.

      Fig. 1A (now Fig. S1A) only shows the tip 30 hits. KKT2 was indeed recovered with Aurora BAUK1 (see Table S2) and is often used as a kinetochore marker in trypanosomes by our lab and others since the signal of fluorescently tagged KKT2 is relatively bright and KKT2 localizes to centromeres throughout the cell cycle.

      (10) A general comment for all image figures is that these do not have accompanying brightfield images and it is therefore difficult to know where the cell body is, or sometimes which nuclei and kinetoplasts belong to which cell where DNA from more than one cell is within the image. It would be beneficial if brightfield images could be added, or alternatively, the cell outlines were traced onto DAPI or merged images. Also, brightfield images would allow the stage of cytokinesis (pre-furrowing/furrowing/abscission) in anaphase cells to be determined.

      Since this study primarily addresses the recruitment mechanism of the CPC to kinetochores and to the central spindle from S phase to metaphase and in anaphase, respectively, and CPC proteins are not observed outside of the nucleus during these cell cycle stages, we did not present brightfield images in the figures. However, this point is particularly valid for discerning the localization of KIN-A and KIN-B to the new FAZ tip from late anaphase onwards. Hence, we acquired new microscopy data for Fig. S1B and S1C, which now includes phase contrast images, and have chosen representative cells in late anaphase and telophase. We hope that the signal of Aurora BAUK1, KIN-A and KIN-B at the anterior end of the new FAZ can be now distinguished more clearly.

      (11) Fig 2A: legend should state that the micrographs show the localisation of the proteins within the nucleus as whole cells are not shown. 2C: can INCENP not be split into 2 lines - the 'IN' looks like 1N at first glance, which is confusing.

      We have applied the suggested change in Fig. 2.

      (12) Fig 3 (and other AF2 figures): Could the lines for satisfied & not satisfied in the key be thicker so they more closely resemble the lines in the figure and are less likely to be confused with the disordered regions of the CPC components?

      We have now made those lines thicker.

      (13) Why were different E value thresholds used in Fig 3 and Fig 4?

      The CLMS data in Fig. 3 and Fig. 4 now both use the same E value threshold of E-3 (previously E-4 was used in Fig. 4). To determine a sensible significance threshold, we included some yeast protein sequences (‘false positives’) in the database used in pLink2 for identification of crosslinked peptides. Note that we recently also re-ran AF2 for the full CPC and for the KKT7N-KKT8 complex and got improved predictions. Hence some of the models in Fig. 3/S3 and Fig. 4/S4 have been updated accordingly. For the CLMS plots, we also decided to colour the cross-links according to whether the 30 angstrom distance constraints were fulfilled or not in the AF2 prediction.

      (14) Fig 4H legend - please give the expected sizes of these recombinant proteins & check the 3rd elution panel (see public review comments).

      See above response in public review.

      (15) Fig 4I - please explain what the colours of the PAE plot and the values in the key signify, as well as how the Scored Residue values are arrived at. Please also define the pIDDT in the legend.

      We have cited DeepMind’s 2021 methods paper, in which the outputs of AlphaFold are explained in detail. We also added a short description of the pLDDT and PAE scores and the corresponding colour coding in the legends of Fig. 3 and Fig. 4, respectively.

      From figure 3 legend:

      ‘(B) Cartoon representation showing two orientations of the trypanosome CPC, coloured by protein on the left (Aurora BAUK1: crimson, INCENPCPC1: green, CPC2: cyan, KIN-A: magenta, and KIN-B: yellow) or according to their pLDDT values on the right, assembled from AlphaFold2 predictions shown in Figure S3. The pLDDT score is a per-residue estimate of the confidence in the AlphaFold prediction on a scale from 0 – 100. pLDDT > 70 (blue, cyan) indicates a reasonable accuracy of the model, while pLDDT < 50 (red) indicates a low accuracy and often reflects disordered regions of the protein (Jumper et al., 2021). BS3 crosslinks in (B) were mapped onto the model using PyXlinkViewer (blue = distance constraints satisfied, red = distance constraints violated, Cα-Cα Euclidean distance threshold = 30 Å) (Schiffrin et al., 2020).’

      From Figure 4 legend:

      ‘(G) AlphaFold2 model of the KKT7 – KKT8 complex, coloured by protein (KKT71-261: green, KKT8: blue, KKT12: pink, KKT9: cyan and KKT11: orange) (left) and by pLDDT (center). BS3 crosslinks in (H) were mapped onto the model using PyXlinkViewer (Schiffrin et al., 2020) (blue = distance constraints satisfied, red = distance constraints violated, Cα-Cα Euclidean distance threshold = 30 Å). Right: Predicted Aligned Error (PAE) plot of model shown on the left (rank_2). The colour indicates AlphaFold’s expected position error (blue = low, red = high) at the residue on the x axis if the predicted and true structures were aligned on the residue on the y axis (Jumper et al., 2021).’

      (16) Fig 6 legend - Line 730 should say (F) not (C).

      Thank you for pointing out this typo.

      (17) Fig S1A - a key is missing for the colours. Fig S1B/C - cell outlines or a brightfield image are really needed here - see earlier comment. Fig S1D - there doesn't seem to be a method for how this tree was generated.

      See above response in public review regarding Fig. S1A and S1B/C. The tree in Fig. S1D is based on (Butenko et al., 2020).

      (18) Fig S2: A: how was protein knockdown validated (especially for CPC2 where there was little obvious phenotype)? Fig S2B: the y-axis should read proportion of cells, not percentage. Fig S2E - NLS should be labelled.

      Thank you for pointing out the mistake in the labelling.

      (19) Fig S3: PAE plots should be labelled with protein names, not A-E. Similarly, the pIDDT plots should be labelled as in Fig 4I.

      We have corrected the labelling in Fig. S3.

      (20) Fig S5A-D - cell cycle stage labels are missing from images.

      Thank you for pointing out the missing cell cycle stage labels.

      Addition by editor:

      In line 126 the statement that KIN-A and KIN-B "associate with Aurora-AUK1, INCENP-CPC1 and CPC2 throughout the cell cycle" seems too strong. There is no direct evidence for this. Please re-phrase as "likely associate" or "suggest... that ... may...".

      We have modified that sentence according to the editor’s suggestion.

      References:

      Akiyoshi, B., and K. Gull. 2014. Discovery of Unconventional Kinetochores in Kinetoplastids. Cell. 156. doi:10.1016/j.cell.2014.01.049.

      Butenko, A., F.R. Opperdoes, O. Flegontova, A. Horák, V. Hampl, P. Keeling, R.M.R. Gawryluk, D. Tikhonenkov, P. Flegontov, and J. Lukeš. 2020. Evolution of metabolic capabilities and molecular features of diplonemids, kinetoplastids, and euglenids. BMC Biology 2020 18:1. 18:1–28. doi:10.1186/S12915-020-0754-1.

      Cormier, A., D.G. Drubin, and G. Barnes. 2013. Phosphorylation regulates kinase and microtubule binding activities of the budding yeast chromosomal passenger complex in vitro. J Biol Chem. 288:23203–23211. doi:10.1074/JBC.M113.491480. Endow, S.A., F.J. Kull, and H. Liu. 2010. Kinesins at a glance. J Cell Sci. 123:3420. doi:10.1242/JCS.064113.

      Fink, S., K. Turnbull, A. Desai, and C.S. Campbell. 2017. An engineered minimal chromosomal passenger complex reveals a role for INCENP/Sli15 spindle association in chromosome biorientation. J Cell Biol. 216:911–923. doi:10.1083/JCB.201609123.

      van der Horst, A., M.J.M. Vromans, K. Bouwman, M.S. van der Waal, M.A. Hadders, and S.M.A. Lens. 2015. Inter-domain Cooperation in INCENP Promotes Aurora B Relocation from Centromeres to Microtubules. Cell Rep. 12:380–387. doi:10.1016/J.CELREP.2015.06.038.

      Ishii, M., and B. Akiyoshi. 2020. Characterization of unconventional kinetochore kinases KKT10/19 in Trypanosoma brucei. J Cell Sci. doi:10.1242/jcs.240978.

      Jeyaprakash, A.A., C. Basquin, U. Jayachandran, and E. Conti. 2011. Structural Basis for the Recognition of Phosphorylated Histone H3 by the Survivin Subunit of the Chromosomal Passenger Complex. Structure. 19:1625–1634. doi:10.1016/J.STR.2011.09.002.

      Jeyaprakash, A.A., U.R. Klein, D. Lindner, J. Ebert, E.A. Nigg, and E. Conti. 2007. Structure of a Survivin–Borealin–INCENP Core Complex Reveals How Chromosomal Passengers Travel Together. Cell. 131. doi:10.1016/j.cell.2007.07.045.

      Jumper, J., R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S.A.A. Kohl, A.J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, D. Silver, O. Vinyals, A.W. Senior, K. Kavukcuoglu, P. Kohli, and D. Hassabis. 2021. Highly accurate protein structure prediction with AlphaFold. Nature 2021 596:7873. 596:583–589. doi:10.1038/s41586-021-03819-2.

      Kang, J.S., I.M. Cheeseman, G. Kallstrom, S. Velmurugan, G. Barnes, and C.S.M. Chan. 2001. Functional cooperation of Dam1, Ipl1, and the inner centromere protein (INCENP)-related protein Sli15 during chromosome segregation. J Cell Biol. 155:763–774. doi:10.1083/JCB.200105029.

      Klein, U.R., E.A. Nigg, and U. Gruneberg. 2006. Centromere targeting of the chromosomal passenger complex requires a ternary subcomplex of Borealin, Survivin, and the N-terminal domain of INCENP. Mol Biol Cell. 17:2547–2558. doi:10.1091/MBC.E05-12-1133.

      Komaki, S., E.C. Tromer, G. De Jaeger, N. De Winne, M. Heese, and A. Schnittger. 2022. Molecular convergence by differential domain acquisition is a hallmark of chromosomal passenger complex evolution. Proc Natl Acad Sci U S A. 119. doi:10.1073/PNAS.2200108119/-/DCSUPPLEMENTAL.

      Li, Z. 2012. Regulation of the Cell Division Cycle in Trypanosoma brucei. Eukaryot Cell. 11:1180. doi:10.1128/EC.00145-12.

      Li, Z., J.H. Lee, F. Chu, A.L. Burlingame, A. Günzl, and C.C. Wang. 2008. Identification of a Novel Chromosomal Passenger Complex and Its Unique Localization during Cytokinesis in Trypanosoma brucei. PLoS One. 3. doi:10.1371/journal.pone.0002354.

      Mackay, A.M., D.M. Eckley, C. Chue, and W.C. Earnshaw. 1993. Molecular analysis of the INCENPs (inner centromere proteins): separate domains are required for association with microtubules during interphase and with the central spindle during anaphase. J Cell Biol. 123:373–385. doi:10.1083/JCB.123.2.373.

      Marchetti, M.A., C. Tschudi, H. Kwon, S.L. Wolin, and E. Ullu. 2000. Import of proteins into the trypanosome nucleus and their distribution at karyokinesis. J Cell Sci. 113 ( Pt 5):899–906. doi:10.1242/JCS.113.5.899.

      Nakajima, Y., A. Cormier, R.G. Tyers, A. Pigula, Y. Peng, D.G. Drubin, and G. Barnes. 2011. Ipl1/Aurora-dependent phosphorylation of Sli15/INCENP regulates CPC-spindle interaction to ensure proper microtubule dynamics. J Cell Biol. 194:137–153. doi:10.1083/JCB.201009137.

      Noujaim, M., S. Bechstedt, M. Wieczorek, and G.J. Brouhard. 2014. Microtubules accelerate the kinase activity of Aurora-B by a reduction in dimensionality. PLoS One. 9. doi:10.1371/JOURNAL.PONE.0086786.

      Okada, Y., and N. Hirokawa. 1999. A processive single-headed motor: Kinesin superfamily protein KIF1A. Science (1979). 283:1152–1157. doi:10.1126/SCIENCE.283.5405.1152.

      Rice, S., A.W. Lin, D. Safer, C.L. Hart, N. Naber, B.O. Carragher, S.M. Cain, E. Pechatnikova, E.M. Wilson-Kubalek, M. Whittaker, E. Pate, R. Cooke, E.W. Taylor, R.A. Milligan, and R.D. Vale. 1999. A structural change in the kinesin motor protein that drives motility. Nature 1999 402:6763. 402:778–784. doi:10.1038/45483.

      Sablin, E.P., F.J. Kull, R. Cooke, R.D. Vale, and R.J. Fletterick. 1996. Crystal structure of the motor domain of the kinesin-related motor ncd. Nature 1996 380:6574. 380:555–559. doi:10.1038/380555a0.

      Samejima, K., M. Platani, M. Wolny, H. Ogawa, G. Vargiu, P.J. Knight, M. Peckham, and W.C. Earnshaw. 2015. The Inner Centromere Protein (INCENP) Coil Is a Single α-Helix (SAH) Domain That Binds Directly to Microtubules and Is Important for Chromosome Passenger Complex (CPC) Localization and Function in Mitosis. J Biol Chem. 290:21460–21472. doi:10.1074/JBC.M115.645317.

      Schiffrin, B., S.E. Radford, D.J. Brockwell, and A.N. Calabrese. 2020. PyXlinkViewer: A flexible tool for visualization of protein chemical crosslinking data within the PyMOL molecular graphics system. Protein Sci. 29:1851–1857. doi:10.1002/PRO.3902.

    3. Reviewer #1 (Public Review):

      Summary:

      The CPC plays multiple essential roles in mitosis such as kinetochore-microtubule attachment regulation, kinetochore assembly, spindle assembly checkpoint activation, anaphase spindle stabilization, cytokinesis, and nuclear envelope formation, as it dynamically changes its mitotic localization: it is enriched at inner centromeres from prophase to metaphase but it is relocalized at the spindle midzone in anaphase. The business end of the CPC is Aurora B and its allosteric activation module IN-box, which is located at the C-terminus of INCENP. In most well-studied eukaryotic species, Aurora B activity is locally controlled by the localization module of the CPC, Survivin, Borealin and the N-terminal portion of INCENP. Survivin and Borealin, which bind the N-terminus of INCENP, recognize histone residues that are specifically phosphorylated in mitosis, while anaphase spindle midzone localization is supported by the direct microtubule-binding capacity of the SAH (single alpha helix) domain of INCENP and other microtubule-binding proteins that specifically interact with INCENP during anaphase, which are under the regulation of CDK activity. One of these examples includes the kinesin-like protein MKLP2 in vertebrates. Trypanosoma is an evolutionarily interesting species to study mitosis since its kinetochore and centromere proteins do not show any similarity to other major branches of eukaryotes, while orthologs of Aurora B and INCENP have been identified. Combining molecular genetics, imaging, biochemistry, cross-linking IP-MS (IP-CLMS), and structural modeling, this manuscript reveals that two orphan kinesin-like proteins KIN-A and KIN-B act as localization modules of the CPC in Trypanosoma brucei. The IP-CLMS, AlphaFold2 structural predictions, and domain deletion analysis support the idea that (1) KIN-A and KIN-B form a heterodimer via their coiled-coil domains, (2) Two alpha helices of INCENP interact with the coiled-coil of the KIN-A-KIN-B heterodimer, (3) conserved KIN-A C-terminal CD1 and CD2 interact with the heterodimeric KKT9-KKT11 complex, which is a submodule of the KKT7-KKT8 kinetochore complex composed of KKT7, KKT8, KKT9, KKT11, and KKT12 unique to Trypanosoma, (4) KIN-A and KIN-B coiled-coil domains and the KKT7-KKT8 complex are required for CPC localization at the centromere, (5) CD1 and CD2 domains of KIN-A support its centromere localization. The authors further introduced a KIN-A rigor mutant and knocked-down wild-type KIN-A to show that the ATPase activity of KIN-A seems dispensable for centromere targeting but critical for spindle midzone enrichment of the CPC. The imaging data of the KIN-A rigor mutant suggest that dynamic KIN-A-microtubule interaction is required for metaphase alignment of the kinetochores and proliferation. Overall, the study reveals novel pathways of CPC localization regulation via KIN-A and KIN-B by multiple complementary approaches.

      Strengths:

      The major conclusion is collectively supported by multiple approaches, combining CRISPR-mediated gene deletion and complementation/site specific genome engineering, epistasis analysis of cellular localization, AlphaFold2 structure prediction of protein complexes, IP-CLMS and biochemical reconstitution (the complex of KKT8, KKT9, KKT11 and KKT12)

      Weaknesses:

      Minor weakness. The authors imply that KIN-A, but not KIN-B, interacts with microtubules based on microtubule pelleting assay (Fig. S6), but the substantial insoluble fractions of 6HIS-KINA and 6HIS-KIN-B make it difficult to conclusively interpret the data. It is possible that these two proteins are not stable unless they form a heterodimer.

    4. Reviewer #2 (Public Review):

      How the chromosomal passenger complex (CPC) and its subunit Aurora B kinase regulate kinetochore-microtubule attachment, and how the CPC relocates from kinetochores to the spindle midzone as a cell transitions from metaphase to anaphase are questions of great interest. In this study, Ballmer and Akiyoshi take a deep dive into the CPC in T. brucei, a kinetoplastid parasite with a kinetochore composition that varies greatly from other organisms.

      Using a combination of approaches, most importantly in silico protein predictions using alphafold multimer and light microscopy in dividing T. brucei, the authors convincingly present and analyse the composition of the T. brucei CPC. This includes the identification of KIN-A and KIN-B, proteins of the kinesin family. This is a clear advancement over earlier work, for example by Li and colleagues in 2008. The involvement of KIN-A and KIN-B is of particular interest, as it provides a clue for the (re)localization of the CPC during the cell cycle. The evolutionary perspective makes the paper potentially interesting for a wide audience of cell biologists, a point that the authors bring across properly in the title, the abstract, and their discussion.

      The evolutionary twist of the paper would be strengthened 'experimentally' by predictions of the structure of the CPC beyond T. brucei. Depending on how far the authors can extend their in-silico analysis, it would be of interest to discuss a) available/predicted CPC structures in well-studied organisms and b) structural predictions in other euglenozoa. What are the general structural properties of the CPC (e.g. flexible linkers, overall dimensions, structural differences when subunits are missing etc.)? How common is the involvement of kinesin-like proteins?

    5. Reviewer #3 (Public Review):

      Summary:

      The protein kinase, Aurora B, is a critical regulator of mitosis and cytokinesis in eukaryotes, exhibiting a dynamic localisation. As part of the Chromosomal Passenger Complex (CPC), along with the Aurora B activator, INCENP, and the CPC localisation module comprised of Borealin and Survivin, Aurora B travels from the kinetochores at metaphase to the spindle midzone at anaphase, which ensures its substrates are phosphorylated in a time- and space-dependent manner. In the kinetoplastid parasite, T. brucei, the Aurora B orthologue (AUK1), along with an INCENP orthologue known as CPC1, and a kinetoplastid-specific protein CPC2, also displays a dynamic localisation, moving from the kinetochores at metaphase, to the spindle midzone at anaphase, to the anterior end of the newly synthesised flagellum attachment zone (FAZ) at cytokinesis. However, the trypanosome CPC lacks orthologues of Borealin and Survivin, and T. brucei kinetochores also have a unique composition, being comprised of dozens of kinetoplastid-specific proteins (KKTs). Of particular importance for this study are KKT7 and the KKT8 complex (comprising KKT8, KKT9, KKT11, and KKT12). Here, Ballmer and Akiyoshi seek to understand how the CPC assembles and is targeted to its different locations during the cell cycle in T. brucei.

      Strengths & Weaknesses:

      Using immunoprecipitation and mass-spectrometry approaches, Ballmer and Akiyoshi show that AUK1, CPC1, and CPC2 associate with two orphan kinesins, KIN-A and KIN-B, and with the use of endogenously expressed fluorescent fusion proteins, demonstrate for the first time that KIN-A and KIN-B display a dynamic localisation pattern similar to other components of the CPC, providing compelling evidence for KIN-A and KIN-B being bona fide CPC proteins.

      They then demonstrate, by using RNAi to deplete individual components, that the CPC proteins have hierarchical interdependencies for their localisation to the kinetochores at metaphase. These experiments appear to have been well performed.

      Ballmer and Akiyoshi then go on to determine the kinetochore localisation domains of KIN-A and KIN-B. Using ectopically expressed GFP-tagged truncations, they show that coiled coil domains within KIN-A and KIN-B, as well as a disordered C-terminal tail present only in KIN-A, but not the N-terminal motor domains of KIN-A or KIN-B, are required for kinetochore localisation. These data are strengthened by immunoprecipitating CPC complexes and crosslinking them prior to mass spectrometry analysis (IP-CLMS), a state-of-the-art approach, to determine the contacts between the CPC components. Structural predictions of the CPC structure are also made using AlphaFold2, suggesting that coiled coils form between KIN-A and KIN-B, and that KIN-A/B interact with the N termini of CPC1 and CPC2. Experimental results showing that CPC1 and CPC2 are unable to localise to kinetochores if they lack their N-terminal domains are consistent with these predictions. Altogether these data provide compelling evidence of the protein domains required for CPC kinetochore localisation and CPC protein interactions and indicate that both KIN-A and KIN-B have a role to play.

      Next, using a mixture of RNAi depletion and LacI-LacO recruitment experiments, the authors show that kinetochore proteins KKT7 and KKT9 are required for AUK1 to localise to kinetochores (other KKT8 complex components were not tested here) and that all components of the KKT8 complex are required for KIN-A kinetochore localisation. Further, both KKT7 and KKT8 were able to recruit AUK1 to an ectopic locus in S phase, and KKT7 recruited KKT8 complex proteins, indicating it is upstream of KKT8, in line with previous work showing kinetochore localization of KKT7 is unaffected by disruption of the KKT8 complex. This leads to the conclusion that the KKT8 complex is likely the main kinetochore receptor of the CPC.

      Further IP-CLMS experiments, in combination with recombinant protein pull down assays and structural predictions, suggested that within the KKT8 complex, there are two subcomplexes of KKT8:KKT12 and KKT9:KKT11, and that KKT7 interacts with KKT9:KKT11 to recruit the remainder of the KKT8 complex. The authors also assess the interdependencies between KKT8 complex components for localisation and expression, showing that all four subunits are required for the assembly of a stable KKT8 complex and present AlphaFold2 structural modelling data to support the two subcomplex model. In general, these data are of high quality and convincing, although it is a shame that data showing the effects of KKT8, KKT9 and KKT12 depletion on KKT11 localisation and abundance could not be presented alongside the reciprocal experiments in Fig S4I-L.

      The authors also convincingly show that AlphaFold2 predictions of interactions between KKT9:KKT11 and a conserved domain (CD1) in the C-terminal tail of KIN-A are correct, with CD1 and a second conserved domain, CD2, identified through sequence analysis, acting synergistically to promote KIN-A kinetochore localisation at metaphase, but not being required for KIN-A to move to the central spindle at anaphase. They then hypothesise that the kinesin motor domain of KIN-A (but not KIN-B which is predicted to be inactive based on non-conservation of residues key for activity) determines its central spindle localisation at anaphase through binding to microtubules. In support of this hypothesis, the authors show that KIN-A, but not KIN-B can bind microtubules in vitro and in vivo. However, ectopically expressed GFP-NLS fusions of full length KIN-A or KIN-A motor domain did not localise to the central spindle at anaphase. The authors suggest this is due to the GFP fusion disrupting the ATPase activity of the motor domain, although they provide no evidence that this is the case. Instead, they replace endogenous KIN-A with a predicted ATPase-defective mutant (G210A), showing that while this still localises to kinetochores, the kinetochores were frequently misaligned at metaphase, and that it no longer concentrates at the central spindle (with concomitant mis-localisation of AUK1), causing cells to accumulate at anaphase. From these data, the authors conclude that KIN-A ATPase activity is required for chromosome congression to the metaphase plate and its central spindle localisation at anaphase. While these data are highly suggestive that KIN-A possesses ATPase activity, and that this activity is essential for its function, definitive biochemical evidence of KIN-A's ATPase activity is still lacking.

      Impact:

      Overall, this work uses a wide range of cutting edge molecular and structural predictive tools to provide a significant amount of new and detailed molecular data that shed light on the composition of the unusual trypanosome CPC and how it is assembled and targeted to different cellular locations during cell division. Given the fundamental nature of this research, it will be of interest to many parasitology researchers as well as cell biologists more generally, especially those working on aspects of mitosis and cell division, and those interested in the evolution of the CPC.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1 (Public Review):

      Summary:

      Songbirds provide a tractable system to examine neural mechanisms of sequence generation and variability. In past work, the projection from LMAN to RA (output of the anterior forebrain pathway) was shown to be critical for driving vocal variability during babbling, learning, and adulthood. LMAN is immediately adjacent to MMAN, which projects to HVC. MMAN is less well understood but, anatomically, appears to resemble LMAN in that it is the cortical output of a BG-thalamocortical loop. Because it projects to HVC, a major sequence generator for both syllable phonology and sequence, a strong prediction would be that MMAN drives sequence variability in the same way that LMAN drives phonological variability. This hypothesis predicts that MMAN lesions in a Bengalese finch would reduce sequence variability. Here, the authors test this hypothesis. They provide a surprising and important result that is well motivated and well analyzed: MMAN lesions increase sequence variability - this is exactly the opposite result from what would be predicted based on the functions of LMAN.

      Strengths:

      (1) A very important and surprising result shows that lesions of a frontal projection from MMAN to HVC, a sequence generator for birdsong, increase syntactical variability.

      (2) The choice of Bengalese finches, which have complex transition structures, to examine the mechanisms of sequence generation, enabled this important discovery.

      (3) The idea that frontal outputs of BG-cortical loops can generate vocal variability comes from lesions/inactivations of a parallel pathway from LMAN to RA. The difference between MMAN and LMAN functions is striking and important.

      Weaknesses:

      (1) If more attention was paid to how syllable phonology was (or was not) affected by MMAN lesions then the claims could be stronger around the specific effects on sequence.

      Reviewer #2 (Public Review):

      Summary:

      This study investigates the neural substrates of syntax variation in Bengalese finch songs. Here, the authors tested the effects of bilateral lesions of mMAN, a brain area with inputs to HVC, a premotor area required for song production. Lesions in mMAN induce variability in syntactic elements of song specifically through increased transition entropy, variability within stereotyped song elements known as chunks, and increases in the repeat number of individual syllables. These results suggest that mMAN projections to HVC contribute to multiple aspects of song syntax in the Bengalese finch. Overall the experiments are well-designed, the analysis excellent, and the results are of high interest.

      Strengths:

      The study identifies a novel role for mMAN, the medial magnocellular nucleus of the anterior nidopallium, in the control of syntactic variation within adult Bengalese finch song. This is of particular interest as multiple studies previously demonstrated that mMAN lesions do not affect song structure in zebra finches. The study undertakes a thorough analysis to characterise specific aspects of variability within the song of lesioned animals. The conclusions are well supported by the data.

      Weaknesses:

      The study would benefit from additional mechanistic information. A more fine-grained or reversible manipulation, such as brain cooling, might allow additional insights into how mMAN influences specific aspects of syntax structure. Are repeat number increases and transition entropy resulting from shared mechanisms within mMAN, or perhaps arising from differential output to downstream pathways (i.e. projections to HVC)? Similarly, unilateral manipulations would allow the authors to further test the hypothesis that mMAN is involved in inter-hemispheric synchronization.

      We thank the reviewers and editor for their encouraging and helpful comments and suggestions. We have revised the previous submission with new analyses and discussion to address points raised by the reviewers.

      Following the suggestion of Reviewer 1 we have added an analysis of the effects of mMAN lesions on syllable phonology, using a variety of measures. We have included 3 new Figure Supplements that detail our analyses and elaborate on these points.

      We agree with Reviewer 2 that reversible and unilateral manipulations would be interesting and potentially enable additional insights into the mechanisms by which mMAN influences song sequencing, and we are planning to perform such experiments in future studies.

      We made additional minor changes throughout the manuscript to address other points raised by the reviewers, and we thank them again for their time and effort in providing constructive feedback to improve our study.

      A complete point by point detailing of these changes is included below, interspersed with the reviewer comments.

      Reviewer #1 (Recommenda1ons For The Authors):

      The opposite result from what would be predicted based on the functions of LMAN.

      Shoring up the paper's claims and ruling out alternative interpretations will require attention to the following issues:

      Major comments

      (1) Acoustic structure of syllables

      Line 294 & Sup. Figure 2, in some birds the syllable acoustic structures seem to be significantly different between the pre- and post-lesion condition, e.g. 'w' in Bird 1, 'g' in Bird 2, 'blm' in Bird 6. This observation seems to contradict the claim that acoustic structures are not affected by MMAN lesions.

      Related to the previous point, a more detailed analysis is needed to quantify the extent of acoustic changes caused by MMAN lesions. For example, do these pre- and post- lesion syllables form distinct clusters if embedded in a UMAP? Do more standard measures of syllable phonology (e.g. SAP similarity scores or feature distributions) show differences in pre- and post-MMAN lesion?

      We agree with the reviewer that there were individual syllables as illustrated in the average spectrograms of Figure 2 – figure supplement 1 that qualitatively differed between pre- and post-lesion recordings. We have followed the reviewer’s suggestion to quantify changes to syllable phonology using both similarity scores by Sound Analysis Pro (SAP) and a variety of identified acoustic features.

      In brief, these measures largely corroborate the conclusion that for most birds and syllables there was little or no difference in phonology between pre- and post-lesion songs, but that in a minority of cases syllables were altered noticeably (further detail below). In those cases where syllable phonology was altered, changes were not consistent across birds, and we cannot rule out off-target effects due to damage to structures or fibers of passage neighboring mMAN, so that it is unclear whether some subtle changes to syllable phonology can be attributed to mMAN lesions versus other causes. Future studies could more specifically examine whether damage to mMAN alone is sufficient in some cases to degrade syllable structure by using viral or other approaches that enable the more specific disruption of mMAN projection neurons.

      In practice, almost all syllables were identifiable in post-lesion songs so that we could unambiguously assign identity for purposes of evaluating effects of lesions on sequencing. Moreover, in any individual cases where there was ambiguity in syllable identity, we used the sequential context to assign the most likely label. Thus, any errors in assignment in such cases would have tended to reduce rather than accentuate the magnitude of reported sequencing effects. Lastly, each of the reported effects of mMAN lesions on sequencing were observed in multiple birds for which we detected no significant changes to syllable similarity.

      Further details of the analyses of syllable structure are detailed below, and have been added as new figure supplements:

      (1) Syllable similarity scores calculated using SAP (Sound Analysis Pro) (new Figure 2 – figure supplement 2). We compared pre-post lesion similarity scores for each syllable with selfsimilarity measures for the same syllables taken from separate control recordings before lesions. For comparison, we also included a cross-similarity score for syllables of different types. These measures confirmed the qualitative impression from spectrograms that for most birds there were no greater changes to syllable structure following lesions than was present across control recordings. For one bird, pre-post changes were significantly larger than changes across control recordings, but pre-post similarity remained higher than crosssimilarity.

      (2) Analysis of fundamental frequency and coefficient of variation (CV) of fundamental frequency of select syllables for each bird before and after mMAN lesions (new Figure 2- figure supplement 3). This analysis is directly comparable with the same analysis performed on LMAN lesions in Sakata, Hampton, Brainard (2008). We carried out this analysis in part to address changes to syllable structure that might have inadvertently arisen due to damage to LMAN, which sits immediately lateral to mMAN. In the Bengalese finch and zebra finch, lesions of LMAN cause little change to the mean fundamental frequency of individual syllables but cause a consistent reduction in the coefficient of variation (CV) of fundamental frequency across repeated renditions of a given syllable (Sakata, Hampton, Brainard 2008, Andalman, Fee 2009, Warren et al. 2011,). We therefore supposed that unintended damage to LMAN or its projections to RA might have resulted in a reduction in the CV of syllables following mMAN lesions. Instead, we saw a modest increase in the CV of fundamental frequency (mean across birds of +20%; range -19 to +43%). These data suggest that off target effects on LMAN were largely absent in our experiments (consistent with histology, e.g. Figure 1 - figure supplement 1).

      (3) Comparison of Entropy of spectral envelope (entS), Temporal centroid for the temporal envelope (meanT), First, second and third formants (F1, F2, F3), before and after lesions (calculated using the python SoundSig toolbox (Elie and Theunissen 2016) (new Figure 2- figure supplement 4). Acoustic features generally showed little change between pre and post lesion songs. They highlight as relative outliers the same individual examples that stand out in the average spectrograms in Figure 2 – figure supplement 1.

      Author response image 1.

      Syllable similarity calculated using Sound Analysis Pro (SAP). ‘Self Similarity’ = Similarity comparison of syllables before mMAN lesions to syllables of the same type, taken from two separate control recordings before the lesions, ‘Pre vs Post’ = Similarity comparison of the same syllable types before and aqer mMAN lesions, ‘Cross Similarity’ = Similarity comparison of each syllable type to other syllable types. For Birds 1-2 and 4-7, ‘Self Similarity’ was not significantly different from ‘Pre vs Post’ Similarity (p>0.05, Wilcoxon sign rank test), while for Bird 3, there was a significant difference (p = 0.03, Wilcoxon sign rank test). For all birds ‘Pre vs Post’ was significantly different from ‘Cross Similarity’ (p<0.05, Wilcoxon sign rank test). On average, ‘Pre vs Post’ was 4.8 % less than ‘Self Similarity’ (range 0.2%-14%) while ‘Cross Similarity’ was 40% less than ‘Self Similarity’ (range 20.2%-56.3%). These measures confirm the qualitative impression from Figure 2- figure supplement 1 that for most birds and syllables there were no greater changes to syllable structure following lesions than was present across control recordings, and that pre-post similarity remained higher than cross-similarity, i.e. syllables remained clearly identifiable.

      Author response image 2.

      (A) CV of fundamental frequency (FF) of select syllables before and aqer mMAN lesions. In the Bengalese finch and zebra finch, lesions of lMAN, which sits immediately lateral to mMAN, cause a consistent reduction in the coefficient of variation (CV) of fundamental frequency across repeated renditions of a given syllable (Sakata, Hampton, Brainard 2008, Andalman, Fee 2009, Warren et al. 2011). We therefore supposed that unintended damage to lMAN or its projections to RA might have resulted in a reduction in the CV of syllables following mMAN lesions. Instead we saw a modest increase in the CV of fundamental frequency (p<0.05, Wilcoxon sign rank test; mean across birds of +20%; range -19 to +43%). These data suggest that it is unlikely that changes to syllable structure might have arisen due to accidental damage to lMAN. (B) Percent change in mean fundamental frequency aqer mMAN lesions vs mean fundamental frequency before mMAN lesions.

      Author response image 3.

      Selected acoustic features for all syllables in all birds before and after mMAN lesions. Different colors represent different syllable types per bird. ‘entS’ = Entropy of spectral envelope, ‘meanT’ = Temporal centroid for temporal envelope, ‘F1’ = First formant, ‘F2’= Second formant, ‘F3’ = Third formant. Acoustic features generally showed little change between pre and post lesion songs. They highlight as relative outliers the same individual examples that stand out in the average spectrograms in Figure 2 – figure supplement 1.

      (2) Shoring up claims of increased transitional variability

      Line 301 & Sup. Figure 1, in several birds (1, 2, 5, 6), seems that there is a downward trend for postlesion, i.e. the transition entropy gradually decreases with time. How to exclude the possibility that the increased variability is a transient effect, e.g. caused by surgery side effects or destabilization of circuits, which may eventually recover to normal?

      Transition entropy remains elevated for as long as the birds were followed in this study. While the persistence of the effects we observed is longer than transient effects such as those following Nif lesion in zebra finches (Otchy et al., 2015 ~2 days), we cannot rule out either recovery or further deterioration following lesions on much longer time scales, such as those reported by Kubikova et al., 2007 (X lesions, 6 months). We have now added data points for 4 birds where we had songs from later timepoints following lesions; for three of these birds, transition entropy remained elevated above the baseline values for 14 and 33 days, respectively (Figure 1 - figure supplement 2).

      Line 313 & Sup. Figure 4, the claim that "transitions that had low history dependence tended to show larger changes after mMAN lesions" needs better statistical support, because in Sup. Figure 4, the correlation is not significant.

      We apologize for the phrasing. We have changed the sentence to: “Consistent with the first possibility, we observed that there was a nonsignificant trend toward larger changes after mMAN lesion for transitions with low history dependence.”

      Figure 4C-D, only data from 5 out of 7 birds was included, did the other two birds not have repeats? If so, the authors need to be explicit on data exclusion.

      The reviewer’s inference is correct that in our dataset only 5 out of 7 birds had songs which contained repeat phrases. We have added the following sentence to state that explicitly: “In our dataset of 7 birds, only 5 birds had songs which contained repeat phrases.”

      Minor comments

      Sup. Figure 3, to help readers understand, 1) add symbols and arrows to point to the structures; 2) indicate the orientation of the slide, e.g. which direction is medial/lateral; 3) a negative control without lesion needs to be shown for comparison.

      We have made the suggested changes and updated new Figure 1- figure supplement 1.

      Author response image 4.

      Image of calcitonin gene-related peptide (CGRP)-stained frontal section (leq) control and (right) bird 5. CGRP labels cells in both lMAN (seen in black to the leq of the lesion) and mMAN (blue, intact; red, completely destroyed).

      A statistical test is needed for Sup. Figure 5B.

      We have modified the Figure legend for Figure 3 – figure supplement 1 as follows:

      “Change in transition entropy was not significantly different for transitions within chunks and at branchpoints (p> 0.05, Wilcoxon rank sum test)”

      Line 363, these can be moved to the Introduction, so readers have a better sense of what's already known about MMAN lesion.

      We have moved the sentence to Introduction.

      Fig 1e. RA also projects to DLM.

      Our intention was to focus on the connections involving mMAN; we have now added the connection in Figure 1E.

      Reviewer #2 (Recommenda1ons For The Authors):

      Please address this issue in the discussion (no new experiments required): It would be interesting to consider how social context modulates the variability of the song. In these experiments, Bengalese finches were singing in isolation. How might changes in syntax be modulated by the presence of a female in directed song and in other social contexts?

      Thank you for your suggestion. One study by Jarvis, et al., (Jarvis E., et al., 1998) shows that ZENK expression in mMAN aqer singing does not differ between female-directed singing, undirected singing and singing in presence of a male conspecific. This suggests that activity in mMAN might not be modulated by social context. But we agree that it would be interesting to test how a change in social context (which typically leads to reduced transition entropy) interacts with the increased variability we see aqer mMAN lesions. We have added the following sentences to the discussion:

      “In our study, we only recorded song sequencing of male Bengalese finches singing in isolaBon. Social context, such as female-directed song, can also change song sequencing (Hampton, Sakata and Brainard, 2009; Chen, Matheson and Sakata, 2016). It would be interesBng to test whether mMAN plays a role in the social context-modulated changes in sequencing (Jarvis et al., 1998), similar to how lMAN contributes to social context-modulated changes in syllable structure (Sakata, Hampton and Brainard, 2008).”

    2. Reviewer #2 (Public Review):

      Summary:

      This study investigates the neural substrates of syntax variation in Bengalese finch song. Here, the authors tested the effects of bilateral lesions of mMAN, a brain area with inputs to HVC, a premotor area required for song production. Lesions in mMAN induce variability in syntactic elements of song specifically through increased transition entropy, variability within stereotyped song elements known as chunks and increases in the repeat number of individual syllables. These results suggest that mMAN projections to HVC contribute to multiple aspects of song syntax in the Bengalese finch. Overall the experiments are well-designed, the analysis excellent, and the results are of high interest.

      Strengths:

      The study identifies a novel role for mMAN, medial magnocellular nucleus of the anterior nidopallium, in the control of syntactic variation within adult Bengalese finch song. This is of particular interest as multiple studies previously demonstrated that mMAN lesions to do not effect song structure in zebra finches. The study undertakes a thorough analysis to characterise specific aspects of variability within the song of lesioned animals. The conclusions are well supported by the data.

    3. eLife assessment

      Songbirds provide a tractable model system to study mechanisms of vocal production and sequencing, and past work showed that the lesions to LMAN, the output of a basal ganglia thalamocortical loop, reduced vocal variability, consistent with a role in motor exploration. In this fundamental work, the authors rigorously examined how lesions to an understudied neighboring region, MMAN, part of a parallel basal ganglia loop, affect singing in Bengalese finches, whose songs exhibit complex sequential transitions. The authors provide compelling evidence that MMAN lesions resulted in increased sequential variability but do not affect syllable acoustic structure, showing that distinct frontal systems can have distinct functions for producing and sequencing song syllables.

    4. Reviewer #1 (Public Review):

      Summary:

      Songbirds provide a tractable system to examine neural mechanisms of sequence generation and variability. In past work, the projection from LMAN to RA (output of the anterior forebrain pathway) was shown to be critical for driving vocal variability during babbling, learning, and adulthood. LMAN is immediately adjacent to MMAN, which projects to HVC. MMAN is less well understood but, anatomically, appears to resemble LMAN in that it is the cortical output of a BG-thalamocortical loop. Because it projects to HVC, a major sequence generator for both syllable phonology and sequence, a strong prediction would be that MMAN drives sequence variability in the same way that LMAN drives phonological variability. This hypothesis predicts that MMAN lesions in a Bengalese finch would reduce sequence variability. Here, the authors test this hypothesis. They provide a surprising and important result that is well motivated and well analyzed: MMAN lesions increase sequence variability - this is exactly the opposite result from what would be predicted based on the functions of LMAN.

      Strengths:

      (1) A very important and surprising result shows that lesions of a frontal projection from MMAN to HVC, a sequence generator for birdsong, increase syntactical variability.

      (2) The choice of Bengalese finches, which have complex transition structures, to examine the mechanisms of sequence generation, enabled this important discovery.

      (3) The idea that frontal outputs of BG-cortical loops can generate vocal variability comes from lesions/inactivations of a parallel pathway from LMAN to RA. The difference between MMAN and LMAN functions is striking and important.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Review

      [...] A particular strength of the present study is the structural characterization of human PURA, which is a challenging target for structural biology approaches. The molecular dynamics simulations are state-of-the-art, allowing a statistically meaningful assessment of the differences between wild-type and mutant proteins. The functional consequences of PURA mutations at the cellular level are fascinating, particularly the differential compartmentalization of wild-type and mutant PURA variants into certain subcellular condensates.

      Weaknesses that warrant rectification relate to (i) The interpretation of statistically non-significant effects seen in the molecular dynamic simulations.

      We removed from the manuscript the sentence which indicated that we analyzed statistically non-significant effects. Therefore, the above statement has been resolved.

      (ii) The statistical analysis of the differential compartmentalization of PURA variants into processing bodies vs. stress granules, and

      We re-analyzed all cell-biological data and adjusted the statistical analysis of P-bodies and Stress-granule intensity analysis. The new, and improved statistics have replaced the original analyses in the corresponding figures (Figs. 1C and 2B).

      (iii) Insufficient documentation of protein expression levels and knock-down efficiencies.

      Quantification of protein expression levels by Western blotting is shown in Appendix Figure S1. Quantification of knock-down efficiencies by Western blot experiments (Appendix Figure S3).

      Recommendations for the authors: Reviewer #1

      Concerns and Suggested Changes

      (a) I have only one concern about the computational part and that is about statements such as "There are also large differences in the residue surrounding the mutation spot (residues 90 to 100), where the K97E mutant also shows much greater fluctuation. However, these differences are not significant due to the large standard deviations." If the differences are not statistically significant, then I would suggest either removing such a statement or increasing the statistics.

      We agree with the Reviewer’s comment. We removed this sentence from the text.

      Recommendations for the authors: Reviewer #2

      General Comments

      This is a challenging structural target and the authors have made considerable efforts to determine the effect of several mutations on the structure and function. Many of the constructs, however, could not be expressed and/or purified in bacteria. However, it is not clear to what extent other expression systems (e.g. Drosophila or human) were considered and if this would have been beneficial.

      We did not use other expression systems because the wild-type protein is well-behaved when expressed in E. coli. In case a mutant variant cannot be expressed or does not behave well in E. coli, this constitutes a clear indication that the respective mutation impairs the protein’s integrity. Thus, by using E. coli as a reference system for all the variants of PURA protein, we could assess the influence of the mutations on the structural integrity and solubility. Only for the variants that did not show impairment in E. coli expression, we continued to assess in more detail why they are nevertheless functionally impaired and cause PURA Syndrome.

      Concerns and Suggested Changes

      (a) The schematic in Figure 3A would have been helpful for interpreting the mutations discussed in Figures 1 and 2. I would suggest moving it earlier in the text.

      We changed the figure according to the Reviewer’s suggestion.

      (b) I believe the RNA used for binding studies in Figures 3C and D was (CGG)8. Are the two "free" RNA bands a monomer and a dimer (duplex?)?

      Although we do not know for certain, it is indeed likely that the two free RNA bands represent either different secondary structures of the free RNA or a duplex of two molecules. Of note, PURA binds to both “free” RNA bands, indicating that it either does not discriminate between them or melts double-stranded RNA in these EMSAs.

      There also seems to be considerable cooperativity in the binding, so I wonder if a shorter RNA oligonucleotide might facilitate the measurement of Kds.

      The length of the used RNA was selected based on the estimated elongated size of the full-length PURA and the presence of 3 PUR repeats. Assuming that one PUR repeat interacts with about 6-7 bases (data from the co-structure of Drosophila PURA with DNA; PDB-ID: 5FGP) and that full-length PURA forms a dimer consisting of three PUR repeats, the full-length protein in its extended form should cover a nucleic-acid stretch of about 24 bases.

      Also, it is not clear how the affinities were measured particularly for hsPURA III since free band is never fully bound at the highest protein concentration.

      It was not our goal to measure Kds for the interaction of PURA variants with RNA. The EMSA experiments were conducted to detect relative differences in the interaction between PURA variants and RNA. To estimate the differences, we measured total intensity of the bound (shifted) and unbound RNA. The intensities of the bands observed on the scanned EMSA gels were quantified with FUJI ImageJ software. We calculated the percentage of the shifted RNA and normalized it. hsPURA III fragment shows much lower affinity therefore it does not fully shift RNA with the highest protein concentration when compared to the full-length PURA and to PURA I-II.

      (c) Do the human PURA I+II and dmPURA I+ II crystallize in the same space group and have similar packing? Can the observed structural flexibility be due to crystal contacts?

      hsPURA I+II and dmPURA I+II crystallize in different space groups with different crystal packing. In both cases, the asymmetric unit contains 4 independent molecules with the flexible part of the structure composed of the β4 and β8 (β ridge) exposed to solvent. In the case of the Drosophila structure, we do not observe any flexibility of both β-strands. In contrast, for the human PURA structure the β ridge exhibits lots of flexibility and it adopts different conformations in all 4 molecules of the asymmetric unit. We observe similar flexibility of the β4 and β8 (β ridge) in the structure of K97E mutant which contains 2 molecules in the asymmetric unit. We would like to add that we expect crystal contacts to rather stabilize than destabilize domains.

      Similarly, can the conformations observed for the K97E mutant be partially explained by packing?

      Regarding the sequence shift observed for the β5 and β6 strands in hsPURA I+II K97E variant: although the β5 strand with shifted amino acid sequence is involved in the contact with the symmetry-related molecule with another β5 strand we don’t consider this interaction as a source of the shift. To be sure that the shift is not forced by the crystallization, we had performed NMR measurement which confirmed that in solution there is a strong change in the β-stands comparing WT and K97E mutant. This is an unambiguous indication that the structural changes observed in the crystal structure are also happening in solution. In addition, the MD simulations provide additional confirmation of our interpretation that K97E destabilizes the corresponding PUR domain. Taken together, we provide proof from three different angles that the observed differences indeed affect the integrity and hence function of the protein.

      (d) Perhaps, it is my misunderstanding, but I find the NMR data on the Arg sidechains for the K97E confusing. If they are visible for K97E and not WT, doesn't this indicate that there is an exchange between two conformations or more dynamics in the WT structure? This does not seem to be the opposite of the expectation if K97E is thought to have more conformational flexibility.

      Due to a technical issue (peak contour level), arginine side chain resonances were not clearly visible in the WT spectrum. The figure 5F has been updated. Now, they do correspond to those seen in the mutant spectrum. However, to prevent any confusion or mis/overinterpretation, we removed the sentence regarding arginine side chain: "Intriguingly, arginine side chain resonances Nε-Hε were only visible in the K97E variant, while they were broadened out in the wild-type spectrum."

      (e) The most speculative part of the paper is the interpretation of SG and PB localization of PURA in Fig 1 and 2. There is an important issue with the statistics that must be clarified because it would appear that statistical significance was determined using each SG or PB as an independent measurement. This is incorrect and significance should be measured by only using the means of three biological replicates. This is well described here. It is not clear at this time if the reported P values will be confirmed upon reanalysis, and this may require reinterpretation of the data.

      We are grateful for this clarifying comment and agree that the statistical analysis of P-body and stress granule was misleading. Of note, while the figures depicted all the values independent of the biological repeats, the statistical analyses were done on the mean value of each replicate of each cell line and not all raw data points.

      We prepared new Plots, only showing the mean value of each replicate, and also re-calculated P-values. The values have changed only slightly in this new analysis because we now also included the previously labeled outliers (red points) to better demonstrate that significance still exists even when considering them.

      In the new analysis of stress-granule association, only the value of the K97E mutant lost its significance, indicating that its association to stress granules is not lost. Therefore, we adjusted the following sentences in the manuscript.

      Results:

      Original: "While quantification showed a reduced association of hsPURA K97E mutant with G3BP1-positive granules (Fig 1B), the two other mutants, I206F and F233del, showed the same co-localization to stress granules as the wild type control."

      Corrected: "In all the patient-related mutations, no significant reduction in stress granule association was seen when compared to the wild type control (Fig 1C)."

      Original: "The observation that only one of the patient-related mutations of hsPURA, K97E, showed reduced stress granule association indicates that this feature may not constitute a major hallmark of the PURA syndrome. It should be noted however that this interpretation must be considered with some caution as the experiments were performed in a PURA wild-type background."

      Corrected: "As we did not observe significant changes in the association of patient-related mutations of hsPURA to stress granules, it is suggested that that this feature may not constitute a major hallmark of the PURA syndrome. It should be noted however that this interpretation must be considered with some caution as the experiments were performed in a PURA wild-type background."

      (f) A western blot showing the level of overexpression of the PURA proteins should be shown in Figure 1 as well as the KD of endogenous PURA for Figure S2?

      As requested, a Western blot showing the level of overexpression of the different PURA proteins has been added as Appendix Figure S1.

      A Western blot of the siRNA-mediated knock-down experiments of PURA and their corresponding control has been added to Appendix Figure S3. Quantification of three biological repeats showed a significant reduction of PURA protein levels upon knock down.

      (g) While I appreciate that rewriting is time-consuming, I would recommend considering restructuring the manuscript because I think that it would aid the overall clarity. I think the foundation of the work is the structural characterization and would suggest beginning the paper with this data and the biochemical characterization. The co-localization with SGs and PBs and how this may be relevant to disease is much more speculative and is therefore better to present later. While I appreciate that the structural interpretation of why some mutants localize to PBs differently is not entirely clear, I do think that this would provide some context for the discussion.

      In the initial version of the manuscript we first presented the structural characterization of PURA and afterwards the co-localization with SGs and PBs. As this reviewer stated him-/herself in (e), we also noticed that the SG and PB interpretation is the most speculative part of this manuscript. We felt that having this at the end of the results section would weaken the manuscript. On the other hand, we consider that the structural interpretation of mutations is much stronger and has a greater impact for future research. After long discussion we decided to swap the order to leave the most important results for the end of the manuscript.

      Recommendations for the authors: Reviewer #3

      Concerns and Suggested Changes:

      (a) For the characterization of G3BP1-positive stress granules in HeLa cells upon depletion of PURA, it remains unclear what is the efficiency of siRNA? The authors should provide a western blot to indicate how much the endogenous levels were reduced.

      We completely agree with the stated concern and addressed it accordingly. We had performed this experiment prior to submission but for some unknown reason it was not included in the manuscript.

      The Western blot of siRNA-mediated knock-down experiments of PURA and their corresponding control is now shown in Appendix Figure S3. Quantification of three biological repeats, showed a significant reduction of PURA protein levels upon knock down.

      (b) How does knocking down PURA affect DCP1A-positive structures in HeLa cells? Would P bodies be formed even in the absence (or reduction) of total PURA?

      Indeed, the stated question is very interesting. In fact, we have already shown in our recent publication (Molitor et al., 2023) that a knock down of PURA in HeLa and NHDF cells leads to a significant reduction of P-bodies. We actually referred to this finding on page 6:

      "Since hsPURA was recently shown to be required for P-body formation in HeLa cells and fibroblasts (Molitor et al. 2023), PURA-dependent liquid phase separation could potentially also directly contribute to the formation of these granules."

      On the same page, we also refer to the underlying molecular mechanism:

      "However, when putting this observation in perspective with previous reports, it seems unlikely that P-body formation directly depends on phase separation by hsPURA, but rather on its recently reported function as gene regulator of the essential P-body core factors LSM14a and DDX6 (Molitor et al., 2023)."

    2. Joint Public Review:

      The present study focuses on the structure and function of human PURA, a regulator of gene transcription and mRNA transport and translation whose mutation causes the neurodevelopmental PURA syndrome, characterized by developmental delay, intellectual disability, hypotonia, epileptic seizures, a.o. deficits. The authors combined structural biology, molecular dynamics simulation, and various cell biological assays to study the effects of disease-causing PURA mutations on protein structure and function. The corresponding data reveal a highly dynamic PURA structure and show that disease-related mutations in PURA cause complex defects in folding, DNA-unwinding activity, RNA binding, dimerization, and partitioning into processing bodies. These findings provide first insights into how very diverse PURA mutations can cause penetrant molecular, cellular, and clinical defects. This will be of substantial interest to cell biologists, neurogeneticists, and neurologists alike.

      A particular strength of the present study is the structural characterization of human PURA, which is a challenging target for structural biology approaches. The molecular dynamics simulations are state-of-the-art, allowing a statistically meaningful assessment of the differences between wild-type and mutant proteins. The functional consequences of PURA mutations at the cellular level are fascinating, particularly the differential compartmentalization of wild-type and mutant PURA variants into certain subcellular condensates.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      These ingenious and thoughtful studies present important findings concerning how people represent and generalise abstract patterns of sensory data. The issue of generalisation is a core topic in neuroscience and psychology, relevant across a wide range of areas, and the findings will be of interest to researchers across areas in perception, learning, and cognitive science. The findings have the potential to provide compelling support for the outlined account, but there appear other possible explanations, too, that may affect the scope of the findings but could be considered in a revision.

      Thank you for sending the feedback from the three peer reviewers regarding our paper. Please find below our detailed responses addressing the reviewers' comments. We have incorporated these suggestions into the paper and provided explanations for the modifications made.

      We have specifically addressed the point of uncertainty highlighted in eLife's editorial assessment, which concerned alternative explanations for the reported effect. In response to Reviewer #1, we have clarified how Exp. 2c and Exp. 3c address the potential alternative explanation related to "attention to dimensions." Further, we present a supplementary analysis to account for differences in asymptotic learning, as noted by Reviewer #2. We have also clarified how our control experiments address effects associated with general cognitive engagement in the task. Lastly, we have further clarified the conceptual foundation of our paper, addressing concerns raised by Reviewers #2 and #3.

      Reviewer #1 (Public Review):

      Summary:

      This manuscript reports a series of experiments examining category learning and subsequent generalization of stimulus representations across spatial and nonspatial domains. In Experiment 1, participants were first trained to make category judgments about sequences of stimuli presented either in nonspatial auditory or visual modalities (with feature values drawn from a two-dimensional feature manifold, e.g., pitch vs timbre), or in a spatial modality (with feature values defined by positions in physical space, e.g., Cartesian x and y coordinates). A subsequent test phase assessed category judgments for 'rotated' exemplars of these stimuli: i.e., versions in which the transition vectors are rotated in the same feature space used during training (near transfer) or in a different feature space belonging to the same domain (far transfer). Findings demonstrate clearly that representations developed for the spatial domain allow for representational generalization, whereas this pattern is not observed for the nonspatial domains that are tested. Subsequent experiments demonstrate that if participants are first pre-trained to map nonspatial auditory/visual features to spatial locations, then rotational generalization is facilitated even for these nonspatial domains. It is argued that these findings are consistent with the idea that spatial representations form a generalized substrate for cognition: that space can act as a scaffold for learning abstract nonspatial concepts.

      Strengths:

      I enjoyed reading this manuscript, which is extremely well-written and well-presented. The writing is clear and concise throughout, and the figures do a great job of highlighting the key concepts. The issue of generalization is a core topic in neuroscience and psychology, relevant across a wide range of areas, and the findings will be of interest to researchers across areas in perception and cognitive science. It's also excellent to see that the hypotheses, methods, and analyses were pre-registered.

      The experiments that have been run are ingenious and thoughtful; I particularly liked the use of stimulus structures that allow for disentangling of one-dimensional and two-dimensional response patterns. The studies are also well-powered for detecting the effects of interest. The model-based statistical analyses are thorough and appropriate throughout (and it's good to see model recovery analysis too). The findings themselves are clear-cut: I have little doubt about the robustness and replicability of these data.

      Weaknesses:

      I have only one significant concern regarding this manuscript, which relates to the interpretation of the findings. The findings are taken to suggest that "space may serve as a 'scaffold', allowing people to visualize and manipulate nonspatial concepts" (p13). However, I think the data may be amenable to an alternative possibility. I wonder if it's possible that, for the visual and auditory stimuli, participants naturally tended to attend to one feature dimension and ignore the other - i.e., there may have been a (potentially idiosyncratic) difference in salience between the feature dimensions that led to participants learning the feature sequence in a one-dimensional way (akin to the 'overshadowing' effect in associative learning: e.g., see Mackintosh, 1976, "Overshadowing and stimulus intensity", Animal Learning and Behaviour). By contrast, we are very used to thinking about space as a multidimensional domain, in particular with regard to two-dimensional vertical and horizontal displacements. As a result, one would naturally expect to see more evidence of two-dimensional representation (allowing for rotational generalization) for spatial than nonspatial domains.

      In this view, the impact of spatial pre-training and (particularly) mapping is simply to highlight to participants that the auditory/visual stimuli comprise two separable (and independent) dimensions. Once they understand this, during subsequent training, they can learn about sequences on both dimensions, which will allow for a 2D representation and hence rotational generalization - as observed in Experiments 2 and 3. This account also anticipates that mapping alone (as in Experiment 4) could be sufficient to promote a 2D strategy for auditory and visual domains.

      This "attention to dimensions" account has some similarities to the "spatial scaffolding" idea put forward in the article, in arguing that experience of how auditory/visual feature manifolds can be translated into a spatial representation helps people to see those domains in a way that allows for rotational generalization. Where it differs is that it does not propose that space provides a scaffold for the development of the nonspatial representations, i.e., that people represent/learn the nonspatial information in a spatial format, and this is what allows them to manipulate nonspatial concepts. Instead, the "attention to dimensions" account anticipates that ANY manipulation that highlights to participants the separable-dimension nature of auditory/visual stimuli could facilitate 2D representation and hence rotational generalization. For example, explicit instruction on how the stimuli are constructed may be sufficient, or pre-training of some form with each dimension separately, before they are combined to form the 2D stimuli.

      I'd be interested to hear the authors' thoughts on this account - whether they see it as an alternative to their own interpretation, and whether it can be ruled out on the basis of their existing data.

      We thank the Reviewer for their comments. We agree with the Reviewer that the “attention to dimensions” hypothesis is an interesting alternative explanation. However, we believe that the results of our control experiments Exp. 2c and Exp. 3c are incompatible with this alternative explanation.

      In Exp. 2c, participants are pre-trained in the visual modality and then tested in the auditory modality. In the multimodal association task, participants have to associate the auditory stimuli and the visual stimuli: on each trial, they hear a sound and then have to click on the corresponding visual stimulus. It is thus necessary to pay attention to both auditory dimensions and both visual dimensions to perform the task. To give an example, the task might involve mapping the fundamental frequency and the amplitude modulation of the auditory stimulus to the colour and the shape of the visual stimulus, respectively. If participants pay attention to only one dimension, this would lead to a maximum of 25% accuracy on average (because they would be at chance on the other dimension, with four possible options). We observed that 30/50 participants reached an accuracy > 50% in the multimodal association task in Exp. 2c. This means that we know for sure that at least 60% of the participants paid attention to both dimensions of the stimuli. Nevertheless, there was a clear difference between participants that received a visual pre-training (Exp. 2c) and those who received a spatial pre-training (Exp. 2a) (frequency of 1D vs 2D models between conditions, BF > 100 in near transfer and far transfer). In fact, only 3/50 participants were best fit by a 2D model when vision was the pre-training modality compared to 29/50 when space was the pre-training modality. Thus, the benefit of the spatial pre-training cannot be due solely to a shift in attention toward both dimensions.

      This effect was replicated in Exp. 3c. Similarly, 33/48 participants reached an accuracy > 50% in the multimodal association task in Exp. 3c, meaning that we know for sure that at least 68% of the participants actually paid attention to both dimensions of the stimuli. Again, there was a clear difference between participants who received a visual pre-training (frequency of 1D vs 2D models between conditions, Exp. 3c) and those who received a spatial pre-training (Exp. 3a) (BF > 100 in near transfer and far transfer).

      Thus, we believe that the alternative explanation raised by the Reviewer is not supported by our data. We have added a paragraph in the discussion:

      “One alternative explanation of this effect could be that the spatial pre-training encourages participants to attend to both dimensions of the non-spatial stimuli. By contrast, pretraining in the visual or auditory domains (where multiple dimensions of a stimulus may be relevant less often naturally) encourages them to attend to a single dimension. However, data from our control experiments Exp. 2c and Exp. 3c, are incompatible with this explanation. Around ~65% of the participants show a level of performance in the multimodal association task (>50%) which could only be achieved if they were attending to both dimensions (performance attending to a single dimension would yield 25% and chance performance is at 6.25%). This suggests that participants are attending to both dimensions even in the visual and auditory mapping case.”

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, L&S investigates the important general question of how humans achieve invariant behavior over stimuli belonging to one category given the widely varying input representation of those stimuli and more specifically, how they do that in arbitrary abstract domains. The authors start with the hypothesis that this is achieved by invariance transformations that observers use for interpreting different entries and furthermore, that these transformations in an arbitrary domain emerge with the help of the transformations (e.g. translation, rotation) within the spatial domain by using those as "scaffolding" during transformation learning. To provide the missing evidence for this hypothesis, L&S used behavioral category learning studies within and across the spatial, auditory, and visual domains, where rotated and translated 4-element token sequences had to be learned to categorize and then the learned transformation had to be applied in new feature dimensions within the given domain. Through single- and multiple-day supervised training and unsupervised tests, L&S demonstrated by standard computational analyses that in such setups, space and spatial transformations can, indeed, help with developing and using appropriate rotational mapping whereas the visual domain cannot fulfill such a scaffolding role.

      Strengths:

      The overall problem definition and the context of spatial mapping-driven solution to the problem is timely. The general design of testing the scaffolding effect across different domains is more advanced than any previous attempts clarifying the relevance of spatial coding to any other type of representational codes. Once the formulation of the general problem in a specific scientific framework is done, the following steps are clearly and logically defined and executed. The obtained results are well interpretable, and they could serve as a good stepping stone for deeper investigations. The analytical tools used for the interpretations are adequate. The paper is relatively clearly written.

      Weaknesses:

      Some additional effort to clarify the exact contribution of the paper, the link between analyses and the claims of the paper, and its link to previous proposals would be necessary to better assess the significance of the results and the true nature of the proposed mechanism of abstract generalization.

      (1) Insufficient conceptual setup: The original theoretical proposal (the Tolman-Eichenbaum-Machine, Whittington et al., Cell 2020) that L&S relate their work to proposes that just as in the case of memory for spatial navigation, humans and animals create their flexible relational memory system of any abstract representation by a conjunction code that combines on the one hand, sensory representation and on the other hand, a general structural representation or relational transformation. The TEM also suggests that the structural representation could contain any graph-interpretable spatial relations, albeit in their demonstration 2D neighbor relations were used. The goal of L&S's paper is to provide behavioral evidence for this suggestion by showing that humans use representational codes that are invariant to relational transformations of non-spatial abstract stimuli and moreover, that humans obtain these invariances by developing invariance transformers with the help of available spatial transformers. To obtain such evidence, L&S use the rotational transformation. However, the actual procedure they use actually solved an alternative task: instead of interrogating how humans develop generalizations in abstract spaces, they demonstrated that if one defines rotation in an abstract feature space embedded in a visual or auditory modality that is similar to the 2D space (i.e. has two independent dimensions that are clearly segregable and continuous), humans cannot learn to apply rotation of 4-piece temporal sequences in those spaces while they can do it in 2D space, and with co-associating a one-to-one mapping between locations in those feature spaces with locations in the 2D space an appropriate shaping mapping training will lead to the successful application of rotation in the given task (and in some other feature spaces in the given domain). While this is an interesting and challenging demonstration, it does not shed light on how humans learn and generalize, only that humans CAN do learning and generalization in this, highly constrained scenario. This result is a demonstration of how a stepwise learning regiment can make use of one structure for mapping a complex input into a desired output. The results neither clarify how generalizations would develop in abstract spaces nor the question of whether this generalization uses transformations developed in the abstract space. The specific training procedure ensures success in the presented experiments but the availability and feasibility of an equivalent procedure in a natural setting is a crucial part of validating the original claim and that has not been done in the paper.

      We thank the Reviewer for their detailed comments on our manuscript. We reply to the three main points in turn.

      First, concerning the conceptual grounding of our work, we would point out that the TEM model (Whittington et al., 2020), however interesting, is not our theoretical starting point. Rather, as we hope the text and references make clear, we ground our work in theoretical work from the 1990/2000s proposing that space acts as a scaffold for navigating abstract spaces (such as Gärdenfors, 2000). We acknowledge that the TEM model and other experimental work on the implication of the hippocampus, the entorhinal cortex and the parietal cortex in relational transformations of nonspatial stimuli provide evidence for this general theory. However, our work is designed to test a more basic question: whether there is behavioural evidence that space scaffolds learning in the first place. To achieve this, we perform behavioural experiments with causal manipulation (spatial pre-training vs no spatial pre-training) have the potential to provide such direct evidence. This is why we claim that:

      “This theory is backed up by proof-of-concept computational simulations [13], and by findings that brain regions thought to be critical for spatial cognition in mammals (such as the hippocampal-entorhinal complex and parietal cortex) exhibit neural codes that are invariant to relational transformations of nonspatial stimuli. However, whilst promising, this theory lacks direct empirical evidence. Here, we set out to provide a strong test of the idea that learning about physical space scaffolds conceptual generalisation.“

      Second, we agree with the Reviewer that we do not provide an explicit model for how generalisation occurs, and how precisely space acts as a scaffold for building representations and/or applying the relevant transformations to non-spatial stimuli to solve our task. Rather, we investigate in our Exp. 2-4 which aspects of the training are necessary for rotational generalisation to happen (and conclude that a simple training with the multimodal association task is sufficient for ~20% participants). We now acknowledge in the discussion the fact that we do not provide an explicit model and leave that for future work:

      “We acknowledge that our study does not provide a mechanistic model of spatial scaffolding but rather delineate which aspects of the training are necessary for generalisation to happen.”

      Finally, we also agree with the Reviewer that our task is non-naturalistic. As is common in experimental research, one must sacrifice the naturalistic elements of the task in exchange for the control and the absence of prior knowledge of the participants. We have decided to mitigate as possible the prior knowledge of the participants to make sure that our task involved learning a completely new task and that the pre-training was really causing the better learning/generalisation. The effects we report are consistent across the experiments so we feel confident about them but we agree with the Reviewer that an external validation with more naturalistic stimuli/tasks would be a nice addition to this work. We have included a sentence in the discussion:

      “All the effects observed in our experiments were consistent across near transfer conditions (rotation of patterns within the same feature space), and far transfer conditions (rotation of patterns within a different feature space, where features are drawn from the same modality). This shows the generality of spatial training for conceptual generalisation. We did not test transfer across modalities nor transfer in a more natural setting; we leave this for future studies.”

      (2) Missing controls: The asymptotic performance in experiment 1 after training in the three tasks was quite different in the three tasks (intercepts 2.9, 1.9, 1.6 for spatial, visual, and auditory, respectively; p. 5. para. 1, Fig 2BFJ). It seems that the statement "However, our main question was how participants would generalise learning to novel, rotated exemplars of the same concept." assumes that learning and generalization are independent. Wouldn't it be possible, though, that the level of generalization depends on the level of acquiring a good representation of the "concept" and after obtaining an adequate level of this knowledge, generalization would kick in without scaffolding? If so, a missing control is to equate the levels of asymptotic learning and see whether there is a significant difference in generalization. A related issue is that we have no information on what kind of learning in the three different domains was performed, albeit we probably suspect that in space the 2D representation was dominant while in the auditory and visual domains not so much. Thus, a second missing piece of evidence is the model-fitting results of the ⦰ condition that would show which way the original sequences were encoded (similar to Fig 2 CGK and DHL). If the reason for lower performance is not individual stimulus difficulty but the natural tendency to encode the given stimulus type by a combo of random + 1D strategy that would clarify that the result of the cross-training is, indeed, transferring the 2D-mapping strategy.

      We agree with the Reviewer that a good further control is to equate performance during training. Thus, we have run a complementary analysis where we select only the participants that reach > 90% accuracy in the last block of training in order to equate asymptotic performance after training in Exp. 1. The results (see Author response image 1) replicates the results that we report in the main text: there is a large difference between groups (relative likelihood of 1D vs. 2D models, all BF > 100 in favour of a difference between the auditory and the spatial modalities, between the visual and the spatial modalities, in both near and far transfer, “decisive” evidence). We prefer not to include this figure in the paper for clarity, and because we believe this result is expected given the fact that 0/50 and 0/50 of the participants in the auditory and visual condition used a 2D strategy – thus, selecting subgroups of these participants cannot change our conclusions.

      Author response image 1.

      Results of Exp. 1 when selecting participants that reached > 90% accuracy in the last block of training. Captions are the same as Figure 2 of the main text.

      Second, the Reviewer suggested that we run the model fitting analysis only on the ⦰ condition (training) in Exp. 1 to reveal whether participants use a 1D or a 2D strategy already during training. Unfortunately, we cannot provide the model fits only in the ⦰ condition in Exp. 1 because all models make the same predictions for this condition (see Fig S4). However, note that this is done by design: participants were free to apply whatever strategy they want during training; we then used the generalisation phase with the rotated stimuli precisely to reveal this strategy. Further, we do believe that the strategy used by the participants during training and the strategy during transfer are the same, partly because – starting from block #4 – participants have no idea whether the current trial is a training trial or a transfer trial, as both trial types are randomly interleaved with no cue signalling the trial type. We have made this clear in the methods:

      “They subsequently performed 105 trials (with trialwise feedback) and 105 transfer trials including rotated and far transfer quadruplets (without trialwise feedback) which were presented in mixed blocks of 30 trials. Training and transfer trials were randomly interleaved, and no clue indicated whether participants were currently on a training trial or a transfer trial before feedback (or absence of feedback in case of a transfer trial).”

      Reviewer #3 (Public Review):

      Summary:

      Pesnot Lerousseau and Summerfield aimed to explore how humans generalize abstract patterns of sensory data (concepts), focusing on whether and how spatial representations may facilitate the generalization of abstract concepts (rotational invariance). Specifically, the authors investigated whether people can recognize rotated sequences of stimuli in both spatial and nonspatial domains and whether spatial pre-training and multi-modal mapping aid in this process.

      Strengths:

      The study innovatively examines a relatively underexplored but interesting area of cognitive science, the potential role of spatial scaffolding in generalizing sequences. The experimental design is clever and covers different modalities (auditory, visual, spatial), utilizing a two-dimensional feature manifold. The findings are backed by strong empirical data, good data analysis, and excellent transparency (including preregistration) adding weight to the proposition that spatial cognition can aid abstract concept generalization.

      Weaknesses:

      The examples used to motivate the study (such as "tree" = oak tree, family tree, taxonomic tree) may not effectively represent the phenomena being studied, possibly confusing linguistic labels with abstract concepts. This potential confusion may also extend to doubts about the real-life applicability of the generalizations observed in the study and raises questions about the nature of the underlying mechanism being proposed.

      We thank the Reviewer for their comments. We agree that we could have explained ore clearly enough how these examples motivate our study. The similarity between “oak tree” and “family tree” is not just the verbal label. Rather, it is the arrangement of the parts (nodes and branches) in a nested hierarchy. Oak trees and family trees share the same relational structure. The reason that invariance is relevant here is that the similarity in relational structure is retained under rigid body transformations such as rotation or translation. For example, an upside-down tree can still be recognised as a tree, just as a family tree can be plotted with the oldest ancestors at either top or bottom. Similarly, in our study, the quadruplets are defined by the relations between stimuli: all quadruplets use the same basic stimuli, but the categories are defined by the relations between successive stimuli. In our task, generalising means recognising that relations between stimuli are the same despite changes in the surface properties (for example in far transfer). We have clarify that in the introduction:

      “For example, the concept of a “tree” implies an entity whose structure is defined by a nested hierarchy, whether this is a physical object whose parts are arranged in space (such as an oak tree in a forest) or a more abstract data structure (such as a family tree or taxonomic tree). [...] Despite great changes in the surface properties of oak trees, family trees and taxonomic trees, humans perceive them as different instances of a more abstract concept defined by the same relational structure.”

      Next, the study does not explore whether scaffolding effects could be observed with other well-learned domains, leaving open the question of whether spatial representations are uniquely effective or simply one instance of a familiar 2D space, again questioning the underlying mechanism.

      We would like to mention that Reviewer #2 had a similar comment. We agree with both Reviewers that our task is non-naturalistic. As is common in experimental research, one must sacrifice the naturalistic elements of the task in exchange for the control and the absence of prior knowledge of the participants. We have decided to mitigate as possible the prior knowledge of the participants to make sure that our task involved learning a completely new task and that the pre-training was really causing the better learning/generalisation. The effects we report are consistent across the experiments so we feel confident about them but we agree with the Reviewer that an external validation with more naturalistic stimuli/tasks would be a nice addition to this work. We have included a sentence in the discussion:

      “All the effects observed in our experiments were consistent across near transfer conditions (rotation of patterns within the same feature space), and far transfer conditions (rotation of patterns within a different feature space, where features are drawn from the same modality). This shows the generality of spatial training for conceptual generalisation. We did not test transfer across modalities nor transfer in a more natural setting; we leave this for future studies.”

      Further doubt on the underlying mechanism is cast by the possibility that the observed correlation between mapping task performance and the adoption of a 2D strategy may reflect general cognitive engagement rather than the spatial nature of the task. Similarly, the surprising finding that a significant number of participants benefited from spatial scaffolding without seeing spatial modalities may further raise questions about the interpretation of the scaffolding effect, pointing towards potential alternative interpretations, such as shifts in attention during learning induced by pre-training without changing underlying abstract conceptual representations.

      The Reviewer is concerned about the fact that the spatial pre-training could benefit the participants by increasing global cognitive engagement rather than providing a scaffold for learning invariances. It is correct that the participants in the control group in Exp. 2c have poorer performances on average than participants that benefit from the spatial pre-training in Exp. 2a and 2b. The better performances of the participants in Exp. 2a and 2b could be due to either the spatial nature of the pre-training (as we claim) or a difference in general cognitive engagement. .

      However, if we look closely at the results of Exp. 3, we can see that the general cognitive engagement hypothesis is not well supported by the data. Indeed, the participants in the control condition (Exp. 3c) have relatively similar performances than the other groups during training. Rather, the difference is in the strategy they use, as revealed by the transfer condition. The majority of them are using a 1D strategy, contrary to the participants that benefited from a spatial pre-training (Exp 3a and 3b). We have included a sentence in the results:

      “Further, the results show that participants who did not experience spatial pre-training were still engaged in the task, but were not using the same strategy as the participants who experienced spatial pre-training (1D rather than 2D). Thus, the benefit of the spatial pre-training is not simply to increase the cognitive engagement of the participants. Rather, spatial pre-training provides a scaffold to learn rotation-invariant representation of auditory and visual concepts even when rotation is never explicitly shown during pre-training.”

      Finally, Reviewer #1 had a related concern about a potential alternative explanation that involved a shift in attention. We reproduce our response here: we agree with the Reviewer that the “attention to dimensions” hypothesis is an interesting (and potentially concerning) alternative explanation. However, we believe that the results of our control experiments Exp. 2c and Exp. 3c are not compatible with this alternative explanation.

      Indeed, in Exp. 2c, participants are pre-trained in the visual modality and then tested in the auditory modality. In the multimodal association task, participants have to associate the auditory stimuli and the visual stimuli: on each trial, they hear a sound and then have to click on the corresponding visual stimulus. It is necessary to pay attention to both auditory dimensions and both visual dimensions to perform well in the task. To give an example, the task might involve mapping the fundamental frequency and the amplitude modulation of the auditory stimulus to the colour and the shape of the visual stimulus, respectively. If participants pay attention to only one dimension, this would lead to a maximum of 25% accuracy on average (because they would be at chance on the other dimension, with four possible options). We observed that 30/50 participants reached an accuracy > 50% in the multimodal association task in Exp. 2c. This means that we know for sure that at least 60% of the participants actually paid attention to both dimensions of the stimuli. Nevertheless, there was a clear difference between participants that received a visual pre-training (Exp. 2c) and those who received a spatial pre-training (Exp. 2a) (frequency of 1D vs 2D models between conditions, BF > 100 in near transfer and far transfer). In fact, only 3/50 participants were best fit by a 2D model when vision was the pre-training modality compared to 29/50 when space was the pre-training modality. Thus, the benefit of the spatial pre-training cannot be due solely to a shift in attention toward both dimensions.

      This effect was replicated in Exp. 3c. Similarly, 33/48 participants reached an accuracy > 50% in the multimodal association task in Exp. 3c, meaning that we know for sure that at least 68% of the participants actually paid attention to both dimensions of the stimuli. Again, there was a clear difference between participants who received a visual pre-training (frequency of 1D vs 2D models between conditions, Exp. 3c) and those who received a spatial pre-training (Exp. 3a) (BF > 100 in near transfer and far transfer).

      Thus, we believe that the alternative explanation raised by the Reviewer is not supported by our data. We have added a paragraph in the discussion:

      “One alternative explanation of this effect could be that the spatial pre-training encourages participants to attend to both dimensions of the non-spatial stimuli. By contrast, pretraining in the visual or auditory domains (where multiple dimensions of a stimulus may be relevant less often naturally) encourages them to attend to a single dimension. However, data from our control experiments Exp. 2c and Exp. 3c, are incompatible with this explanation. Around ~65% of the participants show a level of performance in the multimodal association task (>50%) which could only be achieved if they were attending to both dimensions (performance attending to a single dimension would yield 25% and chance performance is at 6.25%). This suggests that participants are attending to both dimensions even in the visual and auditory mapping case.”

      Conclusions:

      The authors successfully demonstrate that spatial training can enhance the ability to generalize in nonspatial domains, particularly in recognizing rotated sequences. The results for the most part support their conclusions, showing that spatial representations can act as a scaffold for learning more abstract conceptual invariances. However, the study leaves room for further investigation into whether the observed effects are unique to spatial cognition or could be replicated with other forms of well-established knowledge, as well as further clarifications of the underlying mechanisms.

      Impact:

      The study's findings are likely to have a valuable impact on cognitive science, particularly in understanding how abstract concepts are learned and generalized. The methods and data can be useful for further research, especially in exploring the relationship between spatial cognition and abstract conceptualization. The insights could also be valuable for AI research, particularly in improving models that involve abstract pattern recognition and conceptual generalization.

      In summary, the paper contributes valuable insights into the role of spatial cognition in learning abstract concepts, though it invites further research to explore the boundaries and specifics of this scaffolding effect.

      Reviewer #1 (Recommendations For The Authors):

      Minor issues / typos:

      P6: I think the example of the "signed" mapping here should be "e.g., ABAB maps to one category and BABA maps to another", rather than "ABBA maps to another" (since ABBA would always map to another category, whether the mapping is signed or unsigned).

      Done.

      P11: "Next, we asked whether pre-training and mapping were systematically associated with 2Dness...". I'd recommend changing to: "Next, we asked whether accuracy during pre-training and mapping were systematically associated with 2Dness...", just to clarify what the analyzed variables are.

      Done.

      P13, paragraph 1: "only if the features were themselves are physical spatial locations" either "were" or "are" should be removed.

      Done.

      P13, paragraph 1: should be "neural representations of space form a critical substrate" (not "for").

      Done.

      Reviewer #2 (Recommendations For The Authors):

      The authors use in multiple places in the manuscript the phrases "learn invariances" (Abstract), "formation of invariances" (p. 2, para. 1), etc. It might be just me, but this feels a bit like 'sloppy' wording: we do not learn or form invariances, rather we learn or form representations or transformations by which we can perform tasks that require invariance over particular features or transformation of the input such as the case of object recognition and size- translation- or lighting-invariance. We do not form size invariance, we have representations of objects and/or size transformations allowing the recognition of objects of different sizes. The authors might change this way of referring to the phenomenon.

      We respectfully disagree with this comment. An invariance occurs when neurons make the same response under different stimulation patterns. The objects or features to which a neuron responds is shaped by its inputs. Those inputs are in turn determined by experience-dependent plasticity. This process is often called “representation learning”. We think that our language here is consistent with this status quo view in the field.

      Reviewer #3 (Recommendations For The Authors):

      • I understand that the objective of the present experiment is to study our ability to generalize abstract patterns of sensory data (concepts). In the introduction, the authors present examples like the concept of a "tree" (encompassing a family tree, an oak tree, and a taxonomic tree) and "ring" to illustrate the idea. However, I am sceptical as to whether these examples effectively represent the phenomena being studied. From my perspective, these different instances of "tree" do not seem to relate to the same abstract concept that is translated or rotated but rather appear to share only a linguistic label. For instance, the conceptual substance of a family tree is markedly different from that of an oak tree, lacking significant overlap in meaning or structure. Thus, to me, these examples do not demonstrate invariance to transformations such as rotations.

      To elaborate further, typically, generalization involves recognizing the same object or concept through transformations. In the case of abstract concepts, this would imply a shared abstract representation rather than a mere linguistic category. While I understand the objective of the experiments and acknowledge their potential significance, I find myself wondering about the real-world applicability and relevance of such generalizations in everyday cognitive functioning. This, in turn, casts some doubt on the broader relevance of the study's results. A more fitting example, or an explanation that addresses my concerns about the suitability of the current examples, would be beneficial to further clarify the study's intent and scope.

      Response in the public review.

      • Relatedly, the manuscript could benefit from greater clarity in defining key concepts and elucidating the proposed mechanism behind the observed effects. Is it plausible that the changes observed are primarily due to shifts in attention induced by the spatial pre-training, rather than a change in the process of learning abstract conceptual invariances (i.e., modifications to the abstract representations themselves)? While the authors conclude that spatial pre-training acts as a scaffold for enhancing the learning of conceptual invariances, it raises the question: does this imply participants simply became more focused on spatial relationships during learning, or might this shift in attention represent a distinct strategy, and an alternative explanation? A more precise definition of these concepts and a clearer explanation of the authors' perspective on the mechanism underlying these effects would reduce any ambiguity in this regard.

      Response in the public review.

      • I am wondering whether the effectiveness of spatial representations in generalizing abstract concepts stems from their special nature or simply because they are a familiar 2D space for participants. It is well-established that memory benefits from linking items to familiar locations, a technique used in memory training (method of loci). This raises the question: Are we observing a similar effect here, where spatial dimensions are the only tested familiar 2D spaces, while the other 2 spaces are simply unfamiliar, as also suggested by the lower performance during training (Fig.2)? Would the results be replicable with another well-learned, robustly encoded domain, such as auditory dimensions for professional musicians, or is there something inherently unique about spatial representations that aids in bootstrapping abstract representations?

      On the other side of the same coin, are spatial representations qualitatively different, or simply more efficient because they are learned more quickly and readily? This leads to the consideration that if visual pre-training and visual-to-auditory mapping were continued until a similar proficiency level as in spatial training is achieved, we might observe comparable performance in aiding generalization. Thus, the conclusion that spatial representations are a special scaffold for abstract concepts may not be exclusively due to their inherent spatial nature, but rather to the general characteristic of well-established representations. This hypothesis could be further explored by either identifying alternative 2D representations that are equally well-learned or by extending training in visual or auditory representations before proceeding with the mapping task. At the very least I believe this potential explanation should be explored in the discussion section.

      Response in the public review.

      I had some difficulty in following an important section of the introduction: "... whether participants can learn rotationally invariant concepts in nonspatial domains, i.e., those that are defined by sequences of visual and auditory features (rather than by locations in physical space, defined in Cartesian or polar coordinates) is not known." This was initially puzzling to me as the paragraph preceding it mentions: "There is already good evidence that nonspatial concepts are represented in a translation invariant format." While I now understand that the essential distinction here is between translation and rotation, this was not immediately apparent upon first reading. This crucial distinction, especially in the context of conceptual spaces, was not clearly established before this point in the manuscript. For better clarity, it would be beneficial to explicitly contrast and define translation versus rotation in this particular section and stress that the present study concerns rotations in abstract spaces.

      Done.

      • The multi-modal association is crucial for the study, however to my knowledge, it is not depicted or well explained in the main text or figures (Results section). In my opinion, the details of this task should be explained and illustrated before the details of the associated results are discussed.

      We have included an illustration of a multimodal association trial in Fig. S3B.

      Author response image 2.

      • The observed correlation between the mapping task performance and the adoption of a 2D strategy is logical. However, this correlation might not exclusively indicate the proposed underlying mechanism of spatial scaffolding. Could it also be reflective of more general factors like overall performance, attention levels, or the effort exerted by participants? This alternative explanation suggests that the correlation might arise from broader cognitive engagement rather than specifically from the spatial nature of the task. Addressing this possibility could strengthen the argument for the unique role of spatial representations in learning abstract concepts, or at least this alternative interpretation should be mentioned.

      Response in the public review.

      • To me, the finding that ~30% of participants benefited from the spatial scaffolding effect for example in the auditory condition merely through exposure to the mapping (Fig 4D), without needing to see the quadruplets in the spatial modality, was somewhat surprising. This is particularly noteworthy considering that only ~60% of participants adopted the 2D strategy with exposure to rotated contingencies in Experiment 3 (Fig 3D). How do the authors interpret this outcome? It would be interesting to understand their perspective on why such a significant effect emerged from mere exposure to the mapping task.

      • I appreciate the clarity Fig.1 provides in explaining a challenging experimental setup. Is it possible to provide example trials, including an illustration that shows which rotations produce the trail and an intuitive explanation that response maps onto the 1D vs 2D strategies respectively, to aid the reader in better understanding this core manipulation?

      • I like that the authors provide transparency by depicting individual subject's data points in their results figures (e.g. Figs. 2 B, F, J). However, with an n=~50 per condition, it becomes difficult to intuit the distribution, especially for conditions with higher variance (e.g., Auditory). The figures might be more easily interpretable with alternative methods of displaying variances, such as violin plots per data point, conventional error shading using 95%CIs, etc.

      • Why are the authors not reporting exact BFs in the results sections at least for the most important contrasts?

      • While I understand why the authors report the frequencies for the best model fits, this may become difficult to interpret in some sections, given the large number of reported values. Alternatives or additional summary statistics supporting inference could be beneficial.

      As the Reviewer states, there are a large number of figures that we can report in this study. We have chosen to keep this number at a minimum to be as clear as possible. To illustrate the distribution of individual data points, we have opted to display only the group's mean and standard error (the standard errors are included, but the substantial number of participants per condition provides precise estimates, resulting in error bars that can be smaller than the mean point). This decision stems from our concern that including additional details could lead to a cluttered representation with unnecessary complexity. Finally, we report what we believe to be the critical BFs for the comprehension of the reader in the main text, and choose a cutoff of 100 when BFs are high (corresponding to the label “decisive” evidence, some BFs are larger than 1012). All the exact BFs are in the supplementary for the interested readers.

    2. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, L&S investigates the important general question of how humans achieve invariant behavior over stimuli belonging to one category given the widely varying input representation of those stimuli and more specifically, how they do that in arbitrary abstract domains. The authors start with the hypothesis that this is achieved by invariance transformations that observers use for interpreting different entries and furthermore, that these transformations in an arbitrary domain emerge with the help of the transformations (e. g. translation, rotation) within the spatial domain by using those as "scaffolding" during transformation learning. To provide the missing evidence for this hypothesis, L&S used behavioral category learning studies within and across the spatial, auditory and visual domains, where rotated and translated 4-element token sequences had to be learned to categorize and then the learned transformation had to applied in new feature dimensions within the given domain. Through single- and multiple-day supervised training and unsupervised tests, L&S demonstrated by standard computational analyses that in such setups, space and spatial transformations can, indeed, help with developing and using appropriate rotational mapping whereas the visual domain cannot fulfill such a scaffolding role.

      Strengths:

      The overall problem definition and the context of spatial mapping-driven solution to the problem is timely. The general design of testing the scaffolding effect across different domains is more advanced than any previous attempts clarifying the relevance of spatial coding to any other type of representational codes. Once the formulation of the general problem in a specific scientific framework is done, the following steps are clearly and logically defined and executed. The obtained results are well interpretable, and they could serve as a good steppingstone for deeper investigations. The analytical tools used for the interpretations are adequate. The paper is relatively clearly written.

      Weaknesses:

      Some additional effort to clarify the exact contribution of the paper, the link between analyses and the claims of the paper and its link to previous proposals would be necessary to better assess the significance of the results and the true nature of the proposed mechanism of abstract generalization.

      (1) Insufficient conceptual setup: The original theoretical proposal (the Tolman-Eichenbaum-Machine, Whittington et al., Cell 2020) that L&S relate their work proposes that just as in the case of memory for spatial navigation, humans and animal create their flexible relational memory system of any abstract representation by a conjunction code that combines on the one hand, sensory representation and on the other hand, a general structural representation or relational transformation. The TEM also suggest that the structural representation could contain any graph-interpretable spatial relations, albeit in their demonstration 2D neighbor relations were used. The goal of L&S's paper is to provide behavioral evidence for this suggestion by showing that humans use representational codes that are invariant to relational transformations of non-spatial abstract stimuli and moreover, that humans obtain these invariances by developing invariance transformers with the help of available spatial transformers. To obtain such evidence, L&S use the rotational transformation. However, the actual procedure they used actually solved an alternative task: instead of interrogating how humans develop generalizations in abstract spaces, they demonstrated that if one defines rotation in an abstract feature space embedded in visual or auditory modality that is similar to the 2D space (i.e. has two independent dimensions that are clearly segregable and continuous), humans cannot learn to apply rotation of 4-piece temporal sequences in those spaces while they can do it in 2D space, and with co-associating a one-to-one mapping between locations in those feature spaces with locations in the 2D space an appropriate shaping mapping training will lead to successful application of rotation in the given task (and in some other feature spaces in the given domain). While this is an interesting and challenging demonstration, it does not shed light on how humans learn and generalize only that humans CAN do learning and generalization in this, highly constrained scenario. This result is a demonstration of how a stepwise learning regiment can make use of one structure for mapping a complex input into a desired output. The results neither clarify how generalizations would develop in abstract spaces nor the question if this generalization uses transformations developed in the abstract space. The specific training procedure ensures success in the presented experiments but the availability and feasibility of an equivalent procedure in natural setting is a crucial part of validating the original claim and that has not been done in the paper.

      (2) Missing controls: The asymptotic performance in Exp 1 after training in the three tasks was quite different in the three tasks (intercepts 2.9, 1.9, 1.6 for spatial, visual and auditory, respectively; p. 5. para. 1, Fig 2BFJ). It seems that the statement "However, or main question was how participants would generalise learning to novel, rotated exemplars of the same concept." assumes that learning and generalization are independent. Wouldn't it be possible, though, that the level of generalization depends on the level of acquiring a good representation of the "concept" and after obtaining an adequate level of this knowledge, generalization would kick in without scaffolding? If so, a missing control is to equate the levels of asymptotic learning and see whether there is a significant difference in generalization. A related issue is that we have no information what kind of learning in the three different domains were performed, albeit we probably suspect that in space the 2D representation was dominant while in the auditory and visual domains not so much. Thus, a second missing piece of evidence is the model fitting results of the ⦰ condition that would show which way the original sequences were encoded (similar to Fig 2 CGK and DHL). If the reason for lower performance is not individual stimulus difficulty but the natural tendency to encode the given stimulus type by a combo of random + 1D strategy that would clarify that the result of the cross-training is, indeed, transferring the 2D-mapping strategy.

    3. eLife assessment

      These ingenious and thoughtful studies present important findings concerning how people can represent and generalise abstract patterns of sensory data. The issue of generalization is a core topic in neuroscience and psychology, relevant across a wide range of areas, and the findings will be of interest to researchers across areas in perception, learning and cognitive science. The findings are convincing in this setting, but future research must establish their generality and interrogate the precise nature of the underlying mechanism.

    4. Reviewer #1 (Public Review):

      Summary:

      This manuscript reports a series of experiments examining category learning and subsequent generalization of stimulus representations across spatial and nonspatial domains. In Experiment 1, participants were first trained to make category judgments about sequences of stimuli presented either in nonspatial auditory or visual modalities (with feature values drawn from a two-dimensional feature manifold, e.g., pitch vs timbre), or in a spatial modality (with feature values defined by positions in physical space, e.g., Cartesian x and y coordinates). A subsequent test phase assessed category judgments for 'rotated' exemplars of these stimuli: i.e., versions in which the transition vectors are rotated in the same feature space used during training (near transfer) or in a different feature space belonging to the same domain (far transfer). Findings demonstrate clearly that representations developed for the spatial domain allow for representational generalization, whereas this pattern is not observed for the nonspatial domains that are tested. Subsequent experiments demonstrate that if participants are first pre-trained to map nonspatial auditory/visual features to spatial locations, then rotational generalization is facilitated even for these nonspatial domains. It is argued that these findings are consistent with the idea that spatial representations form a generalized substrate for cognition: that space can act as a scaffold for learning abstract nonspatial concepts.

      Strengths:

      I enjoyed reading this manuscript, which is extremely well written and well presented. The writing is clear and concise throughout, and the figures do a great job of highlighting the key concepts. The issue of generalization is a core topic in neuroscience and psychology, relevant across a wide range of areas, and the findings will be of interest to researchers across areas in perception and cognitive science. It's also excellent to see that the hypotheses, methods and analyses were pre-registered.

      The experiments that have been run are ingenious and thoughtful; I particularly liked the use of stimulus structures that allow for disentangling of one-dimensional and two-dimensional response patterns. The studies are also well powered for detecting effects of interest. The model-based statistical analyses are thorough and appropriate throughout (and it's good to see model recovery analysis too). The findings themselves are clear-cut: I have little doubt about the robustness and replicability of these data.

      Weaknesses:

      In my original review I raised a concern related to a potential alternative interpretation of the findings: the idea that participants have substantial experience of representing space in terms of multiple, independent, and separable dimensions, whereas this may not be the case for the visual and auditory stimuli used here. As I noted in that prior review, on this view "the impact of spatial pre-training and (particularly) mapping is simply to highlight to participants that the auditory / visual stimuli comprise two separable (and independent) dimensions."

      In addressing this point, the authors note that performance in the visual/auditory "mapping" task in Experiments 2c and 3c suggests that most participants were paying attention to both dimensions of auditory and visual stimuli. I agree that seems to have been the case. But there is a difference between making use of information from both dimensions, and realizing that ***the two dimensions are separable and independent*** (which is what is required for rotational generalization in this task).

      As an analogy, suppose I have a task where participants have to map a pillow and a shuttlecock to category A, and a surfboard and a bicycle to category B. A participant could learn to do this just by memorizing the correct response for each item considered as a "whole thing". Or they could realize that the items contain component information, learning that "things with feathers" belong in category A, and "things that can carry people" go in category B. Performance may be the same in both cases, but the underlying process is quite different.

      The "attention to dimensions" account that I advanced in my previous review was referring to something more like the latter (feathers/vehicle) case: that spatial pre-training helps people to understand that items can be decomposed into separable pieces of information. Above-chance performance in the visual-auditory mapping task does not (necessarily) demonstrate this ability because it could reflect memorization of "whole" stimuli rather than reflecting decomposition into separable component parts. I agree that it does at least show that participants were paying attention to and making use of information from both dimensions when making their mapping decisions; it's just that they may not have *realized* that they were using information from two separable dimensions.

    1. Author Response

      The following is the authors’ response to the previous reviews.

      • Is the coronal slice in Figure 2 the corresponding mid-coronal plane to compute Dice scores? If so, the authors could mention it so that readers have an idea where the selected slice is.

      This is indeed a good point. The coronal slice in Figure 2 is not part of the set of slices that we used to compute Dice scores. Showing such a slice is important, so we have added a small figure to the appendix with one of these slices, along with the corresponding automated segmentations.

      • SIFT descriptors were adopted to detect fiducials only. Maybe it could also be applied to align stacked photographs of brain slices.

      While SIFT is robust against changes in pose (e.g., object rotation), perspective, and lightning, it is not robust against changes in the object itself – such as changes between one slice to the next, as is the case in our work. We have added a sentence to the methods section clarifying this issue.

    2. eLife assessment

      The authors of this study implemented an important toolset for 3D reconstruction and segmentation of dissection photographs, which could serve as an alternative for cadaveric and ex vivo MRIs. The tools were tested on synthetic and real data with compelling performance. This toolset could further contribute to the study of neuroimaging-neuropathological correlations.

    3. Reviewer #1 (Public Review):

      Gazula and co-workers presented in this paper a software tool for 3D structural analysis of human brains, using slabs of fixed or fresh brains. This tool will be included in Freesurfer, a well-known neuroimaging processing software. It is possible to reconstruct a 3D surface from photographs of coronal sliced brains, optionally using a surface scan as model. A high-resolution segmentation of 11 brain regions is produced, independent of the thickness of the slices, interpolating information when needed. Using this method, the researcher can use the sliced brain to segment all regions, without the need of ex vivo MRI scanning.

      The software suite is freely available and includes 3 modules. The first accomplishes preprocessing steps, for correction of pixel sizes and perspective. The second module is a registration algorithm that registers a 3D surface scan obtained prior to sectioning (reference) to the multiple 2D slices. It is not mandatory to scan the surface, -a probabilistic atlas can also be used as reference- however the accuracy is lower. The third module uses machine learning to perform the segmentation of 11 brain structures in the 3D reconstructed volume. This module is robust, dealing with different illumination conditions, cameras, lens and camera settings. This algorithm ("Photo-SynthSeg") produces isotropic smooth reconstructions, even in high anisotropic datasets (when the in-plane resolution of the photograph is much higher than the thickness), interpolating the information between slices.

      To verify the accuracy and reliability of the toolbox, the authors reconstructed 3 datasets, using real and synthetic data. Real data of 21 postmortem confirmed Alzheimer's disease cases from the Massachusetts Alzheimer's Disease Research Center (MADRC)and 24 cases from the AD Research at the University of Washington(who were MRI scanned prior to processing)were employed for testing. These cases represent a challenging real-world scenario. Additionally, 500 subjects of the Human Connectome project were used for testing error as a continuous function of slice thickness. The segmentations were performed with the proposed deep-learning new algorithm ("Photo-SynthSeg") and compared against MRI segmentations performed to "SAMSEG" (an MRI segmentation algorithm, computing Dice scores for the segmentations. The methods are sound and statistically showed correlations above 0.8, which is good enough to allow volumetric analysis. The main strengths of the methods are the datasets used (real-world challenging and synthetic) and the statistical treatment, which showed that the pipeline is robust and can facilitate volumetric analysis derived from brain sections and conclude which factors can influence in the accuracy of the method (such as using or not 3D scan and using constant thickness).

      Although very robust and capable of handling several situations, the researcher has to keep in mind that processing has to follow some basic rules in order for this pipeline to work properly. For instance, fiducials and scales need to be included in the photograph, and the slabs should be photographed against a contrasting background. Also, only coronal slices can be used, which can be limiting for certain situations.

      The authors achieved their aims, and the statistical analysis confirms that the machine learning algorithm performs segmentations comparable to the state-of-the-art of automated MRI segmentations.<br /> Those methods will be particularly interesting to researchers who deal with post-mortem tissue analysis and do not have access to ex vivo MRI. Quantitative measurements of specific brain areas can be performed in different pathologies and even in the normal aging process. The method is highly reproducible, and cost-effective since allows the pipeline to be applied by any researcher with small pre-processing steps.

    4. Reviewer #2 (Public Review):

      Summary

      The authors proposed a toolset Photo-SynthSeg to the software FreeSurfer which performs 3D reconstruction and high-resolution 3D segmentation on a stack of coronal dissection photographs of brain tissues. To prove the performance of the toolset, three experiments were conducted, including volumetric comparison of brain tissues on AD and HC groups from MADRC, quantitative evaluation of segmentation on UW-ADRC and quantitative evaluation of 3D reconstruction on HCP digitally sliced MRI data.

      Strengths

      To guarantee the successful workflow of the toolset, the authors clearly mentioned the prerequisites of dissection photograph acquisition, such as fiducials or rulers in the photos and tissue placement of brain slices with more than one connected component. The quantitative evaluation of segmentation and reconstruction on synthetic and real data demonstrates the accuracy of the methodology. Also, the successful application of this toolset on two brain banks with different slice thicknesses, tissue processing and photograph settings demonstrates its robustness. By working with tools of the SynthSeg pipeline, Photo-SynthSeg could further support volumetric cortex parcellation. The toolset also benefits from its adaptability of different 3D references, such as surface scan, ex vivo MRI and even probabilistic atlas, suiting the needs for different brain banks.

      Weaknesses

      Certain weaknesses are already covered in the manuscript. Cortical tissue segmentation could be further improved. The quantitative evaluation of 3D reconstruction is quite optimistic due to random affine transformations. Manual edits of slice segmentation task are still required and take a couple of minutes per photograph. Finally, the current toolset only accepts coronal brain slices and should adapt to axial or sagittal slices in future work.

    1. eLife assessment

      Parkkinen et al. describe distinct biophysical profiles of brain-derived alpha-synuclein vs. in vitro-seeded synuclein fibrils. The findings are important and relevant to the emerging potential role of SAA as a biomarker of PD. The evidence is solid, using appropriate methodology and providing some support for the main claims with some limitations.

    2. Reviewer #1 (Public Review):

      SUMMARY:

      Parkinson's disease (PD) and other synucleinopathies, including Parkinson's Disease Dementia (PDD), Dementia with Lewy Bodies (DLB), and Multiple System Atrophy (MSA), pose significant challenges for early diagnosis, as their clinical manifestations often emerge after substantial neurodegeneration has occurred. In this context, the Alpha-Synuclein Seeding Amplification Assay (SAA) has garnered considerable attention for its potential as a diagnostic tool, capable of detecting pathological forms of alpha-synuclein (αSyn) even before the onset of classical clinical symptoms and signs. The assay exploits αSyn's intrinsic property to convert healthy forms into pathological ones, subsequently amplifying these pathological forms for visualization. This study aims to investigate the efficacy of SAA in accurately identifying subtypes of synucleinopathies, including PD, PDD, DLB, and MSA. To achieve this, the results from the patient brain-derived αSyn SAA are compared with those obtained through conformational stability assays, immunolabeling, and electron microscopy. Study shows that brain-derived αSyn fibrils exhibit significant differences across various synucleinopathies in their conformation, biochemical profile and phosphorylation patterns. Importantly, the SAA method appears to fall short in capturing these distinctions.

      The study's findings are highly relevant given the rapidly advancing landscape of utilizing the SAA for the diagnosis and differentiation of various forms of PD and synucleinopathies using patient biofluids. It is somewhat surprising that the authors primarily characterize SAA as a research tool without delving into its potential as a biomarker detection assay, especially in the context of the field's excitement about its diagnostic applications. Additionally, a missed opportunity lies in not referencing a recent study that employed SAA successfully to diagnose PD and subtype the condition using a vast sample size. To further strengthen the results, the inclusion of healthy control brains in the biochemical and immunostaining/immunoblot experiments would provide more robust comparisons. Overall, the authors have conducted their experiments diligently, and their study offers valuable insights that align with the ongoing efforts to enhance early diagnosis and subtype differentiation in the domain of synucleinopathies.

      STRENGTH:

      The strengths of this research article are indeed notable and contribute to the credibility and significance of the study:

      Important Research Question: The study addresses a crucial question in the field of neurodegenerative diseases by evaluating the effectiveness of the αSyn SAA in diagnosing and differentiating synucleinopathies. This question is of significant clinical and scientific interest.<br /> Comprehensive Introduction: The article provides a thorough and well-structured introduction to the topic with an illustration, setting the stage for the research. It ensures that readers, including those unfamiliar with the subject matter, can grasp the context and significance of the study.<br /> Use of Patient Brain Tissue: The use of patient-derived brain tissue samples from various synucleinopathies, including PD, PDD, DLB, and MSA, enhances the clinical relevance and applicability of the findings.<br /> Replication and Statistical Significance: Conducting the experiments six times for each sample demonstrates the rigor of the study and the robustness of the results, and increases the confidence in the conclusions drawn.<br /> Clarity in Experimental Results and Discussion: The authors have presented the experimental results in a clear and understandable manner. I was personally impressed by images showing twisted and straight conformations of αSyn, as well as immunogold labeling for phosphorylation of αSyn, which aids in conveying the findings effectively to the readers. The results clearly show distinct differences in the characteristics of αSyn fibrils across different synucleinopathies. It also highlights the more aggressive seeding capacities and higher biochemical stability of αSyn in PDD and DLB patients, offering valuable insights into the pathophysiology of these conditions. The authors also clearly show that SAA fails in differentiating the disease types within the synucleinopathies.<br /> Clinical relevance: The study underscores the importance of considering complementary diagnostic methods alongside SAA for a more comprehensive understanding of synucleinopathy subtypes. The study might also play an important role in potential FDA approval of SAA as a diagnostic tool for synucleinopathies, especially for PD.<br /> These strengths collectively make the study a valuable contribution to the field of neurodegenerative diseases, shedding light on the limitations and potential applications of SAA in the diagnosis and differentiation of various synucleinopathies.

      WEAKNESS:

      While this study is overall robust, there are several aspects that could further enhance the quality and interpretation of the findings.

      Clinical Data on Patient Brain Samples: The inclusion of specific details such as post-mortem intervals and the age at disease onset for patient brain samples would be valuable. These factors could significantly affect the quality of the tissues and their relevance to the study. Moreover, given the large variation in disease duration between PD and PDD, it's important to consider disease duration as a potential confounding factor, especially when concluding that PDD patients have a more severe form of synucleinopathy compared to PD.<br /> Inclusion of Healthy Controls in Multiple Tests: Given the importance of healthy controls in scientific studies, especially those involving human brain samples, the authors could consider using healthy controls in more tests to strengthen the robustness of the findings. Expanding the use of healthy controls in biochemical profiling and phosphorylation profiles would provide a better basis for comparison and clarify the significance of results in a disease context.<br /> This will help the authors to elaborate on the interpretation of results, for example, in Figure 3, where the authors claim that PD brains show mostly monomeric αSyn forms (line 119 and 120, and also in 222 and 223). Whether it implies the absence of alpha-syn pathology in PD brains? If there are differences from healthy controls? What are these low molecular weight bands (<15kD) (line 125-126) and whether they are also present in healthy controls? Also, we do not have a perfect pS129-specific (anti-p𝛼Syn) antibody. They are known for non-specific labeling. Investigating the phosphorylation levels in healthy controls and comparing them to PD brains, especially considering the predominance of monomeric (healthy αSyn?) in PD brains, would help clarify the observed changes.<br /> Age of Healthy Controls: Providing information about the age at death for healthy controls is crucial, as age can impact the accumulation of αSyn. Also include if the brain samples were age-matched, or analyses were age-adjusted.<br /> Braak Staging Discrepancy: The study reports the same Braak staging for both PD and PDD, despite the significant difference in disease duration. Maybe other reviewers with clinical experience might have a better take on this. This observation merits discussion in the paper, allowing readers to better understand the implications of this finding.<br /> Citation of Relevant Studies: The paper should consider citing and discussing a recent celebrated study on PD biomarkers that used thousands of cerebrospinal fluid (CSF) samples from different PD patient cohorts to demonstrate the effectiveness of SAA as a biochemical assay for diagnosing PD and its subtypes (https://doi.org/10.1016/S1474-4422(23)00109-6).<br /> In summary, these suggestions aim to enhance the study's quality and the clarity of its findings, ultimately contributing to a more comprehensive understanding of synucleinopathies and the diagnostic potential of SAA.

    3. Reviewer #2 (Public Review):

      Most neurodegenerative diseases are characterized by the self-templated misfolding of a particular protein in a manner that enables progressive spread throughout the central nervous system. In diseases including Parkinson's disease (PD) and multiple system atrophy (MSA), the protein alpha-synuclein misfolds into unique shapes, or strains, which use this self-replicating mechanism to encode disease-specific information. Previous research suggests that a major contributor to the lack of successful clinical trials across neurodegenerative diseases is the lack of disease-relevant strains used in preclinical testing. While MSA patient samples are known to replicate efficiently in cell and mouse models of disease, Lewy body disease (LBD) patient samples do not. To overcome this obstacle, the seeding amplification assay (SAA) uses recombinant alpha-synuclein to amplify the misfolded protein structure present in a human patient sample. The resulting fibrils are then widely used by many laboratories as a model of PD. In this manuscript, Lee et al., set out to compare the strain properties of alpha-synuclein fibrils isolated from LBD and MSA patient samples with the resulting amplified fibrils following SAA. Using orthogonal biochemical and structural approaches to strengthen their analyses, the authors report that the SAA-amplified fibrils do not recapitulate the disease-relevant strains present in the patient samples. Moreover, their data suggest that regardless of which strain is used to seed the SAA reaction, the same strain is generated. These results clearly demonstrate that the SAA-amplified material is not disease-relevant. SAA fibrils are broadly used in academic and pharmaceutical laboratories. They are used in ongoing drug discovery efforts and recombinant fibrils broadly inform much of what is known about alpha-synuclein strain biology in LBD patients. The implications of the reported work are, therefore, expansive. These findings add to the growing ledger of reasons that the use of SAA fibrils in research should be halted until improved methods for amplification with high fidelity are developed.

    4. Reviewer #3 (Public Review):

      Summary:

      This interesting manuscript presents a comparison of biophysical properties, TEM appearances, and phosphorylation patterns of brain-derived synuclein fibrils from 3 subjects each with Parkinson Disease (PD), Parkinson Disease with Dementia (PDD), Dementia with Lewy bodies (DLB) and Multiple System Atrophy (MSA), the effects of studying these brain-derived fibrils in a Seeding Aggregation Assay (SAA), and a comparison of the seeded and resultant fibers. The results are not unexpected.

      Strengths:

      The work explores an important question. Namely, what is the fidelity of synuclein fibrils produced during an SAA reaction to the starting material if that material has been extracted from the brains of deceased patients with synucleinopathies.

      Weaknesses:

      The work suffers from several methodological flaws

      The experiments are missing two important controls. 1) what to fibrils generated by different in vitro fibril preparations made from recombinant synclein protein look like; and 2) the use of CSF from the same patients whose brain tissue was used to assess whether CSF and brain seeds look and behave identically. The latter is perhaps the most important question of all - namely how representative are CSF seeds of what is going on in patients' brains?

      In their discussion the authors do not comment on the obvious differences in the conditions leading to the formation of seeds in the brain and in the artificial conditions of the seeding assay. Why should the two sets of conditions be expected to yield similar morphologies, especially since the extracted fibrils are subjected to harsh conditions for solubilization and re-suspension.

      Finally, the key experiment was not performed - would the resultant seeds from SAA preparations from the different nosological entities produce different pathologies when injected into animal brains? But perhaps this is the subject of a future manuscript.

      Furthermore, the authors comment on phosphorylation patterns, stating that the resultant seeds are less heavy phosphorylated than the original material. Again, this should not be surprising, since the SAA assay conditions are not known to contain the enzymes necessary to phosphorylate synuclein. The discussion of PTMs is limited to pS-129 phosphorylation. What about other PTMs? How does the pattern of PTMs affect the seeding pattern.

      Lastly, the manuscript contains no data on how the diagnostic categories were assigned at autopsy. This information should be included in the supplementary material.

    1. eLife assessment

      The authors provide convincing experimental evidence of extended motivational signals encoded in the mouse anterior cingulate cortex (ACC) that are implemented by the orbitofrontal cortex (OFC)-to-ACC signaling during learning. The results are valuable to the field of motivation and cognition. The experimental methods used were state-of-the-art. The manuscript would further benefit from theory-driven analyses to inform a mechanistic understanding, particularly for the single-cell calcium imaging results. These results will be of interest to those interested in cortical function, learning, and/or motivation.

    2. Reviewer #1 (Public Review):

      Summary:

      This is an interesting report examining activity patterns in mouse ACC and in the OFC neurons projecting to ACC. In addition, the effects of inactivation are examined. In aggregate, the results provide new and interesting information about these two brain areas and they translate motivation into action - a function that it seems intuitively plausible that ACC might perform but, despite this intuition, there have been comparatively few direct tests of the idea and little is known of the specific mechanisms. The study is performed carefully and is written up clearly. There were just a few points where I wondered if a little more clarification might be helpful.

      Strengths:

      The combination of recording and inactivation/inhibition experiments and the combination of investigation of ACC neurons and of OFC regions projecting to ACC are very impressive.

      Weaknesses:

      These are all minor points of clarification.

      (1) An important conclusion (Figure 4) is that when mice are trained to run through no reward (N) cues in order to reach reward (R) cues, the OFC neurons projecting to ACC each respond to different specific events in a manner that ensures that collectively they tile the extended behavioural sequence. What I was less sure of was whether the ACC neurons do the same or not. Figure 3 suggests that on average ACC neurons maintain activity across N cues in order to get to R cues but I was not sure whether this was because all individual neurons did this or whether some had activity patterns like the OFC neurons projecting to ACC.

      (2) Figure 1 versus Figure 2: There does not seem to be a particular motivation for whether chemogenetic inactivation or optogenetic inhibition were used in different experiments. I think that this is not problematic but, if I am wrong and there were specific reasons for performing each experiment in a certain way, then further clarification as to why these decisions were made would be useful. If there is no particular reason, then simply explaining that this is the case might stop readers from seeking explanations.

      (3) P5, paragraph 2. The authors argue that OFC and anteriomedial (AM) thalamic inputs into ACC are especially important for mediating motivation through N cues in order to reach R cues. Is this based on a statistical comparison between the activity in OFC or AM inputs as opposed to the other inputs?

      (4) P3, paragraph 2. Some papers by Khalighinejad and colleagues (eg Neuron 2020, Current Biology, 2022) might be helpful here in as much as they assess ACC roles in determining action frequency, initiation, and speed and mediating the relationship between reward availability and action frequency and speed.

      (5) Paragraph 1 "This learning is of a more deliberate, informed nature than habitual learning, as they are sensitive to the current value of outcomes and can lead to a novel sequence of actions for a desired outcome1-3." Should "they" be "it"?

    3. Reviewer #2 (Public Review):

      Summary:

      Regalado et al. studied how an extended motivational state, necessary for maintaining behavioural drive despite unrewarding experiences, could be encoded in the ACC and its potential causal implications for learning discriminatory behaviour and avoiding unrewarding stimuli. They designed a self-initiated learning task and identified bulk neural responses tuned specifically to reward delivery as well as trial initiation. Interestingly, in both cases, neural activity precedes behavioural onset, indicating the encoding of a motivational signal. To investigate the neural encoding of motivational signals during unrewarded, distracting stimuli presentation, they created a discrimination task by introducing 'no reward' cues, during which animals need to learn not to reduce running speed and not engage in licking. Interestingly, with mice learning to increase running speed and reduce licking rates after 'no reward' cues, the preceding ACC activity also gradually increased. Importantly, only the increase in running speed after 'no reward' cues was impaired upon optogenetic inhibition of ACC activity during early training, linking the extended motivational signal in ACC and learning to maximise rewards by actively avoiding distracting and unrewarded stimuli. Such motivational signals could also be observed in OFC-ACC projecting neurons. Especially the continuous ramping of activity upon repeated 'non-reward' cues, which could be exclusively observed in the 'fast learner' subgroup, provides an interesting concept of how an extended motivational signal necessary for learning avoidance of unrewarded stimuli could be implemented in ACC. The shift in the temporal activity of initially reward-responsive neurons towards the preceding 'no reward' cue, provides a potential mechanism linking extended motivation to reward maximisation. This mechanism seems to be particularly important in periods of persistent 'non-reward' cues, as demonstrated in the impairment of running speed increase after two consecutive 'non-reward' cues.

      Appraisal:

      The authors provide convincing experimental evidence to support their claims of an extended motivational signal encoded in the ACC that is implemented by OFC-ACC signalling and critically involved in learning avoidance of unrewarded stimuli. The newly designed task seems appropriate to identify correlates of relevant cognitive and behavioural variables (e.g. sustained motivation). The combination of recording Ca2+ transients (bulk as well as longitudinal single neuron recordings) to identify potential neural responses and subsequent evaluation of their causal role in establishing and maintaining this persistent motivational state using opto- and pharmacogenetic manipulations is generally accepted.

      Impact:

      The findings will be valuable for further research on the impact of motivational states on behaviour and cognition. The authors provided a promising concept of how persistent motivational states could be maintained, as well as established a novel, reproducible task assay. While experimental methods used are currently state-of-the-art, theoretical analysis seems to be incomplete/not extensive.

    1. eLife assessment

      This study presents important methodologies for repeated brain ultrasound localization microscopy (ULM) in awake mice and a set of results indicating that wakefulness reduces vascularity and blood flow velocity. The efficiency of the technique is however incompletely demonstrated, in particular regarding the reliability of longitudinal imaging. This study is relevant for scientists investigating vascular physiology in the brain.

    2. Reviewer #1 (Public Review):

      Summary:

      Wang and colleagues present a study aimed at demonstrating the feasibility of repeated ultrasound localization microscopy (ULM) recording sessions on mice chronically implanted with a cranial window transparent to ultrasound. They provided quantitative information on their protocol, such as the required number of contrast-enhancing microbubbles (MBs) to get a clear image of the vasculature of a brain coronal section. Also, they quantified the co-registration quality over time-distant sessions and the vasodilator effect of isoflurane.

      Strengths:

      The study showed a remarkable performance in recording precisely the same brain coronal section over repeated imaging sessions. In addition, it sheds light on the vasodilator effect of isoflurane (an anesthetic whose effects are not fully understood) on the different brain vasculature compartments, although, as the authors stated, some insights in this aspect have already been published with other imaging techniques. The experimental setting and protocol are very well described.

      Weaknesses:

      While the title is fair with respect to the data shown, in the summary and the rest of the paper, the comparison between anesthetized and awake conditions is systematically stated, while more caution should be used.

      First, isoflurane is one of the (many) anesthetics commonly used in pre-clinical research, and its effect on the brain vasculature cannot be generalized to all the anesthetics. Indeed, other anesthesia approaches do not produce evident vasodilation; see ketamine + medetomidine mixtures. Second, the imaged awake state is head-fixed and body-constrained in mice. A condition that can generate substantial stress in the animals. In this study, there is no evaluation of the stress level of the mice. In addition, the awake imaging sessions were performed a few minutes after the mouse woke up from isoflurane induction, which is necessary to inject the MB bolus. It is known that the vasodilator effects of isoflurane last a long time after its withdrawal. This aspect would have influenced the results, eventually underestimating the difference with respect to the awake state.

      These limitations should be clearly described in the Discussion.

      Looking at Figure 2e, it takes more than 5' to reach the 5 Millions MB count useful for good imaging. However, the MB count per pixel drops to a few % at that time. This information tells me that (i) repeated measurements are feasible but with limited brain coverage since a single 'wake up' is needed to acquire a single brain section and (ii) this approach cannot fit the requirements of functional ULM that requires to merge the responses to multiple stimuli to get a complete functional image. Of course, a chronic i.v. catheter would fix the issue, but this configuration is not trivial to test in the experimental setup proposed by the authors, hindering the extension of the approach to fULM.

      Statistics are often poor or not properly described. The legend and the text referring to Figure 2 do not report any indication of the number of animals analyzed. I assume it is only one, which makes the findings strongly dependent on the imaging quality of THAT mouse in THAT experiment. Three mice have been displayed in Figure 3, as reported in the text, but it is not clear whether it is a mouse for each shown brain section. Figure 5 reports quantitative data on blood vessels in awake VS isoflurane states but: no indication about the number of tested mice is provided, nor the number of measured blood vessels per type and if statistics have been done on mice or with a multivariate method. Also, a T-test is inappropriate when the goal is to compare different brain regions and blood vessel types. Similar issues partially apply to Figure 6, too.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors present a very interesting collection of methods and results using brain ultrasound localization microscopy (ULM) in awake mice. They emphasize the effect of the level of anesthesia on the quantifiable elements assessable with this technique (i.e. vessel diameter, flow speed, in veins and arteries, area perfused, in capillaries) and demonstrate the possibility of achieving longitudinal cerebrovascular assessment in one animal during several weeks with their protocol.

      Strengths:

      Even if the methods elements considered separately are not new (brain ULM in rodents, setup for longitudinal awake imaging similar to those used in fUS imaging, quantification of vessel diameters/bubble flow/vessel area), when masterfully combined as it is done in this paper, they answer two questions that have been long-running in the community: what is the impact of anesthesia on the parameters measured by ULM (and indirectly in fUS and other techniques)? Is it possible to achieve ULM in awake rodents for longitudinal imaging? The authors answer quite exhaustively the first question. The manuscript is well-constructed and well-written, and the graphics are appealing.

      Weaknesses:

      The only major comment (calling for further work) I would like to make is the relative weakness of the manuscript regarding longitudinal imaging (mostly Figure 6), compared to the exhaustive review of the effect of isoflurane on the vasculature (3 rats, 3 imaging planes, quantification on a large number of vessels, in 9 different brain regions). The 6 cortical vessels evaluated in Figure 6 feel really disappointing. As longitudinal imaging is supposed to be the salient element of this manuscript (first word appearing in the title), it should be as good and trustworthy as the first part of the paper. Figure 6c. is of major importance, and should be supported by a more extensive vessel analysis, including various brain areas, and validated on several animals to validate the robustness of longitudinal positioning with several instances of the surgical procedure. Figure 6d estimates the reliability of flow measurements on 3 vessels only. Therefore I recommend showing something similar to what is done in Figures 4 and 5: 3 animals, and more extensive quantification in different brain regions.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Wang et al. performed a study looking at vascular changes in response to anesthesia in awake mice using ultrasound localization microscopy (ULM). The authors report a reduction of vascularity and blood flow velocity in the awake state. In addition, they demonstrate the reproducibility of ULM measurements in time.

      Strengths:

      Demonstration that high-quality, state-of-the-art ULM images can be performed using cranial windows in awake animals.<br /> Demonstration that repeated imaging in time produces comparable images.

      Weaknesses:

      It is unclear whether multiple animals were used in the statistical analysis.<br /> Generalizations are sometimes drawn from what seems to be the analysis of a single vessel.<br /> The description of the statistical analysis is mostly qualitative.<br /> Some terms used are insufficiently defined.<br /> Additional limitations should be included in the discussion.<br /> Some technical details are lacking.

      Without information about whether the results obtained come from multiple animals, it is difficult to conclude that the authors generally achieved their aim. They do achieve it in a single animal.

      The results that are shown are interesting and could have an impact on the ULM community and beyond. In particular, the experimental setup they used along with the high reproducibility they report could become very important for the use of ULM in larger animal cohorts.

    1. eLife assessment

      This valuable manuscript shows that the optogenetic stimulation of direct and indirect pathway spiny projection neurons (SPNs) in the dorsomedial versus the dorsolateral striatum has different consequences for locomotor activity, real-time place preference, and action selection, in a contextually mediated manner. The evidence in support of this conclusion is solid but would be further strengthened through deeper analysis of the effect and specificity of optogenetic manipulations on SPN activity. These findings will be of interest to neuroscientists, particularly behavioral neuroscientists.

    2. Reviewer #1 (Public Review):

      Summary:

      The motivating questions are an accurate reflection of the current state of knowledge surrounding striatal pathway function. The comparisons of pathway function across striatal subregion, activation & inhibition, and task context are laudable and extremely important for advancing the subfield. Had these manipulations, to the largest extent possible been performed in single animals (e.g. activate dSPNs of DMS or DLS in the same mouse across the 3 tasks), this would have significantly strengthened the impact and conclusions that could be drawn by making this set of studies even more so internally consistent and directly comparable. While this is no longer possible, a conceptually related and fantastic contribution to the subfield (and likely beyond in terms of Opto manipulations of brain areas) would be to directly demonstrate that within their studies their DMS pathway manipulations do not impact nearby DLS activity (and vice versa). This is a significant and non-essential request. More feasibly and reasonably, it would be fantastic and strengthen the conclusions here to more fully detail their opsin expression patterns in DMS vs DLS groups and perhaps attempt to relate individual opsin profiles and fiberoptic targeting with behavioral outcomes across tests.

      Strengths:

      A comprehensive and paired comparison of inhibition and activation of striatal pathways across subregions and tasks is a very important and meaningful step towards reconciling contradictory results on striatal pathway function that are observed across labs (who typically focus on one subregion, one task setting, and often do not directly report comparisons of activation and inhibition).

      Weaknesses:

      Figure 1A - the example DMS vs DLS opsin expression and fiber targeting are not terribly convincing that the manipulations will be specific to each subregion (the example in Figure 2A is a little better but I have a similar concern still). The specificity of these manipulations is key to interpretation and conclusions and I strongly feel they should be strengthened here. The best evidence would be direct neural recordings (light in DMS, no effect in DLS, and vice versa), but this is a tall ask and not expected. The next best option, which is readily feasible, is to show not only fiberoptic targeting summaries (as in Figure 1A, Figure 2A) but also a summary of opsin spread for all animals (especially given the two examples appear to have significant spread across DMS and DLS). It would be of great benefit to the field to have these in the Allen Common Coordinate Framework. It would also be fine and useful to utilize the authors' current classical histological atlas alignment methods (e.g. Paxinos pdf). These histological summary figures would also benefit from being larger and more visible (perhaps as separate supplemental figures associated with the main figures).

      Related to the above, it is a concern that the classic view is supported or not because of individual variations in virus/fiber targeting to striatal subregions which likely have greater granularity than the traditional dorsal medial vs lateral (e.g. Hunnicutt et al 2016, Foster et al 2021, Hintiryan et al 2016). Although there may not be enough animals or variation in targeting in the present study to find meaningful relationships, it would strengthen the paper and be a great benefit to the field to know whether for key findings if the strength of behavioral effects correlated with anterior/posterior or medial/lateral or dorsal/ventral fiberoptic coordinates (or the volume of opsin expression profiles).

      Conceptually, a clear new idea or integrative interpretation of prior work (nor even the large body of results within this work) comes to the fore, save for the already appreciated fact that the classic view of opposing pathways is sometimes supported and sometimes not. Two tangible suggestions that I believe would facilitate the influence of this study - (1) can the authors more thoughtfully bridge the logical steps in their results sections and the prior context around them (some topic sentences jump right into results, e.g. line 195: "The inhibition experiment showed), and (2) in discussion, rather than emphasizing when/where the classic view is supported and not, more content on precisely why would be helpful. Some questions more specifically, if DMS/DLS pathway activation/inhibition is *mostly* oppositely appetitive/aversive, what does that mean in the context of spontaneous or reward-guided locomotion? Self-initiated pathway activation/inhibition is in part learned (with very intriguing differences across pathways in the expression across learning) - how should we think about striatal pathway function with regards to learning, spontaneous/innate behaviors, vs over-trained behaviors? When the classic view fails in the dorsal striatum - why? And is a complimentary "model" an actual alternative concept, a distinct mode of circuit function, or just a negative result on the classic view?

    3. Reviewer #2 (Public Review):

      Summary:

      Cuevas et al. investigate the involvement of DMS and DLS direct and indirect pathways in locomotion and action selection using optogenetic manipulation techniques. They show that optical excitation of dSPNs in both DMS and DLS induces place preference, with optical inhibition resulting in the opposite effect. Interestingly, and somewhat not coming as a surprise given many previous data on this, optical excitation of iSPNs in both regions resulted in place aversion - in line with the classical view of functional opposition.

      Then, the authors performed a two-choice task in which animals would have to choose between pressing in a lever alone or in a lever+stim to obtain a food reward. Again, and not surprisingly, they show that optical activation of dSPNs results in selection from pressing in the lever+stim with the opposite being observed for iSPN, in both DMS and DLS. What was concerting was the increase in lever pressing when inhibiting dSPNs in the DMS, since before authors show that it should cause aversion. When looking at locomotor effects, the authors report an increase in spontaneous displacement when exciting dSPNs in DMS, and the opposite in DLS. Contrary, the excitation of iSPNs either in DMS or DLS increased spontaneous displacement. In reward-seeking, displacement excitation of either dSPNs or iSPNs in both regions resulted in decreased locomotion.

      Strengths:

      Overall this manuscript brings a new light to the involvement of DLS SPNs in both locomotion and behavioral preference.

      Weaknesses:

      Some of the main claims would benefit from further discussion or new data on the effect of optogenetic manipulation on the activity of SPNs. This could allow for the creation of a clearer picture of the involvement of iSPNs and dSPNs of DMS and DLS for behavior.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1

      Weaknesses:

      Start site fidelity in purified recons5tuted systems can be drama5cally altered in different buffer condi5ons. Interpreta5on of the observed changes to start site selec5on in mRNAs in the absence or presence of Ded1 using only the one buffer condi5on used is therefore limited.

      This is an excellent point and is something we could explore in future studies using the Rec-Seq system. We have added this caveat to the Discussion on lines 797-809. We have previously studied the fidelity of start codon recogni>on in the recons>tuted system (Kolitz et al., [2009] RNA, 15:138-152) and found that under our standard buffer condi>ons the codon specificity generally reflects what we observed in vivo using a dual-luciferase reporter assay, with the most stable 48S complexes forming on AUG codons, followed by first posi>on mismatches (GUG, UUG, CUG), with second and third posi>on mismatches leading to significantly less stable complexes. However, as the reviewer notes, there are some devia>ons: ACG and AUA are poor codons in the in vitro system under the buffer condi>ons used but allowed rela>vely strong expression in our in vivo reporter assay. It should also be noted that the hierarchy of nearcognate start codon usage in vivo in yeast differs according to the study and the reporter used, making it difficult to establish a “ground truth” for start codon fidelity.

      I have some specific comments to strengthen the manuscript and address some minor issues.

      It is not clear to me whether the authors refold the purified mRNA aEer phenol/chloroform extrac5on? Have the authors observed different results if the mRNA is refolded or not? This is appropriate since the authors compare their Rec-Seq data to PARS scores that were generated from refolded mRNAs. One assumes that the total mRNA used is refolded in the same way as the PARS score study, but this is not clearly stated. The authors should make this point clear in the text and methods.

      This is an excellent point. We did not use the final refolding protocol that Kertesz et al. used when they developed their PARS scores and now clarify this in the Methods sec>on (lines 962967). It is possible that we would have seen stronger correla>ons in the analyses using PARS scores had we followed the renatura>on protocol, although the fact that we observed significant correla>ons (e.g., Fig. 3E-H) suggests the structures in the Kertesz et al. mRNAs were similar to those in our mRNAs.

      It is not clear how the authors determine the concentra5on of total mRNA that is used in the assay - reported as 60 nM? Are the authors assuming a molecular weight of an average mRNA to determine the concentra5on? The authors should provide more detail for how they quan5fy their mRNA concentra5on and its stoichiometry compared to 43S PICs.

      We thank the reviewer for poin>ng out this oversight and have now included this informa>on on lines 849-855 of the Methods sec>on.

      Comments regarding start site fidelity in the recons5tuted system:

      The authors use in vitro transcribed tRNAi-Met. Since tRNA modifica5ons may play a role in start site fidelity, the authors should perhaps men5on that this will need to be inves5gated in a future study in the discussion.

      This is a good point and we now note it as a caveat in the Discussion on lines 806-809.

      The authors state that Ded1 promotes leaky scanning regardless of the mAUG start site context (page 24; lines 533-534). The authors then state on page 25 that the level of iAUG ini5a5on rela5ve to mAUG ini5a5on does depend on the mAUG context (lines 545-546). This seems contradictory unless I am not understanding this correctly? It would certainly be surprising that mAUG context didn't regulate leaky scanning in the recons5tuted system given the fact that ini5a5on codon context regulates selec5on in cells (when Ded1 is present).

      These statements are correct as wrihen. As shown in Figure 5O, the frequency of leaky scanning (as measured by rela>ve ribosome occupancy of the internal region of the ORF, not including the main start codon, to the whole ORF, including the main start codon; RRO) decreases as the context score around the start codon gets stronger (green and purple lines). The RRO is increased to the same extent when 500 nM Ded1 is added, regardless of the strength of the start codon context, indica>ng that Ded1 enhances leaky scanning equally (compare slopes of the green line without Ded1 to the purple line with Ded1). Because of this, the effect of Ded1 on RRO (DRR0) is constant across context score bins (orange line). There is no discrepancy between our two conclusions that leaky scanning of the mAUG increases as context score decreases and that Ded1 increases leaky scanning equally for good and bad mAUG contexts, indica>ng that Ded1 does not inspect the mAUG context and simply decreases the dwell >me equally at all contexts.

      Further to the start site context ques5on. It is possible that the fidelity of the recons5tuted system (i.e. buffer condi5ons) is not fully reflec5ng in vivo-like start site selec5on. A rigorous characteriza5on of commercially available re5culocyte lysate systems iden5fied buffer condi5ons that provided similar start site fidelity to that observed in live cells (Kozak. Nucleic Acids Res. 1990 May 11;18(9):2828). While I feel that it is beyond the context of the current work to undertake a similar rigorous buffer characteriza5on, one must be careful about interpre5ng the results about leaky scanning and upstream ini5a5on sites in the current work. Perhaps one would observe similar results to Guenther et al. if the fidelity (buffer condi5ons) of the recons5tuted system were different? I appreciate that the authors state that their results only apply to their recons5tuted system and do not necessarily suggest that previous data are incorrect, but with only one buffer condi5on being tested in the current study it may be appropriate to further soEen the interpreta5on of the current results when compared to published data in live cells.

      This point is well-taken. As noted above, we have added a caveat about possible effects of buffer condi>ons on start codon fidelity to the Discussion (lines 797-809). In terms of the possibility that upstream ini>a>on is more frequent in vivo than we observe in the in vitro RecSeq system, we previously studied 5’UTR transla>on in vivo using ribosome profiling (Kulkarni et al. [2019] BMC Biol., 17:101). The ra>o of RPFs in 5’UTRs to coding sequences in this study was 0.0027, very similar to the value measured in the in vitro Rec-Seq system in the presence of Ded1 (0.0016-0.0017). Thus, it does not seem that the frequency of upstream ini>a>on is drama>cally higher in vivo than in our in vitro system. We have now made note of this point in the Results (lines 594-598). Guenther et al. employed a ribosome profiling protocol in which they added cycloheximide to their cells prior to lysis, which has been shown to create significant ar>facts, par>cularly in 5’UTR transla>on (e.g., Gerashchenko and Gladyshev [2014] Nucleic Acids Res., 42:e134). Nevertheless, as suggested by the reviewer, we have modified the text in the Results and Discussion to somen the interpreta>on somewhat (lines 582-583; 616-618; 761763).

      Reviewer #2

      Weaknesses:

      Several findings in this report are quite surprising and may require addi5onal work to fully interpret. Primary among these is the finding that Ded1p s5mulates accumula5on of PICs at internal site in mRNA coding sequences at an incidence of up to ~50%. The physiological relevance of this is unclear.

      We agree with the reviewer that understanding the physiological significance, if any, of the apparent leaky scanning of main AUG start codons induced by Ded1 is an unanswered ques>on that will require addi>onal studies. It is possible that rapid 60S subunit joining and forma>on of the 80S ini>a>on complex amer start codon recogni>on on most mRNAs reduces the leaky scanning effect in vivo. We now bring up this possibility in the Discussion sec>on (lines 804809). However, as noted in lines 568-580, mRNAs that display significantly decreased mRPFs at 500 nM Ded1 in the Rec-Seq system also tend to have TEs that are increased in the ded1-cs- mutant rela>ve to WT yeast in in vivo ribosome profiling experiments, sugges>ng that Ded1 ac>vity also diminishes ini>a>on on mAUG codons in these mRNAs in vivo.

      A limita5on of the methodology is that, as an endpoint assay, Rec-Seq does not readily decouple effects of Ded1p on PIC-mRNA loading from those on the subsequent scanning step where the PIC locates the start codon. Considering that Ded1p ac5vity may influence each of these ini5a5on steps through dis5nct mechanisms - i.e., binding to the mRNA cap-recogni5on factor eIF4F, or direct mRNA interac5on outside eIF4F - addi5onal studies may be needed to gain deeper mechanis5c insights.

      We agree that this is a limita>on of the Rec-Seq assay and now men>on this point in the Discussion sec>on (lines 810-817). It is possible that future work using cross-linking agents to stabilize 43S complexes bound near the cap and scanning the 5’UTR, similar to the methodology used in 40S ribosome profiling, could enable us or others to disentangle these steps from one another.

      As the authors note, the achievable Ded1p concentra5ons in Rec-Seq may mask poten5al effects of Ded1p-based granule forma5on on transla5on ini5a5on. Addi5onal factors present in the cell could poten5ally also promote this mechanism. Consequently, the results do not fully rule out granule forma5on as a poten5al parallel Ded1p-mediated transla5on-inhibitory mechanism in cells.

      We agree. As stated in the Discussion sec>on (lines 735-741): “It is possible that at higher concentra>ons of Ded1 than were achievable in these in vitro experiments or in the presence of addi>onal factors that modify Ded1’s ATPase or RNA binding ac>vi>es the factor could directly inhibit a subset of mRNAs, by ac>ng as an mRNA clamp that impedes scanning by the PIC, or by sequestering the mRNAs in insoluble condensates. It might be interes>ng in the future to test candidate factors in Rec-Seq to determine if they switch Ded1 from being a s>mulatory helicase to an inhibitory mRNA clamp that removes transcripts from the soluble phase.”

      It is certainly clear why the 15-minute 5mepoint was chosen for these assays. However, I wondered whether data from an earlier 5mepoint would provide useful informa5on. The descrip5on on line 210 of the compiled PDF suggests data from different 5mepoints may be available; if it is, in my view it could be a useful addi5on. More generally, including language about the single-turnover nature of these reac5ons may be helpful for the benefit of a broad audience.

      In preliminary experiments, we have used the Rec-Seq system to measure the kine>cs of 48S PIC forma>on transcriptome-wide. As you probably can imagine, this is a challenging experiment and requires addi>onal work before we would feel comfortable publishing it. We very much agree with the reviewer that resolving the kine>cs of these events will provide important addi>onal informa>on. As suggested, we have added caveats about the endpoint and single-turnover nature of the assay to the Discussion (lines 821-828).

      I wondered whether it might be useful to present addi5onal informa5on on the mRNAs not found in the assay. For example, are these the least abundant mRNAs, which may not have had 5me to recruit the 43S PIC?

      75% of mRNAs (2719 of 3640) not observed in the Rec-Seq analysis had densi>es below the median (2.3 reads per nucleo>de). We now men>on this in the Methods sec>on (lines 855856).

      The Rec-Seq recruitment reac5ons were carried out at 22C˚ . Considering that remodeling of RNA structure by helicase enzymes is a focal point of the study, linking the results to the recruitment landscape at a closer-to-physiological temperature may bolster the conclusions.

      In the future, it would be interes>ng to test the effects of temperature on 48S PIC forma>on using the Rec-Seq system. As the reviewer suggests, the interplay between temperature and mRNA structure could reveal interes>ng phenomenon. It is worth no>ng, however, that there is no clear “physiological” temperature for S. cerevisiae. For consistency and convenience, lab yeast is usually grown at 30 ˚C, but in the wild yeast live at a wide range of temperatures, which generally change throughout the day. From this standpoint, 22 ˚C seems reasonably physiological.

      Results from Rec-seq experiments conducted at 15° C might be more directly comparable to in vivo Ribo-seq data with the ded1-cs mutant. However, already ~90% of the Ded1hyperdependent mRNAs iden>fied by Ribo-seq analysis of that mutant were iden>fied here as Ded1-s>mulated mRNAs in Rec-Seq experiments at 22°C. The Ribo-seq experiments conducted by Guenther et al. were conducted on the ded1-ts mutant at 37°C; thus, any structures that confer Ded1-dependent leaky-scanning through uORFs detected in that study should have been stable in our Rec-Seq experiments.

      The introduc5on provides an important, detailed exposi5on of the state of the field with respect to Ded1p ac5vity. Nevertheless, in my view, it is quite lengthy and could be streamlined for clarity. As just one example, the proposed func5on of Ded1p in the nucleus seems like a detail that could be dispensed with for the present work.

      We have ahempted to shorten the Introduc>on, as suggested. However, we did not remove the short sec>on describing Ded1’s possible roles in the nucleus and ribosome biogenesis because we felt it was important to emphasize that one of the strengths of the Rec-Seq system is that it allows us to isolate the early steps of transla>on ini>a>on from later steps and from other cellular processes. In addi>on, at the sugges>on of Reviewer #3, we added a brief explana>on of Ded1’s possible role in the subunit joining step of transla>on.

      Reviewer #3

      Weaknesses:

      The slow nature of the biochemical experiments could bias results.

      We agree that the 15-minute >me point used could mask effects that are manifested at a purely kine>c level. It should be noted that we have measured the observed rate constants for 48S forma>on on a variety of mRNAs in the in vitro recons>tuted system in the presence of satura>ng Ded1 (Gupta et al. [2018] eLife, hhps://elifesciences.org/ar>cles/38892 ) and found that they are generally in the range of es>mates of rate constants for transla>on ini>a>on in vivo in yeast (~1-10 min-1; e.g., Siwiak and Zielenkiewicz [2010], PLOS Comput. Biol., 6: e100865). In preliminary experiments, we have used the Rec-Seq system to measure the kine>cs of 48S PIC forma>on transcriptome-wide in the absence of Ded1 and find that the mean rate constant observed (~2 min-1) is also within the range of es>mates of the rate of transla>on ini>a>on in vivo in yeast. We hope to publish this analysis in a future manuscript.

      It has been suggested that Ded1 and its human homolog DDX3X could play a role in subunit joining postscanning (Wang et al. 2022, Cell and Geissler et al. 2012 Nucleic Acids Res). Could the authors poten5ally inves5gate this by adding GTP, eIF5B and 60S subunits into the reac5on mixture and isola5ng 80S complexes?

      This is a very interes>ng sugges>on. One of our plans with the Rec-Seq system is to see if we can also observe 80S forma>on with it and dis>nguish 80S from 48S complexes. Although we haven’t yet tried this and there might be technical obstacles to doing it, if it works we would like to examine the poten>al effects of Ded1, as suggested. We now men>on this possibility in the Discussion sec>on (lines 709-716 and 810-817).

      An incuba5on 5me of 15 minutes is quite long on the 5mescale of transla5on ini5a5on. Presumably, the compe55on for 40S among mRNAs is par5ally kine5cally controlled so it would be interes5ng if the authors could do a 5me series on the incuba5on 5me. Does Ded1 increase ini5a5on on more structured UTRs even at shorter incuba5ons or are those only observed with longer incuba5ons?

      We agree. See the response to the ques5on about kine5cs above.

      Does GDPNP lead to off-pathway events? What happens when GTP is used in the TC? Presumably in the absence of eIF5B the 48S PIC should remain stalled at the start codon.

      In previous experiments in the recons>tuted system, we showed that using GTP instead of GDPNP resulted in 48S complexes that were less stable than those stalled prior to GTP hydrolysis (e.g., Algire et al. [2002] RNA 8:382-397). This is presumably because eIF2•GDP and eIF5 release from the complex and the Met-tRNAi can dissociate in the absence of subunit joining. Although we haven’t tried it in the Rec-Seq system, we suspect that the resul>ng PICs would fall apart during sucrose gradient sedimenta>on.

      The authors use assembly of a 48S PIC at the start codon as evidence of scanning but could use more evidence to back this claim up. Does removing the cap structure on the two luciferase mRNA controls disrupt ini5a5on using this approach? That would be direct evidence of 5' end 40S loading and scanning to the start codon.

      In previous work using the recons>tuted system, we studied the effect of the 5’-cap on 48S PIC forma>on (Mitchell et al. [2010] Mol. Cell 39:950-962; Yourik et al. [2017] eLife hhps://elifesciences.org/ar>cles/31476 ). We found that stable 48S PIC forma>on is strongly dependent on the presence of the 5’-cap. In addi>on, the cap prevents off-pathway events and enforces a requirement for the full set of ini>a>on factors to achieve efficient 48S PIC forma>on. As the reviewer indicates, the cap-dependence of the system supports the conclusion that 5’end loading and scanning take place. We have now added this informa>on and the relevant cita>ons to the Introduc>on (lines 147-153). We thank the reviewer for poin>ng out this oversight. It should also be noted that the cases of mRNAs in which 5’UTR transla>on is increased by addi>on of Ded1 support the conclusion that the factor promotes ahachment of the PIC to the 5’ ends of mRNAs and subsequent 5’ to 3’ scanning, as noted in lines 608-618.

      The authors state that "The correla5on between CDS length and RE could be indirect because CDS length also correlates with 5'UTR length". Could the authors bin the transcripts into different 5' UTR length ranges and then probe for CDS length differences on RE for each 5' UTR length bin? This could be useful to truly parse the mechanism by which CDS length is influencing RE.

      This was an excellent sugges>on. We now include this analysis in a new supplementary figure, Figure 3S-2. Corresponding text was added in lines 380-387:

      “Importantly, correlations between Ded1 stimulation and 5’ UTR lengths are evident for all three groups of mRNAs containing distinct ranges of CDS lengths (Fig. 3-S2A-C). In contrast, a marked correlation between Ded1 stimulation and CDS length was detected only for the group of mRNAs with longest 5’UTRs (Fig. 3-S2D-F), and only the latter group showed a clear correlation between 5’UTR length and CDS length (Fig. 3-S2G-I). Thus, the correlation between Ded1 stimulation and CDS length appears to be indirect, driven by the tendency for the mRNAs with the longest 5’UTRs to also have correspondingly longer CDSs.”

      We thank the reviewer for this very useful idea.

      In Figure 3I, why does RE dip for the middle bins of CDS length in both 100 nM and 500 nM condi5ons, and then rise back up for the later bins? In other words, why do the shortest and longest CDS have the best RE in the presence of ded1?

      We do not know the reason for this dip and now say this in the Results on lines 377-378.

      The discussion sec5on would be well served to discuss proposed roles of Ded1 post-scanning and how those fit, if at all, with the data presented throughout the manuscript.

      We have now added this to the Discussion (lines 709-716 and 810-817). We thank the reviewer for poin>ng out this oversight.

      Minor comments:

      • Define bins on figures rather than using bin number for axis labels. For example, Figure 3A-D x-axis labels indicate the length range of each bin.

      Thank you for the sugges>on. We have made this change.

      • Figure 3I: the data seem to indicate that shortest CDSs have a ded1 dependency similar to the longest CDSs. This result seems inconsistent with the given rela5onship between UTR length, structure, CDS length. Please clarify.

      See answer to this ques>on above.

      • Replace qualita5ve statements, such as "substan5ally smaller reduc5ons" with percent change, numbers, etc.

      We have tried to replace qualita>ve statements with quan>ta>ve ones, where possible.

    2. eLife assessment

      This is an important paper as it is the first to use a reconstituted translation system to study competition among mRNAs for the initiation machinery. Understanding the principles of the biochemistry of mRNA competition for initiation factors cannot be achieved without such a system. The authors provide compelling evidence that Ded1 is required for efficient initiation of highly structured mRNAs. The findings are significant and validate the in vitro reconstituted system by recapitulating the effects of in-vivo perturbations of translation initiation by Ded1 mutants.

    3. Reviewer #1 (Public Review):

      The authors have developed and optimized a footprinting assay to monitor the recruitment of mRNAs to a reconstituted translation initiation system. This assay is named Recruitment-Sequencing (Rec-Seq) and enables the analysis of many purified mRNAs in the reconstituted system.

      This system possesses the ability to determine how competition occurs between mRNAs for the initiation machinery. This is the first approach using a reconstituted system that enables this important feature, and this is an important advance for the field.

      Using purified mRNAs in a fully reconstituted system together with the ability to monitor start site selection is an important advance. The method enables one to observe for the first time how competition between mRNAs is altered in response to the absence or presence of different initiation components or accessory proteins.

      Start site fidelity in purified reconstituted systems can be altered in different buffer conditions and by the concentration of various initiation factors involved in start site fidelity. Future experiments will reveal how these variables can regulate start site selection in this powerful system.

      Comments on revised version:

      The authors have addressed all of my original comments. This is an impressive manuscript.

    4. Reviewer #2 (Public Review):

      Summary:

      Zhou et al report development of a new method, Rec-Seq, that allows rigorous quantitation of the efficiency of 48S ribosomal pre-initiation complex (PIC) formation on messenger RNAs at transcriptome scale in vitro. With a next-generation deep-sequencing approach, Rec-Seq allows precisely targeted dissection of the roles of translation initiation factors in PIC assembly. This level of molecular precision is important to understanding mechanisms of translational control, making Rec-Seq a significant methodological advance. The authors leverage Rec-Seq to investigate the relative roles of two key helicase enzymes, Ded1p and eIF4A. While past work has pointed to differing roles for Ded1p and eIF4A helicase activity in PIC assembly, unambiguous interpretation of prior in-vivo data has been hindered by technical requirements for performing the experiments in cells. Rec-Seq circumvents these challenges, providing robust mechanistic insights. The authors find that Ded1p stimulates PIC formation selectively on mRNAs with long, structured leaders in the Rec-Seq system, while eIF4A provides much more general stimulation across mRNAs. The findings substantiate the past in-vivo results, along with adding new insights. They contrast with evidence that Ded1p promotes translation by suppressing inhibitory upstream initiation through structural remodeling, or through formation of intracellular, phase-separated granules. The conclusions of the study are well-supported by the data, and are likely to be of broad interest.

      Strengths:

      The quantitative nature of Rec-Seq, which uses an internal standard to measure absolute recruitment efficiencies, is an important strength.

      The methodology decisively overcomes past experimental limitations, allowing the authors to make clear conclusions with regard to the relative roles of Ded1p and eIF4A in PIC formation. An important and useful addition to the toolbox for studying translation and translational control mechanisms, Rec-Seq substantially expands the throughput and scope of mechanistic analyses for translation initiation.

      One significant finding to emerge is that the in-vitro reconstituted system used here recapitulates effects of in-vivo perturbations of translation initiation. Despite the lack of a cellular environment and its components, PIC formation appears to operate much as it does in the cell. Importantly, this highlights an inherent "modularity" to the system that is especially of interest in the context of how regulatory machinery beyond the PIC may control translation.

      Weaknesses:

      The study finds that Ded1p stimulates accumulation of PICs at internal AUG codons, i.e., within mRNA coding sequences, at an incidence of up to ~50% - thus, bypassing "canonical" translation start sites. Understanding the physiological significance of this activity will require further study. The authors address this in the text.

      A limitation of the methodology is that, as an endpoint assay, Rec-Seq does not readily decouple effects of Ded1p on PIC-mRNA loading from those on the subsequent scanning step where the PIC locates the start codon. Considering that Ded1p activity may influence each of these initiation steps through distinct mechanisms - i.e., binding to the mRNA cap-recognition factor eIF4F, or direct mRNA interaction outside eIF4F - additional studies will be needed to gain deeper mechanistic insights. The authors discuss this in the text.

      Comments on revised version:

      In revising their manuscript, the authors have responded very thoughtfully and insightfully to the initial review. The final manuscript is an important contribution to the field, and I am sure it will be of broad interest.

    1. eLife assessment

      This paper investigates how the EWS::FLI1 fusion protein organizes chromatin topology and regulates gene expression in an aggressive pediatric bone cancer known as Ewing sarcoma. The authors used the most recent genomics methodologies to provide a solid base of evidence for the role of a short alpha helix in the DNA binding domain of FLI1 in modulating binding to GGAA microsatellites and promoting enhancer activity. The study provides valuable insight into the underlying oncogenic mechanisms in Ewing sarcoma, but is limited to a single cell line and would benefit from consolidation of the main conclusions using additional techniques.

    1. eLife assessment

      This important manuscript details the characterization of ClpL from L. monocytogenes as an effective and autonomous AAA+ disaggregase that provides enhanced heat resistance to this food-borne pathogen. Supported by compelling evidence, the authors demonstrate that ClpL has DnaK-independent disaggregase activity towards a variety of aggregated model substrates, which is more potent than that observed with the endogenous canonical DnaK/ClpB bi-chaperone system. The work will be of broad interest to microbiologists and biochemists.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This fundamental study advances our understanding of the cell specific treatment of cone photoreceptor degeneration by Txnip. The evidence supporting the conclusions is convincing with rigorous genetic manipulation of Txnip mutations, however, there are a few areas in which the article may be improved through further analysis and application of the data. The work will be of broad interest to vision researchers, cell biologists and biochemists.

      Reviewer #1 (Public Review):

      Summary:

      This is a follow-up study to the authors' previous eLife report about the roles of an alpha-arrestin called protein thioredoxin interacting protein (Txnip) in cone photoreceptors and in the retinal pigment epithelium. The findings are important because they provide new information about the mechanism of glucose and lactate transport to cone photoreceptors and because they may become the basis for therapies for retinal degenerative diseases.

      Strengths:

      Overall, the study is carefully done and, although the analysis is fairly comprehensive with many different versions of the protein analyzed, it is clearly enough described to follow. Figure 4 greatly facilitated my ability to follow, understand and interpret the study.

      Weaknesses:

      I have just one concern that I would like the authors to address. It is about the text that begins at line 133: "We assayed their ability to clear GLUT1 from the RPE surface (Figure 2A)". Please provide more details about this. From the figure it appears that n = 1 for this experiment, but given how careful the authors are with these types of studies that seems unlikely. How did the authors quantify the ability to clear GLUT1 from the surface? Was it cleared from both the apical and basal surface? (It is hard to resolve the apical and basal surfaces in the images provided). The experiments shown in Fig. 1H and Fig. 1I of PMID 31365873 shows how GLUT1 disappears only from the apical surface (under the conditions of that experiment and through the mechanism described in their text). It would be helpful for the authors to discuss their current results in the context of that experiment.

      We repeated all eight AAV-Best1-Txnip alleles for RPE GLUT1 staining with more than three eyes of each condition. We also quantified the GLUT1 intensity on the RPE basal surface. A new Figure 2-figure supplement 1 with these data has been added to this submission. The results and conclusions are similar to those in our initial submission.

      As mentioned in our provisional responses: GLUT1 on the basal surface of the RPE is more easily scored than that on the apical surface. The photoreceptor inner segments and Müller glia microvilli also have GLUT1, and their processes are juxtaposed and/or intertwined with the apical processes of the RPE, making the apical process GLUT1 staining of the RPE much more difficult to score. In some sections where the RPE and the retina separate, we can score the apical process GLUT1 staining of the RPE, but we do not always have this situation in our sections. The current quantification in the new Figure 2-figure supplement 1 thus concerns only the basal staining.

      As a separate issue, Reviewer #1 mentioned the work of another group (Wang et al., 2019, PMID: 31365873), which claimed that, on the apical surface of the RPE, GLUT1 is down-regulated in a RP mouse strain, RhoP23H. We have not consistently observed such a down-regulation of GLUT1 in other RP mouse strains such as rd1, rd10 or Rho-/- (unpublished data; see review Xue and Cepko, 2023, PMID: 37460158). However, as we pointed out above, it is difficult to score GLUT1 staining on the RPE apical surface. It is even more difficult in the degenerating retina where RPE and photoreceptor processes degenerate. For reference, one can see images of degenerating RPE apical processes in Wu et al. 2021 (PMID: 33491671).

      Reviewer #2 (Public Review):

      The hard work of the authors is much appreciated. With overexpression of a-arrestin Txnip in RPE, cones and the combined respectively, the authors show a potential gene agnostic treatment that can be applied to retinitis pigmentosa. Furthermore, since Txnip is related to multiple intracellular signaling pathway, this study is of value for research in the mechanism of secondary cone dystrophy as well.

      There are a few areas in which the article may be improved through further analysis and application of the data, as well as some adjustments that should be made in to clarify specific points in the article.

      Reviewer #3 (Public Review):

      Summary:

      Xue et al. extended their groundbreaking discovery demonstrating the protective effect of Txnip on cone photoreceptor survival. This was achieved by investigating the protection of cone degeneration through the overexpression of five distinct mutated variants of Txnip within the retinal pigment epithelium (RPE). Moreover, the study explored the roles of two proteins, HSP90AB1 and Arrdc4, which share similarities or associations with Txnip. They found the protection of Txnip in RPE cells and its mechanism is different from its protection in cone cells. These discoveries have significant implications for advancing our understanding of the mechanisms underlying Txnip's protection on cone cells.

      Strengths: (1) Identify the roles of different Txnip mutations in RPE and their effects on the expression of glucose transporter

      (2) Dissect the mechanism of Txnip in RPE vs Cone photoreceptors in retinal degeneration models.

      (3) Explore the functions of ARrdc4, a protein similar to Txnip and HSP90AB1 in cone degeneration.

      Weaknesses:

      (1) Arrdc4 has deleterious effect on cone survival but no discussion on its mechanism.

      (2) Inhibition of HSP90 is known to cause retinal generation. It is unclear why inhibition enhances the protection of Txnip.

      As mentioned in our provisional responses, little was known about the function of Arrdc4 or HSP90AB1 in cones. We summarize some of the recent discoveries regarding these two proteins in the new Discussion:

      “Arrdc4, the most similar α-arrestin protein to Txnip that also has Arrestin N- and C- domains, accelerated RP cone death when transduced via AAV (Figure 1). This observation suggests that Txnip has unique functions that protect RP cones. Recently, Arrdc4 has been proposed to be critical for liver glucagon signaling, which could be negated by insulin (Dagdeviren et al. 2023). The implication of this potential role in RP cone survival is unclear, but interestingly, the activation of the insulin/mTORC1 pathway is beneficial to RP cone survival (Punzo et al. 2009; Venkatesh et al. 2015).”

      “Little is known about the function of HSP90AB1. Knocking down Hsp90ab1 improved mitochondrial metabolism of skeletal muscle in a diabetic mouse model (Jing et al. 2018). Knocking out HSP90AA1, a paralog of HSP90AB1 which has 14% different amino acids, led to rod death and correlated with PDE6 dysregulation (Munezero et al. 2023). Inhibiting HSP90AA1 with small molecules transiently delayed cone death in human retinal organoids under low glucose conditions (Spirig et al. 2023). However, the exact role of HSP90AA1 in photoreceptors needs to be clarified, and the implications for HSP90AB1 in RP cones are still unclear. ”

      In addition, we used AlphaFold Multimer, an AI algorithm based on AlphaFold-2, to explore the possible interaction between TXNIP, PARP1 and HSP90AB1 in the revision. One of the predicted models is shown as the new Figure 5-figure supplement 2. The C-terminus of Txnip is predicted to link HSP90AB1 and PARP1 together in this model.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have just one concern that I would like the authors to address. It is about the text that begins at line 133: "We assayed their ability to clear GLUT1 from the RPE surface (Figure 2A)". Please provide more details about this. From the figure it appears that n = 1 for this experiment, but given how careful the authors are with these types of studies that seems unlikely. How did the authors quantify the ability to clear GLUT1 from the surface? Was it cleared from both the apical and basal surface? (It is hard to resolve the apical and basal surfaces in the images provided). The experiments shown in Fig. 1H and Fig. 1I of PMID 31365873 shows how GLUT1 disappears only from the apical surface (under the conditions of that experiment and through the mechanism described in their text). It would be helpful for the authors to discuss their current results in the context of that experiment.

      See our responses to Review #1’s public review section above.

      Also, is the clearance from the RPE plasma membrane homogenous throughout the RPE monolayer?

      In the area of AAV infection, the effects are very homogenous. In the uninfected area, the clearance does not occur, and we consider the uninfected area of the same eye to be an excellent internal control.

      A statistical analysis (as was provided for other experiments in the manuscript) would help to make the surprising conclusion about C.Txhniip.C247S more convincing.

      In this revision, we used the Mann-Whitney U test with the Bonferroni correction for GLUT1 intensity quantification. For the cone survival statistics, we used the t-test or ANOVA with Dunnett multiple comparison test. The information has been added to each figure legend.

      Another improvement I suggest for this figure is to include normal full length Txnip as a positive control to show how completely it removes GLUT1 from the surface.

      Added. See the new Figure 2-figure supplement 1.

      Another point that should be discussed is - when Txnip prevents GLUT1 from reaching the surface does all the GLUT1 get fully degraded within the cell. A brief description of how Txnip influences GLUT1 stability and localization would be helpful.

      We are unable to track the fate of the GLUT1 after it is removed, i.e. we do not see definitive intracellular staining. We do not know if this is due to degradation or a hidden epitope.

      Minor point

      (1) Confusing citation on lines 99-100: "We previously showed that overexpressing the Txnip wt allele in the RPE using an RPE specific promoter, derived from the Best1 gene (Esumi et al. 2009),.." makes it sound like Esumi et al. is the citation for their previous study, which is not correct.

      We have amended this to: "We previously showed (Xue et al. 2021) that overexpressing the Txnip wt allele in the RPE using an RPE-specific promoter, derived from the Best1 gene (Esumi et al., 2009), did not improve RP cone survival."

      Reviewer #2 (Recommendations For The Authors):

      Regarding the manuscript, here are some suggestions that authors can take into consideration for the completeness of the study:

      (1) The text references the relationship between α-arrestin and glucose metabolism in cone cells, but fails to provide an explanation for its specific involvement in glucose metabolism. Consequently, readers may struggle to discern the targeted metabolic pathway.

      We understand this point from Reviewer, and would love to know more about its mechanism, which is one reason why we undertook the current study. The mechanism(s) by which Txnip affects metabolism remains to be elucidated. To summarize our findings from our previous study, we showed that LDHB, which converts lactate to pyruvate, was required for Txnip-mediated rescue. Addition of the LDHB gene, however, did not boost rescue. We also showed that mitochondrial size and membrane potential were improved, and the Na/K pump function was improved, in Txnip-treated cones. Improved mitochondria were not sufficient, however, as revealed by a PARP-1 KO mouse with improved mitochondria that did not extend cone survival. In addition, using a Txnip mutant that does not remove the glucose transporter, we still saw cone rescue, so this function cannot be required for Txnip-mediated rescue. How does Txnip lead to improved mitochondria and to a reliance on lactate? We do not know.

      (2) Although the author conducted an experiment on arrdc14 due to its similarity to Txnip, the lack of clarification on why arrdc4, with a 60% amino acid similarity, did not yield the same effects as Txnip remains unaddressed. Highlighting structural disparities or differences in intracellular signaling pathways could potentially shed light on this incongruity. Subsequently, an additional experiment may be warranted to test the hypothesis regarding the effective component of α-arrestin for cone rescue.

      Additional experiments are needed to learn of the relevant differences between Arrdc4 and Txnip, but are beyond the scope of our work at the present. However, we have added a paragraph on newly published data on the function of Arrdc4 in the new Discussion:

      “Arrdc4, the most similar α-arrestin protein to Txnip that also has Arrestin N- and C- domains, accelerated RP cone death when transduced by AAV (Figure 1). This observation suggests that Txnip has unique functions that protect RP cones. Recently, Arrdc4 has been proposed to be critical for liver glucagon signaling, which could be negated by insulin (Dagdeviren et al. 2023). The implication of this potential role regarding RP cone survival is unclear, but interestingly, the activation of the insulin/mTORC1 pathway is beneficial to RP cone survival (Punzo et al. 2009; Venkatesh et al. 2015).”

      (3) The utilization of distinct mutant Txnip variants to impact RPE, cones, and their combined influence is noted. A comparative table elucidating the impact of cone rescue on these three targets would greatly enhance clarity.

      We presented these data in Figure 4 in a table format.

      Additionally, the text does not definitively establish whether Txnip.C247S.LL351 and 352AA, as well as Txnip.C247S, indeed manifest discrepancies when exclusively affecting RPE.

      We edited a sentence in Results to: “Similar to Best1-wt Txnip (Xue et al., 2021), Best1-Txnip.C247S did not show significant improvement of cone survival, ruling out the C247S mutation alone as promoting the cone survival by Best1-Txnip.C247S.LL351 and 352AA.”

      (4) While the text mentions that Txnip stimulates lactate utilization within cones, it remains unclear whether this effect extends to RPE. If applicable, this trait could potentially contribute to its role in cone rescue.

      We agree with the Reviewer, and hope to address this question in our next study.

      (5) The discussion introduces the notion that one potential mechanism for cone rescue by Txnip.C247S involves facilitating unhindered movement of Thioredoxin for redox processes. To validate this hypothesis and elucidate the mechanics of Txnip's involvement in cone rescue, it may be prudent to conduct further experiments concentrating on the interaction between Txnip and thioredoxin. Alternatively, an experiment aimed at upregulating Thioredoxin expression would be a valuable addition.

      We hope to address this question in the future. However, the effect may be more complicated than our simple hypothesis regarding release of Thioredoxin. More than a dozen proteins were found to differentially interact with Txnip vs. Txnip.C247S (Forred et al. 2016).

      Reviewer #3 (Recommendations For The Authors):

      (1) Glucose transporter 1 is identified as an important mechanism in the protection of cone degeneration. It is unclear why GLut1 is upregulated in retinal cells although the expression of Txnip mutants are specifically in the RPE in Figure 2.

      This retinal GLUT1 upregulation was not consistently observed in the treated eyes, so we did not comment on it in the text.

      (2) Mutant N. Txnip was mentioned in the discussion that it causes obvious retinal degeneration. The quantification of retinal thickness from Figure 2 will be more rigorous.

      Unlike the robust effects of Best1-N.Txnip on RPE GLUT1 level, this negative effect of Best1-N.Txnip on ONL thickness was not consistent. This result does not undermine the other major conclusions. Therefore, we deleted the related sentence of the original text: “This hypothesis is supported by the observation that N.Txnip led to an obvious thinning of the outer nuclear layer of the wt retina, reflecting a loss of photoreceptors”. We did leave in the related finding as follows:

      “The N-terminal half of Txnip (1-228aa) might exert harmful effects in the RPE, that negate the beneficial effects from the C-terminal half, suggested by the observation that its removal, in the C-terminal 149-397 allele, led to better cone survival when expressed in the RPE (Figure 2). In cones, the C-terminal half, including the C-terminal IDR tail, may cooperate with the N-terminal half, or negate its negative effects, to benefit RP cone survival. However, the C-terminal half is not sufficient for cone rescue when expressed in cones, as the 149-397 allele did not rescue.”

    2. eLife assessment

      This fundamental study advances our understanding of the cell specific treatment of cone photoreceptor degeneration by Txnip. The evidence supporting the conclusions is compelling with rigorous genetic manipulation of Txnip mutations. The work will be of broad interest to vision researchers, cell biologists and biochemists.

    3. Reviewer #1 (Public Review):

      Summary:

      This is a follow-up study to the authors' previous eLife report about the roles of an alpha-arrestin called protein thioredoxin interacting protein (Txnip) in cone photoreceptors and in the retinal pigment epithelium. The findings are important because they provide new information about the mechanism of glucose and lactate transport to cone photoreceptors and because they may become the basis for therapies for retinal degenerative diseases.

      Strengths:

      Overall, the study is carefully done and, although the analysis is fairly comprehensive with many different versions of the protein analyzed, it is clearly enough described to follow. Figure 4 greatly facilitated my ability to follow, understand and interpret the study. The authors have appropriately addressed a few concerns about statistical significance and the relationship between their findings and previous studies of the possible roles of Txnip on GLUT1 expression and localization on the surfaces of RPE cells.

    4. Reviewer #2 (Public Review):

      The hard work of the authors is much appreciated. With overexpression of a-arrestin Txnip in RPE, cones and the combined respectively, the authors show a potential gene agnostic treatment that can be applied to retinitis pigmentosa. Furthermore, since Txnip is related to multiple intracellular signaling pathway, this study is of value for research in the mechanism of secondary cone dystrophy as well.

      There are a few areas in which the article may be improved through further analysis and application of the data, as well as some adjustments that should be made in to clarify specific points in the article.

      Strengths

      - The follow-up study builds on innovative ground by exploring the impact of TxnipC247S and its combination with HSP90AB1 knockdown on cone survival, offering novel therapeutic pathways.<br /> - Testing of different Txnip deletion mutants provides a nuanced understanding of its functional domains, contributing valuable insights into the mechanism of action in RP treatment.<br /> - The findings regarding GLUT1 clearance and the differential effects of Txnip mutants on cone and RPE cells lay the groundwork for targeted gene therapy in RP.

      Weaknesses

      - The focus on specific mutants and overexpression systems might overlook broader implications of Txnip interactions and its variants in the wider context of retinal degeneration.<br /> - The study's reliance on cell count and GLUT1 expression as primary outcomes misses an opportunity to include functional assessments of vision or retinal health, which would strengthen the clinical relevance.<br /> - The paper could benefit from a deeper exploration of why certain treatments (like Best1-146 Txnip.C247S) do not lead to cone rescue and the potential for these approaches to exacerbate disease phenotypes through glucose shortages.<br /> - Minor inconsistencies, such as the missing space in text references and the need for clarification on data representation (retinas vs. mice), should be addressed for clarity and accuracy.<br /> - The observation of promoter leakage and potential vector tropism issues raise questions about the specificity and efficiency of the gene delivery system, necessitating further discussion and validation.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Sang et al. proposed a pair of IR60b-expressing pharyngeal neurons in Drosophila use IR25a, IR76b, and IR60b channels to detect high Na+ and limit its consumption. Some of the key findings that support this thesis are: 1) animals that lacked any one of these channels - or with their IR60b-expressing neurons selectively silenced - showed much reduced rejection of high Na+, but restored rejection when these channels were reintroduced back in the IR60b neurons; 2) animals with TRPV artificially expressed in their IR60b neurons rejected capsaicin-laced food whereas WT did not; 3) IR60b-expressing neurons exhibited increased Ca2+ influx in response to high Na+ and such response went away when animals lacked any of the three channels.

      Strengths:

      The experiments were thorough and well designed. The results are compelling and support the main claim. The development and the use of the DrosoX two-choice assay put forward for a more quantitative and automatic/unbiased assessment for ingestion volume and preference.

      Weaknesses:

      There are a few inconsistencies with respect the the exact role by which IR60b neurons limit high salt consumption and the contribution of external (labellar) high-salt sensors in regulating high salt consumption. These weaknesses do not significantly impact the main conclusion, however.

      Reviewer #2 (Public Review):

      Summary:

      In this paper, Sang et al. set out to identify gustatory receptors involved in salt taste sensation in Drosophila melanogaster. In a two-choice assay screen of 30 Ir mutants, they identified that Ir60b is required for avoidance of high salt. In addition, they demonstrate that activation of Ir60b neurons is sufficient for gustatory avoidance using either optogenetics or TRPV1 to specifically activate Ir60b neurons. Then, using tip recordings of labellar gustatory sensory neurons and proboscis extension response behavioral assays in Ir60b mutants, the authors demonstrate that Ir60b is dispensable for labellar taste neuron responses to high salt and the suppression of proboscis extension by high salt. Since external gustatory receptor neurons (GRNs) are not implicated, they look at Poxn mutants, which lack external chemosensory sensilla but have intact pharyngeal GRNs. High salt avoidance was reduced in Poxn mutants but was still greater than Ir60b mutants, suggesting that pharyngeal gustatory sensory neurons alone are sufficient for high salt avoidance. The authors use a new behavioral assay to demonstrate that Ir60b mutants ingest a higher volume of sucrose mixed with high salt than control flies do, suggesting that the action of Ir60b is to limit high salt ingestion. Finally, they identify that Ir60b functions within a single pair of gustatory sensory neurons in the pharynx, and that these neurons respond to high salt but not bitter tastants.

      Strengths:

      A great strength of this paper is that it rigorously corroborates previously published studies that have implicated specific Irs in salt taste sensation. It further introduces a new role for Ir60b in limiting high salt ingestion, demonstrating that Ir60b is necessary and sufficient for high salt avoidance and convincingly tracing the action of Ir60b to a particular subset of gustatory receptor neurons. Overall, the authors have achieved their aim by identifying a new gustatory receptor involved in limiting high salt ingestion. They use rigorous genetic, imaging, and behavioral studies to achieve this aim, often confirming a given conclusion with multiple experimental approaches. They have further done a great service to the field by replicating published studies and corroborating the roles of a number of other Irs in salt taste sensation. An aspect of this study that merits further investigation is how the same gustatory receptor neurons and Ir in the pharynx can be responsible for regulating the ingestion of both appetitive (sugar) and aversive tastants (high salt).

      A previous report published in eLife from John Carlson’s lab (Joseph et al, 2017) showed that the Ir60b GRN in the pharynx responds to sucrose resulting in sucrose repulsion. Thus, stimulation of this pharyngeal GRN results in gustatory avoidance only, not both attraction and avoidance. (lines 205-207)

      Weaknesses:

      There are several weaknesses that, if addressed, could greatly improve this work.

      (1) The authors combine the results and discussion but provide a very limited interpretation of their results. More discussion of the results would help to highlight what this paper contributes, how the authors interpret their results, and areas for future study.

      We agree and have now separated the Results and Discussion, and in so doing have greatly expanded discussion of the results.

      (2) The authors rename previously studied populations of labellar GRNs to arbitrary letters, which makes it difficult to understand the experiments and results in some places. These GRN populations would be better referred to according to the gustatory receptors they are known to express.

      One of the corresponding authors (Craig Montell) introduced this alternative GRN nomenclature in a review in 2021: Montell, C. (Drosophila sensory receptors—a set of molecular Swiss Army Knives. Genetics 217, 1-34) (Montell, 2021). We are not fans of referring to different classes of GRNs based on the receptors that they express since it is not obvious which receptors to use. For example, the GRNs that respond to bitter compounds all express multiple GR co-receptors. The same is true for the GRNs that respond to sugars. The former system of referring to GRNs simply as sugar, bitter, salt and water GRNs is also not ideal since the repertoire of chemicals that stimulates each class is complex. For example, the Class A GRNs (formerly sugar GRNs) are also activated by low Na+, glycerol, fatty acids, and acetic acid, while the B GRNs (former bitter GRNs) are also stimulated by high Na+, acids, polyamines, and tryptophan. In addition, there are five classes of GRNs. At first mention of the Class A—E GRNs, we mention the most commonly used former nomenclature of sugar, bitter, salt and water GRNs. In addition, for added clarify, we now also include a mention of one of the receptors that mark each class. (lines 51-59)

      (3) The conclusion that GRNs responsible for high salt aversion may be inhibited by those that function in low salt attraction is not well substantiated. This conclusion seems to come from the fact that overexpression of Ir60b in salt attraction and salt aversion sensory neurons still leads to salt aversion, but there need not be any interaction between these two types of sensory neurons if they act oppositely on downstream circuits.

      We did not make this claim.

      (4) The authors rely heavily on a new Droso-X behavioral apparatus that is not sufficiently described here or in the previous paper the authors cite. This greatly limits the reader's ability to interpret the results.

      We expanded the description of the apparatus in the Droso-X assay section of the Materials and Methods. (lines 588-631)

      Reviewer #3 (Public Review):

      Summary:

      Sang et al. successfully demonstrate that a set of single sensory neurons in the pharynx of Drosophila promotes avoidance of food with high salt concentrations, complementing previous findings on Ir7c neurons with an additional internal sensing mechanism. The experiments are well-conducted and presented, convincingly supporting their important findings and extending the understanding of internal sensing mechanisms. However, a few suggestions could enhance the clarity of the work.

      Strengths:

      The authors convincingly demonstrate the avoidance phenotype using different behavioral assays, thus comprehensively analyzing different aspects of the behavior. The experiments are straightforward and well-contextualized within existing literature.

      Weaknesses:

      Discussion

      While the authors effectively relate their findings to existing literature, expanding the discussion on the surprising role of Ir60b neurons in both sucrose and salt rejection would add depth. Additionally, considering Yang et al. 2021's (https://doi.org/10.1016/j.celrep.2021.109983) result that Ir60b neurons activate feeding-promoting IN1 neurons, the authors should discuss how this aligns with their own findings.

      Yang et al. demonstrated that the activation of Ir60b neurons can trigger the activation of IN1 neurons akin to pharyngeal multimodal (PM) neurons, potentially leading to enhanced feeding (Yang et al, 2021). However, our research reveals a specific pattern of activation for Ir60b neurons. Instead of being generalists, they are specialized for certain sugars, such as sucrose and high salt. Consequently, while Ir60b GRNs activate IN1 neurons, we contend that there are other neurons in the brain responsible for inhibiting feeding. (lines 412-417)

      Lines 187: The discussion primarily focuses on taste sensillae outside the labellum, neglecting peg-type sensillae on the inner surface. Clarification on whether these pegs contribute to the described behaviors and if the Poxn mutants described also affect the pegs would strengthen the discussion.

      We added the following to the Discussion section. “We also found that the requirement for Ir60b appears to be different when performing binary liquid capillary assay (DrosoX), versus solid food binary feeding assays. When we employed the DrosoX assay to test mutants that were missing salt aversive GRNs in labellar bristles but still retained functional Ir60b GRNs, the flies behaved the same as wild-type flies (e.g. Figure 3J and 3L). However, using solid food binary assays, Poxn mutants, which are missing labellar taste bristles but retain Ir60b GRNs (LeDue et al, 2015), displayed repulsion to high salt food that was intermediate between control flies and the Ir60b mutant (Figure 2J). Poxn mutants retain taste pegs (LeDue et al., 2015), and these hairless taste organs become exposed to food only when the labial palps open. We suggest that there are high-salt sensitive GRNs associated with taste pegs, which are accessed when the labellum contacts a solid substrate, but not when flies drink from the capillaries used in DrosoX assays. This explanation would also account for the findings that the Ir60b mutant is indifferent to 300 mM NaCl in the DrosoX assay (Figure 3B), but prefers 1 mM sucrose alone over 300 mM NaCl and 5 mM sucrose in the solid food binary assay (Figure 1B).”. (lines 430-444)

      In line 261 the authors state: "We attempted to induce salt activation in the I-type sensilla by ectopically expressing Ir60b, similar to what was observed with Ir56b 8; however, this did not generate a salt receptor (Figures S6A)"

      An obvious explanation would be that these neurons are missing the identified necessary co-receptors Ir76b and Ir25a. The authors should discuss here if the Gr33a neurons they target also express these co-receptors, if yes this would strengthen their conclusion that an additional receptor might be missing.

      We clarified this point in the Discussion section as follows, “An open question is the subunit composition of the pharyngeal high Na+ receptor, and whether the sucrose/glucose and Na+ receptors in the Ir60b GRN are the same or distinct. Our results indicate that the high salt sensor in the Ir60b GRN includes IR25a, IR60b and IR76b since all three IRs are required in the pharynx for sensing high levels of NaCl. I-type sensilla do not elicit a high salt response, and we were unable to induce salt activation in I-type sensilla by ectopically expressing Ir60b, under control of the Gr33a-GAL4. This indicates that IR25a, IR60b and IR76b are insufficient for sensing high Na+. The inability to confer a salt response by ectopic expression of Ir60b was not due to absence of Ir25a and Ir76b in Gr33a GRNs since Gr33a and Gr66a are co-expressed (Moon et al, 2009), and Gr66a GRNs express Ir25a and Ir76b (Li et al, 2023). Thus, the high salt receptor in Ir60b GRNs appears to require an additional subunit. Given that Na+ and sugars are structurally unrelated, we suggest that the Na+ and sucrose/glucose receptors do not include the identical set of subunits, or that that they activate a common receptor through disparate sites”. (lines 464-477)

      Methods

      The description of the Droso-X assay seems to be missing some details. Currently, it is not obvious how the two-choice is established. Only one capillary is mentioned, I assume there were two used? Also, the meaning of the variables used in the equation (DrosoX and DrosoXD) are not explained.

      We expanded the description of the apparatus in the Droso-X assay section of the Materials and Methods. (lines 588-631)

      The description of the ex-vivo calcium imaging prep. is unclear in several points:

      (1) It is lacking information on how the stimulus was applied (was it manually washed in? If so how was it removed?).

      We expanded the description of the apparatus in the ex vivo calcium imaging section of the Materials and Methods. (lines 682-716)

      (2) The authors write: "A mild swallow deep well was prepared for sample fixation." I assume they might have wanted to describe a "shallow well"?

      We deleted the word “deep.”.(line 691)

      (3) "...followed by excising a small portion of the labellum in the extended proboscis region to facilitate tastant access to pharyngeal organs." It is not clear to me how one would excise a small portion of the labellum, the labellum depicts the most distal part of the proboscis that carries the sensillae and pegs. Did the authors mean to say that they cut a part of the proboscis?

      Yes. We changed the sentence to “…followed by excising a small portion of the extended proboscis to facilitate tastant access to the pharyngeal organs.”.(lines 693)-695

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      In this manuscript, Sang et al. proposed a pair of IR60b-expressing pharyngeal neurons in Drosophila use IR25a, IR76b, and IR60b channels to detect high Na+ and limit its consumption. Some of the key findings that support this thesis are: 1) animals that lacked any one of these channels - or with their IR60b-expressing neurons selectively silenced - showed much reduced rejection of high Na+, but restored rejection when these channels were reintroduced back in the IR60b neurons; 2) animals with TRPV artificially expressed in their IR60b neurons rejected capsaicin-laced food whereas WT did not; 3) IR60b-expressing neurons exhibited increased Ca2+ influx in response to high Na+ and such response went away when animals lacked any of the three channels. In general, I find the collective evidence presented by the authors convincing. But I feel the MS can benefit from having a discussion session and a few simple experiments. Below I listed some inconsistencies I hope the authors can address or at least discuss.

      We have now added a Discussion section, and expanded the discussion.

      (1) The role of IR60b neurons on suppressing PER appeared inconsistent. On the one hand, optogenetic activation of these neurons suppressed PER (Fig 1D), on the other hand, IR60b mutants were as competent to suppress PER in response to high salt as WT (Fig 2G). Are pharyngeal neurons expected to modulate PER? It might be worth including a retinal-free or genotype control to ascertain the PER suppression exhibited by IR60b>CsChrimson is genuine.

      Please note that Figure 2G is now Figure 2H.

      Our interpretation is that activation of aversive GRNs by high salt either in labellar bristles or in the pharynx is sufficient to inhibit repulsion to high salt. Consistent with this conclusion, optogenetic activation of Ir60b GRNs, which are specific to the pharynx, is sufficient to reduce the PER to sucrose containing food (Figure 1D). However, mutation of Ir60b has no impact on the PER to sucrose plus high (300 mM) NaCl since the high-salt activated GRNs in labellar bristles are not impaired by the Ir60b mutation. In contrast, Ir25a and Ir76b are required in both labellar bristles and in the pharynx to reject high salt. As a consequence, mutation of either Ir25a or Ir76b impairs the repulsion to high salt. Thus, there is no inconsistency between the optogenetics and PER results. We clarified this point in the Discussion section. In terms of controls for IR60b>CsChrimson, we show that UAS-CsChrimson alone or UAS-CsChrimson in combination with the Gr5a driver has no impact on the PER (Figure 1D). In addition, we now include a retinal free control (Figure 1D). These findings provide the key genetic controls and are described in the Results section. (lines 167-170)

      (2) The role of labellar high-salt sensors in regulating salt intake appeared inconsistent. On the one hand, they appeared to have a role in limiting high salt consumption because poxn mutants were significantly more receptive to high salt than WT (Fig. 2J). On the other hand, selectively restoring IR76b or IR25a in only the IR60b neurons in these mutants - thus leaving the labellar salt sensors still defective - reverted the flies to behave like WT when given a choice between sucrose vs. sucrose+high salt (Fig 3J, L).

      We now offer an explanation for these seemingly conflicting results in the Discussion section. When we employed the DrosoX assay with mutants with functional Ir60b GRNs, but were missing salt aversive GRNs in labellar bristles, the flies behaved the same as control flies (e.g. Figure 3J and L). However, using solid food binary assays, Poxn mutants, which are missing labellar taste bristles but retain Ir60b GRNs (LeDue et al., 2015), display aversion high salt food intermediate between control and Ir60b mutant flies (Figure 2J). Poxn mutants retain taste pegs (LeDue et al., 2015), which are exposed to food substrates only when the labial palps open. We suggest that the taste pegs harbor high salt sensitive GRNs, and they may be exposed to solid substrates, but not to the liquid in capillary tubes used in the DrosoX assays. This explanation would also account for the findings that the Ir60b mutant is indifferent to 300 mM NaCl in the DrosoX assay (Figure 3B), but prefers 1 mM sucrose alone over 300 mM NaCl and 5 mM sucrose in the solid food binary assay (Figure 1B). (lines 433-444)

      (3) The behavior sensitivity of IR60b mutant to high salt again appeared somewhat inconsistent when assessed in the two different choice assays. IR60b mutant flies were indifferent to 300 mM NaCl when assayed with DrosoX (Fig 3A, B) but were clearly still sensitive to 300 mM NaCl when assayed with "regular" assay - they showed much reduced preference for 5 mM sucrose over 1 mM sucrose when the 5 mM sucrose was adulterated with 300 mM NaCl (Fig 1B).

      The explanation provided above may also account for the findings that the Ir60b mutant is indifferent to 300 mM NaCl in the DrosoX assay (Figure 3B), but not when selecting between 300 mM NaCl and 5 mM sucrose versus 1 mM sucrose in the solid food binary assay (Figure 1B). Alternatively, the different behavioral responses might be due to the variation in sucrose concentrations in each of these two assays, which employed 5 mM sucrose in the solid food binary assay, as opposed to 100 mM sucrose in the DrosoX assay. This disparity in attractive valence between these two concentrations of sucrose might consequently impact feeding amount and preference. This point is now also included in the Discussion section. (lines 441-449)

      (4) Given the IR60b neurons exhibited clear IR60b/IR25a/IR76b-dependent sucrose sensitivity, too, I am curious how the various mutant animals behave when given a choice between 100 mM sorbitol vs. 100 mM sorbitol + 300 mM NaCl, a food choice assay not complicated by the presence of sucrose. Similarly, I am curious if the Ca2+ response of IR60 neurons differs significantly when presented with 100 mM sucrose vs. when presented with 100 mM sucrose + 300 mM NaCl. In principle, the magnitude for the latter should be significantly larger than the former as animals appeared to be capable of discriminating these two choices solely relying on their IR60b neurons.

      To investigate the aversion induced by high salt in the absence of a highly attractive sugar, such as sucrose, we combined 300 mM salt with 100 mM sorbitol, which is a tasteless but nutritive sugar (Burke & Waddell, 2011; Fujita & Tanimura, 2011). Using two-way choice assays, we found that the Ir25a, Ir60b, and Ir76b mutants exhibited substantial reductions in high salt avoidance (Figure 3—figure supplement 2A). In addition, we performed DrosoX assays using 100 mM sorbitol alone, or sorbitol mixed with 300 mM NaCl. Sorbitol alone provoked less feeding than sucrose since it is a tasteless sugar (Figure 3—figure supplement 2B and C). Nevertheless, addition of high salt to the sorbitol reduced food consumption (Figure 3—figure supplement 2B and C). (lines 300-308)

      We also conducted a comparative analysis of the Ca2+ responses within the Ir60b GRN, examining its reaction to various stimuli, including 100 mM sucrose alone, 300 mM NaCl alone, and a combination of 100 mM sucrose and 300 mM NaCl. We found that the Ca2+ responses were significantly higher when we exposed the Ir60b GRN to 300 mM NaCl alone, compared with the response to 100 mM sucrose alone (Figure 4—figure supplement 1D). However, the GCaMP6f responses was not higher when we presented 100 mM sucrose with 300 mM NaCl, compared with the response to 300 mM NaCl alone (Figure 4—figure supplement 1D). (lines 360-367)

      Minor issues

      (1) The labels of sucrose concentration on Figure 2D were flipped.

      This has been corrected.

      (2) The phrasing of the sentence that begins in line 196 (i.e., "This suggests the internal sensor ...") is not as optimal.

      We changed the sentence to, “We found that the aversive behavior to high salt was reduced in the Poxn mutants relative to the control (Figure 2J), consistent with previous studies demonstrating roles for GRNs in labellar bristles in high salt avoidance (Jaeger et al, 2018; McDowell et al, 2022; Zhang et al, 2013).”. (lines 217-219)

      (3) In Line 231, I am not sure why the authors think ectopic expressing IR60b in labellar neurons would allow them to become activated by Na+. It seems highly unlikely to me, especially given IR60b also plays a role in sensing sugar.

      We added the following paragraph to the Discussion addressing this point, “An open question is the subunit composition of the pharyngeal high Na+ receptor, and whether the sucrose/glucose and Na+ receptors in the Ir60b GRN are the same or distinct. Our results indicate that the high salt sensor in the Ir60b GRN includes IR25a, IR60b and IR76b since all three IRs are required in the pharynx for sensing high levels of NaCl. I-type sensilla do not elicit a high salt response, and we were unable to induce salt activation in I-type sensilla by ectopically expressing Ir60b, under control of the Gr33a-GAL4. This indicates that IR25a, IR60b and IR76b are insufficient for sensing high Na+. The inability to confer a salt response by ectopic expression of Ir60b was not due to absence of Ir25a and Ir76b in Gr33a GRNs since Gr33a and Gr66a are co-expressed (Moon et al., 2009), and Gr66a GRNs express Ir25a and Ir76b (Li et al., 2023). Thus, the high salt receptor in Ir60b GRNs appears to require an additional subunit. Given that Na+ and sugars are structurally unrelated, we suggest that the Na+ and sucrose/glucose receptors do not include the identical set of subunits, or that that they activate a common receptor through disparate sites.”. (lines 464-477)

      Reviewer #2 (Recommendations For The Authors):

      Line 41, acutely excessive salt ingestion can lead to death, not just health issues

      We now state that, “consumption of excessive salt can contribute to various health issues in mammals, including hypertension, osteoporosis, gastrointestinal cancer, autoimmune diseases, and can lead to death.”. (lines 41-43)

      Line 46, delete the comma after flies

      Done. (line 47)

      Lines 51-56: This description is unnecessarily confusing and does not cite proper sources. Renaming these GRNs arbitrarily can only create confusion, plus this description lacks nuance. If E GRNs are Ir94e positive, this description is out of date. Furthermore, If D GRNs are ppk23 and Gr66a positive then they will respond to both bitter and high salt.

      Papers to consult: https://elifesciences.org/articles/37167 10.1016/j.cell.2023.04.038

      We have now added citations. We prefer the A—E nomenclature, which was introduced in a 2021 Genetics review by one of the authors of this manuscript (Montell) (Montell, 2021) since naming different classes of GRNs on the basis of markers or as sweet, bitter, salt and water GRNs is misleading and an oversimplification. We cite the Genetics 2021 review, and for added clarity include both types of former names (markers and sweet, bitter, salt and water). Class D GRNs are not marked by Gr66a. The eLife reference cited above provided the initial rationale for stating that Class E GRNs are marked by Ir94e and activated by low salt. According to the Taisz et al reference (Cell 2023), the Class E GRNs, which are marked by Ir94e, are also activated by pheromones, which we now mention (Taisz et al, 2023). (lines 51-59)

      Line 62, E GRNs are not required for low salt behaviors

      We do not state that E GRNs are required for low salt behaviors, only that they sense low Na+ levels. (line 58)

      Line 70-81 - Great deal of emphasis on labellar GRNs but then no mention of how pharyngeal GRNs fit into categories A-E

      We devote the following paragraph to pharyngeal GRNs. We do not mention how they fit in with the A—E categories because it is not clear.

      “In addition to the labellum and taste bristles on other external structures, such as the tarsi, fruit flies are endowed with hairless sensilla on the surface of the labellum (taste pegs), and three internal taste organs lining the pharynx, the labral sense organ (LSO), the ventral cibarial sense organ (VCSO), and the dorsal cibarial sense organ (DCSO), which also function in the decision to keep feeding or reject a food (Chen & Dahanukar, 2017, 2020; LeDue et al., 2015; Nayak & Singh, 1983; Stocker, 1994). A pair of GRNs in the LSO express a member of the gustatory receptor family, Gr2a, and knockdown of Gr2a in these GRNs impairs the avoidance to slightly aversive levels of Na+ (Kim et al, 2017). Pharyngeal GRNs also promote the aversion to bitter tastants, Cu2+, L-canavanine, and bacterial lipopolysaccharides (Choi et al, 2016; Joseph et al., 2017; Soldano et al, 2016; Xiao et al, 2022). Other pharyngeal GRNs are stimulated by sugars and contribute to sugar consumption (Chen & Dahanukar, 2017; Chen et al, 2021; LeDue et al., 2015). Remarkably, a pharyngeal GRN in each of the two LSOs functions in the rejection rather the acceptance of sucrose (Joseph et al., 2017).”. (lines 74-89)

      Line 89, aversive --> aversion

      We changed this part.

      Line 90, gain of aversion capsaicin avoidance suggests they are sufficient for avoidance, not essential for avoidance.

      We changed “essential” to “sufficient.”. (line 100)

      Line 104, what are you recording from here? Labellar or pharyngeal GRNs

      We added “S-type and L-type sensilla” to the sentence. (line 119)

      Line 107, How are A GRNS marked with tdTomato? It is important to mention how you are defining A GRNs.

      We modified the sentence as follows: “Using Ir56b-GAL4 to drive UAS-mCD8::GFP, we also confirmed that the reporter was restricted to a subset of Class A GRNs, which were marked with LexAop-tdTomato expressed under the control of the Gr64f-LexA (Figure 1—figure supplement 1D—F).”. (lines 120-123)

      Line 124, should read "concentrated as sea water."

      We made the change. (line 142)

      Line 125, I am not sure what is meant by "alarm neurons"

      We changed “additional pain or alarm neurons” to “nociceptive neurons.”. (line 144)

      Line 141, Are you definitely A GRNs as only labellar GRNs, i.e. the Gr5a-GAL4 pattern with labellar plus few pharyngeal GRNs? Or are the defining it as Gr64f-GAL4 (i.e. labellar plus many pharyngeal GRNs)

      We refer to the Class A—E GRNs as labellar GRNs. Therefore, in this instance, we removed the reference to A GRNs and B GRNs, and simply mention the drivers that we used (Gr5a-GAL4 and Gr66a-GAL4) to express UAS-CsChrimson. The modified sentence is, “As controls we drove UAS-CsChrimson under control of either the Gr5a-GAL4 or the Gr66a-GAL4.”. (lines 51-59, 160-161)

      Line 180, labellar hairs--> labellar taste bristles

      We made the change. (line 204)

      Line 190, possess only --> only possess

      We made the change. (line 216)

      Line 202, Should this read increased?

      Yes. We changed “reduced” to “increased.”. (line 225)

      Line 206, The information provided here and in reference 47 was not sufficient for me to understand how the Droso-X system works and whether it has been validated. Better diagrams and much more description is required for the reader to understand this system and assess its validity

      We now explain that the DrosoX “system consists of a set of five separately housed flies, each of which is exposed to two capillary tubes with different liquid food options. One capillary contained 100 mM sucrose and the other contained 100 mM sucrose mixed with 300 mM NaCl. The volume of food consumed from each capillary is then monitored automatically over the course of 6 hours and recorded on a computer.”. (lines 238-243)

      Line 218-219, It would be helpful to expand on this to explain how the previous paper detected no difference. Is this because the contact time with the food is the same but the rate of ingestion is slower?

      Yes. This is correct. We now clarify this point by stating that, “In a prior study, it was observed that the repulsion to high salt exhibited by the Ir60b mutant was indistinguishable from wild-type (Joseph et al., 2017). Specifically, the flies were presented with drop of liquid (sucrose plus salt) at the end of a probe, and the Ir60b mutant flies fed on the food for the same period of time as control flies (Joseph et al., 2017). However, this assay did not discern whether or not the volume of the high salt-containing food consumed by the Ir60b mutant flies was reduced relative to control flies. Therefore, to assess the volume of food ingested, we used the DrosoX system, which we recently developed (Figure 3—figure supplement 1A) (Sang et al, 2021). This system consists of a set of five separately housed flies, each of which is exposed to two capillary tubes with different liquid food options. One capillary contained 100 mM sucrose and the other contained 100 mM sucrose mixed with 300 mM NaCl. The volume of food consumed from each capillary was then monitored automatically over the course of 6 hours and recorded on a computer. We found that control flies consuming approximately four times more of the 100 mM sucrose than the sucrose mixed with 300 mM NaCl (Figure 3A). In contrast, the Ir25a, Ir60b, and Ir76b mutants consumed approximately two-fold less of the sucrose plus salt (Figure 3A). Consequently, they ingested similar amounts of the two food options (Figure 3B; ingestion index). Thus, while the Ir60b mutant and control flies spend similar amounts of time in contact with high salt-containing food when it is the only option (Joseph et al., 2017), the mutant consumes considerably less of the high salt food when presented with a sucrose option without salt.”. (lines 226-251)

      Lines 231-235, Is this evidence for this, that Ir60b expression in the Ir25a or Ir76b pattern will induce high salt responses in the labellum? You should elaborate on this to clearly state what you mean rather than implying it. I do not think that overexpression of one Ir is enough evidence for this sweeping conclusion.

      We agree. We eliminated this point. (lines 227-232)

      Lines 261-263, Please elaborate here, how did you target the I-type sensilla and where are these neurons? So they already express Ir76b and Ir25a?

      We now explain in the Results that, “We attempted to induce salt activation in the I-type sensilla by ectopically expressing Ir60b, under control of the Gr33a-GAL4. Gr33a is co-expressed with Gr66a (Moon et al., 2009), which has been shown to be co-expressed Ir25a and Ir76b (Li et al., 2023). When we performed tip recordings from I7 and I10 sensilla, we did not observe a significant increase in action potentials in response to 300 mM NaCl (Figure 4—figure supplement 1A), indicating that ectopic expression of Ir60b in combination with Ir25a and Ir76b is not sufficient to generate a high salt receptor.”. (lines 324-330)

      Lines 300-303, The discussion needs to be greatly expanded. What is the proposed mechanism by which the same neurons/receptors can inhibit sucrose and high salt feeding? What is the author's interpretation of what this study adds to our understanding of taste aversion?

      We have now added a Discussion section and greatly expanded the discussion.

      Reviewer #3 (Recommendations For The Authors):

      In line 73 there is a typo in "esophagus"

      We changed this part.

      In line 331, the use of a mixture of sucrose and "saponin" seems to be a mistake; "NaCl" is likely intended.

      We made the correction. (lines 546 and 640)

      On several occasions, the authors refer to the pharynx as a taste organ (for example 1st sentence of the abstract). I am not sure this is correct, the actual pharyngeal taste organs are the LSO, DSCO, and VSCO which are located in the pharynx.

      We made the corrections. (lines 24, 90, 92, 93, and 356)

      In line 155 the authors refer to Ir25a and Ir76b as "broadly tuned". I think it is not correct to refer to co-receptors this way, I'd suggest to just call them co-receptors.

      We made the correction. (lines 177-178)

      In line 182, stating "Gr2a is also expressed in the proboscis" is unclear. Clarify whether it refers to sensillae, pharyngeal taste organs, etc.

      We clarified it refers to pharyngeal taste organs. (lines 206-207)

      Line 253: "These finding imply that all three Irs are coexpressed in the pharynx." "The pharynx" is very unspecific, did the authors mean to say "the same neuron"?

      We now clarify by saying “in the Ir60b GRN in the pharynx.”. (line 317)

      Figures & Legends

      I found it confusing that the same color scale is being reused for different panels with different meanings repeatedly and in inconsistent ways. For example in Figure 2, red and blue are being used for Ir25a² mutants, while blue is also being used for Gr64f-Gal4 and S type sensilla. It is also not easily visible nor mentioned in the caption which of the 3 color scales presented belong to which panels.

      We modified the colors in the figures so that they are used in a consistent way. We now also define the colors in the legends.

      In Figure 2 F-I, indicating the stimulus sequence in each panel would enhance clarity. The color scale in Figure 3 could benefit from explicit explanations of different shades in the caption for easier interpretation.

      For example: "The ingestion of (a, dark color) 100 mM sucrose alone and (b, light color) in combination with 300 mM"

      We made the suggested modification.

      In Figure 4a the authors highlight that Ir76b and Ir25a label 2 neurons in the LSO. Did the imaging in 4c also capture the second cell, and if so did it respond to their stimulation?

      No, the focal plane differs, and the signal in Figure 4C is considerably weaker compared to the immunohistochemistry shown in Figure 4A. Notably, the other neuron did not exhibit a response to NaCl.

      In Figure 4f a legend for the color scale is missing, or the color might not be necessary at all. Also, the asterisks seem to be shifted to the right.

      We fixed the shifted asterisks and eliminated the color.

      Figure 4i is mislabeled 4f

      We made the correction.

    2. eLife assessment

      This valuable study on the molecular and cellular mechanisms of ingestion avoidance of high salt in insects is focused in scope, but the authors present convincing evidence that a specific subset of gustatory receptors in a pair of pharyngeal taste neurons are necessary and sufficient for avoiding ingestion of high salt during feeding. This work will be of interest to Drosophila neuroscientists interested in taste coding and feeding behavior.

    3. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Sang et al. proposed a pair of IR60b-expressing pharyngeal neurons in Drosophila use IR25a, IR76b, and IR60b channels to detect high Na+ and limit its consumption. Some of the key findings that support this thesis are: 1) animals that lacked any one of these channels - or with their IR60b-expressing neurons selectively silenced - showed much reduced rejection of high Na+, but restored rejection when these channels were reintroduced back in the IR60b neurons; 2) animals with TRPV artificially expressed in their IR60b neurons rejected capsaicin-laced food whereas WT did not; 3) IR60b-expressing neurons exhibited increased Ca2+ influx in response to high Na+ and such response went away when animals lacked any of the three channels.

      The experiments were thorough and well designed and further improved after revision. The results are compelling and support the main claim. The development and the use of the DrosoX two-choice assay put forward for a more quantitative and automatic/unbiased assessment for ingestion volume and preference.

    4. Reviewer #2 (Public Review):

      Summary:

      In this paper, Sang et al. set out to identify gustatory receptors involved in salt taste sensation in Drosophila melanogaster. In a two-choice assay screen of 30 Ir mutants, they identify that Ir60b is required for avoidance of high salt. In addition, they demonstrate that activation of Ir60b neurons is sufficient for gustatory avoidance using either optogenetics or TRPV1 to specifically activate Ir60b neurons. Then, using tip recordings of labellar gustatory sensory neurons and proboscis extension response behavioral assays in Ir60b mutants, the authors demonstrate that Ir60b is dispensable for labellar taste neuron responses to high salt and the suppression of proboscis extension by high salt. Since external gustatory receptor neurons (GRNs) are not implicated, they look at Poxn mutants, which lack external chemosensory sensilla but have intact pharyngeal GRNs. High salt avoidance was reduced in Poxn mutants but was still greater than Ir60b mutants, suggesting that pharyngeal gustatory sensory neurons alone are sufficient for high salt avoidance. The authors use a new behavioral assay to demonstrate that Ir60b mutants ingest a higher volume of sucrose mixed with high salt than control flies do, suggesting that the action of Ir60b is to limit high salt ingestion. Finally, they identify that Ir60b functions within a single pair of gustatory sensory neurons in the pharynx, and that these neurons respond to high salt but not bitter tastants.

      Strengths:

      A great strength of this paper is that it rigorously corroborates previously published studies that have implicated specific Irs in salt taste sensation. It further introduces a new role for Ir60b in limiting high salt ingestion, demonstrating that Ir60b is necessary and sufficient for high salt avoidance and convincingly tracing the action of Ir60b to a particular subset of gustatory receptor neurons. Overall the authors have achieved their aim by identifying a new gustatory receptor involved in limiting high salt ingestion. They use rigorous genetic, imaging, and behavioral studies to achieve this aim, often confirming a given conclusion with multiple experimental approaches. They have further done a great service to the field by replicating published studies and corroborating the roles of a number of other Irs in salt taste sensation.

    5. Reviewer #3 (Public Review):

      Sang et al. successfully demonstrate that a set of single sensory neurons in the pharynx of Drosophila promotes avoidance of food with high salt concentrations, complementing previous findings on Ir7c neurons with an additional internal sensing mechanism. The experiments are well-conducted and presented, convincingly supporting their important findings and extending the understanding of internal sensing mechanisms.

      The authors convincingly demonstrate the avoidance phenotype using different behavioral assays, thus comprehensively analyzing different aspects of the behavior. The experiments are straightforward and well-contextualized within existing literature.

    1. Author Response

      The following is the authors’ response to the original reviews.

      This study highlights new insights into the mechanism of pheochromocytoma pathogenesis that remains poorly understood. In the context of hereditary syndromes, such as multiple endocrine neoplasia 2 (MEN-2), where RET mutation is the major driver of thyroid, parathyroid, and adrenal pathologies, including pheochromocytoma, this mechanistic dissection of RET and TMEM127 is fundamentally sound. While the significance was deemed important, the strength of the evidence was found to be solid,

      Recognizing the limitations of models available for study of neuroendocrine cancers, and specifically for pheochromocytomas, we have revised and clarified the text of the current manuscript version and provide specific responses to the additional comments provided below, highlighting changes and new data.

      Reviewer #1 (Recommendations For The Authors):

      A current lack of pheochromocytoma cell lines and the use of generated cell lines for mechanistic studies presents a significant challenge that may undermine the inferred value of these findings in mock in vitro systems and question reproducibility in pheochromocytoma. Consideration for 3-dimensional patient-derived pheochromocytoma organoid in vitro and patient-derived organoid xenograft in vivo models will enable confirmation or refute novel findings described by the authors.

      We agree completely with Reviewer 1 that ideally, we should replicate these findings with PCC-derived cells in vitro and in organoids. Despite many attempts, PCC cell lines have proved a major challenge for the field of neuroendocrine cancers. Cell line models are not available and PDOs have proven poorly growing and resistant to manipulations, such as CRISPR KOs or siRNA KD. In studies completed since the submission and review of the present manuscript, and subsequently published elsewhere, we have shown that RET protein is highly expressed in TMEM127-mutant PCC by immunohistochemistry. We also showed that the TMEM127-KO SH-SY5Y cell model does grow more robustly than Mock-KO cells in nude mice and that RET inhibition (Selpercatinib) does lead to tumor regression (Guo et al., 2023), suggesting that our findings may be reproducible in vivo. These findings, and potential caveats of the cell models used have been further discussed in the text.

      Reviewer #2 (Recommendations For The Authors):

      Most notably, all experiments are conducted in an isogenic single-cell line. This exposes the whole story to be potentially confounded by unknown variables.

      In addition, studies would benefit from the adding back of TMEM127, or other methods to modulate endosome and plasma membrane dynamics to mechanistically secure the cause of the findings.

      As suggested by Reviewer 2, we have generated a TMEM127 KO in HEK293, an unrelated cell line which expressed low levels of TMEM127 but does not express RET. Consistent with our findings in SH-SY5Y, we saw increased membrane accumulation of endogenous membrane proteins N-cadherin and transferrin receptor-1 in these cells in the absence of TMEM127. Additionally, re-expression of a wildtype TMEM127 (FLAG-TMEM127) in these cells led to dramatic decreases in membrane localization of these proteins (Supplemental Figure 1D). These data suggest that membrane accumulation is indeed TMEM127 dependent, and that these processes are not directly dependent on RET expression.

      References

      Guo, Q., Z.M. Cheng, H. Gonzalez-Cantu, M. Rotondi, G. Huelgas-Morales, P. Ethiraj, Z. Qiu, J. Lefkowitz, W. Song, B.N. Landry, H. Lopez, C.M. Estrada-Zuniga, S. Goyal, M.A. Khan, T.J. Walker, E. Wang, F. Li, Y. Ding, L.M. Mulligan, R.C.T. Aguiar, and P.L.M. Dahia. 2023. TMEM127 suppresses tumor development by promoting RET ubiquitination, positioning, and degradation. Cell Rep. 42:113070.

    2. eLife assessment

      This valuable paper provides convincing evidence that loss of the tumor suppressor TMEM127 causes disorganization of plasma membrane lipid domains, alters clathrin assembly, and inhibits endocytosis of a variety of cell surface receptors, leading to increased cell surface levels of signaling proteins including RET and other transmembrane receptor tyrosine kinases. The results are significant for understanding how RET127 loss contributes to pheochromocytoma, although the evidence is indirect owing to the lack of human pheochromocytoma cell lines. The results will be of interest for researchers studying pheochromocytoma and endocytosis mechanisms.

    3. Reviewer #4 (Public Review):

      Summary:

      Walker et al. investigated the function of TMEM127 on RET regulation and function that could contribute to the development of pheochromocytoma (PCC). The authors showed that deletion of TMEM127 causes RET protein accumulation on the cell surface and, thereby, increased its constitutive ligand-independent activity and downstream signaling. They also unveiled the mechanism of how TMEM127 regulates cell membrane dynamics, particularly focusing on clathrin distribution and its effects on cargo internalization.

      Strengths:

      They showed that the deletion of TMEM127 affected multiple classes of transmembrane proteins, including RTKs (RET, EGFR), cell adhesion molecules (N-Cadherin, Integrin Beta-3), and carrier proteins (Transferrin Receptor-1), suggesting a global effect on cell surface proteins. This, at least in part, may explain how TMEM127 mutations act as drivers in PCC as well as in other cancers, such as renal cell carcinoma, where RET is not highly expressed. Overall, these findings provide new insights into the understanding of pheochromocytoma pathogenesis and potentially other cancers.

      Weaknesses:

      The major weakness of this study is the lack of human PCC cell lines for evaluating the function of TMEM127. Currently, the cell line models for pheochromocytoma are unavailable, and manipulation of patient-derived organoids is challenging. To complement this weakness, they provided immunohistochemical data showing that RET protein is highly expressed in TMEM127-mutant PCC.

      Furthermore, some of the authors in this manuscript recently published a paper titled 'TMEM127 suppresses tumor development by promoting RET ubiquitination, positioning, and degradation' (Guo et al. Cell Reports 42, 113070, 2023, which is also cited in the current manuscript). In this manuscript, they showed that TMEM127 binds to RET and recruits the NEDD4 E3 ubiquitin ligase for RET ubiquitination and degradation via TMEM127. In general, the ubiquitination of proteins is highly specific to each molecule. In the current version of the manuscript, there is no description of the relevance between these two potentially different mechanisms (clathrin-mediated or ubiquitin-mediated) of accumulating RET and/or other proteins mentioned in two separate papers. I believe the authors should at least discuss this.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript by DeHaro-Arbona et al., the authors wish to understand how a signaling pathway (Notch) is dynamically decoded to elicit a specific transcriptional output. In particular, they investigate the kinetic properties of Notch-responsive nuclear complexes (the DNA binding factor CSL and its co-activator Mastermind (mam) along with several candidate interacting partners). Their experimental model is the polytene chromosome of the Drosophila salivary gland, in which the naturally inactive Notch can be artificially induced through the expression of a constitutively active form of Notch.

      The authors develop a series of CRISPR and transgenic lines enabling the live imaging of these complexes at a specific locus and in various backgrounds (genetic perturbations/drug treatments). This quantitative live imaging data suggests that Notch nuclear complexes form hubs, and the authors characterize their binding dynamics. Interestingly, they elegantly demonstrate that the content of these hubs and their kinetic properties can evolve, even within Notch ON cells. Hence, they propose the existence of distinct hubs, distinguishing an open (CSL), engaged (CSK-Mam), or active (CSL-Mam-Med-PolII) configuration in Notch ON cells and an inactive hub (in Notch OFF having previously been exposed to Notch) state, that would explain the surprising transcriptional memory that the authors observe hours after Notch withdrawal.

      We thank the reviewer for this constructive summary of our work

      Reviewer #2 (Public Review):

      The manuscript from deHaro-Arbona et al, entitled "Dynamic modes of Notch transcription hubs conferring memory and stochastic activation revealed by live imaging the co-activator Mastermind", uses single molecule microscopy imaging in live tissues to understand the dynamics and molecular determinants of transcription factor recruitment to the E(spl)-C locus in Drosophila salivary gland cells under Notch-ON and -OFF conditions. Previous studies have identified the major players that are involved in transcription regulation in the Notch pathway, as well as the importance of general transcriptional coregulators, such as CBP/P300 and the Mediator CDK module, but the detailed steps and dynamics involved in these processes are poorly defined. The authors present a wealth of single molecule data that provides significant insights into Notch pathway activation, including:

      (1) Activation complexes, containing CSL and Mam, have slower dynamics than the repressor complexes, containing CSL and Hairless.

      (2) Contribution of CSL, NICD, and Mam IDRs to recruitment.

      (3) CSL-Mam slow-diffusing complexes are recruited and form a hub of high protein concentrations around the target locus in Notch-ON conditions.

      (4) Mam recruitment is not dependent on transcription initiation or RNA production.

      (5) CBP/P300 or its associated HAT activity is not required for Mam recruitment.

      (6) Mediator CDK module and CDK8 activity are required for Mam recruitment, and vice-versa, but not CSL recruitment.

      (7) Mam is not required for chromatin accessibility but is dependent on CSL and NICD.

      (8) CSL recruitment and increased chromatin accessibility persist after NICD removal and loss of Mam, which confers a memory state that enables rapid re-activation in response to subsequent Notch activation.

      (9) Differences in the proportions of nuclei with both Pol II and with Mam enrichment, which results in transcription being probabilistic/stochastic. These data demonstrate that the presence of Mamcomplexes is not sufficient to drive all the steps required for transcription in every Notch-ON nucleus.

      (10) The switch from more stochastic to robust transcription initiation was elicited when ecdysone was added.

      Overall, the manuscript is well written, concise, and clear, and makes significant contributions to the Notch field, which are also important for a general understanding of transcription factor regulation and behavior in the nucleus. I recommend that the authors address my relatively minor criticisms detailed below.

      We thank the reviewer for their thorough and constructive summary of our work. We are glad that they overall found it insightful and interesting. Below we have addressed the points they have raised.

      Page 7, bottom. The authors speculate, "It is possible therefore that, once recruited, Mam can be retained at target loci independently of CSL by interactions with other factors so that it resides for longer." Is it possible that another interpretation of that data is that Mam is a limiting factor?

      As indicated our comment is a speculation and is based on the observations summarized in the paragraph. We are not entirely sure what the reviewer is proposing as an alternate model. However, if it relates to the relative concentrations of the different factors, this would not account for the differences in trajectory durations. And for most aspects of our analysis, K[off] has the most profound influence on the results. Furthermore, differences persist even when CSL levels are considerably reduced (as in conditions with Hairless RNAi).

      Page 9. The authors write, "A very low level of enrichment was evident for... for the CSL Cterminus..". The recruitment of CSL ct IDR does not appear to be statistically significant or there is no apparent difference (Figure S2C), suggesting the CSL ct IDR does not play a role in enrichment.

      We agree with the comments of the reviewer and have adjusted the text on page 9 accordingly.

      Page 9. The authors write, "Notably, MamnIDR::GFP fusion was present in droplets, suggesting it can self-associate when present in a high local concentration (Figure S2B)." Is this result only valid for Mam nIDR or does full-length Mam also localize into droplets, as has been previously observed for full-length mammalian Maml1 in transfected cells?

      We agree that the observed foci of MamL1 that have been detected in mammalian cells are interesting. We have not tried to replicate those data because the large size of Mam has made it challenging to produce a full-length form in over-expression. We note however that another portion of Mam, MamIDR, does not make droplets when over-expressed despite it containing a large section of the disordered region of the Drosophila Mam. We have now included a comment about the mammalian data in the text (page 9) to put our findings in context.

      Previous studies in mammalian cells suggest that Maml1 is a high-confidence target for phosphorylation by CDK8, see Poss et al 2016 Cell Reports https://doi.org/10.1016/j.celrep.2016.03.030. By sequence comparison, does fly Mam have similar potential phosphorylation sites, and might these be critical for Mam/CDK module recruitment?

      We thank the reviewer for highlighting this point. Indeed, we were very excited when we learnt that MamL1 was found to be a high confidence CDK8 target and we looked hard in the Mam sequence for potential phosphorylation sites. Sadly, there is very little conservation between the fly and the mammalian proteins beyond the helical region that contacts CSL and NICD. Furthermore, there are no identifiable putative CDK8 phosphorylation sites based on conventional motifs. It therefore remains to be established whether or not Mam is a direct target of the CDK8 kinase activity. We have added an explanatory comment in the text (page 11).

      Page 11: The authors write, "The differences in the effects on Mam and CSL imply that the CDK module is specifically involved in retaining Mam in the hub, and that in its absence other CSL complexes "win-out", either because the altered conditions favour them and/or because they are the more abundant." Are the "other" complexes the authors are referring to Hairless-containing complexes? With the reagents the authors have in hand couldn't this be explicitly shown for CSLcomplexes rather than speculated upon?

      The reviewer is correct that CSL complexes containing Hairless are good candidates to be recruited in these conditions. We have compared the levels of Hairless at E(spl)-C following treatments with Senexin and have not detected a difference. However, it appears that the high proportion of unbound Hairless makes it difficult to detect/quantify the enrichment at E(spl)-C. We have therefore taken a different strategy, which is to measure the recruitment of a mutant form of CSL that is compromised for Hairless binding. Recruitment of the mutant CSL is detected in Notch-ON conditions, but is significantly reduced/absent following Senexin treatment. These data favour the model proposed by the reviewer that in the absence of CDK8 activity, the CSL-Hairless complexes win out. These new data have been added in new Supplementary Figure S3F and S3G (and see text page 11)

      Page 12/13: The authors write, "Based on these results we propose that, after Notch activity decays, the locus remains accessible because when Mam-containing complexes are lost they are replaced by other CSL complexes (e.g. co-repressor complexes)." Again, why not actually test this hypothesis rather than speculate? The dynamics of Hairless complexes following the removal of Notch would be very interesting and build upon previously published results from the Bray lab.

      We thank the reviewer for this comment and we agree it’s possible that the proportion of Hairless complexes increases after Notch withdrawal. However, for the reasons outlined above, it is difficult to quantify changes in Hairless, (and our preliminary experiment did not reveal any large-scale effect) and because of the complexity of the genetics we cannot straightforwardly extend the experiment to analyze the behaviour of the mutant CSL as above. Therefore, at present, we cannot say whether the loss of Mam is compensated by an increase in Hairless. We hope in future to investigate the characteristics of the memory in more depth.

      Page 13: The authors write, "As Notch removal leads to a loss of Mam, but not CSL, from the hub, it should recapitulate the effects of MamDN." While the data in Figure 5B seem to support this hypothesis, it's not clear to me that the loss of Mam and MamDN should phenocopy each other, bc in the case of MamDN, NICD would still be present.

      We apologise that this sentence was a bit misleading. We have now rewritten it to improve accuracy (page 13) “As Notch removal leads to a loss of Mam, but not CSL, from the hub, we hypothesised it would recapitulate the effects of MamDN on chromatin accessibility and transcription of targets.”

      The temporal dynamics for Mam recruitment using the temperature- and optogenetic-paradigms are quite different. For example, in the optogenetic time course experiments, the preactivated cells are in the dark for 4 hours, while in the temperature-controlled experiments, there is still considerable enrichment of Mam at 4 hours. For the preactivated optogenetic experiments, how sure are the authors that Mam is completely gone from the locus, and alternatively, can the optogenetic experimental results be replicated in the temperature-controlled assays? My concern is whether the putative "memory" observation is just due to incomplete Mam removal from the previous activation event.

      We appreciate the concerns of the reviewer. However, we are confident that the 4-hour optogenetic inactivation is much more effective than the equivalent time for temperature shifts. The temperature sensitive experiment involves a longer decay, because not only the protein but also the mRNA has to decay to fully remove NICD activity. The optogenetic experiments, involve only protein decay and so are more acute. Furthermore, we have tested (and we show in Figure 5H) that Mam is fully depleted after 4 hours “Off” in the optogenetic experiments.

      In order to further strengthen the evidence in favour of the memory hub, we have extended the time-frame further to show that CSL is retained at the locus even after 24 hours “Notch OFF” in both the temperature and the optogenetic paradigm. We have also measured the effects on transcription after a 24hr OFF period using the optogenetic paradigm and seen that robust transcription is initiated in cells that have experienced a previous activation (preactivated) compared to those that have not (naïve). These new data have been added to new Figure 5 C-F and strongly support the memory model.

      Reviewer #3 (Public Review):

      Summary:

      DeHaro-Arbona and colleagues investigate the in vivo dynamics of Notch-dependent transcriptional activation with a focus on the role of the Mastermind (MAM) transcriptional co-activator. They use GFP and HALO-tagged versions of the CSL DNA-binding protein and MAM to visualize the complex, and Int/ParB to visualize the site of Notch-dependent E(Spl)-C transcription. They make several conclusions. First, MAM accumulates at E(Spl)-C when Notch signaling is active, just like CSL. Second, MAM recruits the CDK module of Mediator but does not initiate chromatin accessibility. Third, after signaling is turned off, MAM leaves the site quickly but CSL and chromatin accessibility are retained. Fourth, RNA pol II recruitment, Mediator recruitment, and active transcription were similar and stochastic. Fifth, ecdysone enhances the probability of transcriptional initiation.

      Strengths:

      The conclusions are well supported by multiple lines of extensive data that are carefully executed and controlled. A major strength is the strategic combination of Drosophila genetics, imaging, and quantitative analyses to conduct compelling and easily interpretable experiments. A second major strength is the focus on MAM to gain insights into the dynamics of transcriptional activation specifically.

      We thank the reviewer for their positive comments about the strengths of our work.

      Weaknesses:

      Weaknesses are minor. There were no p-values reported for data presented in Figure S1D and no indication of how variable measurements were. In addition, the discussion of stochasticity was not integrated optimally with relevant literature.

      We thank the reviewer for noting these points. The statistical tests have now been included for Figure S1D (now Figure S1F). We have amplified the discussion about stochasticity, to include more reference to the literature and to make clear also the distinction with transcription bursting (page 19, 20).

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The authors have an elegant series of manipulations that provide strong evidence for their hypotheses and conclusions. Their exploitation of a unique biological system amenable to imaging in the larval salivary gland is well-considered and well-performed. Most of the conclusions are supported by the data. I only have the concerns below.

      (1) One of the main findings is the composition of Notch nuclear complexes and their interactions within a 'hub'. Yet most of the data showing hubs focus on labeling one protein component (+the locus or transcription), but multi-color imaging is rarely used to show how CSL-Mam, Mam-Med... protein signals coalescence to form a hub. Given the powerful tool developed, it would be important to show these multi-state hubs. Related to this, if the authors expect that hubs are formed independently of transcription or Notch pathway activation, do the authors see clustering at other non-specific loci in the nucleus? If not, can the authors comment on why they think that is the case? If so, do they demonstrate consistent residence time profiles with the tracked E(spl) locus?

      We apologise that it was not evident from the data shown that the proteins co-localize. First we stress that all the experiments are multicolor and most rely on very powerful methods to measure co-recruitment at a chromosomal locus- something that is very rarely achieved by others studying hubs. Second, we have in all cases confirmed that the proteins do colocalize. We have modified the diagram of our analysis pipeline to make more clear that this relies on multi-colour imaging, and adjusted all the figure labels to indicate the position of E(spl)-C. We have also added panels to new supplementary Figure S1C with examples of the co-localization between CSL and Mam and a plot confirming their levels of recruitment are correlated across multiple nuclei.

      We would like to clarify that our data show that the hubs do require Notch activation for their establishment. Other regions of enrichment are detected in Notch-ON conditions, but these are less prominent and, with no independent method for identifying them, can’t be compared between nuclei. In SPT experiments, other clusters with consistent residence are detected as reported in our recent paper which expanded on the SPT data (Baloul et al, 2023). We also detect co-localizations and “hubs” in other tissues, but those analyses are ongoing and beyond the scope of this paper.

      (2) The authors convincingly show that Notch hub complexes exhibit a memory. While the data showing rapid hub reformation upon Notch withdrawal are solid and convincing (Figure 5, in particular, F), the claim that this memory fosters rapid transcriptional reactivation is less clear. Yet in order to invoke transcriptional memory, it's necessary to solidify this transcriptional response angle. The authors should consider quantifying the changes in transcription activity (at the TS and not in the cytoplasm as currently shown), as well as the timing of transcriptional reactivation (with the MS2 system or smFISH). Manipulating the duration of the activation and dark recovery periods could help to draw a better correlation between the timing of hub reformation and that of transcriptional response and would also help determine how persistent this phenomenon is.

      We thank the reviewer for these suggestions. We have carried out several new experiments to probe further the persistence of memory and to show the effects on transcription when Notch is inactivated/reactivated. First, we have extended the time period for Notch inactivation by temperature control and show that the CSL hub persists even at 24 hours and that no transcription from the target E(spl)m3 is detected –neither at the transcription start-site nor in the cytoplasm. Second, we have extended the Notch OFF time period to 24 hours using the optogenetic approach and show that transcription is robustly reinitiated in preactivated nuclei when Notch is re-activated with 30 mins light treatment while little if any E(spl)m3 transcription is detected in naïve nuclei with the same treatment. These new data are included in new Figure 5 C-F and see page 13-14. Both these new experiments substantiate the model that the nuclei retain transcriptional memory.

      (3) The manuscript ends with the finding that the presence of a Mam hub does not always correlate with transcription. They conclude that transcription is initially stochastic. The authors find this surprising and even state that this could not be observed without their in vivo live imaging approaches. I don't understand why this result is surprising or unexpected, as we now know that transcription is generally a stochastic process and that most (if not all) loci are transcribed in a bursting manner. The fact that E(spl)-C locus is bursty is already obvious from the smFISH data. The fact that active nascent transcription does not correlate with local TF hubs was already observed in early Drosophila embryos (with Zelda hubs and two MS2 reporters, hb-MS2, sna-MS2). If, in spite of the inherent stochasticity of transcription (bursting), the data are surprising for other reasons, the authors should explain it better.

      We apologise that we had not made clear the reasons why the results were unexpected. We have substantially rewritten this section, and the discussion section, to clarify. We have also moderated the language used to better reflect the overall context of our results. We briefly summarise here. As the reviewer correctly states, it is well known that transcription is inherently bursty. Indeed the MS2 transcription profiles in “ON” nuclei are bursty, which likely reflects the switching of the promoter. However, in other contexts where we have monitored transcription although it is bursty it has nevertheless been initiated synchronously in response to Notch in all nuclei in a manner that was fully penetrant. What we observe in our current conditions, is that some nuclei never initiate transcription over the time-course of our experiments (2-3 hours), and those that are ON rarely switch off. This implies that there is another rate-limiting step. Supplying a second signal can modulate this so that it occurs with much higher frequency/penetrance. We consider this to be a second tier of regulation above the fundamental transcriptional bursting.

      The fact that Mam is recruited in all nuclei, whether or not they are actively transcribing was surprising because recruitment of the activation complex has been considered as the limiting step. This is somewhat different from Zelda, which is thought to be permissive and needed at an early step to prime genes for later activation rather than to be the last step needed to fire transcription. We note also that we are not monitoring the position of the hub with respect to the promoter, as in the Zelda experiments (Zelda hubs may still persist, but they are not overlapping with the nascent RNA), we are monitoring the presence or absence of Mam hub in proximity to a genomic region.

      Minor suggestions:

      (1) The genotypes of the samples should be indicated in the figure legends.

      We thank the reviewer for this suggestion. We have provided a table (new Table S3) where all of the genetic combinations are provided in detail for each figure. We considered that this approach would be preferable because it would be quite cumbersome to have the genotypes in each legend as they would become very long and repetitive.

      (2) While the schematic Fig1A explains how the locus is detected, the presence of ParS/ParB is never indicated in subsequent panels and Figure. I assume that all panels depicting enrichment profiles, use a given radius from the ParS/ParB dot to determine the zero of the x-axis (grey zone). This should be clearly stated in all panels/figure legends concerned.

      We apologies if this was not made explicit. Yes, all panels depicting enrichment profiles, use immunofluorescence signal from ParA/ParB recruitment to determine the zero of the x-axis. We have now marked this more clearly In all figures (grey bar, grey shading or labelled 0). All images where the locus is indicated by an arrowhead, by a coloured bar above the intensity plots or by grey shading in the graphs have been captured with dual colour and the signal from ParA/B recruitment used to define its location. This is now clearly stated in the analysis methods and in the legend. We have also modified the diagram in new supplementary Figure S1B, showing our analysis pipeline, to make that more explicit.

      (3) FRAP/SPT experiments: the author should provide more details. How many traces? Are traces showing bleaching removed?

      P7: does the statement ' The residences are likely an underestimation because bleaching and other technical limitations also affect track durations' imply that traces showing bleaching have not been removed from the analysis?

      The authors could justify the choice of the model for fitting FRAP/Spt experiments and be cautious about their interpretation. For example, interpreting a kinetic behavior as a DNA-specific binding event can be accurate, only if backed up with measurements with a mutant version of the DNA binding domain.

      We apologise if some of this information was not evident. The number of trajectories is provided in new Figure S1F, which indicates the number of trajectories analyzed for each condition in Figure 1.

      We have now added also the numbers of trajectories analyzed for the ring experiments.

      The comments on page 7 about bleaching refer to the technical limitations of the SPT approach. However, as bleached particles cannot be distinguished from those that leave the plane of imaging, they have not been filtered or removed. We have not sought to make claims about absolute residence times for that reason. Rather the point is to make a comparison between the different molecules. As the same fluorescent ligand and imaging conditions are used in all the experiments, all the samples are equivalently affected by bleaching. We subdivide trajectories according to their properties and infer that those which are essentially stationary are bound to chromatin, as is common practice in the field. We note that we have previously shown that a DNA binding mutant of CSL does not produce a hub at E(spl)-C in Notch-ON conditions and has a markedly more rapid recovery in FRAP experiments (Gomez-Lamarca et al, 2018) consistent with the slow recovery being related to DNA binding. This point has been added to the text (page 8).

      (4) The authors should quantify their RNAi efficiency for Hairless-RNAi, Med13-RNAi, white-RNAi, yellow-RNAi, CBP-RNAi, and CDK8-RNAi.

      We thank the reviewer for this comment. We have made sure that we are using well validated RNAis in all our experiments and have included the references in Table S2 where they have been used. We have now evaluated the knock-down in the precise conditions used in our experiments by quantitative RT-PCR and added those data, which show efficient knock-down is occurring, to new Supplementary Figure S1D and Figure S3J. We note also that the RNAi experiments are complemented by experiments inhibiting the complexes with specific drugs and that these yield similar results.

      (5) Figure 3 A: could the author show that transcription is indeed inhibited upon triptolide treatment with smFISH (with for example m3 probes)? Why not use alpha-amanitin?

      We thank the reviewer for this suggestion. We had omitted the smFISH data from this experiment in error. These data have now been added to new Supplementary Figure S3A and clearly show that transcription is inhibited following 1 hour exposure to triptolide. Triptolide is a very fast acting and very efficient inhibitor of transcription that acts at a very early step in transcription initiation. In our experience it is much more efficient than alpha-amanitin and is now the inhibitor of choice in many transcription studies.

      (6) Figure 4 typo: panel B should be D and vice versa. Accessibility panels are referred to as Figure 4D, D' in the text but presented as panel B in the Figure.

      We thank the reviewer for noting this mistake, it is now changed in the main text.

      (7) The authors must add their optogenetic manipulation protocol to their methods section.

      The method is described in detail in a recently published paper that reports its design and use. We have now also added a section explaining the paradigm in the methods (Page 31) as requested.

      (8) Figure 3G needs a Y-axis label.

      Our apologies, this has now been added.

      (9) The authors should note why there was a change of control in Figure 3D compared to 3E and G (yellow RNAi vs white RNAi).

      This is a pragmatic choice that relates to the chromosomal site of the RNAis being tested. Controls were chosen according to the chromosome that carries the UAS-RNAi: for the second chromosome this was yellow RNAi and for the third white RNAi. This is explained in the methods.

      (10) Figure 1 would benefit from a diagram describing the genomic structure of the E(spl) locus and the relative position of the labelled locus within it.

      We thank the reviewer for this suggestion and have added a diagram to Supplementary Figure S1A .

      Reviewer #2 (Recommendations For The Authors):

      Minor criticisms and typos:

      Pet peeve: in some of the figure panels they are labeled Notch ON or OFF, but in others they are not, albeit that info is included in the figure legend. For the ease of the reader/reviewer, would it be possible to label all relevant figure panels either Notch ON or OFF for clarity?

      We thank the reviewer for this suggestion and have modified the figures accordingly.

      Page 7, top. "In comparison to their average distribution across the nucleus, both CSL and Mam trajectories were significantly enriched in a region of approximately 0.5 μm around the target locus in Notch-ON conditions, reflecting robust Notch dependant recruitment to this gene complex." Are the authors referring to Figure 1D here?

      Thank you, this figure call-out has been added in the text.

      Page 9. "...reported to interact with p300 and other factors (Figure S2B)." I believe the authors mean Figure S2C and not S2B.

      Thank you, this has been corrected in the text.

      Page 9. There is no Figure S2D.

      Apologies, this was referring to Figure S1D, and is now corrected in the text.

      Page 11: "...were at very reduced levels in nuclei co-expressing MamDN (Figure 4B).." Should be Figure 4CD.

      Thank you, this has been corrected in the text.

      Page 12: "...which was maintained in the presence of MamDN (Figure 4D, D')." Should be Figure 4B.

      Thank you, this has been corrected in the text.

      Reviewer #3 (Recommendations For The Authors):

      In the Results section on Hub, the paragraph starting with "Third, we reasoned . ." the callout to Figure S2D should be Fig S1D.

      Thank you, this has been corrected in the text

      Figures: The font size in the Figures is so small that most words and numbers cannot be read on a printout. One has to go to the electronic version and increase the size to read it. This reviewer found that inconvenient and often annoying.

      We apologise for this oversight, the font size has now been adjusted on all the graphs etc.

      Figure legends: the legends are terse and in some cases leave explanations to the imagination (e.g. "px" in Figure 2E). It would be useful to go through them and make sure those who are not a Drosophila Notch person and not a transcription biochemist can make sense of them.

      Our apologies for the lack of clarity in the legends. We have gone over them to make them more accessible and less succinct.

    1. Author Response

      We are very pleased to hear the overall positive views and constructive criticisms of eLife Editors and Reviewers on our work. In particular, we appreciate their comments highlighting the value of our new pipeline for high-throughput quantification of fly embryonic movement and the positive views of reviewers and editors that our data on the roles of miR-2b-1 in embryonic movement are well supported.

      Regarding Reviewer 1, we thank them for their positive comments that our work is experimentally sound and well-written, their kind words on the value of our new embryonic movement pipeline, and their overall appreciation of the quality, scope, and significance of our work. In a revised version of the manuscript we will consider discussing and addressing some of the interesting points raised by Rev1.

      Turning to the comments by Rev2, we are grateful to them for their recognition of the novelty of our miRNA findings and appreciation of the utility of our novel quantitative pipeline for assessing embryonic movement. Nonetheless, we politely – but strongly – disagree with their suggestion that the findings are inflated by our language. For example, they criticise our use of the verb ‘control’, yet this is a standard textbook term in molecular biology to describe biological processes regulated by genetic factors: given that miR-2b-1 regulates movement patterns during embryogenesis, to say that miR-2b-1 ‘controls’ embryonic movement in the Drosophila embryo is reasonable and in line with the language used in the field. It is not inflation. In connection to other comments, in a revised manuscript we will propose a different name for the gene here described as Janus to avoid annotation issues at FlyBase due to other, unrelated genes that include this word as part of their names.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This work provides new mechanistic insights into the competitive inhibition in the mammalian P2X7 receptors using structural and functional approaches. The authors solved the structure of panda (pd) P2X7 in the presence of the classical competitive antagonists PPNDS and PPADS. They find that both drugs bind to the orthosteric site employed by the physiological agonist ATP. However, owing to the presence of a single phosphate group, they prevent movements in the flipper domain required for channel opening. The authors performed structure-based mutational analysis together with electrophysiological characterization to understand the subtype-specific binding of these drugs. It is known from previous studies that P2X1 and P2X3 are more sensitive to these drugs as compared to P2X7, hence, the residues adjacent to the ATP binding site in pdP2X7 were mutated to those present in P2X1. They observed that mutations of Q143, I214, and Q248 into lysine (hP2X1) increased the P2X7 sensitivity to PPNDS, whereas in P2X1, mutations of these lysines to alanine reduced sensitivity to PPNDS, suggesting that these key residues contribute to the subunit-specific sensitivity to these drugs. Similar experiments were done in hP2X3 to demonstrate its higher sensitivity to PPNDS. This preprint provides a useful framework for developing subtype-specific drugs for the family of P2X receptor channels, an area that is currently relatively unexplored.

      We appreciate the time and effort Reviewer #1 devoted to this review, and we have addressed the specific comments below.

      (1) Why was the crystallization construct of panda P2X7 used for structural studies instead of rat P2X7 with the cytoplasmic ballast which is a more complete receptor that is closely related to the human receptor? Can the authors provide a justification for this choice?

      We appreciate this comment. We did try to express the rat P2X7 receptor in its full-length form based on a previous report (Cell 2019, PMID: 31587896), but the expression of the receptor was not successful for an unknown reason. Instead, we employed a truncated construct of panda P2X7 based on the findings described another previous report (eLife 2016, PMID: 27935479). This truncated construct also possesses ATP-dependent channel activity (eLife 2016, PMID: 27935479). Thus, we understand that the full-length P2X7 construct would be preferable, particularly for addressing the function of the cytoplasmic domain; however, the main focus of this study was on PPNDS/PPNADS recognition and the associated structural changes in the ATP binding pocket, which we believe are less likely to be severely affected by truncation of the cytoplasmic domain. In support of this expectation, our mutational analyses are consistent with the structures in this study. Therefore, we believe that the use of the truncation construct in this study is justified.

      (2) Was there a good reason why hP2X1 and hP2X3 currents were recorded in perforated patches, whereas pdP2X7 currents were recorded using the whole-cell configuration? It seems that the extent of rundown is less of a problem with perforated patch recordings. Can the authors comment and perhaps provide a justification? It would also be good to present data for repeated applications of ATP alone using protocols similar to those for testing antagonists so the reader can better appreciate the extent of run down with different recording configurations for the different receptors.

      We thank the reviewer for bringing up this point. The whole-cell configuration is the most commonly used method in patch-clamp experiments; therefore, we used this method to record the current of pdP2X7 (Author response image 1). However, the whole-cell configuration is not suitable for all experiments; for example, the currents of P2X1 and P2X3 recorded by this method show a severe "rundown" effect. The "rundown" effect prevents accurate calculation of the inhibition rate of the antagonist, and to obtain more accurate results, we used perforated patches to record the currents of hP2X1 and hP2X3.

      Author response image 1.

      Representative current traces of pdP2X7, hP2X3, and hP2X1 after repeated applications of ATP. The pdP2X7 currents were recorded using the whole-cell configuration, and the hP2X1 and hP2X3 currents were recorded using perforated patches.

      (3) The data in Fig. S1, panel A shows multiple examples where the currents activated by ATP after removal of the antagonist are considerably smaller than the initial ATP application. Is this due to rundown or incomplete antagonist unbinding? It is interesting that this wasn't observed with hP2X1 and hP2X3 even though they have a higher affinity for the antagonist. Showing examples of rundown without antagonist application would help to distinguish these distinct phenomena and it would be good for the authors to comment on this in the text. It is also curious why a previous study on pdP2X7 did not seem to have problems with rundown (see Karasawa and Kawate. eLife, 2016).

      We thank the reviewer for bringing up this point. We believe that this difference may be the result of incomplete antagonist unbinding. A similar phenomenon has been observed in previous studies of pdP2X7 (eLife 2016, PMID: 27935479). In the previous experiment, the currents activated by ATP after removal of the antagonist A740003 did not return to the initial value upon ATP application, whereas activation by ATP after removal of the antagonist GW791343 immediately restored the initial value upon ATP application (Fig. 1C of eLife 2016, PMID: 27935479). This may be because different inhibitors dissociate differently from pdP2X7. In our experiments, we assumed that PPNDS/PPADS was not completely dissociated from P2X7 even after 20 min of elution. The activation of P2X7 by ATP without antagonists showed no rundown effect (Author response image 1); therefore, we calculated the inhibition rate of the antagonist according to the precontrol.

      (4) The written presentation could be improved as there are many instances where the writing lacks clarity and the reader has to guess what the authors wish to communicate.

      To address this comment, we made changes to the text, particularly by following the

      Recommendations for The Authors

      Reviewer #1 (Recommendations For The Authors):

      (1) The way the manuscript is written could be greatly improved. There are many confusing sections where the reader has to guess what the authors wish to convey. For example, on page 9 "In addition, the mutation of Val173 to aspartate, as observed in pdP2X7, significantly decreased the sensitivity to PPNDS (Fig. 6B)." It appears from this sentence that Asp is present in P2X7, which is incorrect, please rephrase. There are many more examples of confusing sentences that need to be carefully edited to improve comprehension.

      To address this comment, we extensively modified the text to avoid this kind of misunderstanding. Please see the manuscript file with the track changes.

      (2) Please use either a 1-letter or 3-letter code for amino acid residues throughout the manuscript to maintain uniformity.

      We made this correction throughout the revised manuscript.

      (3) In Figure 1 on the right side, including the nearby density and side chains for interacting residues of PPNDS and PPADS would give more information and reliability for the density of the drugs.

      We appreciate this comment. The corresponding information is shown in Fig. S7.

      (4) Typo: Figure S1, E, and F panels - please correct the y-axis label to Inhibition.

      We corrected the typo in Fig. S1.

      (5) Please rewrite the legends for Fig. S3 and S5. They are confusing. The figure shows 3D classification using Relion, however, the legend suggests it was done using Cryosparc. Please clarify.

      We apologize for the confusion. Before applying C3 symmetry, all steps including 3D classification were performed in Relion 3.1. With C3 symmetry, we performed further refinement using Cryosparc v4.2.1 by non-uniform refinement. We have corrected the figure legends accordingly.

      (6) For Fig. S3 and S5 increase the resolution and size of representative micrographs, and also please provide scale bars.

      We have corrected Figures S3 and S5 accordingly.

      (7) Please add the 3D classification protocol performed in Relion/Cryosparc in the methods section as well.

      We added the corresponding description to the revised manuscript (Lines 9-14, Page 16).

      (8) In Table S1, under the initial model the authors state 'this study' when they should report the use of 5U1L according to the methods section.

      We corrected Table S1 in accordance with this comment.

      (9) The authors should consider combining the raw data shown in Figure S1 in Figure 6 as it provides stronger support for the conclusions than the bar graphs shown in Figure 6B.

      We appreciate the comment and fully understand the intention of Reviewer #1. Nevertheless, we would like to keep Figure S1, since it was also mentioned earlier together with Figure 1. In addition, if we combine Figure S1 with Figure 6, the result would be too large to present as a single figure.

      (10) In Figure 6A, please provide colored labels for both P2X7 and P2X1 to aid comprehension of the structural models.

      Based on this comment, we corrected the labels in Figure 6.

      (11) In the discussion, the authors write about comparisons with the docking study by Huo et al. JBC, 2018. Can they show the superimposition of their EM model with the previous studies' docking model in a supplementary figure for more clarity?

      We appreciate the constructive comments. However, unfortunately, the docking model in the previous study (JBC 2018, PMID: 29997254) is not available, so it is not possible to show the superimposition.

      Reviewer #2 (Public Review):

      Summary:

      P2X receptors play pivotal roles in physiological processes such as neurotransmission and inflammation, making them promising drug targets. This study, through cryo-EM and functional experiments, reveals the structural basis of the competitive inhibition of the PPNDS and PPADS on mammalian P2X7 receptors. Key findings include the identification of the orthosteric site for these antagonists, the revelation of how PPADS/PPNDS binding impedes channel-activating conformational changes, and the pinpointing of specific residues in P2X1 and P2X3 subtypes that determine their heightened sensitivity to these antagonists. These insights present a comprehensive understanding that could guide the development of improved drugs targeting P2X receptors. This work will be a valuable addition to the field.

      Strengths and weaknesses:

      The combination of structural experiments and mutagenesis analyses offers a deeper understanding of the mechanism. While the inclusion of MD simulation is appreciated, providing more insights from the simulation might further strengthen this already compelling story.”

      We appreciate the time and effort Reviewer #2 devoted to this review, and we have addressed the specific comments below.

      Reviewer #2 (Recommendations For The Authors):

      (1) On page 3, the sentence "ATP analogs are the most competitive inhibitors of P2X receptors but are typically unsuitable due to a lack of high specificity in vivo," might need additional context. Could the authors clarify if they are referring to the unsuitability of ATP analogs for medical applications?

      To address this comment, we have rewritten the sentence as follows (Lines 13-16, Page 3):

      ATP analogs are most common among competitive inhibitors for P2X receptors; however, they are generally unsuitable for in vivo applications due to their relatively low specificity, which may result in off-target toxicity. This issue arises because the human body contains numerous ATP-binding proteins.

      (2) Fig. S1. I am curious why, for P2X7, the ATP-only current after removal of PPNDS/PPADS does not recover and become larger than the current in the presence of PPNDS/PPADS? Such behavior was not as pronounced in P2X1. Does that suggest PPNDS/PPADS might remain bound and can not be removed when the P2X7 channel is closed?

      We thank the reviewer for bringing up this point. We believe that this difference may be the result of incomplete antagonist unbinding. A similar phenomenon has been observed in previous studies of pdP2X7 (eLife 2016, PMID: 27935479). In the previous experiment, the currents activated by ATP after removal of the antagonist A740003 did not return to the initial value upon ATP application, whereas activation by ATP after removal of the antagonist GW791343 immediately restored the initial value upon ATP application (Fig. 1C of eLife 2016, PMID: 27935479). We strongly agree with the reviewer that this may be due to the difficulty of dissociating the antagonist from pdP2X7.

    1. Author Response:

      Reviewer #1 (Public Review):

      [...] Weaknesses are the absence of correlation between the results from the animal studies and human pancreatic cancers.

      Author response: We appreciate the reviewer’s attention to the importance of human pancreatic cancer studies. In a previous study (D’Amico et al. Genes & Development 2018 doi: 10.1101/gad.311852.118), we evaluated the expression of STAT3 in human pancreatic tissue microarrays and data from the Human Protein Atlas. Mutations in Stat3 are infrequent in human pancreatic cancers, however there is a trend of decreased STAT3 activity in poorly differentiated carcinomas.

      In the current study, STAT3 and SMAD4 gene signature scores (computed from KO KPC cells) were aligned with human pancreatic ductal adenocarcinoma samples from the TCGA cohort, and statistical analyses supported the selective antagonism of STAT3 and SMAD4 (Fig 4D, Fig 4E).

      The complex process of EMT is difficult to characterize rigorously in human cancers. Mouse models offer an opportunity to study the relationships between cancer phenotypes and genetic alterations.

      Reviewer #2 (Public Review):

      [...] While correlations are strong, the study would benefit from additional cause-and-effect type experiments. It would also be beneficial to better tie together the first and second parts of the paper.

      Author response: We understand the Reviewer’s interest in additional experiments that could further elucidate mechanisms that drive EMT and/or KRAS dependency in relation to STAT3 and TGF-beta antagonism. We previously investigated the development of mutant KRAS knockout tumors (Ischenko et al. Nature Communications 2021 doi:10.1038/s41467-021-21736) to find loss of KRAS promotes EMT, similar to loss of STAT3. Additional experiments are underway but are outside the scope of the current study.

      The first part of the paper is mechanistic and used KRAS-transformed mouse embryo fibroblasts to perform in vitro studies with foci formation. The cell-based foci formation assay has been shown to best evaluate malignant transformation and oncogenic potential. In the second part we transitioned to epithelial cells and pancreatic ductal adenocarcinomas to combine mechanistic relationships with genetic models.

    1. eLife assessment

      This study aims to explore the diabetes-bone paradox using the Mendelian Randomization approach. That diabetes itself is not the direct cause, but rather the complications or associated risk factors increase the risk of fracture, constitutes a valuable insight. Mendelian randomization to explain the relationship of two complex conditions is solid and conducted properly; however, the efforts to reconcile the discrepancies between the Mendelian Randomization analysis and observational studies are incomplete.

    1. Reviewer #3 (Public Review):

      Summary:

      The authors found two endosomal fusion modes by live cell imaging of endosomes in yolk sac lateral endoderm cells of 8.5-day-old embryonic mice and described the fusion modes by mathematical models and simulations. They also showed that actin polymerization is involved in the regulation of one of the fusion modes.

      Strengths:

      The strength of this study is that the authors' claims are well supported by beautiful live cell images and theoretical models. By using specialized cells, yolk sac visceral endoderm cells, the live images of endosomal fusion, localization of actin-related molecules, and validation data from multiple inhibitor experiments are clear.

      Weaknesses:

      This study does not include any assessment of whether the two types of endosome fusions claimed by the authors occur in general cells, so the article is limited to showing a phenomenon specific to yolk sac lateral endoderm cells. Also, the study does not show the physiological importance of the two types of fusion. There are some unclear points in the method of image analysis and some of the descriptions in the text are not logical.