26,869 Matching Annotations
  1. Apr 2024
    1. eLife assessment

      This fundamental study reports differential expression of key genes in full-term placenta between Tibetans and Han Chinese at high elevations, which are more pronounced in the placentae of male than in female fetuses. If validated as functionally relevant, these results will help us understand how human populations adapt to high elevation by mitigating the negative effects of low oxygen on fetal growth. While the differential gene expression analyses are solid, the downstream analyses offer incomplete support for the connection to hypoxia-specific responses and adaptive genetic variation.

    2. Joint Public Review:

      This manuscript by Yue et al. aims to understand the molecular mechanisms underlying the better reproductive outcomes of Tibetans at high altitude by characterizing the transcriptome and histology of full-term placenta of Tibetans and compare them to those Han Chinese at high elevations.

      The approach is innovative, and the data collected are valuable for testing hypotheses regarding the contribution of the placenta to better reproductive success of populations that adapted to hypoxia. The authors identified hundreds of differentially expressed genes (DEGs) between Tibetans and Han, including the EPAS1 gene that harbors the strongest signals of genetic adaptation. The authors also found that such differential expression is more prevalent and pronounced in the placentas of male fetuses than those of female fetuses, which is particularly interesting, as it echoes with the more severe reduction in birth weight of male neonates at high elevation observed by the same group of researchers (He et al., 2022).

      This revised manuscript addressed several concerns raised by reviewers in last round. However, we still find the evidence for natural selection on the identified DEGs--as a group--to be very weak, despite more convincing evidence on a few individual genes, such as EPAS1 and EGLN1.

      The authors first examined the overlap between DEGs and genes showing signals of positive selection in Tibetans and evaluated the significance of a larger overlap than expected with a permutation analysis. A minor issue related to this analysis is that the p-value is inflated, as the authors are counting permutation replicates with MORE genes in overlap than observed, yet the more appropriate way is counting replicates with EQUAL or MORE overlapping genes. Using the latter method of p-value calculation, the "sex-combined" and "female-only" DEGs will become non-significantly enriched in genes with evidence of selection, and the signal appears to solely come from male-specific DEGs. A thornier issue with this type of enrichment analysis is whether the condition on placental expression is sufficient, as other genomic or transcriptomic features (e.g., expression level, local sequence divergence level) may also confound the analysis.

      The authors next aimed to detect polygenic signals of adaptation of gene expression by applying the PolyGraph method to eQTLs of genes expressed in the placenta (Racimo et al 2018). This approach is ambitious but problematic, as the method is designed for testing evidence of selection on single polygenic traits. The expression levels of different genes should be considered as "different traits" with differential impacts on downstream phenotypic traits (such as birth weight). As a result, the eQTLs of different genes cannot be naively aggregated in the calculation of the polygenic score, unless the authors have a specific, oversimplified hypothesis that the expression increase of all genes with identified eQTL will improve pregnancy outcome and that they are equally important to downstream phenotypes. In general, PolyGraph method is inapplicable to eQTL data, especially those of different genes (but see Colbran et al 2023 Genetics for an example where the polygenic score is used for testing selection on the expression of individual genes).

      We would recommend removal of these analyses and focus on the discussion of individual genes with more compelling evidence of selection (e.g., EPAS1, EGLN1)

    1. eLife assessment

      This fundamental study provides insights into the mechanism controlling cell cycle reentry, establishing a regulatory role for Mecp2 degradation in shifting transcription from metabolic to proliferation genes during quiescence exit. The evidence, which includes experimental data from in vitro cell culture and an in vivo injury-induced liver regeneration model, is convincing but the trigger for MeCP2 degradation and how MeCP2 differentially regulates proliferation and metabolic genes remain unclear.

    2. Reviewer #1 (Public Review):

      In the study described in the manuscript, the authors identified Mecp2, a methyl-CpG binding protein, as a key regulator involved in the transcriptional shift during the exit of quiescent cells into the cell cycle. Their data show that Mecp2 levels were remarkably reduced during the priming/initiation stage of partial hepatectomy-induced liver regeneration and that altered Mecp2 expression affected the quiescence exit. Additionally, the authors identified Nedd4 E3 ligase that is required for downregulation of Mecp2 during quiescence exit. This is an interesting study with well-presented data that supports the authors' conclusions regarding the role of Mecp2 in transcription regulation during the G0/G1 transition. However, the significance of the study is limited by a lack of mechanistic insights into the function of Mecp2 in the process. This weakness can be addressed by identifying the signaling pathway(s) that trigger Mecp2 degradation during the quiescence exit.

    1. eLife assessment

      This valuable study reports that miR-199b-5p is elevated in human osteoarthritis patients. There is solid evidence for the finding that inhibiting miR-199b-5p alleviates symptoms in mice with knee osteoarthritis. Additionally, potential targets of miR-199b-5p are identified but whether miR-199b-5p truly functions through Fzd6 and/or Gcnt2 requires further investigation.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors reported that miR-199b-5p is elevated in osteoarthritis (OA) patients. They also found that overexpression of miR-199b-5p induced OA-like pathological changes in normal mice and inhibiting miR-199b-5p alleviated symptoms in knee OA mice. They concluded that miR-199b-5p is not only a potential micro target for knee OA, but also provides a potential strategy for future identification of new molecular drugs.

      Strengths:

      The data are generated from both human patients and animal models. The data presented in this revised manuscript is solid and support their conclusions. The questions from reviewers are also properly addressed and the quality of this manuscript has been significantly improved.

      There are no significant weaknesses identified in this revised manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      The Authors identified miR-199b-5p is a potential OA target gene using serum exosomal small RNA-seq from human healthy and OA patients. Their RNA-seq results were further compared with publicly available datasets to validate their finding of miR-199b-5p. In vitro chondrocyte culture with miR-199b-5p mimic/inhibitor and in vivo animal models were used to evaluate the function of miR-199b-5p in OA. The possible genes that were potentially regulated by miR-199b-5p were also predicted (i.e., Fzd6 and Gcnt2) and then validated by using Luciferase assays.

      Strengths:

      (1) Strong in vivo animal models including pain tests.<br /> (2) Validate the binding of miR-199b-5p with Fzd6 and binding of miR-199b-5p with Gcnt2

      The authors have addressed my concerns.

    1. eLife assessment

      The authors build upon prior data implicating the secreted peptidoglycan hydrolase SagA produced by Enterococcus faecium in immunotherapy. Leveraging new strains with sagA deletion/complementation constructs, the investigators reveal that sagA is non-essential, with sagA deletion leading to a marked growth defect due to impaired cell division, and sagA being necessary for the immunogenic and anti-tumor effects of E. faecium. In aggregate, the study utilizes compelling methods to provide both fundamental new insights into E. faecium biology and host interactions and a proof-of-concept for identifying the bacterial effectors of immunotherapy response.

    1. eLife assessment

      This important mouse study shows that wild-type female progeny of Khdc3 mutants have abnormal gene expression relating to hepatic metabolism, which persists over multiple generations and passes through both female and male lineages. Information about litter size and a full phenotypic description of the phenotype of each progeny should be included to evaluate the impact of KHDC3 mutation on the progeny; in its current state, the evidence for the authors' claims is incomplete. A role for small RNAs on this phenomenon is proposed but has not been functionally validated. The work will be of interest to researchers in the field of DNA-independent mechanism of inheritance. Mentioning the experimental organism in title and abstract would ensure that it targets the appropriate audience.

    2. Reviewer #1 (Public review):

      The key discovery of the manuscript is that the authors found that genetically wild type females descended from Khdc3 mutants shows abnormal gene expression relating to hepatic metabolism, which persist over multiple generations and pass through both female and male lineages. They also find dysregulation of hepatically-metabolized molecules in the blood of these wild type mice with Khdc3 mutant ancestry. These data provide solid evidence further support that phenotype can be transmitted to multiple generations without altering DNA sequence, supporting the involvement of epigenetic mechanisms. The authors further performed exploratory studies on the small RNA profiles in the oocytes of Khdc3-null females, and their wild type descendants, suggesting that altered small RNA expression could be a contributor of the observed phenotype transmission, although this has not been functionally validated.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript aimed to investigate the non-genetic impact of KHDC3 mutation on the liver metabolism. To do that they analyzed the female liver transcriptome of genetically wild type mice descended from female ancestors with a mutation in the Khdc3 gene. They found that genetically wild type females descended from Khdc3 mutants have hepatic transcriptional dysregulation which persist over multiple generations in the progenies descended from female ancestors with a mutation in the Khdc3 gene. This transcriptomic deregulation was associated with dysregulation of hepatically-metabolized molecules in the blood of these wild type mice with female mutational ancestry. Furthermore, to determine whether small non-coding RNA could be involved in the maternal non-genetic transmission of the hepatic transcriptomic deregulation, they performed small RNA-seq of oocytes from Khdc3-/- mice and genetically wild type female mice descended from female ancestors with a Khdc3 mutation and claimed that oocytes of wild type female offspring from Khdc3-null females has dysregulation of multiple small RNAs.

      Finally, they claimed that their data demonstrates that ancestral mutation in Khdc3 can produce transgenerational inherited phenotypes.

      However, at this stage and considering the information provided in the paper, I think that these conclusions are too preliminary. Indeed, several controls/experiments need to be added to reach those conclusions.

      Additional context you think would help readers interpret or understand the significance of the work<br /> • Line 25: this first sentence is very strong and needs to be documented in the introduction.<br /> • Line 48: Reference 5 is not appropriate since the paper shows the remodeling of small RNA during post-testicular maturation of mammalian sperm and their sensibility to environment. Please, change it<br /> • Line 51: "implies" is too strong and should be replaced by « suggests »<br /> • Line 67: reference is missing<br /> Database, the accession numbers are lacking.<br /> • References showing the maternal transmission of non-genetically inherited phenotypes in mice via small RNA need to be added<br /> • Line 378: All RNA-Seq and small RNA-Seq data are available in the NCBI GEO

    1. eLife assessment

      This important study uses the Jurkat T cell model to study the role of Formin-like 1 β phosphorylation at S1086 on actin dynamics and exosome release at the immunological synapse. While the evidence is compelling within the framework of the Jurkat model, it is limited in a broader immunological and cell-biological context due to the limitations of the model system. Jurkat is known to have a bias toward formin-mediated actin filament formation at the expense of Arp2/3-mediated branched F-actin foci observed in primary T cells. In this light, confirming major findings in primary T cells will be of importance.

    2. Reviewer #1 (Public Review):

      Summary:

      In their article entitled "Formin-like 1 beta phosphorylation at S1086 is necessary for secretory polarized traffic of exosomes at the immune synapse", Javier Ruiz-Navarro and co-workers address the question of the mechanisms regulating the polarization of the microtubule organizing center (MTOC) and of the multivesicular bodies (MVB) at the immunological synapse (IS) in T lymphocytes.

      This work is a follow-up of previous studies published by the same team showing that TCR-stimulated protein kinase C delta(PKCdelta) phosphorylates FMNL1beta, which plays a crucial role in cortical actin reorganization at the IS, and controls MTOC/MVB polarization and thus exosome secretion by T lymphocytes at the IS.

      The authors first compare the amino acid sequences of FMNL2 and of FMNL1beta, to seek similarities in the DID-DAD auto-inhibition sequences and find that the sequence surrounding S1086 in the arginine-rich DAD of FMNL1beta displays high similarity to that around S1072 in FMNL2 which is phosphorylated by PKCdelta. They then interrogate the role of the phosphorylation of S1086 in the arginine-rich DAD of FMNL1betaby introducing S1086A and S1086D mutations that, respectively, cannot be phosphorylated or mimic the phosphorylated form of FMNL1beta, in cells expressing an FMNL1 shRNA.

      Using these tools, they show that:

      - FMNL1beta is phosphorylated by PMA an activator of PKCs.

      - The S1086A mutant of FMNL1beta does not restore the defect in MTOC and MVB polarization at the IS present in FMNL1 deficient T cells, whereas the phosphomimetic mutant does.

      - Although FMNL1betaphosphorylation at S1086 is necessary, it is not sufficient for MTOC polarization, since it does not restore the defect of polarization observed in PKCdelta deficient T cells.

      - FMNL1b translocates to the IS. This neither requires PKC expression nor phosphorylation of S1086.

      - Phosphorylation of FMNL1betaon S1086 regulates actin remodeling at the immune synapse.

      - Phosphorylation of FMNL1betaon S1086 regulates secretion of extracellular vesicles containing CD63 by T lymphocytes.

      Strengths:

      This work shows for the first time the role of the phosphorylation of FMNL1beta on S1086 on the regulation of the IS formation and secretion of extracellular vesicles by T lymphocytes.

      Weaknesses:

      Although of interest, this work has several weaknesses. First, all the experiments are performed in Jurkat T cells that may not recapitulate the regulation of polarization in primary T cells. Moreover, all the experiments analyzing the role of PKCdelta are performed in one clone of wt or PKCdelta KO Jurkat cells. This is problematic since clonal variation has been reported in Jurkat T cells. Moreover, the remodeling of F-actin at the IS lacks careful quantification as well as detailed analysis of the actin structure in mutant cells. Finally, although convincing, the defect in the secretion of vesicles by T cells lacking phosphorylation of FMNL1beta on S1086 is preliminary. It would be interesting to analyze more precisely this defect. The expression of the CD63-GFP in mutants by WB is not completely convincing. Are other markers of extracellular vesicles affected, e.g. CD3 positive?

    3. Reviewer #2 (Public Review):

      Summary:

      The authors have addressed the role of S1086 in the FMNL1beta DAD domain in F-actin dynamics, MVB polarization, and exosome secretion, and investigated the potential implication of PKCdelta, which they had previously shown to regulate these processes, in FMNL1beta S1086 phosphorylation. This is based on:<br /> (1) the documented role of FMNL1 proteins in IS formation;<br /> (2) their ability to regulate F-actin dynamics;<br /> (3) the implication of PKCdelta in MVB polarization to the IS and FMNL1beta phosphorylation;<br /> (4) the homology of the C-terminal DAD domain of FMNL1beta with FMNL2, where a phosphorylatable serine residue regulating its auto-inhibitory function had been previously identified.

      They demonstrate that FMNL1beta is indeed phosphorylated on S1086 in a PKCdelta-dependent manner and that S1086-phosphorylated FMNL1beta acts downstream of PKCdelta to regulate centrosome and MVB polarization to the IS and exosome release. They provide evidence that FMNL1beta accumulates at the IS where it promotes F-actin clearance from the IS center, thus allowing for MVB secretion.

      Strengths

      The work is based on a solid rationale, which includes previous findings by the authors establishing a link between PKCdelta, FMNL1beta phosphorylation, synaptic F-actin clearance, and MVB polarization to the IS. The authors have thoroughly addressed the working hypotheses using robust tools. Among these, of particular value is an expression vector that allows for simultaneous RNAi-based knockdown of the endogenous protein of interest (here all FMNL1 isoforms) and expression of wild-type or mutated versions of the protein as YFP-tagged proteins to facilitate imaging studies. The imaging analyses, which are the core of the manuscript, have been complemented by immunoblot and immunoprecipitation studies, as well as by the measurement of exosome release (using a transfected MVB/exosome reporter to discriminate exosomes secreted by T cells).

      Weaknesses

      The data on F-actin clearance in Jurkat T cells knocked down for FMNL1 and expressing wild-type FMNL1 or the non-phosphorylatable or phosphomimetic mutants thereof would need to be further strengthened, as this is a key message of the manuscript. Also, the entire work has been carried out on Jurkat cells. Although this is an excellent model easily amenable to genetic manipulation and biochemical studies, the key finding should be validated on primary T cells.

    1. Reviewer #3 (Public Review):

      In this important work, the authors show compelling evidence that the Rapid Alkalinisation Factor1 (RALF1) peptide acts as an interlink between pectin methyl esterification status and FERONIA receptor-like kinase in mediating extracellular sensing. Moreover, the RALF1-mediated pectin perception is surprisingly independent of LRX-mediated extracellular sensing in roots. The authors also show that the peptide directly binds demethylated pectin and the positively charged amino acids are required for pectin binding as well as for its physiological activity.

      Some present findings are surprising; previously, the FERONIA extracellular domain was shown to bind pectin directly, and the mode of operation in the pollen tube involves the LRX8-RALF4 complex, which seems not the case for RALF1 in the present study. Although some aspects remain controversial, this work is a very valuable addition to the ongoing debate about this elusive complex regulation and signaling.

      The authors drafted the manuscript well, so I do not have a lot of criticism or suggestions. The experiments are well-designed, executed, and presented, and they solidly support the authors' claims.

    2. eLife assessment

      This fundamental study provides mostly convincing evidence for pectin modification as a requirement for RALF peptide signalling altering the apoplastic pH, adding further support for a key role of RALF peptides in linking the assembly and dynamics of the extracellular matrix with cellular activity and function. A small number of additional controls would further enhance the study.

    3. Reviewer #1 (Public Review):

      Summary:

      Rößling et al., report in this study that the perception of RALF1 by the FER receptor is mediated by the association of RALF1 with deesterified pectin, contributing to the regulation of the cell wall matrix and plasma membrane dynamics. In addition, they report that this mode of action is independent from the previously reported cell wall sensing mechanism mediated by the FER-LRX complex.

      This manuscript reproduces and aligns with the results from a recently published study (Liu et al., Cell) where they also report that RALF1 can interact with deesterified pectin, forming coacervates and promoting the recruitment of LLG-FER at the membrane.

    4. Reviewer #2 (Public Review):

      Summary:

      The study by Rößling et al. addresses the link between the biochemical constitution of the cell wall, in particular the methylesterification state of pectin with signalling induced by the extracellular RALF peptide. The work suggests that only in the presence of demethylesterifies pectin, RALF is able to trigger activation of its receptor FERONIA (FER).

      Remarkably, the application of RALF peptides leads to rather dramatic FER-dependent changes in wall integrity and plasma membrane invaginations not observed before. Interestingly, RALF can be out-titrated from the wall by short pectin fragments. In addition, the study provides further evidence for multiple FER-dependent pathways by showing the presence of LRX proteins is not required for the pectin/RALF mediated signalling.

      Strengths:

      This work provides fundamental insight into a complex emerging pathway, or perhaps several pathways, linking pectin sensing, pectin structure and RALF/FER signalling. The study provides convincing evidence that pectin methylesterase activity is required for RALF sensing, indicating that the physical interaction of RALFs with the cell wall is important for signalling. Beyond that, the study documents very clearly how profoundly RALF signalling can affect cell wall integrity and membrane topology.

      Weaknesses:

      The genetic material used by the authors to strengthen the connection of RALF signalling and PME activity might not be as suitable as an acute inhibition of PME activity.

      The PMEI3ox line generated by Peaucelle et al., 2008 is alcohol-inducible. Was expression of the PMEI induced during the experiments? As ethanol inducible systems can be rather leaky, it would not be surprising if PME activity would be reduced even without induction, but maybe this would warrant testing whether PMEI3 is actually overexpressed and/or whether PME activity is decreased. On a similar note, the PMEI5ox plants do not appear to show the typical phenotype described for this line. I personally don't think these lines are necessary to support the study. Short-term interference with PME activity (such as with EGCG) might be more meaningful than life-long PMEI overexpression, in light of the numerous feedback pathways and their associated potential secondary effects. This might also explain why EGCG leads to an increase in pH, as one would expect from decreased PME activity, while PMEI expression (caveats from above apply) apparently does not (Fig 3A-D).

      At least at first sight, the observation that OGs are able to titrate RALF from pectin binding seems at odds with the idea of cooperative binding with low affinity, leading to high avidity oligomers. Perhaps the can provide a speculative conceptual model of these interactions?

      I could not find a description of the OG treatment/titration experiments, but I think it would be important to understand how these were performed with respect to OG concentration, timing of the application, etc.

    1. Reviewer #1 (Public Review):

      Summary:

      The authors report evidence for a microprotein of AtHB2-miP. The authors came across HB2 in a screen for alternative transcription start sites in Arabidopsis in response to white light or a white light followed by a far red light representative of shade. Out of 337 potential microproteins, authors selected AtHB2. At the beginning of the manuscript, it is investigated that an alternative transcription start site of HB2 gene can be used in response to far red light. The resulting shorter protein form seems to interact with HB2 protein forms, altering the localization of HB2 in transient expression assays. The functionality of HB2-miP overexpression has been addressed in transgenic Arabidopsis lines using a 35S promoter. The responses and phenotypes were compared with either WT or various types of athb2 mutant lines with disrupted HB2 gene. Such mutants and the 35S promoter-driven AtHB2-miP line showed various types of phenotypes versus each other that can be classified as mild or none, e.g. small effects on root growth, iron homeostasis gene expression, and iron contents.

      Strengths:

      The authors performed an interesting screen for alternative transcription start sites which resulted in 337 candidates (Figure 1A). Principally, it can be interesting to find that plants may use alternative start sites for HB2 in response to shading light. The authors provide evidence that alternative transcription start sites of HB2 can be present and used in response to FR. The possibility that potentially resulting small protein may have effects under FR light, causing alteration of root growth and physiology, is an interesting idea.

      Weaknesses:

      In the present manuscript, there are several signs of incomplete analysis.

      (1) The transient expression experiments are not conducted with much detail to demonstrate that indeed HB2 miP is produced and can interact with regular protein. The localization of HB2 was found to be linked with condensates, but perhaps not in the presence of HB2 miP. Clearly, the lack of quantitative and qualitative analysis hampers a clear assessment of this point.

      (2) The authors, unfortunately, did not provide the data of the screen to demonstrate which concrete candidates may have miPs and whether there is enrichment of certain functions. There is no supplemental table accompanying Figure 1A.

      (3) One of the major unclear points that is also not addressed in the discussion is that the function of miR is studied in overexpression plants (35S promoter::miP). The effects are only compared to wild type and various lines of HB2 knockouts or knockdowns, partly with fairly uncharacterized phenotypes. It can now not be clearly determined whether the miP effects are due to a regular function of miP or due to overexpression of it. A needed control would be a 35S::AtHB2 line, or better at least two different lines (only a single miP overexpression line investigated). Since it has not been assessed by deletion mutant analysis to determine which protein parts of miP are involved in the protein regulation, it cannot be ruled out that the observed miP effects are not naturally occurring but the result of ectopic expression of a protein. Clearly, the effect of miP would be ideally studied in an environment where the levels can be controlled and the resulting phenotypes and protein levels quantified.

      (4) It is not shown that the microprotein is generated in Arabidopsis in response to shade, e.g. through Western or fluorescence protein detection. The main idea that authors want to claim, namely that miP binds with regular protein and thereby controls its localization or activity has not been addressed in Arabidopsis. There are no localization experiments of HB2 protein data in the presence of miP in Arabidopsis.

      (5) The plants with altered HB2 forms seem to grow well and the recorded phenotypes are rather minor. Photos are not shown. At some point, the authors discuss that there could be redundancy or that HB miP might interact with other HB proteins. However, such protein interactions have not been experimentally investigated.

    2. Reviewer #2 (Public Review):

      The first portion of the manuscript centered on identifying and confirming the ATHB2 microprotein (ATHB2miP), which constitutes the core message of this study. Overall, I find no issue with the selection criteria employed for identifying alternative microprotein mRNA transcripts. However, I do have some queries that I hope the authors can address for clarity.

      (1) Upon reviewing the supplemental dataset where the authors listed the 377 unique novel miPs, along with those specifically in WL or shade treatments, I sought to comprehend the rationale behind focusing on ATHB2. Have the authors examined the shade response of all 377 potential microprotein candidates? Readers may be intrigued to learn how many of these candidates exhibit induction or repression under shade conditions, and whether such changes correlate positively or negatively with alterations in the full-length TSSs in response to shade. Essentially, I aim to discern the prevalence of microprotein production during shade responses and any shared characteristics among these microprotein transcripts. This inquiry also aims to uncover the existence of a common mechanism regulating microprotein transcription.

      (2) To confirm that ATHB2miP stems from an independent transcription event, the authors sequenced full-length cDNAs using PacBio isoseq. However, I find the information regarding isoseq missing from the manuscript. My assumption is that the full-length cDNAs were reverse transcribed from mRNAs isolated from whole seedlings, where mature mRNAs in the cytoplasm predominate, making it challenging to evaluate whether a specific mRNA undergoes post-transcriptional processing. One approach to confirming ATHB2miP as a product of independent transcription involves examining nascent mRNA produced in the nucleus. The authors may need to isolate nascent mRNAs associated with RNA Polymerase II in the nucleus from seedlings treated with shade for 45 min, and then perform reverse transcription and PacBio isoseq.

      (3) The authors noted the identification of two potential start codons, TTG and CTG, in the alternative TSS of ATHB2 using TISpredictor. Yet, it's imperative to identify the actual translation initiation site and the full-length sequence of ATHB2miP. I suggest the authors fuse an epitope tag (e.g., 3xFLAG) to the C-terminus of ATHB2 (utilizing the genomic sequence of ATHB2) and generate transgenic lines to be treated with shade to induce ATHB2miP-3xFLAG production. Affinity purification (anti-FLAG beads) and mass spectrometry can then identify the actual start site of ATHB2miP. This step is crucial, as the current ATHB2miP used may not be the exact sequence, and any observed phenotype could be artifacts arising from these lines.

      (4) My confusion arose when analyzing the results in Figures 1E - G. The authors didn't specify whether these plants were subjected to shade treatment. What are the sequences within the second intron and third exon excluded from pATHB2control::GUS that promote transcription and translation? Have the authors examined the sequence features? This information is pivotal and related to the above question #1 because it may tell us whether the sequence feature is shared by other miP candidates.

      The latter part of the manuscript focused on the functional characterization of ATHB2miP. The approaches adopted by the authors resemble those used in studying antimorphic (dominant negative) alleles. However, I have several concerns regarding the approaches and conclusions.

      (5) Firstly, as mentioned in question #3, the authors did not map the actual translation initiation site of ATHB2miP. Therefore, all constructs involving ATHB2miP, such as eGFP-ATHB2miP, BD-ATHB2miP, and mCherry-ATHB2miP in Figure 2, and 35S::miP in Figures 3-5, may contain extra amino acids in the N-terminus, given that epitope tags were all added to the N terminus. These additional amino acids could potentially impact the behavior of ATHB2miP and lead to artifacts. Identifying the translation initiation site in ATHB2miP would facilitate the development of tools to disrupt ATHB2miP expression without affecting full-length ATHB2 expression. For instance, if the "CTG" before the leucine zipper domain is confirmed as the translation initiation site, mutating it to another Leu codon (e.g., TTA) could generate transgenic lines using the genomic sequence of ATHB2, including this mutation, to evaluate the impact of losing ATHB2miP on shade responses.

      (6) Another concern pertains to the 35S::miP line utilized in Figures 3-5. The authors only presented results from one 35S::miP line, raising the possibility of T-DNA insertion disrupting an endogenous gene in the transgenic plant genome. It is essential to clarify how many individual T1 plants were generated and how many of them showed the same phenotype as the line used in the manuscript. Additionally, the use of the constitutive CaMV35S promoter could generate artifacts akin to neomorphic mutations. For example, the authors identified Cluster 1 genes that were only induced in 35S::miP, but not in t-athb2 or WT plants (Figure 3B); moreover, they found an overrepresentation of genes involved in root development in this cluster. This observation correlated well with the root phenotype of 35S::miP under the proximity shade (Figure 4D), in which the short-root phenotype was only observed in lines expressing 35S::miP. These data could be artifacts due to the constitutive expression of ATHB2miP in roots but didn't necessarily reflect the natural function of ATHB2miP.

      (7) Furthermore, I seek clarification regarding the rationale behind employing different shade conditions, including deep shade, canopy shade, and proximity shade, and the significance of treating plants with these conditions. The results were challenging to interpret, and I have reservations about some statements made. The authors claimed that ATHB2 acts as a growth repressor in deep shade but a growth promoter in the canopy and proximity shade (Lines 366-368). However, it appears that regardless of the shade conditions, most mutant and transgenic lines were not significantly different from WT (Figure 4C). Additionally, the definition of proximity shade in this manuscript (R:FR = 0.06) differs from that in Roig-Villanova & Martinez-Garcia (Front. Plant Sci., 2016; R:FR, 0.5-0.3). Clarity on this disparity would be appreciated.

      (8) In Figure 5, no statistical analyses were presented in Figure 5C. It remains unclear whether the differences observed are statistically significant. Moreover, the values appear quite similar among all three genotypes. Even if statistically significant, do these minor differences in Fe concentrations significantly impact plant physiology? Additionally, some statements related to Figure 5 do not align with the data presented. For instance, claims about longer hypocotyls in t-athb2, athb2∆, and atbh2∆LZ mutants compared to wild type under shade conditions on high iron media (lines 453-455) were not supported by the data in Figure 5D. Similarly, statements about the differences between mutants (lines 458-460) were not substantiated by the data.

    3. Reviewer #3 (Public Review):

      Summary and Strengths:

      In this interesting manuscript, the authors identify a large number of alternative transcription start sites (TSS) and focus their functional analysis on an alternative TSS that is expected to produce a micro-protein (miP) encoding the C-terminus of ATHB2 (ATHB2miP). ATHB2miP is expected to comprise the leucine zipper part of ATHB2 and hence interact with the full-length protein through this dimerization motif. Such interactions are shown using yeast two-hybrid and FRET-FLIM assays. ATHB2 is a well-known shade-induced gene that has been implicated in shade-regulated growth responses. The authors then test the potential role for ATHB2miP genetically by comparing several athb2 loss-of-function (LOF) alleles: one does not express either full-length ATHB2 or the short ATHB2miP (t-ATHB2), two CRISPR alleles give rise to frameshift mutations in the full-length transcript but still express a potentially functional short ATHB2miP (athb2deltaLZ and athb2delta). The authors also use plants that over and ectopically express ATHB2miP (35S:miP). Overall, the results are consistent with the hypothesis that ATHB2miP inhibits the function of ATHB2, which constitutes a novel negative feedback loop. Potentially ATHB2miP may also inhibit the activity of other related HD ZIP proteins (based on 35S:miP). The effects of these genetic alterations on shade-regulated hypocotyl growth are relatively modest. Effects on root growth are also investigated and in one intriguing case, the negative feedback model does not appear to explain the data (Figure 4D, effect on lateral roots, because for this phenotype 35S:miP is very different from the lof alleles). The authors also identify a potentially interesting link between shade-regulated hypocotyl growth and iron uptake. A number of text changes and corrections to the figures would be important for clarity. They primarily concern three issues: names of the alleles, names of the studied shade conditions, and statements about significant differences between genotypes. Also, it would be interesting to know whether the effects of ATHB2 on iron uptake are due to local effects of ATHB2. Is ATHB2 expressed in roots?

      Weaknesses:

      (1) The naming of the different shade conditions is difficult to follow and not consistent with the way most authors in the field call such conditions. Deep shade is ok (low PAR and low R/FR, WL, PAR 13microE, R/FR 0.13). This condition is clearly defined for experiments in Figure 4. However, data in Figure 1 also use Deep shade (line 174) but PAR is not defined there. I suggest that all light conditions are clearly defined in the figure legends and in the M&M (not the case in this ms). Regarding Canopy shade (WL, PAR 45microE, R/FR 0.15) and proximity shade (WL, PAR 45microE, R/FR 0.06), see lines 355-357, this nomenclature is unclear. First proximity shade has a higher R/FR ratio than canopy shade. Second for canopy shade (compared to the WL control) PAR should decrease which is not what is done here. What is called proximity shade and canopy shade are 2 WL conditions with different R/FR ratios, which are compared to WL controls with the same PAR. It would make more sense to call them proximity shade and indicate the different R/FR ratios. Finally, extensive literature from many plant species and numerous labs has shown that hypocotyl elongation increases with R/FR decreasing. In the data shown in Figure 4, it is the opposite. Hypocotyls in Canopy shade (WL, PAR 45microE, R/FR 0.15) are longer than those in proximity shade (WL, PAR 45microE, R/FR 0.06), while with these R/FR ratios the opposite is expected. Could this be a mistake in the text? Please check.

      (2) In several instances (in particular regarding data from Figures 4 and 5), the authors write that 2 genotypes are significantly different while the statistical analysis of the data does not support such statements. For example lines 392-395, the authors write that in WL the t-DNA mutant, both CRISPR mutants and 35S:miP lines all had significantly lower number of lateral roots than the WT. This is true for the t-DNA mutant (group bc, while the WT is in group a), however, all other genotypes are in group ab, hence not significantly different from the WT. Please carefully check all such statements about significant differences.

      (3) The naming of the CRISPR mutants is problematic. In particular athb2delta, such a name suggests that the gene is deleted (also suggested by Figure 4A), which is not the case in this CRISPR allele leading to a frameshift early in the coding sequence. This is particularly problematic because in this allele ATHB2miP is still expressed, while based on such a name one would expect that in this mutant both the full length and the miP are lost. Both CRISPR alleles lead to a frameshift and this should be clarified in Figure 4A and in the text.

      (4) Overall hypocotyl growth phenotypes of athb2 lof mutants and 35S:miP are similar and consistent with a model according to which ATHB2miP inhibits the full-length protein. However, this is not the case for the root phenotype described in 4D. It would be interesting to discuss this.

      (5) The authors propose a role for ATHB2 in the root, in particular linked to iron uptake. Is this due to a local effect of ATHB2 in the roots? Is ATHB2 expressed in roots? It would be very informative if the authors would show such data, e.g. using the reporter lines used in Figure 1. Are both the FL and the miP expressed in roots?

      (6) From the description regarding 5'PEAT.seq data presented in Figure 1 (see lines 174-177) it is not clear in which light conditions the seedlings were grown. It appears that samples were collected in 3 conditions. WL and after 45 and 90 minutes of low R/FR treatment. However, then the data is discussed collectively. Does the 12398 TSS correspond to what was found in all three conditions together? Are the authors showing shade-regulation of TSS? This is clearly the case for ATHB2miP. This needs to be clarified.

      (7) The way gene expression of low F/FR effects is done might conflate circadian effects and low R/FR effects because the samples from different light conditions are not collected at the same ZT. This is how I understood the text. If I'm wrong please clarify the text. If I am right, this potential problem should be mentioned in the text.

      (8) Could the authors envisage a way to genetically test the role of ATHB2miP by using an allele that makes the full length but not the miP? Currently, the authors use lof alleles that either make none of the transcripts (t-DNA) or potentially only the miP (CRISPR alleles). Overall, these alleles do not appear to differ in their phenotypes, suggesting that most of the effect of ATHB2miP is through ATHB2 FL. Having an allele only producing the FL would be nice (but technically I'm not sure how one could do that).

    1. eLife assessment

      This potentially important paper reports on interactions between L1TD1, an RNA binding protein (RBP), and the ancestral LINE-1 retrotransposon from which it originates. Overall, the results support a model in which L1TD1 and LINE-1 ORF1p have synergistic effects on LINE-1 retrotransposition, but the evidence for whether this is through direct protein-protein interaction or through simultaneous interaction with LINE-1 RNA is currently incomplete.

    2. Reviewer #1 (Public Review):

      Summary:

      In their manuscript entitled 'The domesticated transposon protein L1TD1 associates with its ancestor L1 ORF1p to promote LINE-1 retrotransposition', Kavaklıoğlu and colleagues delve into the role of L1TD1, an RNA binding protein (RBP) derived from a LINE1 transposon. L1TD1 proves crucial for maintaining pluripotency in embryonic stem cells and is linked to cancer progression in germ cell tumors, yet its precise molecular function remains elusive. Here, the authors uncover an intriguing interaction between L1TD1 and its ancestral LINE-1 retrotransposon.

      The authors delete the DNA methyltransferase DNMT1 in a haploid human cell line (HAP1), inducing widespread DNA hypo-methylation. This hypomethylation prompts abnormal expression of L1TD1. To scrutinize L1TD1's function in a DNMT1 knock-out setting, the authors create DNMT1/L1TD1 double knock-out cell lines (DKO). Curiously, while the loss of global DNA methylation doesn't impede proliferation, additional depletion of L1TD1 leads to DNA damage and apoptosis.

      To unravel the molecular mechanism underpinning L1TD1's protective role in the absence of DNA methylation, the authors dissect L1TD1 complexes in terms of protein and RNA composition. They unveil an association with the LINE-1 transposon protein L1-ORF1 and LINE-1 transcripts, among others.

      Surprisingly, the authors note fewer LINE-1 retro-transposition events in DKO cells than in DNMT1 KO alone.

      Strengths:

      The authors present compelling data suggesting the interplay of a transposon-derived human RNA binding protein with its ancestral transposable element. Their findings spur interesting questions for cancer types, where LINE1 and L1TD1 are aberrantly expressed.

      Weaknesses:

      Suggestions for refinement:

      The initial experiment, inducing global hypo-methylation by eliminating DNMT1 in HAP1 cells, is intriguing and warrants a more detailed description. How many genes experience misregulation or aberrant expression? What phenotypic changes occur in these cells? Why did the authors focus on L1TD1? Providing some of this data would be helpful to understand the rationale behind the thorough analysis of L1TD1.

      The finding that L1TD1/DNMT1 DKO cells exhibit increased apoptosis and DNA damage but decreased L1 retro-transposition is unexpected. Considering the DNA damage associated with retro-transposition and the DNA damage and apoptosis observed in L1TD1/DNMT1 DKO cells, one would anticipate the opposite outcome. Could it be that the observation of fewer transposition-positive colonies stems from the demise of the most transposition-positive colonies? Further exploration of this phenomenon would be intriguing.

    3. Reviewer #2 (Public Review):

      In this study, Kavaklıoğlu et al. investigated and presented evidence for the role of domesticated transposon protein L1TD1 in enabling its ancestral relative, L1 ORF1p, to retrotranspose in HAP1 human tumor cells. The authors provided insight into the molecular function of L1TD1 and shed some clarifying light on previous studies that showed somewhat contradictory outcomes surrounding L1TD1 expression. Here, L1TD1 expression was correlated with L1 activation in a hypomethylation-dependent manner, due to DNMT1 deletion in the HAP1 cell line. The authors then identified L1TD1-associated RNAs using RIP-Seq, which displays a disconnect between transcript and protein abundance (via Tandem Mass Tag multiplex mass spectrometry analysis). The one exception was for L1TD1 itself, which is consistent with a model in which the RNA transcripts associated with L1TD1 are not directly regulated at the translation level. Instead, the authors found the L1TD1 protein associated with L1-RNPs, and this interaction is associated with increased L1 retrotransposition, at least in the contexts of HAP1 cells. Overall, these results support a model in which L1TD1 is restrained by DNA methylation, but in the absence of this repressive mark, L1TD1 is expressed and collaborates with L1 ORF1p (either directly or through interaction with L1 RNA, which remains unclear based on current results), leads to enhances L1 retrotransposition. These results establish the feasibility of this relationship existing in vivo in either development, disease, or both.

    1. eLife assessment

      This valuable study showcases a novel and exciting vaccine platform but the evidence supporting the claims is incomplete. The work would benefit from robust statistical analysis of experimental groups with a larger number of individuals. There is also comparison to other existing vaccine platforms (such as the mosaic nanoparticle where hemagglutinin trimers are used).

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript by Thronlow Lamson et al., the authors develop a "beads-on-a-string" or BOAS strategy to link diverse hemagglutinin head domains, to elicit broadly protective antibody responses. The authors are able to generate varying formulations and lengths of the BOAS and immunization of mice shows induction of antibodies against a broad range of influenza subtypes. However, several major concerns are raised, including the stability of the BOAS, that only 3 mice were used for most immunization experiments, and that important controls and analyses related to how the BOAS alone, and not the inclusion of diverse heads, impacts humoral immunity.

      Strengths:

      Vaccine strategy is new and exciting.

      Analyses were performed to support conclusions and improve paper quality.

      Weaknesses:

      Controls for how different hemagglutinin heads impact immunity versus the multivalency of the BOAS.

      Only 3 mice were used for most experiments.

      There were limited details on size exclusion data.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors describe a "beads-on-a-string" (BOAS) immunogen, where they link, using a non-flexible glycine linker, up to eight distinct hemagglutinin (HA) head domains from circulating and non-circulating influenzas and assess their immunogenicity. They also display some of their immunogens on ferritin NP and compare the immunogenicity. They conclude that this new platform can be useful to elicit robust immune responses to multiple influenza subtypes using one immunogen and that it can also be used for other viral proteins.

      Strengths:

      The paper is clearly written. While the use of flexible linkers has been used many times, this particular approach (linking different HA subtypes in the same construct resembling adding beads on a string, as the authors describe their display platform) is novel and could be of interest.

      Weaknesses:

      The authors did not compare to individuals HA ionized as cocktails and did not compare to other mosaic NP published earlier. It is thus difficult to assess how their BOAS compare.

      Other weaknesses include the rationale as to why these subtypes were chosen and also an explanation of why there are different sizes of the HA1 construct (apart from expression). Have the authors tried other lengths? Have they expressed all of them as FL HA1?

    4. Reviewer #3 (Public Review):

      This work describes the tandem linkage of influenza hemagglutinin (HA) receptor binding domains of diverse subtypes to create 'beads on a string' (BOAS) immunogens. They show that these immunogens elicit ELISA binding titers against full-length HA trimers in mice, as well as varying degrees of vaccine mismatched responses and neutralization titers. They also compare these to BOAS conjugated on ferritin nanoparticles and find that this did not largely improve immune responses. This work offers a new type of vaccine platform for influenza vaccines, and this could be useful for further studies on the effects of conformation and immunodominance on the resulting immune response. 

      Overall, the central claims of immunogenicity in a murine model of the BOAS immunogens described here are supported by the data. 

      Strengths included the adaptability of the approach to include several, diverse subtypes of HAs. The determination of the optimal composition of strains in the 5-BOAS that overall yielded the best immune responses was an interesting finding and one that could also be adapted to other vaccine platforms. Lastly, as the authors discuss, the ease of translation to an mRNA vaccine is indeed a strength of this platform. 

      One interesting and counter-intuitive result is the high levels of neutralization titers seen in vaccine-mismatched, group 2 H7 in the 5-BOAS group that differs from the 4-BOAS with the addition of a group 1 H5 RBD. At the same time, no H5 neutralization titers were observed for any of the BOAS immunogens, yet they were seen for the BOAS-NP. Uncovering where these immune responses are being directed and why these discrepancies are being observed would constitute informative future work. 

      There are a few caveats in the data that should be noted: 

      (1) 20 ug is a pretty high dose for a mouse and the majority of the serology presented is after 3 doses at 20 ug. By comparison, 0.5-5 ug is a more typical range (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6380945/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9980174/). Also, the authors state that 20 ug per immunogen was used, including for the BOAS-NP group, which would mean that the BOAS-NP group was given a lower gram dose of HA RBD relative to the BOAS groups. 

      (2) Serum was pooled from all animals per group for neutralization assays, instead of testing individual animals. This could mean that a single animal with higher immune responses than the rest in the group could dominate the signal and potentially skew the interpretation of this data. 

      (3) In Figure S2, it looks like an apparent increase in MW by changing the order of strains here, which may be due to differences in glycosylation. Further analysis would be needed to determine if there are discrepancies in glycosylation amongst the BOAS immunogens and how those differ from native HAs.

    1. eLife assessment

      The study reports on a previously unrecognized function of ATG6 in plant immunity. The work is valuable because it proposes a direct interaction between ATG6 and a well-studied salicylic acid receptor protein, NPR1, which may interest researchers investigating plant immunity regulation. While the data presented are compelling, more information regarding the specificity of ATG6's role would improve the overall impact of the study, especially with an eye towards consistency with prior work.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors showed that autophagy-related genes are involved in plant immunity by regulating the protein level of the salicylic acid receptor, NPR1.

      Strengths:

      The experiments are carefully designed and the data is convincing. The authors did a good job of understanding the relationship between ATG6 and NRP1.

      Weaknesses:<br /> - The authors can do a few additional experiments to test the role of ATG6 in plant immunity.<br /> I recommend the authors to test the interaction between ATGs and other NPR1 homologs (such as NPR2).

      -The concentration of SA used in the experiment (0.5-1 mM) seems pretty high. Does a lower concentration of SA induce ATG6 accumulation in the nucleus?

      -Does the silencing of ATG6 affect the cell death (or HR) triggered by AvrRPS4?

      -SA and NPR1 are also required for immunity and are activated by other NLRs (such as RPS2 and RPM1). Is ATG6 also involved in immunity activated by these NLRs?

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Zhang et al. explores the effect of autophagy regulator ATG6 on NPR1-mediated immunity. The authors propose that ATG6 directly interacts with NPR1 in the nucleus to increase its stability and promote NPR1-dependent immune gene expression and pathogen resistance. This novel role of ATG6 is proposed to be independent of its role in autophagy in the cytoplasm. The authors demonstrate through biochemical analysis that ATG6 interacts with NPR1 in yeast and very weakly in vitro. They further demonstrate using overexpression transgenic plants that in the presence of ATG6-mcherry the stability of NPR1-GFP and its nuclear pool is increased.

      However, the overall conclusions of the study are not well supported experimentally. The significance of the findings is low because of their mostly correlational nature, and lack of consistency with earlier reports on the same protein.

      Based on the integrity and quality of the data as well as the depth of analysis, it is not yet clear if ATG6 is a specific regulator of NPR1 or if it is affecting NPR1's stability indirectly, through inducing an elevation of SA levels in plants. As such, the current study demonstrates a correlation between overexpression of ATG6, SA accumulation, and NPR1 stability, however, whether and how these components work together is not yet demonstrated.

      Based on the provided biochemical data, it is not yet clear if the ATG6 functions specifically through NPR1 or through its paralogs NPR3 and NPR4, which are negative regulators of immunity. It is quite possible that interaction with NPR1 (or any NPR) is not the major regulatory step in the activity of ATG6 in plant immunity. The effect of ATG6 on NPR1 could well be indirect, through a change in the SA level and redox environment of the cell during the immune response. Both SA level and redox state of the cell were reported to induce accumulation of NPR1 in the nucleus and increase in stability.

      Another major issue is the poor quality of the subcellular analyses. In contradiction to previous studies, ATG6 in this study is not localized to autophagosome puncta, which suggests that the soluble localization pattern presented here does not reflect the true localization of ATG6. Even if the authors propose a novel, non-canonical nuclear localization for ATG6, they still should have detected the canonical autophagy-like localization of this protein.

    1. eLife assessment

      The investigation of the functional significance of the X-linked ciliary protein OFD1 gene in regulating the fate of cranial neural crest-derived cells (CNCCs) and its potential effect on myogenic progenitors during tongue development is interesting because the Ofd1 conditional knockout mouse model has a very striking phenotype and nicely mimics the phenotype in humans. It is a valuable model to understand human disease. This study will require additional experiments to support their conclusions.

    2. Reviewer #1 (Public Review):

      In this study, the authors reported that disruption of the X-linked ciliary protein OFD in the cranial neural crest-derived cells (CNCCs) leads to a migration defect in the CNCCs and that aberrant CNCCs abnormally differentiate into osteoblasts due to a lack of Hh signal. Furthermore, CNCC defects lead to the failure of mesoderm-derived cells to differentiate into myoblasts and instead result in abnormal differentiation of mesoderm-derived cells into adipocytes. The Ofd cko mouse model has a very striking phenotype and nicely mimics the phenotype of human patients, making it a very valuable model to understand human disease.

    3. Reviewer #2 (Public Review):

      In this study, the authors report that both mice and human patients carrying function-disrupting mutations in the OFD1 gene exhibited ectopic brown adipose tissue formation in the malformed tongue. The OFD1 gene is located on the X-chromosome and encodes a protein product required for the formation and function of the primary cilium, which is required for cells to properly receive and activate several signaling pathways, particularly the hedgehog signaling pathway. Loss of OFD1 function causes prenatal lethality of male fetuses and mosaic disruption of tissues in females due to random inactivation of the X-chromosome carrying either the mutant or wildtype allele. Using cell type-specific gene inactivation and genetic lineage labeling, the manuscript shows that the ectopic brown adipose tissue in the mutant tongue was not derived from cranial neural crest cells (CNCCs). Additional genetic and embryological studies led to the conclusion that loss of Ofd1 function in the CNCC cells in the embryonic hypoglossal cord, via which the tongue myoblast precursor cells migrate from anterior somites to the tongue primordia, caused disruption of cell-cell interactions between the CNCCs and migrating muscle precursor cells, resulting in altered differentiation of those myoblast precursor cells into brown adipocytes. The authors provided data that disruption of Smo in a subset of CNCCs also resulted in ectopic adipose tissue formation in the tongue, indicating that this phenotype in the Ofd1 mutant mice was likely caused by disruption of hedgehog signaling in CNCCs. However, no experimental evidence is provided to support a major conclusion of the manuscript regarding altered differentiation of the tongue myoblast precursor cells into brown adipocytes in the Ofd1 mutant mice. Since it is well established that hedgehog signaling in the CNCCs is required for them to direct tongue myoblast cell migration as well as for tongue muscle differentiation/organization after the myoblasts arrived in the tongue primordia, the finding of tongue muscle defects in the Ofd1 mutant mice is not surprising. However, if proven true that disruption of Ofd1 function in CNCCs caused tongue myoblast precursor cells to alter their fate and differentiate into brown adipocytes, it would be an interesting new finding. Further identification of the signals produced by the Ofd1 mutant CNCCs for directing the cell fate switch will be a highly significant new advance in understanding the cellular and molecular mechanisms regulating tongue morphogenesis.

    4. Reviewer #3 (Public Review):

      The authors observed phenotypes of ciliopathy model mice and they seem to coincide with those in human patients. They used mutants in which cilial function genes are deleted in cranial neural crest cells, and found the mutants exhibit abnormal cell differentiation in both neural crest- and mesoderm-lineage cells. The finding clearly shows the importance of tissue/cell interaction. The authors mainly observed the mouse in which Ofd1 gene that is coded on the X chromosome is deleted, therefore, Ofd1fl/WT;Wnt1Cre(HET) mice show that about one-fourth of neural crest cells can exhibit Ofd1 function whereas Ofd1fl;Wnt1Cre (HM) shows null Ofd1 function and show severer phenotypes than HET.

      For ectopic brown adipose tissue in the tongue is derived from mesoderm and the authors tried to show that the hypoglossal cord failed to obtain myogenic lineage after entering branchial arches in HET and HM due to lack of communication with neural crest cells. For ectopic bone formation, they found that it is due to the lack of Hedgehog signaling in neural crest cells, which was consistent with the reports in the Smofl/fl;Wnt1-Cre (Xu et al., 2019) and Ift88fl/fl;Wnt1Cre (Kitamura et al. 2020). The ectopic bone is connected to the original mandibular bone. The authors attribute the ectopic bone formation to the migration of mandibular bone neural crest cells into the tongue-forming area.

      For the poor tongue frenum formation, the authors found the importance of cell migration from the lateral sides of the branchial arch to the midline and its formation relies on non-canonical Wnt signaling. The authors observed similar phenotypes in the human patients as those in the mutants. The adipose tissue in the tongue area is normally found in the salivary gland region and intermuscular space, and it is intriguing to find the brown adipose tissue anterior to the cervical area in which the most anterior brown adipose tissue develops. qRT-PCR indicates that some of the marker genes are expressed in the laser micro-dissected sections of the ectopic brown adipose tissue. However, histology does not show the typical brown adipose tissue feature. In addition, brown adipose tissue is normally recognized in the sixth pharyngeal region as the cervical brown tissue from around E14.5 (Schulz and Tseng 2013), not E12 as the authors observe. Although the mutants develop under abnormal conditions, is it possible to say they are brown adipose tissue? The point has to be further investigated with more marker expression by immunohistochemical detection and other methods. Since the mutants seem to show impaired midline formation (which is consistent with the condition of human ciliopathy), is it possible to hypothesize that the adipose-like tissue is derived from the mesoderm of posterior branchial arch levels if the tissue is brown adipose tissue?

      Cranial neural crest cells start migrating around E8.0 and reach their destination by E9.5. The authors show the lack of neural crest cells in the midline, the fluorescence is absent from the midline in HM, however, they studied it in the E11 mandible (Fig. 4E), almost more than two days after neural crest migration completes. Since the mandibular arch seems to form at the beginning in the mutants, is there a failure in allocating the neural crest and mesoderm at the beginning of the mandibular arch formation?<br /> The authors tried to disturb the interaction between the hypoglossal cord and neural crest cells by making incisions in the dorsal area of the branchial arches. That area contains both neural crest and mesoderm but not the hypoglossal cord-derived mesoderm. The hypoglossal cord passed through the posterior edge of the caudal (6th) pharyngeal arch, along the lateral side of the pericardium towards the anterior, ventral to branchial arches, and then inside the 2nd and 1st branchial arches (Adachi et al., 2018). It expresses Pax3 before entering the branchial arches, then Myf5 in the branchial arches. It seems that the migration of the hypoglossal cord does not require interaction with neural crest cells but it has to be confirmed as well as neural crest migration into the branchial arches from the beginning. Although the hypoglossal cord migrates mostly in mesoderm-derived mesenchyme, we cannot exclude the possibility that hypoglossal cord migration is affected.

      The lack of Myf5 expression in Ofd1fl;Wnt1Cre (HM) was explained as a failure in the differentiation of the hypoglossal cord into myoblasts on entrance into the branchial arches. Most of the cervical brown adipose tissue is derived from either Myf5- or Pax3- expressing lineage (Sanchez-Gurmaches and Guertin, 2014). Although the authors suggest that brown adipose cells are fate-changed mesoderm in the branchial arches, how do they explain the association with Myf5- or Pax3- expression?

      In addition, the cervical brown tissue is supposed to be derived from the branchial arch mesoderm (Mo et al., 2017). Is the formation of the cervical brown tissue affected in the Ofd1fl/WT;Wnt1Cre(HET) or Ofd1fl;Wnt1Cre (HM) if dysfunction of neural crest cells results in the cell fate change of mesoderm?

      For the tongue frenum development, it is hard to understand to hypothesize that its formation is unlikely to associate with midline formation. Although Lgr5 and Tbx22 are not expressed in the midline, the defect in midline formation could cause unnecessary interaction between the right and left tissues.

      Tissue morphogenesis takes place in three dimensions, which were not considered in the data, especially in the labeling experiments. When the authors labelled the cells, which cells in which area were labelled? In the textbook, tongue formation is a result of the fusion of the midline processes derived from the branchial arches, therefore, it is important to identify which cells in which area are labelled.

      The weakest point is that the authors demonstrate many interesting phenotypes but fail to show the mechanism of altered cell differentiation and direct evidence of the tissue origin of ectopic brown tissue. Without the data, suggestion from the authors' argument is weak, which is reflected in the conclusion of the abstract.

    1. eLife assessment

      This useful study defines developmental roles for a protein kinase involved in endocytosis and reports a surprising finding that the kinase catalytic activity is unnecessary. However, several claims of the authors are only partially supported by the data. Although in its current form, this work is incomplete, it will be of broad interest to cell biologists and biochemists because this kinase was previously suggested to be a target of drug design efforts.

    2. Reviewer #1 (Public Review):

      Recent work reported that the AP2-associated kinase 1 (AAK1) downregulates Wnt signaling by phosphorylating, thus activating, the µ-subunit of the AP2 complex (AP2M1), which recognizes an endocytic signal on the intracellular domain of the Wnt co-receptor LRP6 leading to its internalization (Agajanian, et al., 2018). It has also long been known that DPY-23/AP2M1 and the retromer complex, which controls trafficking between endosomes and the trans-golgi network and recycling from endosomes to the plasma membrane, regulate Wnt signaling in C. elegans, at least in part by modulating trafficking of the Wnt-secretion factor MIG-14/WLS (Pan, et al., 2008; Yan et al., 2008).

      Here the authors first set out to ask whether SEL-5/AAK1 plays a conserved role in Wnt signaling via phosphorylation of DPY-23/AP2M1 by assessing the function of SEL-5 in Wnt-regulated morphogenetic events; specifically, the well-characterized migration and polarization of several neurons and the less-understood process of excretory canal cell outgrowth.

      The authors found that the simultaneous removal of sel-5 and the retromer complex gene vps-29 resulted in synthetic neuronal and excretory canal outgrowth phenotypes, indicating that sel-5 and the retromer complex function in parallel in these processes. Genetic interactions between sel-5 and Wnt pathway components were also examined, and for QL neuroblast migration, loss of sel-5 exacerbated phenotypes caused by loss of the Wnt receptor LIN-17/FZD, but not those caused by loss of a different receptor, MIG-1/FZD. The authors assessed the site of sel-5 function in neuronal migration defects via tissue-specific rescue and identified the hypodermis, a known source of Wnt ligands, and muscles as sites where sel-5/AAK1 activity is required.

      The novelty in this work comes from the discovery of a function for sel-5/AAK1 and the retromer complex in excretory canal outgrowth, identified by phenotypes caused by simultaneous loss of sel-5 and retromer components. This synthetic phenotype is rescued by restoring sel-5 to either the excretory canal cell or the hypodermis, suggesting autonomous and non-autonomous functions for sel-5 in canal outgrowth. The authors also confirmed previous results showing that loss of LIN-17/FZD results in excretory canal overgrowth, and by carrying out an extensive survey of Wnt-pathway mutants they discovered that LIN-44/Wnt is likely the ligand that functions via LIN-17 as a "stop" signal in canal outgrowth. They also implicate a CWN-1/Wnt-CFZ-2/FZD pathway as required for canal outgrowth and find genetic interactions between sel-5/AAK1 and the lamellipodin ortholog mig-10, suggesting that these genes function in parallel to promote excretory canal outgrowth.

      The most intriguing claim in this work is the suggestion that neither DPY-23 phosphorylation nor SEL-5 kinase activity is required for their function in Wnt signaling. However, the tools used to support these conclusions are not well-characterized. First, a new dpy-23 phosphorylation site-mutant is not genetically characterized, thus it is difficult to interpret the negative results obtained with this allele. Second, although the mutations introduced into SEL-5 are expected to abolish kinase activity, this is not demonstrated biochemically, nor are the effects, if any, of mutations on protein stability/localization assessed. Finally, experiments testing the function of SEL-5 kinase mutants are reported using only one multi-copy extrachromosomal array per construct. Because these types of transgenes vastly overexpress proteins, it is likely that even proteins with reduced function will rescue, raising concerns regarding the conclusion that kinase activity is not necessary for SEL-5 function.

      In conclusion, it is not clear that the findings presented here will be of great general interest, as they mostly support previously-known functions for SEL-5/AAK1, DPY-23/AP2M1, and the retromer complex in Wnt-mediated signaling. Thus, this work will mainly be of interest to researchers studying Wnt-mediated cell outgrowth, and more specifically to those studying the C. elegans excretory canal. Moreover, the study lacks coherence: initially, there is a clear hypothesis testing a role for SEL-5/AAK1 in DPY-23/AP2M1 phosphorylation and how this impinges on Wnt signaling. This model appears to be refuted (although, as noted above the tools used to do this need to be better validated), but the authors do not explore alternative targets or functions for SEL-5/AAK1, nor do they directly assess how SEL-5 or the retromer complex impinge on Wnt signaling in excretory canal outgrowth. Thus, there is little mechanistic insight provided by this work.

    3. Reviewer #2 (Public Review):

      Summary<br /> This study by Knop, et al. defines two different developmental roles for the conserved SEL-5/AAK1 protein kinase in Caenorhabditis elegans. In other organisms, AAK1 was known to promote the recycling of the Wntless sorting receptor and endocytosis of Wnt receptors. This study establishes that SEL-5 acts in two roles in C. elegans: in Wnt-producing cells, a role that promotes migration of a neuroblast termed QL.d, and in Wnt-receiving cells, a role that promotes outgrowth of the excretory cell (EXC). Before this study, SEL-5/AAK1 was thought to regulate endocytosis through phosphorylation of AP2M1 and other endocytic adaptor proteins. This study shows convincing data that the SEL-5 makes a partial contribution to AP2M1 phosphorylation, and more surprisingly, that its roles in Wnt-producing and Wnt-receiving cells of C. elegans do not require SEL-5 catalytic activity. Human AAK1 was previously suggested to be a target of drug design efforts due to its roles in neuropathic pain, viral infection, and Alzheimer's disease. The discovery that some roles for SEL-5/AAK1 are independent of catalytic activity will be of broad interest to cell biologists and biochemists.

      Strengths<br /> (1) The data establishing the requirement for SEL-5 in QL.d migration and EXC outgrowth (Fig. 1 and Fig. 4) is rigorous and convincing. My assessment of the rigor is based on the following: First, the authors show that two independently derived sel-5 deletion mutations result in defects in QL.d and EXC. Second, the authors show that providing wild-type, GFP-tagged SEL-5 results in significant rescue of both phenotypes. Importantly, they use tissue-specific transgenes to show that the requirement for SEL-5 in QL.d migration is non-cell-autonomous, and the requirement for SEL-5 in EXC outgrowth is cell-autonomous (Fig. 2). For rescue experiments, they show that each tissue-specific transgene is indeed expressed strongly in the tissue of interest. This establishes the roles for SEL-5 in two different roles, in Wnt-producing and Wnt-receiving cells.

      (2) The authors present three lines of convincing biochemical and genetic evidence that SEL-5 kinase catalytic activity is not important for its roles in Wnt-producing and Wnt-receiving cells.

      Taking a biochemical approach, they use quantitative Westerns to assess the degree of AP2M1 phosphorylation in sel-5 mutants (Fig. 3). Their results show that AP2M1 phosphorylation is diminished, but not absent in mutants. Their results are convincing because they make use of GFP-tagged AP2M1 to probe for total and phospho-AP2M1. I note that they included uncropped Western blots in supplemental data. Furthermore, they make use of a GFP-tagged AP2M1 mutant (T160A) to confirm which residue is phosphorylated. Their results suggest that some mechanism other than AP2M1 phosphorylation may account for the sel-5 mutant phenotypes.

      Taking a genetic approach, they make use of a unique allele, dpy-23(mew25), that alters the known AP2M1 phosphorylation site. They show that animals carrying this allele do not display the QL.d and EXC phenotypes (Fig. 3 and Fig. 5). Finally, in a more direct test of whether SEL-5 requires catalytic activity, they make use of GFP-tagged SEL-5 forms mutated at either the active site or the ATP-binding site of the SEL-5 kinase domain. They show that either SEL-5 mutant form successfully rescues the QL.d and EXC defects seen in sel-5 mutants (Fig. 3), suggesting that SEL-5 catalytic activity is unnecessary.

      (3) The authors have produced an elegant GFP knock-in allele of the sel-5 gene, allowing analysis of expression and localization in living animals (Fig. 2).

      (4) The authors make use of genetic interactions with Wnt signaling mutants to show that SEL-5 acts in a role that promotes Wnt signaling for the QL.d cell (Fig. 1) and counteracts Wnt signaling for the EXC (Fig. 5).

      Weaknesses<br /> (1) Some changes to statistical analyses are needed in this study.

      Fig. 1B, 1D, 2A, 3E, and 3F report the QL.d phenotype as a percentage of animals scored that were defective in migration. The methods make it clear this data is categorical rather than quantitative. Therefore, a t-test or any test designed for quantitative data is not appropriate. I suggest that the authors should investigate using a chi-squared or Fisher's exact test.

      For the reasons mentioned above, the calculation of standard deviation (as shown in error bars) is also not appropriate for Fig. 1B, 1D, 2A, 3E, and 3F. Of course, it is excellent that the authors scored multiple trials. For experiments with mutants, I suggest the authors might combine these trials or show separate results of each trial. For experiments using RNAi (Fig. 1B), each trial should be plotted separately because RNAi effectiveness can vary. If there is not enough space to show multiple trials, then I would ask that a representative trial be shown in the main figure and additional trials in a supplement.

      In Fig. 1, 2, 3, and 5, it is not specified whether/how p-values were adjusted for multiple tests.

      (2) I felt the author's interpretation of the sel-5 mutant phenotypes in EXC, and the genetic interactions with Wnt signaling mutants, might be improved. The authors show convincing data that the sel-5 mutants display a shortened EXC outgrowth phenotype. Conversely, mutants with reduced Wnt signaling, such as the lin-17 or lin-44 mutants, displayed lengthened EXC outgrowth. The authors show that in double mutants, loss of sel-5 partially suppressed the EXC overgrowth defects of lin-17 or lin-44 mutants (Fig. 5). In my opinion, this data is consistent with a model where SEL-5 acts to inhibit Wnt signaling in EXC. An inhibitory role in a Wnt-receiving cell would be consistent with the known activity for human AAK1 in promoting negative feedback and endocytosis of LPR6. Interestingly, the authors mention in their discussion that a mutant of plr-1, which acts in the internalization of Frizzled receptors, has a shortened EXC phenotype similar to that of sel-5 mutants. These observations all seem consistent with an inhibitory role, yet the authors do not state this as their conclusion. A clarification of their interpretation is needed.

      Impact/significance<br /> (1) Among researchers using C. elegans, this study provides a foundation for further investigation of the role of endocytosis, SEL-5, and the retromer, in Wnt trafficking. It is particularly useful that the authors define two different phenotypes that arise from Wnt-producing and Wnt-receiving cells.

      (2) Among a broader community of cell biologists and biochemists, this study will be of interest in its finding that SEL-5/AAK1 kinase catalytic activity is unnecessary for the regulation of Wnt signaling.

    1. eLife assessment

      This important study advances our understanding of the molecular mechanism underlying salt stress-induced inhibition of seed germination and seedling growth. The evidence supporting the conclusions is convincing, with rigorous genetic, physiological, and metabolic analyses. This paper will be of interest to plant stress biologists and crop breeders.

    2. Reviewer #1 (Public Review):

      Salt-inhibited germination and growth in Arabidopsis and other plant species. Here the authors demonstrated that part of that inhibitory effect is caused by the arginine-derived urea hydrolysis, a novel mechanism. They also postulated that urea transport is involved in germination inhibition, but they do not link urea transport from cotyledons to pH changes in roots. At last, they generalized the mechanisms to other glycophytic crops and halophytic plants, but the salt concentration used is the same for the four groups, which are supposed to have very different salt tolerance ranges, questioning the validity of this generalization.<br /> Overall, the authors have provided well-organized genetic and pharmacological evidence to support most of their conclusions.

    3. Reviewer #2 (Public Review):

      Urea is widely utilized in agriculture. In this study, the authors the mechanism underlying the adverse impact of urea on seed germination and seedling growth under salt stress conditions. The results show that salt stress induces a pronounced hydrolysis of urea, resulting in an elevation of cytoplasmic pH and subsequent inhibition of seed germination. These findings challenge the previous notion that ammonium accumulation is the primary cause of salt-induced inhibition of germination, thereby offering novel insights into this process.

      The authors have provided well-organized genetic or biochemical evidence to support most of their conclusions.

    4. Reviewer #3 (Public Review):

      This work submitted by Bu et al. investigated mechanisms of how salt stress-induced arginine catabolism, which is catalyzed by arginase and urease, inhibits seed germination and seedling growth in Arabidopsis using a combination of genetic, biochemical, and live-cell imaging approaches. Their results showed that the two steps for the turnover of arginine into ammonia and the transport of urea from the cotyledon to the root are required for the salt-induced inhibition of seed germination (SISG). Further analysis showed that the cellular accumulation of the end product ammonia is not associated with SISG, but it is the cytoplasmic alkaline stress that primarily causes SISG. Interestingly, they found that the mechanism underlying SISG is conserved in other plant species. In general, this work will be valuable for plant biologists to deeply dissect the complex mechanism that controls salt stress-induced inhibition of plant growth and development in the future.

      The conclusions derived from this work are well supported by the data, but some aspects of data analysis need to be clarified and extended.

      (1) Inhibition of arginine hydrolysis by enzyme inhibitors (NOHA for arginase and PPD for urease) significantly improved seed germination and seedling growth (Figure 2). It seems that the suppressive effect of NOHA against the salt-induced inhibition of seedling growth is dose-dependent (Figure 2b). Whether NOHA effect on SISG is also dose-dependent and application of a certain level of NOHA can fully rescue the phenotype of SISG remains to be answered. The answers may help to explain the genetic data shown in Figure 3c, where either single (argah1 and argah2) or double (argah1/argah2) mutants partially rescued the phenotype of SISG. However, arginase activity, particularly in argah1 and argah2, is not closely correlated to the phenotype shown in Figure 3c and 3d.

      (2) The data shown in Figure 4b and 4e were not fully consistent. The percentage of seed germination rate was about 70% when treated with the highest concentration (7.5 μM) of PPD, but was less than 40% for the aturease mutant.

      (3) Cellular pH values detected at the seed germination stage were not convincing. In the text, they did not describe the results showing that the cytoplasmic pH values in hypocotyl and cotyledon cells were alkaline and not affected by NaCl treatment, and PPD treatment only restored the alkaline cytoplasmic pH to that of the control (Figure 7b). This raises two questions: is it true that cytoplasmic pH values are different between root and cotyledon/hypocotyl cells under normal growth conditions? and does PPD treatment alter the cytoplasmic pH only in roots?

    1. eLife assessment

      This study presents a useful finding on a virally encoded immune-evasin which differentially inhibits antigen presentation by cellular protein complexes called Major histocompatibility complex (MHC) class I, thereby diminishing the activation of cytotoxic T cells. The evidence supporting the claims of the authors is solid, although the addition of more mechanistic insights would strengthen the study. The work will be of interest to virologists and immunologists working on the adaptive immune response to herpesviral infection. Some conclusions would require additional experimental support.

    2. Reviewer #1 (Public Review):

      HMCV encodes various immunoevasins to inhibit being presented by MHC class I molecules to the cytotoxic cells of the immune system. Here, the authors studied the role and specificity of US10, a relatively uncharacterized immunoevasin from HCMV. They found that US10 differentially affects antigen presentation by different MHC class I allotypes. HLA-A and certain HLA-B and C alleles (so-called "tapasin-independent") were unaffected, while other HLA-B and C alleles (so-called "tapasin-dependent") as well as HLA-G were negatively affected. US10 can bind to different MHC class I allotypes, which inhibits their incorporation into peptide loading complex and slowers maturation. By comparing US10 to the other well-studied immunoevasins from HCMV, US2, US3, and US11, the authors demonstrated only partial overlap between them suggesting the cumulative action of immunoevasins in inhibiting MHC class I antigen presentation of HMCV epitopes. This work contributes to our understanding of the complex immune evasion mechanism by HCMV.

      The strengths include using a broad use of available techniques, including overexpression of US10 and US10 siRNA in the infection context that allowed comparison of its net and cumulative effects. Bioinformatic analysis of US10 and US11 to describe how transcription and expression of these two gene products contribute to the control of immunoevasion by HCMV. The conclusions are mostly supported by the experiments.

    3. Reviewer #2 (Public Review):

      The manuscript entitled " Multimodal HLA-I genotypes regulation by human cytomegalovirus US10 and resulting surface patterning" by Gerke et al describes the biochemical analysis of US10-mediated down regulation of HLA-I molecules. The authors systemically examine the surface expression of different HLA-I alleles in cells expressing US10 and interactions of US10 with HLA-I and antigen presentation machinery. Further, studies examined genotypic and allotypic differences during expression of US10/US11 transcripts suggest a different allelic class I downregulation. In general, the authors have included data supporting the major claims. Yet, the conclusions and findings of the study only marginally advance the overall understanding of HCMV viral evasion and the mechanism of US10 function.

      Strengths:<br /> The studies are well characterized and the studies utilize diverse HLA-I and HCMV viral molecules. The biochemistry is excellent and is of high quality. Importantly, the study describes HLA-I allelic specific HCMV down regulation at the cell surface and molecular levels.

      Weaknesses:<br /> (1) The authors use over expressive language such as "strong binding" that does not have a quantitative value and it is relative to the specific assay with only small differences among the factors.<br /> (2) The US10 binding to the HLA-I did not correlate with class I surface levels suggesting that binding to the APC machinery (Figure 1); hence, why does the binding of US10 to the APC define its mechanism of action.<br /> (3) The innovative and significant aspects of the study are limited. The study does not delineate the US10 mechanism of action or show data in which US10-mediated MHC class I down regulation impacts adaptive or innate immune function.

    4. Reviewer #3 (Public Review):

      Correlation of the HLA-B effects with previously demonstrated allelic differences in dependence on the peptide loading complex (PLC) component chaperone/editor tapasin and demonstration that US10 does not bind the PLC reflect on possible mechanisms of US10 function. Thus, this paper adds new information that may be integrated into evolving models of the steps of MHC-I dependent antigen presentation and how viruses counter immune recognition for their own benefit. Clearer focus on the proposed models for the function of US10 and its mechanism--i.e. what experiments address the mechanism and what additional finding might clarify the mechanism would be helpful.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      [...] This study is a fundamental step towards our better understanding of the mechanisms underlying light effects on cognition and consequently optimising lighting standards.

      Strengths:

      While it is still impossible to distinguish individual hypothalamic nuclei, even with the high-resolution fMRI, the authors split the hypothalamus into five areas encompassing five groups of hypothalamic nuclei. This allowed them to reveal that different parts of the hypothalamus respond differently to an increase in illuminance. They found that higher illuminance increased the activity of the posterior part of the hypothalamus encompassing the MB and parts of the LH and TMN, while decreasing the activity of the anterior parts encompassing the SCN and another part of TMN. These findings are somewhat in line with studies in animals. It was shown that parts of the hypothalamus such as SCN, LH, and PVN receive direct retinal input in particular from ipRGCs. Also, acute chemogenetic activation of ipRGCs was shown to induce activation of LH and also increased arousal in mice.

      Weaknesses:

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin and/or other photoreceptors. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use a silent substitution approach to distinguish between different photoreceptors. This may be something to consider when designing the follow-up studies.

      We thank the reviewer for acknowledging the quality and interest of our work and agree with the weaknesses they pointed out.

      Blue-enriched light illuminances were set according to the technical characteristics of the light source and to keep the overall photon flux similar to prior 3T MRI studies of our team (between ~1012 and 1014 ph/cm²/s) (Vandewalle et al. 2010 PNAS, Vandewalle et al. 2011 Biol. Psy.). The orange light was introduced as a control visual stimulation for potential secondary whole-brain analyses. It’s photopic illuminance should ideally have been set similar to the low illuminance blue-enriched light condition, but it was not the case. For the present region of interest analyses, we discarded colour differences between the light conditions and only considered illuminance as indexed by mel EDI lux. This constitutes indeed a limitation of our study as it does not allow attributing the findings to a particular photoreceptor class.

      The revised version of the manuscript will include a better explanation as to the choice of illuminances and spectra. The discussion will make clear that these choices limit the interpretation about the photoreceptors involved. The discussion will also point out that silent substitution could be used in the future to resolve such question.

      Reviewer #2 (Public Review):

      [...] By shedding light on these complex interactions, this research endeavors to contribute to the foundational knowledge necessary for developing innovative therapeutic strategies aimed at enhancing cognitive function through environmental modulation.

      Strengths:

      (1) Considerable Sample Size and Detailed Analysis: The study leverages a robust sample size and conducts a thorough analysis of hypothalamic dynamics, which enhances the reliability and depth of the findings.

      (2) Use of High-Resolution Imaging: Utilizing 7 Tesla fMRI to analyze brain activity during cognitive tasks offers high-resolution insights into the differential effects of illuminance on hypothalamic activity, showcasing the methodological rigor of the study.

      (3) Novel Insights into Illuminance Effects: The manuscript reveals new understandings of how different regions of the hypothalamus respond to varying illuminance levels, contributing valuable knowledge to the field.

      (4) Exploration of Potential Therapeutic Applications: Discussing the potential therapeutic applications of light modulation based on the findings suggests practical implications and future research directions.

      Weaknesses:

      (1) Foundation for Claims about Orexin and Histamine Systems: The manuscript needs to provide a clearer theoretical or empirical foundation for claims regarding the impact of light on the orexin and histamine systems in the abstract.

      (2) Inclusion of Cortical Correlates: While focused on the hypothalamus, the manuscript may benefit from discussing the role of cortical activation in cognitive performance, suggesting an opportunity to expand the scope of the manuscript.

      (3) Details of Light Exposure Control: More detailed information about how light exposure was controlled and standardized is needed to ensure the replicability and validity of the experimental conditions.

      (4) Rationale Behind Different Exposure Protocols: To clarify methodological choices, the manuscript should include more in-depth reasoning behind using different protocols of light exposure for executive and emotional tasks.

      We thank the reviewer for recognising the interest and strength of our study. We agree that corrections and clarifications to the text were needed. We will address the weaknesses they pointed out as follows:

      (1) As detailed in the discussion, we do believe orexin and histamine are excellent candidates for mediating the results we report. As also pointing out, however, we are in no position to know which neurons, nuclei, neurotransmitter and neuromodulator underlie the results. We will therefore remove the last sentence of the abstract as we agree our final statement in the abstract was too strong. We will carefully reconsider the discussion to avoid such overstatements.

      (2) We are unsure at this stage how to address the comment of the reviewer without considerably lengthening the manuscript with statements which can only be putative. Hypothalamus nuclei are connected to multiple cortical (and subcortical) structures. The relevance of these projections will vary with the cognitive task considered. In addition, we have not yet considered the cortex in our analyses such that truly integrating cortical structures appears premature. We will nevertheless refer to the general statement that subcortical structures (and particularly those receiving direct retinal projections) are likely to receive light illuminance signal first before passing on the light modulation to the cortical regions involved in the ongoing cognitive process.

      (3) Illuminance and spectra could not be directly measured within the MRI scanner due to the ferromagnetic nature of measurement systems. The MR coil and the associated optic fibre stand, together with the entire lighting system were therefore placed outside of the MR room to reproduce the experimental conditions of the in a completely dark room. A sensor was placed 2 cm away from the mirror of the coil (mounted at eye level), i.e. where the eye of the first author of the paper would be positioned, to measure illuminance and spectra. The procedure was repeated 4 times for illuminance and twice for spectra and measurements were averaged. This procedure does not take into account inter-individual variation in head size and orbit shape such that the reported illuminance levels may have varied slightly across subjects. The relative differences between illuminance are very unlikely to vary substantially across participants such that statistics consisting of tests for the impact of relative differences in illuminance were not affected. We will report these methodological details in the supplementary material file associated to the paper.

      (4) The comment is similar to the issue raised by reviewer 1 (and reviewer 3) so we refer to the response provided to reviewer 1 to address the final comment of reviewer 2.

      Reviewer #3 (Public Review):

      [...] The authors find evidence in support of a posterior-to-anterior gradient of increased blood flow in the hypothalamus during task performance that they later relate to performance on two different tasks. The results provide an enticing link between light levels, hypothalamic activity, and cognitive/affective function, however, clarification of some methodological choices will help to improve confidence in the findings.

      Strengths:

      The authors' focus on the hypothalamus and its relationship to light intensity is an important and understudied question in neuroscience.

      Weaknesses:

      I found it challenging to relate the authors' hypotheses, which I found to be quite compelling, to the apparatus used to test the hypotheses - namely, the use of orange light vs. different light intensities; and the specific choice of the executive and emotional tasks, which differed in key features (e.g., block-related vs. event-related designs) that were orthogonal to the psychological constructs being challenged in each task.

      Given the small size of the hypothalamus and the irregular size of the hypothalamic parcels, I wondered whether a more data-driven examination of the hypothalamic time series would have provided a more parsimonious test of their hypothesis.

      We thank the reviewer for acknowledging the originality and interest of our study. We agree that some methodological choices needed more explanations. We will address the weaknesses they pointed out as follows:

      The first comment questions the choices of the light conditions and of the tasks. Regarding light conditions, since reviewer 1 (and reviewer 2) raised a similar issue, we refer to the response provided to reviewer 1. We agree that many different tasks could have been used to test our hypotheses. Prior work of our team showed that the n-back task and emotional task we used were successful probes to demonstrate that light illuminance modulates cognitive activity, including within subcortical structures (though resolution did not allow precise isolation of nuclei or subparts). When taking the step of ultra-high field imaging we therefore opted for these tasks as our goal was to show that illuminance affects subcortical brain activity across cognitive domains in general and we were not interested in tasks that would test specific aspects of these domains. The fact that one task is event-related while the other consists of a block design adds, in our view, to the robustness of our finding that a similar anterior-posterior gradient of activity modulation by illuminance is present in hypothalamus. We will update the discussion to highlight this aspect.

      As mentioned in the text, the protocol also included an auditory attentional task that could have further broadened the potential generalisability of our findings, but it was not part of the analyses as it could only include 2 illuminance levels due to time constrains.

      We agree that a data driven approach could have constituted an alternative means to tests our hypothesis. We opted for an approach that we mastered best while still allowing to conclusively test for regional differences in activity across the hypothalamus. Examination of time series of the very same data we used will mainly confirm the results of our analyses – an anterior-posterior gradient in the impact of illuminance - and may yield slight differences in the limits of the subparts of the hypothalamus undergoing decreased or increased activity with increasing illuminance. While the suggested approach may have been envisaged if we had been facing negative results (i.e. no differences between subparts, potentially because subparts would not correspond functional differences in response to illuminance change), it would now constitute a circular confirmation of our main findings (i.e. using the same data). While we truly appreciate the suggestion, we do not consider that it would constitute a more parsimonious test of our hypothesis now that we successfully applied GLM/parcellation and GLMM approaches.

    2. Reviewer #2 (Public Review):

      Summary:

      The interplay between environmental factors and cognitive performance has been a focal point of neuroscientific research, with illuminance emerging as a significant variable of interest. The hypothalamus, a brain region integral to regulating circadian rhythms, sleep, and alertness, has been posited to mediate the effects of light exposure on cognitive functions. Previous studies have illuminated the role of the hypothalamus in orchestrating bodily responses to light, implicating specific neural pathways such as the orexin and histamine systems, which are crucial for maintaining wakefulness and processing environmental cues. Despite advancements in our understanding, the specific mechanisms through which varying levels of light exposure influence hypothalamic activity and, in turn, cognitive performance, remain inadequately explored. This gap in knowledge underscores the need for high-resolution investigations that can dissect the nuanced impacts of illuminance on different hypothalamic regions. Utilizing state-of-the-art 7 Tesla functional magnetic resonance imaging (fMRI), the present study aims to elucidate the differential effects of light on the hypothalamic dynamics and establish a link between regional hypothalamic activity and cognitive outcomes in healthy young adults. By shedding light on these complex interactions, this research endeavors to contribute to the foundational knowledge necessary for developing innovative therapeutic strategies aimed at enhancing cognitive function through environmental modulation.

      Strengths:

      (1) Considerable Sample Size and Detailed Analysis:<br /> The study leverages a robust sample size and conducts a thorough analysis of hypothalamic dynamics, which enhances the reliability and depth of the findings.

      (2) Use of High-Resolution Imaging:<br /> Utilizing 7 Tesla fMRI to analyze brain activity during cognitive tasks offers high-resolution insights into the differential effects of illuminance on hypothalamic activity, showcasing the methodological rigor of the study.

      (3) Novel Insights into Illuminance Effects:<br /> The manuscript reveals new understandings of how different regions of the hypothalamus respond to varying illuminance levels, contributing valuable knowledge to the field.

      (4) Exploration of Potential Therapeutic Applications:<br /> Discussing the potential therapeutic applications of light modulation based on the findings suggests practical implications and future research directions.

      Weaknesses:

      (1) Foundation for Claims about Orexin and Histamine Systems:<br /> The manuscript needs to provide a clearer theoretical or empirical foundation for claims regarding the impact of light on the orexin and histamine systems in the abstract.

      (2) Inclusion of Cortical Correlates:<br /> While focused on the hypothalamus, the manuscript may benefit from discussing the role of cortical activation in cognitive performance, suggesting an opportunity to expand the scope of the manuscript.

      (3) Details of Light Exposure Control:<br /> More detailed information about how light exposure was controlled and standardized is needed to ensure the replicability and validity of the experimental conditions.

      (4) Rationale Behind Different Exposure Protocols:<br /> To clarify methodological choices, the manuscript should include more in-depth reasoning behind using different protocols of light exposure for executive and emotional tasks.

    3. eLife assessment

      This fundamental work describes the complex interplay between light exposure, hypothalamic activity, and cognitive function. The evidence supporting the conclusion is compelling with potential therapeutic applications of light modulation. The work will be of broad interest to basic and clinical neuroscientists.

    4. Reviewer #1 (Public Review):

      Summary:

      Campbell et al investigated the effects of light on the human brain, in particular the subcortical part of the hypothalamus during auditory cognitive tasks. The mechanisms and neuronal circuits underlying light effects in non-image forming responses are so far mostly studied in rodents but are not easily translated in humans. Therefore, this is a fundamental study aiming to establish the impact light illuminance has on the subcortical structures using the high-resolution 7T fMRI. The authors found that parts of the hypothalamus are differently responding to illuminance. In particular, they found that the activity of the posterior hypothalamus increases while the activity of the anterior and ventral parts of the hypothalamus decreases under high illuminance. The authors also report that the performance of the 2-back executive task was significantly better in higher illuminance conditions. However, it seems that the activity of the posterior hypothalamus subpart is negatively related to the performance of the executive task, implying that it is unlikely that this part of the hypothalamus is directly involved in the positive impact of light on performance observed. Interestingly, the activity of the posterior hypothalamus was, however, associated with an increased behavioural response to emotional stimuli. This suggests that the role of this posterior part of the hypothalamus is not as simple regarding light effects on cognitive and emotional responses. This study is a fundamental step towards our better understanding of the mechanisms underlying light effects on cognition and consequently optimising lighting standards.

      Strengths:

      While it is still impossible to distinguish individual hypothalamic nuclei, even with the high-resolution fMRI, the authors split the hypothalamus into five areas encompassing five groups of hypothalamic nuclei. This allowed them to reveal that different parts of the hypothalamus respond differently to an increase in illuminance. They found that higher illuminance increased the activity of the posterior part of the hypothalamus encompassing the MB and parts of the LH and TMN, while decreasing the activity of the anterior parts encompassing the SCN and another part of TMN. These findings are somewhat in line with studies in animals. It was shown that parts of the hypothalamus such as SCN, LH, and PVN receive direct retinal input in particular from ipRGCs. Also, acute chemogenetic activation of ipRGCs was shown to induce activation of LH and also increased arousal in mice.

      Weaknesses:

      While the light characteristics are well documented and EDI calculated for all of the photoreceptors, it is not very clear why these irradiances and spectra were chosen. It would be helpful if the authors explained the logic behind the four chosen light conditions tested. Also, the lights chosen have cone-opic EDI values in a high correlation with the melanopic EDI, therefore we can't distinguish if the effects seen here are driven by melanopsin and/or other photoreceptors. In order to provide a more mechanistic insight into the light-driven effects on cognition ideally one would use a silent substitution approach to distinguish between different photoreceptors. This may be something to consider when designing the follow-up studies.

    5. Reviewer #3 (Public Review):

      Summary:

      Campbell and colleagues use a combination of high-resolution fMRI, cognitive tasks, and different intensities of light illumination to test the hypothesis that the intensity of illumination differentially impacts hypothalamic substructures that, in turn, promote alterations in arousal that affect cognitive and affective performance. The authors find evidence in support of a posterior-to-anterior gradient of increased blood flow in the hypothalamus during task performance that they later relate to performance on two different tasks. The results provide an enticing link between light levels, hypothalamic activity, and cognitive/affective function, however, clarification of some methodological choices will help to improve confidence in the findings.

      Strengths:

      * The authors' focus on the hypothalamus and its relationship to light intensity is an important and understudied question in neuroscience.

      Weaknesses:

      * I found it challenging to relate the authors' hypotheses, which I found to be quite compelling, to the apparatus used to test the hypotheses - namely, the use of orange light vs. different light intensities; and the specific choice of the executive and emotional tasks, which differed in key features (e.g., block-related vs. event-related designs) that were orthogonal to the psychological constructs being challenged in each task.

      * Given the small size of the hypothalamus and the irregular size of the hypothalamic parcels, I wondered whether a more data-driven examination of the hypothalamic time series would have provided a more parsimonious test of their hypothesis.

    1. Reviewer #3 (Public Review):

      Summary:

      In this study, Han and co-authors showed that implantation of Pik3ca deficient KPC cells (aKO) induced clonal expansion of CD8 T cells in the tumor microenvironment. Using aKO cells, they conducted an in vivo genome-wide gene-deletion screen, which showed that deletion of propionyl-CoA carboxylase subunit B gene (Pccb) in αKO cells (p-aKO) leads to immune evasion and tumor progression. Eventually, mice injected with p-aKO but not aKO succumbed to their tumors. Similar to the parental aKO cell line, p-aKO tumors were still infiltrated with clonally expanded CD8+ and CD4+ T cells, as shown by the IHC. Further analyses showed that T cells infiltrating p-aKO tumors expressed high levels of exhaustion markers (PD-1, CTLA-4, TIM3, and TIGIT). Furthermore, PD-1 signaling blockade using PD-1 mAb or genetic depletion of PD-1 reactivated the infiltrated T cells, controlling tumor progression and improving the overall mice survival. Thus, the authors concluded in the abstract that "Pccb can modulate the activity of cytotoxic T cells infiltrating some pancreatic cancers." Although the data clearly showed that the loss of Pccb facilitated the immune evasion of pancreatic cancer cells, there is no clear evidence provided that Pccb deletion can actually modulate the activity of CD8 T cells. One may argue that the deletion of Pccb reduces the immunogenicity of the p-aKO cancer cells, making them less susceptible to killing by normally functional CD8+ T cells.

      Strengths:

      In vivo, Crisper-Cas-9 screen using tumor cell lines.

      Identify a gene that could reduce the immunogenicity of cancer cells.

      Weaknesses:

      The IHC technique that was used to stain and characterize the exhaustion status of the tumor-infiltrating T cells.

    2. eLife assessment

      The significance of the findings is valuable, with implications for immunotherapy design in pancreatic ductal adenocarcinoma. The evidence was considered incomplete and partially supportive of the major claims.

    3. Reviewer #1 (Public Review):

      Summary:

      Pancreatic ductal adenocarcinoma (PDAC) is an aggressive disease that does not respond to immunotherapy. This work represents an extension of the authors' prior observation that PI3Ka deletion in an orthotopic KPC pancreatic tumor model confers susceptibility to immune-mediated elimination. The authors' major claims in the present manuscript are as follows:

      (1) PI3Ka (Pik3ca) knockout in KPC pancreatic tumor cells induces clonal T cell expansion.

      (2) Genome-wide LOF screen in aKPC cells to identify tumor-intrinsic determinants of PI3Ka-KO-enhanced T cell response identified Pccb.

      (3) When Pccb is knocked out in the context of Pi3ka knockout KPC, anti-tumor T cell response is reduced as measured by<br /> a. Increased tumor progression<br /> b. Decreased survival<br /> c. T cells are still clonally expanded but less functional

      (4) ICB is able to "reactivate" clonally expanded T cells.

      (5) Conclusion: Pccb modulates the activity of T cells in PDAC.

      Overall, the experiments were appropriately executed and technically sound, albeit underpowered for single-cell analyses. Upon careful consideration of the data, the biggest weakness of the paper is the authors' interpretations of results, particularly for claims 1 and 4 (see below for details). Much of the data is correlative and does not delve into causation, leaving this reviewer wishing for experiments that would clearly demonstrate that Pccb in tumor cells directly impacts T cell anti-tumor activity.

      Strengths:

      (1) Tumor intrinsic determinants of intratumoral T cell infiltration in PDAC are less commonly evaluated as combination therapies for ICB. This is a point of conceptual innovation and importance.

      (2) A sensitized CRISPR screen to identify mutations that rescue KPC/PI3Ka-KO tumors from immune-mediated killing is an elegant method to better understand the molecular mechanisms contributing to KPC immunosurveillance. Further, one screen candidate (Pccb) was experimentally validated.

      (3) Single-cell clonotype analyses hold promise for identifying tumor-reactive T cells (though authors never demonstrated that specific clones were tumor antigen specific).

      Weaknesses:

      (1) "Clonal expansion of cytotoxic T cells infiltrating the pancreatic αKO tumors"<br /> a. Only two tumor-bearing hosts were evaluated by single-cell TCR sequencing, thus limiting conclusions that may be drawn regarding repertoire diversity and expansion.<br /> b. High abundance clones in the TME do not necessarily have tumor specificity, nor are they necessarily clonally expanded. They may be clones which are tissue-resident or highly chemokine-responsive and accumulate in larger numbers independent of clonal expansion. Please consider softening language to clonal enrichment or refer to clone size as clonal abundance throughout the paper.<br /> c. The whole story would be greatly strengthened by cytotoxicity assays of abundant TCR clones to show tumor antigen specificity.

      (2) "A genome-wide CRISPR gene-deletion screen to identify molecules contributing to Pik3ca-mediated pancreatic tumor immune evasion"<br /> a. CRISPR mutagenesis yielded outgrowth of only 2/8 tumors. A more complete screen with an increased total number of tumors would yield much stronger gene candidates with better statistical power. It is unsurprising that candidates were observed in only one of the two tumors. Nevertheless, the authors moved forward successfully with Pccb.

      (3) T cells infiltrate p-αKO tumors with increased expression of immune checkpoints<br /> a. In Figure 4D, cell counts are not normalized to totalCD8+ T cell counts making it difficult to directly compare aKO to p-aKO tumors. Based on quantifications from Figure 4D, I suspect normalization will strengthen the conclusion that CD8+ infiltrate is more exhausted in p-aKO tumors.<br /> b. Flow cytometric analysis to further characterize the myeloid compartment is incomplete (single replicate) and does not strengthen the argument that p-aKO TME is more immunosuppressive.<br /> c. It could, however, strengthen the argument that TIL has less anti-tumor potential if effector molecule expression in CD8+ infiltrating cells were quantified.

      (4) Inhibition of PD1/PD-L1 checkpoint leads to elimination of most p-αKO tumors<br /> a. It is reasonable to conclude that p-aKO tumors are responsive to immune checkpoint blockade. However, there is no data presented to support the statement that checkpoint blockade reactivates an existing anti-tumor CD8+ T cell response and does not instead induce a de novo response.<br /> b. The discussion of these data implies that anti-PD-1 would not improve aKO tumor control, but these data are not included. As such, it is difficult to compare the therapeutic response in aKO versus p-aKO. Further, these data are at best an indirect comparison of the T cell responsiveness against tumor, as the only direct comparison is infiltrating cell count in Figure 4 and there are no public TCR clones with confirmed anti-tumor specificity to follow in the aKO versus p-aKO response.

    4. Reviewer #2 (Public Review):

      Summary:

      Pancreatic ductal adenocarcinoma is generally considered a "cold" tumor type with little T cell infiltration. This group demonstrated previously that deletion of the PIK3CA isoform of PI3K in the orthotopic pancreatic ductal adenocarcinoma KPC mouse tumor model led to the elimination of tumors by T cells. Here they performed a genome-wide gene-deletion screen in this tumor using CRISPR to determine what was required for this T cell-mediated infiltration and tumor rejection. Deletion of Pccb in the tumors, which encodes propionyl-CoA carboxylase subunit B, allowed for the outgrowth of the PIK3CA-deleted KPC tumors. This was confirmed with the specific deletion of Pccb in the tumor cells. Demonstrating a likely role in tumor progression in human patients as well, high expression of PCCB in pancreatic ductal adenocarcinoma correlated with lower patient survival. T cells still infiltrated these tumors, but had much higher expression of exhaustion markers. Blockade of PD-1 signaling allowed for the rejection of these tumors. While these are intriguing data demonstrating that loss of PCCB by pancreatic ductal adenocarcinoma is a mechanism to escape T cell immunity, the mechanism by which this occurs is not determined. In addition, there are a few issues that suggest the conclusions of the manuscript should be tempered.

      Strengths:

      In vivo analysis of tumor CRISPR deletion screen.

      The study describes a possible novel mechanism by which a tumor maintains a "cold" microenvironment.

      Weaknesses:

      (1) A major issue is that it seems these data are based on the use of a single tumor cell clone with PIK3CA deleted. Therefore, there could be other changes in this clone in addition to the deletion of PIK3CA that could contribute to the phenotype.

      (2) The conclusion that the change in the PCCB-deficient tumor cell line is unrelated to mitochondrial metabolic changes may be incorrect based on the data provided. While it is true that in the experiments performed, there was no statistically significant change in the oxygen consumption rate or metabolite levels, this could be due to experimental error. There is a trend in the OCR being higher in the PCCB-deficient cells, although due to a high standard deviation, the change is not statistically significant. There is also a trend for there being more aKG in this cell line, but because there were only 3 samples per cell line, there is no statistically significant difference.

      (3) More data are required to make the authors' conclusion that there are myeloid changes in the PCCB-deficient tumor cells. There is only flow data from shown from one tumor of each type.

      (4) The previous published study demonstrated increased MHC and CD80 expression in the PIK3CA-deficient tumors and these differences were suggested to be the reason the tumors were rejected. However, no data concerning the levels of these proteins were provided in the current manuscript.

    1. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors delineate the crucial role of the SIRT2-ACSS2 axis in ACSS2 degradation. They demonstrate that SIRT2 acts as an ACSS2 deacetylase specifically under nutrient stress conditions, notably during amino acid deficiency. The SIRT2-mediated deacetylation of ACSS2 at K271 consequently triggers its proteasomal degradation. Additionally, they illustrate that acetylation of ACSS2 at K271 enhances ACSS2 protein levels, thereby promoting De Novo lipogenesis.

      Strengths:

      The findings presented in this manuscript are clearly interesting.

      Weaknesses:

      Further support is required for the model put forward by the authors.

    2. eLife assessment

      This useful study describes a role for acetylation in controlling the stability of acetyl-CoA synthetase 2, which converts acetate to acetyl-CoA for de novo lipid synthesis. While some aspects of the study are solid, the overall evidence supporting these findings is incomplete. Including additional critical controls for protein levels and stability and extending the findings to additional cell lines will strengthen the study. This work will be of interest to researchers studying lipid metabolism and related diseases.

    3. Reviewer #2 (Public Review):

      Summary:

      Karim et al investigated the regulation of ACSS2 by SIRT2. The authors identified a previously undescribed acetylation that they then show is important for the regulation and stability of ACSS2 in cells. The authors show that ACSS2 ubiquitination and degradation by the proteasome is regulated by SIRT2-mediated deacetylation of ACSS2 and that stabilizing ACSS2 by blocking SIRT2 can alter lipid accumulation in adipocytes.

      Strengths:

      Identification of a novel acetylation site on ACSS2 that regulates its protein stability and that has consequences on its activity in adipocytes. Multiple standard approaches were used to manipulate the expression and function of SIRT2 and ACSS2 (i.e., overexpression, knockdown, inhibitors).

      Weaknesses:

      The authors do not show direct deacetylation of ACSS2 by SIRT2 in an in vitro biochemical assay.

      It would have been nice to have included a bona-fide SIRT2 target as a control throughout the study.

      Throughout the manuscript, normalizing the data to 1 and then comparing the fold-change using a t-test is not the best statistical approach in that situation since every normalized value for control is 1 with zero standard deviation. The authors should consider an alternative statistical approach.

      Though not necessary, using 13C-acetate or D3-acetate tracing would be better for understanding the impact of acetylation on the activity of ACSS2 and its impact on lipogenesis.

      Did the authors also consider investigating SIRT1 in their assays? SIRT1 activates ACSS2 while SIRT2 leads to degradation of ACSS2. They should at least discuss these seemingly opposing roles of SIRT1 and SIRT2 in the regulation of ACSS2 and acetate metabolism in more depth, particularly as it concerns situations (i.e., diseases, pathologies) where either SIRT1, SIRT2, or both sirtuins, are active. This would enhance the significance of the findings to the broader research community.

      In Figure 3, the authors should consider immunoblotting for endogenous ACSS2 throughout the differentiation and lipogenesis study since the total ACSS2 levels is the crucial aspect to affecting acetate-dependent promotion of lipogenesis in adipocytes, and to confirm TM-dependent stabilization of ACSS2 in that assay.

      Do the authors have any data proving the K271 mutants of ACSS2 are still functional? Or that K271 ACSS2 protein is folded correctly?

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript shows SIRT2 can regulate acetylation of ACSS2 at residue 271, acetylation of 271 protects ACSS2 from proteasomal degradation in a SIRT2-dependent manner. Lastly, authors show that ACSS2 acetylation at K271 promotes lipid accumulation.

      Strengths:

      The author provides solid data showing ACSS2 acetylation can be regulated by targeting SIRT2 and that SIRT2 regulates ACSS2 ubiquitination. They identify K271 as a site of acetylation and show this is a site when mutated alters SIRT2-mediated ubiquitination.

      Weaknesses:

      However, data for this manuscript seems preliminary as nearly all data is performed in one cell line, some of the conclusions are not well supported by data and the overall role of ACSS2 K271 acetylation is not well characterized.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Bell et al. provide an exhaustive and clear description of the diversity of a new class of predicted type IV restriction systems that the authors denote as CoCoNuTs, for their characteristic presence of coiled-coil segments and nuclease tandems. Along with a comprehensive analysis that includes phylogenetics, protein structure prediction, extensive protein domain annotations, and an in-depth investigation of encoding genomic contexts, they also provide detailed hypotheses about the biological activity and molecular functions of the members of this class of predicted systems. This work is highly relevant, it underscores the wide diversity of defence systems that are used by prokaryotes and demonstrates that there are still many systems to be discovered. The work is sound and backed-up by a clear and reasonable bioinformatics approach. I do not have any major issues with the manuscript, but only some minor comments.

      Strengths:

      The analysis provided by the authors is extensive and covers the three most important aspects that can be covered computationally when analysing a new family/superfamily: phylogenetics, genomic context analysis, and protein-structure-based domain content annotation. With this, one can directly have an idea about the superfamily of the predicted system and infer their biological role. The bioinformatics approach is sound and makes use of the most current advances in the fields of protein evolution and structural bioinformatics.

      Weaknesses:

      It is not clear how coiled-coil segments were assigned if only based on AF2-predicted models or also backed by sequence analysis, as no description is provided in the methods. The structure prediction quality assessment is based solely on the average pLDDT of the obtained models (with a threshold of 80 or better). However, this is not enough, particularly when multimeric models are used. The PAE matrix should be used to evaluate relative orientations, particularly in the case where there is a prediction that parts from 2 proteins are interacting. In the case of multimers, interface quality scores, such as the ipTM or pDockQ, should also be considered and, at minimum, reported.

      A description of the coiled-coil predictions has been added to the Methods. For multimeric models, PAE matrices and ipTM+pTM scores have been included in Supplementary Data File S1.

      Reviewer #2 (Public Review):

      Summary:

      In this work, using in-depth computational analysis, Bell et al. explore the diverse repertoire of type IV McrBC modification-dependent restriction systems. The prototypical two-component McrBC system has been structurally and functionally characterised and is known to act as a defence by restricting phage and foreign DNA containing methylated cytosines. Here, the authors find previously unanticipated complexity and versatility of these systems and focus on detailed analysis and classification of a distinct branch, the so-called CoCoNut, named after its composition of coiled-coil structures and tandem nucleases. These CoCoNut systems are predicted to target RNA as well as DNA and to utilise defence mechanisms with some similarity to type III CRISPR-Cas systems.

      Strengths:

      This work is enriched with a plethora of ideas and a myriad of compelling hypotheses that now await experimental verification. The study comes from the group that was amongst the first to describe, characterize, and classify CRISPR-Cas systems. By analogy, the findings described here can similarly promote ingenious experimental and conceptual research that could further drive technological advances. It could also instigate vigorous scientific debates that will ultimately benefit the community.

      Weaknesses:

      The multi-component systems described here function in the context of large oligomeric complexes. Some of the single chain AF2 predictions shown in this work are not compatible, for example, with homohexameric complex formation due to incompatible orientation of domains. The recent advances in protein structure prediction, in particular AlphaFold2 (AF2) multimer, now allow us to confidently probe potential protein-protein interactions and protein complex formation. This predictive power could be exploited here to produce a better glimpse of these multimeric protein systems. It can also provide a more sound explanation for some of the observed differences amongst different McrBC types.

      Hexameric CnuB complexes with CnuC stimulatory monomers for Type I-A, I-B, I-C, II, and III-A CoCoNuT systems have been modeled with AF2 and included in Supplementary Data File S1, albeit without the domains fused to the GTPase N-terminus (with the exception of Type I-B, which lacks the long coiled-coil domain fused to the GTPase and was modeled with its entire sequence). Attempts to model the other full-length CnuB hexamers did not lead to convincing results.

      Recommendations for the authors:

      Reviewing Editor:

      The detailed recommendations by the two reviewers will help the authors to further strengthen the manuscript, but two points seem particularly worth considering: 1. The methods are barely sketched in the manuscript, but it could be useful to detail them more closely. Particularly regarding the coiled-coil segments, which are currently just statists, useful mainly for the name of the family, more detail on their prediction, structural properties, and purpose would be very helpful. 2. Due to its encyclopedic nature, the wealth of material presented in the paper makes it hard to penetrate in one go. Any effort to make it more accessible would be very welcome. Reviewer 1 in particular has made a number of suggestions regarding the figures, which would make them provide more support for the findings described in the text.

      A description of the techniques used to identify coiled-coil segments has been added to the Methods. Our predictions ranged from near certainty in the coiled-coils detected in CnuB homologs, to shorter helices at the limit of detection in other factors. We chose to report all probable coiled-coils, as the extensive coiled-coils fused to CnuB, which are often the only domain present other than the GTPase, imply involvement in mediating complex formation by interacting with coiled-coils in other factors, particularly the other CoCoNuT factors. The suggestions made by Reviewer 1 were thoughtful and we made an effort to incorporate them.

      Reviewer #1 (Recommendations For The Authors):

      I do not have any major issues with the manuscript. I have however some minor comments, as described below.

      • The last sentence of the abstract at first reads as a fact and not a hypothesis resulting from the work described in the manuscript. After the second read, I noticed the nuances in the sentence. I would suggest a rephrasing to emphasize that the activity described is a theoretical hypothesis not backed-up by experiments.

      This sentence has been rephrased to make explicit the hypothetical nature of the statement.

      • In line 64, the authors rename DUF3578 as ADAM because indeed its function is not unknown. Did the authors consider reaching out to InterPro to add this designation to this DUF? A search in interpro with DUF3578 results in "MrcB-like, N-terminal domain" and if a name is suggested, it may be worthwhile to take it to the IntrePro team.

      We will suggest this nomenclature to InterPro.

      • I find Figure 1E hard to analyse and think it occupies too much space for the information it provides. The color scheme, the large amount of small slices, and the lack of numbers make its information content very small. I would suggest moving this to the supplementary and making it instead a bar plot. If removed from Figure 1, more space is made available for the other panels, particularly the structural superpositions, which in my opinion are much more important.

      We have removed Figure 1E from the paper as it adds little information beyond the abundance and phyletic distribution of sequenced prokaryotes, in which McrBC systems are plentiful.

      • In Figure 2, it is not clear due to the presence of many colorful "operon schemes" that the tree is for a single gene and not for the full operon segment. Highlighting the target gene in the operons or signalling it somehow would make the figure easy to understand even in the absence of the text and legend. The same applies to Supplementary Figure 1.

      The legend has been modified to show more clearly that this is a tree of McrB-like GTPases.

      • In line 146, the authors write "AlphaFold-predicted endonucelase fold" to say that a protein contains a region that AF2 predicts to fold like an endonuclease. This is a weird way of writing it and can be confusing to non-expert readers. I would suggest rephrasing for increased clarity.

      This sentence has been rephrased for greater clarity.

      • In line 167, there is a [47]. I believe this is probably due to a previous reference formatting.

      Indeed, this was a reference formatting error and has been fixed.

      • In most figures, the color palette and the use of very similar color palettes for taxonomy pie charts, genomic context composition schemes, and domain composition diagrams make it really hard to have a good understanding of the image at first. Legends are often close to each other, and it is not obvious at first which belong to what. I would suggest changing the layouts and maybe some color schemes to make it easier to extract the information that these figures want to convey.

      It seemed that Figure 4 was the most glaring example of these issues, and it has been rearranged for easier comprehension.

      • In the paragraph that starts at line 199, the authors mention an Ig-like domain that is often found at the N-terminus of Type I CoCoNuTs. Are they all related to each other? How conserved are these domains?

      These domains are all predicted to adopt a similar beta-sandwich fold and are found at the N-terminus of most CoCoNuT CnuC homologs, suggesting they are part of the same family, but we did not undertake a more detailed sequenced-based analysis of these regions.

      We also find comparable domains in the CnuC/McrC-like partners of the abundant McrB-like NxD motif GTPases that are not part of CoCoNuT systems, and given the similarity of some of their predicted structures to Rho GDP-dissociation inhibitor 1, we suspect that they have coevolved as regulators of the non-canonical NxD motif GTPase type. Our CnuBC multimer models showing consistent proximity between these domains in CnuC and CnuB GTPase domains suggest this could indeed be the case. We plan to explore these findings further in a forthcoming publication.

      • In line 210, the authors write "suggesting a role in overcrowding-induced stress response". Why so? In >all other cases, the authors justify their hypothesis, which I really appreciated, but not here.

      A supplementary note justifying this hypothesis has been added to Supplementary Data File S1.

      • At the end of the paragraph that starts in line 264, the authors mention that they constructed AF2 multimeric models to predict if 2 proteins would interact. However, no quality scores were provided, particularly the PAE matrix. This would allow for a better judgement of this prediction, and I would suggest adding the PAE matrix as another panel in the figure where the 3D model of the complex is displayed.

      The PAE matrix and ipTM+pTM scores for this and other multimer models have been added to Supplementary Data File S1. For this model in particular, the surface charge distribution of the model has been presented to support the role of the domains that have a higher PAE in RNA binding.

      • In line 306, "(supplementary data)" refers to what part of the file?

      This file has been renamed Supplementary Table S3 and referenced as such.

      • In line 464, the authors suggest that ShdA could interact with CoCoNuTs. Why not model the complex as done for other cases? what would co-folding suggest?

      As we were not able to convincingly model full-length CnuB hexamers with N-terminal coiled-coils, we did not attempt modeling of this hypothetical complex with another protein with a long coiled-coil, but it remains an interesting possibility.

      • In line 528, why and how were some genes additionally analyzed with HHPred?

      Justification for this analysis has been added to the Methods, but briefly, these genes were additionally analyzed if there were no BLAST hits or to confirm the hits that were obtained.

      • In the first section of the methods, the first and second (particularly the second) paragraphs are extremely long. I would suggest breaking them to facilitate reading.

      This change has been made.

      • In line 545, what do the authors mean by "the alignment (...) were analyzed with HHPred"?

      A more detailed description of this step has been added to the Methods.

      • The authors provide the models they produced as well as extensive supplementary tables that make their data reusable, but they do not provide the code for the automated steps, as to excise target sequence sections out of multiple sequence alignments, for example.

      The code used for these steps has been in use in our group at the NCBI for many years. It will be difficult to utilize outside of the NCBI software environment, but for full disclosure, we have included a zipped repository with the scripts and custom-code dependencies, although there are external dependencies as well such as FastTree and BLAST. In brief, it involves PSI-BLAST detection of regions with the most significant homology to one of a set of provided alignments (seals-2-master/bin/wrappers/cog_psicognitor). In this case, the reference alignments of McrB-like GTPases and DUF2357 were generated manually using HHpred to analyze alignments of clustered PSI-BLAST results. This step provided an output of coordinates defining domain footprints in each query sequence, which were then combined and/or extended using scripts based on manual analysis of many examples with HHpred (footprint_finders/get_GTPase_frags.py and footprint_finders/get_DUF2357_frags.py), then these coordinates were used to excise such regions from the query amino acid sequence with a final script (seals-2-master/bin/misc/fa2frag).

      Reviewer #2 (Recommendations For The Authors):

      (1) Page 4, line 77 - 'PUA superfamily domains' could be more appropriate to use instead of "EVE superfamily".

      While this statement could perhaps be applied to PUA superfamily domains, our previous work we refer to, which strongly supports the assertion, was restricted to the EVE-like domains and we prefer to retain the original language.

      (2) Page 5. lines 128-130 - AF2 multimer prediction model could provide a more sound explanation for these differences.

      Our AF2 multimer predictions added in this revision indeed show that the NxD motif McrB-like CoCoNuT GTPases interact with their respective McrC-like partners such that an immunoglobulin-like beta-sandwich domain, fused to the N-termini of the McrC homologs and similar to Rho GDP-dissociation inhibitor 1, has the potential to physically interact with the GTPase variants. However, we did not probe this in greater detail, as it is beyond the scope of this already highly complex article, but we plan to study it in the future.

      (3) Page 8, line 252 - The surface charge distribution of CnuH OB fold domain looks very different from SmpB (pdb3iyr). In fact, the regions that are in contact with RNA in SmpB are highly acidic in CoCoNut CnuH. Although it looks likely that this domain is involved in RNA binding, the mode of interaction should be very different.

      We did not detect a strong similarity between the CnuH SmpB-like SPB domain and PDB 3IYR, but when we compare the surface charge distribution of PDB 1WJX and the SPB domain, while there is a significant area that is positively charged in 1WJX that is negatively charged in SPB, there is much that overlaps with the same charge in both domains.

      The similarity between SmpB and the SPB domain is significant, but definitely not exact. An important question for future studies is: If the domains are indeed related due to an ancient fusion of SmpB to an ancestor of CnuH, would this degree of divergence be expected?

      In other words, can we say anything about how the function of a stand-alone tmRNA-binding protein could evolve after being fused to a complex predicted RNA helicase with other predicted RNA binding domains already present? Experimental validation will ultimately be necessary to resolve these kinds of questions, but for now, it may be safe to say that the presence of this domain, especially in conjunction with the neighboring RelE-like RTL domain and UPF1-like helicase domain, signals a likely interaction with the A-site of the ribosome, and perhaps restriction of aberrant/viral mRNA.

    2. eLife assessment

      This paper marks a fundamental advance in our understanding of prokaryotic Type IV restriction systems. The authors provide an encyclopedic overview of a hitherto uncharacterized branch of these systems, which they name CoCoNuTs, for coiled-coil nuclease tandems. They provide compelling evidence that these nucleases target RNA and are part of an echeloned defense response following viral infection. This article will be of great interest to scientists studying prokaryotic immunity mechanisms, as well as broadly to protein scientists engaged in the analysis, classification, and functional annotation of the proteome of life.

    3. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Bell et al. provide an exhaustive and clear description of the diversity of a new class of predicted type IV restriction systems that the authors denote as CoCoNuTs, for their characteristic presence of coiled-coil segments and nuclease tandems. Along with a comprehensive analysis that includes phylogenetics, protein structure prediction, extensive protein domain annotations, and in-depth investigation of encoding genomic contexts, they also provide detailed hypothesis about the biological activity and molecular functions of the members of this class of predicted systems. This work is highly relevant, it underscores the wide diversity of defence systems that are used by prokaryotes and demonstrates that there are still many systems to be discovered. The work is sound and backed up by a clear and reasonable bioinformatics approach.

      Strengths:

      The analysis provided by the authors is extensive and covers the three most important aspects that can be covered computationally when analysing a new family/superfamily: phylogenetics, genomic context analysis, and protein-structure-based domain content annotation. With this, one can directly have an idea about the superfamily of the predicted system and infer about their biological role. The bioinformatics approach is sound and makes use of the most current advances in the fields of protein evolution and structural bioinformatics.

      Weaknesses:

      It is not clear how coiled-coil segments were assigned if only based on AF2-predicted models or also backed by sequence analysis, as no description is provided in the methods. The structure prediction quality assessment is based solely on the average pLDDT of the obtained models (with a threshold of 80 or better). However, this is not enough, particularly when multimeric models were used. The PAE matrix should be used to evaluate relative orientations, particularly in the case where there is a prediction that parts from 2 proteins are interacting. In the case of multimers, interface quality scores, as the ipTM or pDockQ, should also be considered and, at minimum, reported.

      These weaknesses were addressed during revision, and the results provided by the authors support their conclusions. The data resulting from this work will be useful for the general life sciences community, particularly the prokaryotic defense and microbiology communities. It also underscores the high range of functionally unknowns in sequenced genomes that are now much easier to find and interpret due to the success of deep-learning based methods and automated robust bioinformatics pipelines.

    4. Reviewer #2 (Public Review):

      Summary:

      In this work, using in-depth computational analysis, Bell et al. explore the diverse repertoire of type IV McrBC modification dependent restriction systems. The prototypical two-component McrBC system has been structurally and functionally characterised and is known to act as a defence by restricting phage and foreign DNA containing methylated cytosines. Here, the authors find previously unanticipated complexity and versatility of these systems and focus on detailed analysis and classification of a distinct branch, the so-called CoCoNut, named after its composition of coiled-coil structures and tandem nucleases. These CoCoNut systems are predicted to target RNA as well as DNA and to utilise defence mechanisms with some similarity to type III CRISPR-Cas systems.

      Strengths:

      This work is enriched with a plethora of ideas and a myriad of compelling hypotheses that now will await experimental verification. The study comes from the group that was amongst the first to describe, characterise, and classify CRISPR-Cas systems. By analogy, the findings described here can similarly promote ingenious experimental and conceptual research that could further drive technological advances. It could also instigate vigorous scientific debates that will ultimately benefit the community.

      Weaknesses:

      The multi-component systems described here function in the context of large oligomeric complexes similarly to the prototypical McrBC system. While the AlphaFold2 (AF2) multimer predictions are provided in this work, these are not compared with the known McrBC structures. These comparisons could have been helpful not only for providing insights into these multimeric protein systems but also for giving more sound explanations of the differences observed amongst different McrBC types.

    1. eLife assessment

      The paper presents valuable insights into the success of the parasitoid Trichopria drosophilae on Drosophila suzukii, elucidating the importance of both molecular adaptations, such as specialized venom proteins and unique cell types, and ecological strategies, including tolerance of intraspecific competition and avoidance of interspecific competition. Through convincing methodological approaches, the authors demonstrate how these adaptations optimize nutrient uptake and enhance parasitic success, highlighting the intricate coordination between molecular and ecological factors in driving parasitization success.

    2. Reviewer #1 (Public Review):

      Summary:

      Major findings or outcomes include a genome for the wasp, characterization of the venom constituents and teratocyte and ovipositor expression profiles, as well as information about Trichopria ecology and parasitism strategies. It was found that Trichopria cannot discriminate among hosts by age, but can identify previously parasitized hosts. The authors also investigated whether superparasitism by Trichopria wasps improved parasitism outcomes (it did), presumably by increasing venom and teratocyte concentrations/densities. Elegant use of Drosophila ectopic expression tools allowed for functional characterization of venom components (Timps), and showed that these proteins are responsible for parasitoid-induced delays in host development. After finding that teratocytes produce a large number of proteases, experiments showed that these contribute to digestion of host tissues for parasite consumption.<br /> The discussion ties these elements together by suggesting that genes used for aiding in parasitism via different parts of the parasitism arsenal arise from gene duplication and shifts in tissue of expression (to venom glands or teratocytes).

      Strengths:

      The strength of this manuscript is that it describes the parasitism strategies used by Trichopria wasps at a molecular and behavioral level with broad strokes. It represents a large amount of work that in previous decades might have been published in several different papers. Including all of these data in a manuscript together makes for a comprehensive and interesting study.

      Weaknesses:

      The weakness is that the breadth of the study results in fairly shallow mechanistic or functional results for any given facet of Trichopria's biology. Although none of the findings are especially novel given results from other parasitoid species in previous publications, integrating results together provides significant information about Trichopria biology.

    3. Reviewer #2 (Public Review):

      Summary:

      Key findings of this research include the sequencing of the wasp's genome, identification of venom constituents and teratocytes, and examination of Trichopria drosophilae (Td)'s ecology and parasitic strategies. It was observed that Td doesn't distinguish between hosts based on age but can recognize previously parasitized hosts. The study also explored whether multiple parasitisms by Td improved outcomes, which indeed it did, possibly by increasing venom and teratocyte levels. Utilizing Drosophila ectopic expression tools, the authors functionally characterized venom components, specifically tissue inhibitors of metalloproteinases (Timps), which were found to cause delays in host development. Additionally, experiments revealed that teratocytes produce numerous proteases, aiding in the digestion of host tissues for parasite consumption. The discussion suggests that genes involved in different aspects of parasitism may arise from gene duplication and shifts in tissue expression to venom glands or teratocytes.

      Strengths:

      This manuscript provides an in-depth and detailed depiction of the parasitic strategies employed by Td wasps, spanning both molecular and behavioral aspects. It consolidates a significant amount of research that, in the past, might have been distributed across multiple papers. By presenting all this data in a single manuscript, it delivers a comprehensive and engaging study that could help future developments in the field of biological control against a major insect pest.

      Weaknesses:

      While none of the findings are particularly groundbreaking, as similar results have been reported for other parasitoid species in prior research, the integration of these results into one comprehensive overview offers valuable biological insights into an interesting new potential biocontrol species.

    1. eLife assessment

      This important study asks whether motor neurons within the vestibulo-ocular circuit of zebrafish are required to determine the identity, connectivity, and function of upstream premotor neurons. They provide convincing genetic, anatomical and behavioral evidence that the answer is no. This work is of general interest to developmental neurobiologists and motivates future studies of whether motor neurons are dispensable for assembly of other sensorimotor neural circuits.

    2. Reviewer #1 (Public Review):

      Summary:

      This study has as its goal to determine how the structure and function of the circuit that stabilizes gaze in the larval zebrafish depends on the presence of the output cells, the motor neurons. A major model of neural circuit development posits that the wiring of neurons is instructed by their postsynaptic cells, transmitting signals retrogradely on which cells to contact and, by extension, where to project their axons. Goldblatt et al. remove the motor neurons from the circuit by generating null mutants for the phox2a gene. The study then shows that, in this mutant that lacks the isl1-labelled extraocular motor neurons, the central projection neurons have 1) largely normal responses to vestibular input; 2) normal gross morphology; 3) minimally changed transcriptional profiles. From this, the authors conclude that the wiring of the circuit is not instructed by the output neurons, refuting the major model.

      Strengths:

      I found the manuscript to be exceptionally well-written and presented, with clear and concise writing and effective figures that highlight key concepts. The topic of neural circuit wiring is central to neuroscience, and the paper's findings will interest researchers across the field, and especially those focused on motor systems.

      The experiments conducted are clever and of a very high standard, and I liked the systematic progression of methods to assess the different potential effects of removing phox2a on circuit structure and function. Analyses (including statistics) are comprehensive and appropriate and show the authors are meticulous and balanced in most of the conclusions that they draw. Overall, the findings are interesting, and with a few tweaks, should leave little doubt about the paper's main conclusions.

      Weaknesses:

      The main point is the incomplete characterisation of the effects of removing phox2a on the extra-ocular motor neurons. Are these cells no longer there, or are they there but no longer labelled by isl1:GFP? If they are indeed removed, might they have developed early on, and subsequently lost? These questions matter as the central focus of the manuscript is whether the presence of these cells influences the connectivity and function of their presynaptic projection neurons. Therefore, for the main conclusions to be fully supported by the data, the authors would need to test whether 1) the motor neurons that otherwise would have been labelled by the isl1:GFP line are physically no longer there; 2) that this removal (if, indeed, it is that) is developmental. If these experiments are not feasible, then the text should be adjusted to take this into account. A further point to address is the context of the manipulation. If the phox2a removal does indeed take out the extra-ocular motor neurons, what percentage of postsynaptic neurons to the projection neurons are still present? In other words, how does the postsynaptic nMLF output relate to the motor neurons? If, for instance, the nMLF (which, as the authors state, are likely still innervated by the projection neurons) are the main output of the projections neurons, then this would affect the interpretation of the results.

    3. Reviewer #2 (Public Review):

      Summary:

      This study was designed to test the hypothesis that motor neurons play a causal role in circuit assembly of the vestibulo-ocular reflex circuit, which is based on the retrograde model proposed by Hans Straka. This circuit consists of peripheral sensory neurons, central projection neurons, and motor neurons. The authors hypothesize that loss of extraocular motor neurons, through CRISPR/Cas9 mutagenesis of the phox2a gene, will disrupt sensory selectivity in presynaptic projection neurons if the retrograde model is correct.

      Account of the major strengths and weaknesses of the methods and results:

      The work presented is impressive in both breadth and depth, including the experimental paradigms. Overall, the main results were that the loss of function paradigm to eliminate extraocular motor neurons did not 1) alter the normal functional connections between peripheral sensory neurons and central projection neurons, 2) affect the position of central projection neurons in the sensorimotor circuit, or 3) significantly alter the transcriptional profiles of central projection neurons. Together, these results strongly indicate that retrograde signals from motor neurons are not required for the development of the sensorimotor architecture of the vestibulo-ocular circuit.

      Appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      The results of this study showed that extraocular motor neurons were not required for central projection neuron specification in the vestibulo-ocular circuit, which countered the prevailing retrograde hypothesis proposed for circuit assembly. A concern is that the results presented may be limited to this specific circuit and may not be generalizable to other circuit assemblies, even to other sensorimotor circuits.

      Discussion of the likely impact of the work on the field, and the utility of the methods and data to the community:

      As mentioned above, this study sheds valuable new insights into the developmental organization of the vestibulo-ocular circuit. However, different circuits likely utilize various mechanisms, extrinsic or intrinsic (or both), to establish proper functional connectivity. So, the results shown here, although begin to explain the developmental organization of the vestibulo-ocular circuit, are not likely to be generalizable to other circuits; though this remains to be seen. At a minimum, this study provides a starting point for the examination of patterning of connections in this and other sensorimotor circuits.

    4. Reviewer #3 (Public Review):

      In this manuscript by Goldblatt et al. the authors study the development of a well-known sensorimotor system, the vestibulo-ocular reflex circuit, using Danio rerio as a model. The authors address whether motor neurons within this circuit are required to determine the identity, upstream connectivity and function of their presynaptic partners, central projection neurons. They approach this by generating a CRISPR-mediated knockout line for the transcription factor phox2a, which specifies the fate of extraocular muscle motor neurons. After showing that phox2a knockout ablates these motor neurons, the authors show that functionally, morphologically, and transcriptionally, projection neurons develop relatively normally.

      Overall, the authors present a convincing argument for the dispensability of motor neurons in the wiring of this circuit, although their claims about the generalizability of their findings to other sensorimotor circuits should be tempered. The study is comprehensive and employs multiple methods to examine the function, connectivity and identity of projection neurons.

      Specific comments:

      (1) In the introduction the authors set up the controversy on whether or not motor neurons play an instructive role in determining "pre-motor fate". This statement is somewhat generic and a bit misleading as it is generally accepted that many aspects of interneuron identity are motor neuron-independent. The authors might want to expand on these studies and better define what they mean by "fate", as it is not clear whether the studies they are citing in support of this hypothesis actually make that claim.

      (2) Although it appears unchanged from their images, the authors do not explicitly quantitate the number of total projection neurons in phox2a knockouts.

      (3) For figures 2C and 3C, please report the proportion of neurons in each animal, either showing individual data points here or in a separate supplementary figure; and please perform and report the results of an appropriate statistical test.

      (4) In the topographical mapping of calcium responses (figures 2D, E and 3D), the authors say they see no differences but this is hard to appreciate based on the 3D plotting of the data. Quantitating the strength of the responses across the 3-axes shown individually and including statistical analyses would help make this point, especially since the plots look somewhat qualitatively different.

      (5) The transcriptional analysis is very interesting, however, it is not clear why it was performed at 72 hpf, while functional experiments were performed at 5 days. Is it possible that early aspects of projection neuron identity are preserved, while motor neuron-dependent changes occur later? The authors should better justify and discuss their choice of timepoint. The inclusion of heterozygotes as controls is problematic, given that the authors show there are notable differences between phox2a+/+ and phox2a+/- animals; pooling these two genotypes could potentially flatten differences between controls and phox2a-/-.

      (6) Projection neurons appear to be topographically organized and this organization is maintained in the absence of motor neurons. Are there specific genes that delineate ventral and dorsal projection neurons? If so, the authors should look at those as candidate genes as they might be selectively involved in connectivity. Showing that generic synaptic markers (Figure 4E) are maintained in the entire population is not convincing evidence that these neurons would choose the correct synaptic partners.

    1. eLife assessment

      This is a fundamental study that addresses the key question of how the tetraspanin Tspan12 functions biochemically as a co-receptor for Norrin to initiate β-catenin signaling. The strength of the work lies in the rigorous and compelling binding analyses involving various purified receptors, co-receptors, and ligands, as well as molecular modeling by AlphaFold that was subsequently validated by an extensive series of mutagenesis experiments. The study advances the field by providing a novel mechanism of co-receptor function and shedding new light on how signaling specificity is achieved in the complex Wnt/Norrin signaling system.

    2. Reviewer #1 (Public Review):

      Though the Norrin protein is structurally unrelated to the Wnt ligands, it can activate the Wnt/β-catenin pathway by binding to the canonical Wnt receptors Fzd4 and Lrp5/6, as well as the tetraspanin Tspan12 co-receptor. Understanding the biochemical mechanisms by which Norrin engages Tspan12 to initiate signaling is important, as this pathway plays an important role in regulating retinal angiogenesis and maintaining the blood-retina-barrier. Numerous mutations in this signaling pathway have also been found in human patients with ocular diseases. The overarching goal of the study is to define the biochemical mechanisms by which Tspan12 mediates Norrin signaling. Using purified Tspan12 reconstituted in lipid nanodiscs, the authors conducted detailed binding experiments to document the direct, high-affinity interactions between purified Tspan12 and Norrin. To further model this binding event, they used AlphaFold to dock Norrin and Tspan12 and identified four putative binding sites. They went on to validate these sites through mutagenesis experiments. Using the information obtained from the AlphaFold modeling and through additional binding competition experiments, it was further demonstrated that Tspan12 and Fzd4 can bind Norrin simultaneously, but Tspan12 binding to Norrin is competitive with other known co-receptors, such as HSPGs and Lrp5/6. Collectively, the authors proposed that the main function of Tspan12 is to capture low concentrations of Norrin at the early stage of signaling, and then "hand over" Norrin to Fzd4 and Lrp5/6 for further signal propagation. Overall, the study is comprehensive and compelling, and the conclusions are well supported by the experimental and modeling data.

      Strengths:

      • Biochemical reconstitution of Tspan12 and Fzd4 in lipid nanodiscs is an elegant approach for testing the direct binding interaction between Norrin and its co-receptors. The proteins used for the study seem to be of high purity and quality.

      • The various binding experiments presented throughout the study were carried out rigorously. In particular, BLI allows accurate measurement of equilibrium binding constants as well as on and off rates.

      • It is nice to see that the authors followed up on their AlphaFold modeling with an extensive series of mutagenesis studies to experimentally validate the potential binding sites. This adds credence to the AlphaFold models.

      • Table S1 is a further testament to the rigor of the study.

      • Overall, the study is comprehensive and compelling, and the conclusions are well supported by the experimental and modeling data.

      Suggestions for improvement:

      • It would be helpful to show Coomassie-stained gels of the key mutant Norrin and Tspan12 proteins presented in Figures 2E and 2F.

      • Many Norrin and Tspan12 mutations have been identified in human patients with FEVR. It would be interesting to comment on whether any of the mutations might affect the Norrin-Tspan12 binding sites described in this study.

      • Some of the negative conclusions (e.g. the lack of involvement of Tspan12 in the formation of the Norrin-Lrp5/6-Fzd4-Dvl signaling complex) can be difficult to interpret. There are many possible reasons as to why certain biological effects are not recapitulated in a reconstitution experiment. For instance, the recombinant proteins used in the experiment may not be presented in the correct configurations, and certain biochemical modifications, such as phosphorylation, may also be missing.

    3. Reviewer #2 (Public Review):

      This is an interesting study of high quality with important and novel findings. Bruguera et al. report a biochemical and structural analysis of the Tspan12 co-receptor for norrin. Major findings are that Norrin directly binds Tspan12 with high affinity (this is consistent with a report on BioRxiv: Antibody Display of cell surface receptor Tetraspanin12 and SARS-CoV-2 spike protein) and a predicted structure of Tspan12 alone or in complex with Norrin. The Norrin/Tspan12 binding interface is largely verified by mutational analysis. An interaction of the Tspan12 large extracellular loop (LEL) with Fzd4 cannot be detected and interactions of full-length Tspan12 and Fzd4 cannot be tested using nano-disc based BLI, however, Fzd4/Tspan12 heterodimers can be purified and inserted into nanodiscs when aided by split GFP tags. An analysis of a potential composite binding site of a Fzd4/Tspan12 complex is somewhat inconclusive, as no major increase in affinity is detected for the complex compared to the individual components. A caveat to this data is that affinity measurements were performed for complexes with approximately 1 molecule Tspan12 and FZD4 per nanodisc, while the composite binding site could potentially be formed only in higher order complexes, e.g., 2:2 Fzd4/Tspan12 complexes. Interestingly, the authors find that the Norrin/Tspan12 binding site and the Norrin/Lrp6 binding site partially overlap and that the Lrp6 ectodomain competes with Tspan12 for Norrin binding. This result leads the authors to propose a model according to which Tspan12 captures Norrin and then has to "hand it off" to allow for Fzd4/Lrp6 formation. By increasing the local concentration of Norrin, Tspan12 would enhance the formation of the Fzd4/Lrp5 or Fzd4/Lrp6 complex.

      The experiments based on membrane proteins inserted into nano-discs and the structure prediction using AlphaFold yield important new insights into a protein complex that has critical roles in normal CNS vascular biology, retinal vascular disease, and is a target for therapeutic intervention. However, it remains unclear how Norrin would be "handed off" from Tspan12 or Tspan12/Fzd4 complexes to Fzd4/Lrp6 complexes, as the relatively high affinity of Norrin to Fzd4/Tspan12 dimers likely does not favor the "handing off" to Fzd4/Lrp6 complexes.

      Areas that would benefit from further experiments, or a discussion, include:

      - The authors test a potential composite binding site of Fzd4/Tspan12 heterodimers for norrin using nanodiscs that contain on average about 1 molecule Fzd4 and 1 molecule Tspan12. The Fzd4/Tspan12 heterodimer is co-inserted into the nanodiscs supported by split-GFP tags on Fzd4 and Tspan12. The authors find no major increase in affinity, although they find changes to the Hill slope, reflecting better binding of norrin at low norrin concentrations. In 293F cells overexpressing Fzd4 and Tspan12 (which may result in a different stoichiometry) they find more pronounced effects of norrin binding to Fzd4/Tspan12. This raises the possibility that the formation of a composite binding requires Fzd4/Tspan12 complexes of higher order, for example, 2:2 Fzd4/Tspan12 complexes, where the composite binding site may involve residues of each Fzd4 and Tspan12 molecule in the complex. This could be tested in nanodiscs in which Fzd4 and Tspan12 are inserted at higher concentrations or using Fzd4 and Tspan12 that contain additional tags for oligomerization.

      - While Tspan12 LEL does not bind to Fzd4, the successful reconstitution of GFP from Tspan12 and Fzd4 tagged with split GFP components provides evidence for Fzd4/Tspan12 complex formation. As a negative control, e.g., Fzd5, or Tspan11 with split GFP tags (Fzd5/Tspan12 or Fzd4/Tspan11) would clarify if FZD4/Tspan12 heterodimers are an artefact of the split GFP system.

      - Fzd4/Tspan12 heterodimers stabilized by split GFP may be locked into an unfavorable orientation that does not allow for the formation of a composite binding site of FZD4 and Tspan12, this is another caveat for the interpretation that Fzd4/Tspan12 do not form a composite binding site. This is not discussed.

      - Mutations that affect the affinity of norrin/fzd4 are not used to further test if Fzd4 and Tspan12 form a composite binding site. Norrin R41E or Fzd4 M105V were previously reported to reduce norrin/frizzled4 interactions and signaling, and both interaction and signaling were restored by Tspan12 (Lai et al. 2017). Whether a Fzd4/Tspan12 heterodimer has increased affinity for Norrin R41E was not tested. Similarly, affinity of FZD4 M105V vs a Fzd4 M105V/Tspan12 heterodimer were not tested.

      - An important conclusion of the study is that Tspan12 or Lrp6 binding to Norrin is mutually exclusive. This could be corroborated by an experiment in which LRP5/6 is inserted into nanodiscs for BLI binding tests with Norrin, or Tspan12 LEL, or a combination of both. Soluble LRP6 may remove norrin from equilibrium binding/unbinding to Tspan12, therefore presenting LRP6 in a non-soluble form may yield different results.

      - The authors use LRP6 instead of LRP5 for their experiments. Tspan12 is less effective in increasing the Norrin/Fzd4/Lrp6 signaling amplitude compared to Norrin/Fzd4/Lrp5 signaling, and human genetic evidence (FEVR) implicates LRP5, not LRP6, in Norrin/Frizzled4 signaling. The authors find that Norrin binding to LRP6 and Tspan12 is mutually exclusive, however this may not be the case for Lrp5.

      - The biochemical data are largely not correlated with functional data. The authors suggest that the Norrin R115L FEVR mutation could be due to reduced norrin binding to tspan12, but do not test if Tspan12-mediated enhancement of the norrin signaling amplitude is reduced by the R115L mutation. Similarly, the impressive restoration of binding by charge reversal mutations in site 3 is not corroborated in signaling assays.

    4. Reviewer #3 (Public Review):

      Brugeuera et al present an impressive series of biochemical experiments that address the question of how Tspan12 acts to promote signaling by Norrin, a highly divergent TGF-beta family member that serves as a ligand for Fzd4 and Lrp5/6 to promote canonical Wnt signaling during CNS (and especially retinal) vascular development. The present study is distinguished from those of the past 15 years by its quantitative precision and its high-quality analyses of concentration dependencies, its use of well-characterized nano-disc-incorporated membrane proteins and various soluble binding partners, and its use of structure prediction (by AlphaFold) to guide experiments. The authors start by measuring the binding affinity of Norrin to Tspan12 in nanodiscs (~10 nM), and they then model this interaction with AlphaFold and test the predicted interface with various charge and size swap mutations. The test suggests that the prediction is approximately correct, but in one region (site 1) the experimental data do not support the model. [As noted by the authors, a failure of swap mutations to support a docking model is open to various interpretations. As AlphFold docking predictions come increasingly into common use, the compendium of mutational tests and their interpretations will become an important object of study.] Next, the authors show that Tspan12 and Fzd4 can simultaneously bind Norrin, with modest negative cooperativity, and that together they enhance Norrin capture by cells expressing both Tspan12 and Fzd4 compared to Fzd4 alone, an effect that is most pronounced at low Norrin concentration. Similarly, at low Norrin concentration (~1 nM), signaling is substantially enhanced by Tspan12. By contrast, the authors show that LRP6 competes with Tspan12 for Norrin binding, implying a hand-off of Norrin from a Tspan12+Fzd4+Norrin complex to a LRP5/6+Fzd4+Norrin complex. Thanks to the authors' careful dose-response analyses, they observed that Norrin-induced signaling and Tspan12 enhancement of signaling both have bell-shaped dose-response curves, with strong inhibition at higher levels of Norrin or Tspan12. The implication is that the signaling system has been built for optimal detection of low concentrations of Norrin (most likely the situation in vivo), and that excess Tspan12 can titrate Norrin at the expense of LRP5/6 binding (i.e., reduction in the formation of the LRP5/6+Fzd4+Norrin signaling complex). In the view of this reviewer, the present work represents a foundational advance in understanding Norrin signaling and the role of Tspan12. It will also serve as an important point of comparison for thinking about signaling complexes in other ligand-receptor systems.

    1. Author Response

      The following is the authors’ response to the original reviews.

      eLife assessment

      This work provides a valuable contribution and assessment of what it means to replicate a null study finding, and what are the appropriate methods for doing so (apart from a rote p-value assessment). Through a convincing re-analysis of results from the Reproducibility Project: Cancer Biology using frequentist equivalence testing and Bayes factors, the authors demonstrate that even when reducing 'replicability success' to a single criterion, how precisely replication is measured may yield differing results. Less focus is directed to appropriate replication of non-null findings.

      Reviewer #1 (Public Review):

      Summary:

      The goal of Pawel et al. is to provide a more rigorous and quantitative approach for judging whether or not an initial null finding (conventionally with p ≥ 0.05) has been replicated by a second similarly null finding. They discuss important objections to relying on the qualitative significant/non-significant dichotomy to make this judgment. They present two complementary methods (one frequentist and the other Bayesian) which provide a superior quantitative framework for assessing the replicability of null findings.

      Strengths:

      Clear presentation; illuminating examples drawn from the well-known Reproducibility Project: Cancer Biology data set; R-code that implements suggested analyses. Using both methods as suggested provides a superior procedure for judging the replicability of null findings.

      Weaknesses:

      The proposed frequentist and the Bayesian methods both rely on binary assessments of an original finding and its replication. I'm not sure if this is a weakness or is inherent to making binary decisions based on continuous data.

      For the frequentist method, a null finding is considered replicated if the original and replication 90% confidence intervals for the effects both fall within the equivalence range. According to this approach, a null finding would be considered replicated if p-values of both equivalences tests (original and replication) were, say, 0.049, whereas would not be considered replicated if, for example, the equivalence test of the original study had a p-value of 0.051 and the replication had a p-value of 0.001. Intuitively, the evidence for replication would seem to be stronger in the second instance. The recommended Bayesian approach similarly relies on a dichotomy (e.g., Bayes factor > 1).

      Thanks for the suggestions, we now emphasize more strongly in the “Methods for assessing replicability of null results” and “Conclusions” sections that both TOST p-values and Bayes factors are quantitative measures of evidence that do not require dichotomization into “success” or “failure”.

      Reviewer #2 (Public Review):

      Summary:

      The study demonstrates how inconclusive replications of studies initially with p > 0.05 can be and employs equivalence tests and Bayesian factor approaches to illustrate this concept. Interestingly, the study reveals that achieving a success rate of 11 out of 15, or 73%, as was accomplished with the non-significance criterion from the RPCB (Reproducibility Project: Cancer Biology), requires unrealistic margins of Δ > 2 for equivalence testing.

      Strengths:

      The study uses reliable and shareable/open data to demonstrate its findings, sharing as well the code for statistical analysis. The study provides sensitivity analysis for different scenarios of equivalence margin and alfa level, as well as for different scenarios of standard deviations for the prior of Bayes factors and different thresholds to consider. All analysis and code of the work is open and can be replicated. As well, the study demonstrates on a case-by-case basis how the different criteria can diverge, regarding one sample of a field of science: preclinical cancer biology. It also explains clearly what Bayes factors and equivalence tests are.

      Weaknesses:

      It would be interesting to investigate whether using Bayes factors and equivalence tests in addition to p-values results in a clearer scenario when applied to replication data from other fields. As mentioned by the authors, the Reproducibility Project: Experimental Philosophy (RPEP) and the Reproducibility Project: Psychology (RPP) have data attempting to replicate some original studies with null results. While the RPCB analysis yielded a similar picture when using both criteria, it is worth exploring whether this holds true for RPP and RPEP. Considerations for further research in this direction are suggested. Even if the original null results were excluded in the calculation of an overall replicability rate based on significance, sensitivity analyses considering them could have been conducted. The present authors can demonstrate replication success using the significance criteria in these two projects with initially p < 0.05 studies, both positive and non-positive.

      Other comments:

      • Introduction: The study demonstrates how inconclusive replications of studies initially with p > 0.05 can be and employs equivalence tests and Bayesian factor approaches to illustrate this concept. Interestingly, the study reveals that achieving a success rate of 11 out of 15, or 73%, as was accomplished with the non-significance criterion from the RPCB (Reproducibility Project: Cancer Biology), requires unrealistic margins of Δ > 2 for equivalence testing.

      • Overall picture vs. case-by-case scenario: An interesting finding is that the authors observe that in most cases, there is no substantial evidence for either the absence or the presence of an effect, as evidenced by the equivalence tests. Thus, using both suggested criteria results in a picture similar to the one initially raised by the paper itself. The work done by the authors highlights additional criteria that can be used to further analyze replication success on a case-by-case basis, and I believe that this is where the paper's main contributions lie. Despite not changing the overall picture much, I agree that the p-value criterion by itself does not distinguish between (1) a situation where the original study had low statistical power, resulting in a highly inconclusive non-significant result that does not provide evidence for the absence of an effect and (2) a scenario where the original study was adequately powered, and a non-significant result may indeed provide some evidence for the absence of an effect when analyzed with appropriate methods. Equivalence testing and Bayesian factor approaches are valuable tools in both cases.

      Regarding the 0.05 threshold, the choice of the prior distribution for the SMD under the alternative H1 is debatable, and this also applies to the equivalence margin. Sensitivity analyses, as highlighted by the authors, are helpful in these scenarios.

      Thank you for the thorough review and constructive feedback. We have added an additional “Appendix C: Null results from the RPP and EPRP” that shows equivalence testing and Bayes factor analyses for the RPP and EPRP null results.

      Reviewer #3 (Public Review):

      Summary:

      The paper points out that non-significance in both the original study and a replication does not ensure that the studies provide evidence for the absence of an effect. Also, it can not be considered a "replication success". The main point of the paper is rather obvious. It may be that both studies are underpowered, in which case their non-significance does not prove anything. The absence of evidence is not evidence of absence! On the other hand, statistical significance is a confusing concept for many, so some extra clarification is always welcome.

      One might wonder if the problem that the paper addresses is really a big issue. The authors point to the "Reproducibility Project: Cancer Biology" (RPCB, Errington et al., 2021). They criticize Errington et al. because they "explicitly defined null results in both the original and the replication study as a criterion for replication success." This is true in a literal sense, but it is also a little bit uncharitable. Errington et al. assessed replication success of "null results" with respect to 5 criteria, just one of which was statistical (non-)significance.

      It is very hard to decide if a replication was "successful" or not. After all, the original significant result could have been a false positive, and the original null-result a false negative. In light of these difficulties, I found the paper of Errington et al. quite balanced and thoughtful. Replication has been called "the cornerstone of science" but it turns out that it's actually very difficult to define "replication success". I find the paper of Pawel, Heyard, Micheloud, and Held to be a useful addition to the discussion.

      Strengths:

      This is a clearly written paper that is a useful addition to the important discussion of what constitutes a successful replication.

      Weaknesses:

      To me, it seems rather obvious that non-significance in both the original study and a replication does not ensure that the studies provide evidence for the absence of an effect. I'm not sure how often this mistake is made.

      Thanks for the feedback. We do not have systematic data on how often the mistake of confusing absence of evidence with evidence of absence has been made in the replication context, but we do know that it has been made in at least three prominent large-scale replication projects (the RPP, RPEP, RPCB). We therefore believe that there is a need for our article.

      Moreover, we agree that the RPCB provided a nuanced assessment of replication success using five different criteria for the original null results. We emphasize this now more in the “Introduction” section. However, we do not consider our article as “a little bit uncharitable” to the RPCB, as we discuss all other criteria used in the RPCB and note that our intent is not to diminish the important contributions of the RPCB, but rather to build on their work and provide constructive recommendations for future researchers. Furthermore, in response to comments made by Reviewer #2, we have added an additional “Appendix B: Null results from the RPP and EPRP” that shows equivalence testing and Bayes factor analyses for null results from two other replication projects, where the same issue arises.

      Reviewer #1 (Recommendations For The Authors):

      The authors may wish to address the dichotomy issue I raise above, either in the analysis or in the discussion.

      Thank you, we now emphasize that Bayes factors and TOST p-values do not need to be dichotomized but can be interpreted as quantitative measures of evidence, both in the “Methods for assessing replicability of null results” and the “Conclusions” sections.

      Reviewer #2 (Recommendations For The Authors):

      Given that, here follow additional suggestions that the authors should consider in light of the manuscript's word count limit, to avoid confusing the paper's main idea:

      2) Referencing: Could you reference the three interesting cases among the 15 RPCB null results (specifically, the three effects from the original paper #48) where the Bayes factor differs qualitatively from the equivalence test?

      We now explicitly cite the original and replication study from paper #48.

      3) Equivalence testing: As the authors state, only 4 out of the 15 study pairs are able to establish replication success at the 5% level, in the sense that both the original and the replication 90% confidence intervals fall within the equivalence range. Among these 4, two (Paper #48, Exp #2, Effect #5 and Paper #48, Exp #2, Effect #6) were initially positive with very low p-values, one (Paper #48, Exp #2, Effect #4) had an initial p of 0.06 and was very precisely estimated, and the only one in which equivalence testing provides a clearer picture of replication success is Paper #41, Exp #2, Effect #1, which had an initial p-value of 0.54 and a replication p-value of 0.05. In this latter case (or in all these ones), one might question whether the "liberal" equivalence range of Δ = 0.74 is the most appropriate. As the authors state, "The post-hoc specification of equivalence margins is controversial."

      We agree that the post hoc choice of equivalence ranges is a controversial issue. The margins define an equivalence region where effect sizes are considered practically negligible, and we agree that in many contexts SMD = 0.74 is a large effect size that is not practically negligible. We therefore present sensitivity analyses for a wide range of margins. However, we do not think that the choice of this margin is more controversial for the mentioned studies with low p-values than for other studies with greater p-values, since the question of whether a margin plausibly encodes practically negligible effect sizes is not related to the observed p-value of a study. Nevertheless, for the new analyses of the RPP and EPRP data in Appendix B, we have added additional sensitivity analyses showing how the individual TOST p-values and Bayes factors vary as a function of the margin and the prior standard deviation. We think that these analyses provide readers with an even more transparent picture regarding the implications of the choice of these parameters than the “project-wise” sensitivity analyses in Appendix A.

      4) Bayes factor suggestions: For the Bayes factor approach, it would be interesting to discuss examples where the BF differs slightly. This is likely to occur in scenarios where sample sizes differ significantly between the original study and replication. For example, in Paper #48, Exp #2 and Effect #4, the initial p is 0.06, but the BF is 8.1. In the replication, the BF dramatically drops to < 1/1000, as does the p-value. The initial evidence of 8.1 indicates some evidence for the absence of an effect, but not strong evidence ("strong evidence for H0"), whereas a p-value of 0.06 does not lead to such a conclusion; instead, it favors H1. It would be interesting if the authors discussed other similar cases in the paper. It's worth noting that in Paper #5, Exp #1, Effect #3, the replication p-value is 0.99, while the BF01 is 2.4, almost indicating "moderate" evidence for H0, even though the p-value is inconclusive.

      We agree that some of the examples nicely illustrate conceptual differences between p-values and Bayes factors, e.g., how they take into account sample size and effect size. As methodologists, we find these aspects interesting ourselves, but we think that emphasizing them is beyond the scope of the paper and would distract eLife readers from the main messages.

      Concerning the conceptual differences between Bayes factors and TOST p-values, we already discuss a case where there are qualitative differences in more detail (original paper #48). We added another discussion of this phenomenon in the Appendix C as it also occurs for the replication of Ranganath and Nosek (2008) that was part of the RPP.

      5) p-values, magnitude and precision: It's noteworthy to emphasize, if the authors decide to discuss this, that the p-value is influenced by both the effect's magnitude and its precision, so in Paper #9, Exp #2, Effect #6, BF01 = 4.1 has a higher p-value than a BF01 = 2.3 in its replication. However, there are cases where both p-values and BF agree. For example, in Paper #15, Exp #2, Effect #2, both the original and replication studies have similar sample sizes, and as the p-value decreases from p = 0.95 to p = 0.23, BF01 decreases from 5.1 ("moderate evidence for H0") to 1.3 (region of "Absence of evidence"), moving away from H0 in both cases. This also occurs in Paper #24, Exp #3, Effect #6.

      We appreciate the suggestions but, as explained before, think that the message of our paper is better understood without additional discussion of more general differences between p-values and Bayes factors.

      6) The grey zone: Given the above topic, it is important to highlight that in the "Absence of evidence grey zone" for the null hypothesis, for example, in Paper #5, Exp #1, Effect #3 with a p = 0.99 and a BF01 = 2.4 in the replication, BF and p-values reach similar conclusions. It's interesting to note, as the authors emphasize, that Dawson et al. (2011), Exp #2, Effect #2 is an interesting example, as the p-value decreases, favoring H1, likely due to the effect's magnitude, even with a small sample size (n = 3 in both original and replications). Bayes factors are very close to one due to the small sample sizes, as discussed by the authors.

      We appreciate the constructive comments. We think that the two examples from Dawson et al. (2011) and Goetz et al. (2011) already nicely illustrate absence of evidence and evidence of absence, respectively, and therefore decided not to discuss additional examples in detail, to avoid redundancy.

      7) Using meta-analytical results (?): For papers from RPCB, comparing the initial study with the meta-analytical results using Bayes factor and equivalence testing approaches (thus, increasing the sample size of the analysis, but creating dependency of results since the initial study would affect the meta-analytical one) could change the conclusions. This would be interesting to explore in initial studies that are replicated by much larger ones, such as: Paper #9, Exp #2, Effect #6; Goetz et al. (2011), Exp #1, Effect #1; Paper #28, Exp #3, Effect #3; Paper #41, Exp #2, Effect #1; and Paper #47, Exp #1, Effect #5).

      Thank you for the suggestion. We considered adding meta-analytic TOST p-values and Bayes factors before, but decided that Figure 3 and the results section are already quite technical, so adding more analyses may confuse more than help. Nevertheless, these meta-analytic approaches are discussed in the “Conclusions” section.

      8) Other samples of fields of science: It would be interesting to investigate whether using Bayes factors and equivalence tests in addition to p-values results in a clearer scenario when applied to replication data from other fields. As mentioned by the authors, the Reproducibility Project: Experimental Philosophy (RPEP) and the Reproducibility Project: Psychology (RPP) have data attempting to replicate some original studies with null results. While the RPCB analysis yielded a similar picture when using both criteria, it is worth exploring whether this holds true for RPP and RPEP. Considerations for further research in this direction are suggested. Even if the original null results were excluded in the calculation of an overall replicability rate based on significance, sensitivity analyses considering them could have been conducted. The present authors can demonstrate replication success using the significance criteria in these two projects with initially p < 0.05 studies, both positive and non-positive.

      Thank you for the excellent suggestion. We added an Appendix B where the null results from the RPP and EPRP are analyzed with our proposed approaches. The results are also discussed in the “Results” and “Conclusions” sections.

      9) Other approaches: I am curious about the potential impact of using an approach based on equivalence testing (as described in https://arxiv.org/abs/2308.09112). It would be valuable if the authors could run such analyses or reference the mentioned work.

      Thank you. We were unaware of this preprint. It seems related to the framework proposed by Stahel W. A. (2021) New relevance and significance measures to replace p-values. PLoS ONE 16(6): e0252991. https://doi.org/10.1371/journal.pone.0252991

      We now cite both papers in the discussion.

      10) Additional evidence: There is another study in which replications of initially p > 0.05 studies with p > 0.05 replications were also considered as replication successes. You can find it here: https://www.medrxiv.org/content/10.1101/2022.05.31.22275810v2. Although it involves a small sample of initially p > 0.05 studies with already large sample sizes, the work is currently under consideration for publication in PLOS ONE, and all data and materials can be accessed through OSF (links provided in the work).

      Thank you for sharing this interesting study with us. We feel that it is beyond the scope of the paper to include further analyses as there are already analyses of the RPCB, RPP, and EPRP null results. However, we will keep this study in mind for future analysis, especially since all data are openly available.

      11) Additional evidence 02: Ongoing replication projects, such as the Brazilian Reproducibility Initiative (BRI) and The Sports Replication Centre (https://ssreplicationcentre.com/), continue to generate valuable data. BRI is nearing completion of its results, and it promises interesting data for analyzing replication success using p-values, equivalence regions, and Bayes factor approaches.

      We now cite these two initiatives as examples of ongoing replication projects in the introduction. Similarly as for your last point, we think that it is beyond the scope of the paper to include further analyses as there are already analyses of the RPCB, RPP, and EPRP null results.

      Reviewer #3 (Recommendations For The Authors):

      I have no specific recommendations for the authors.

      Thank you for the constructive review.

      Reviewing Editor (Recommendations For the Authors):

      I recognize that it was suggested to the authors by the previous Reviewing Editor to reduce the amount of statistical material to be made more suitable for a non-statistical audience, and so what I am about to say contradicts advice you were given before. But, with this revised version, I actually found it difficult to understand the particulars of the construction of the Bayes Factors and would have appreciated a few more sentences on the underlying models that fed into the calculations. In my opinion, the provided citations (e.g., Dienes Z. 2014. Using Bayes to get the most out of non-significant results) did not provide sufficient background to warrant a lack of more technical presentation here.

      Thank you for the feedback. We added a new “Appendix C: Technical details on Bayes factors” that provides technical details on the models, priors, and calculations underlying the Bayes factors.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Bendzunas, Byrne et al. explore two highly topical areas of protein kinase regulation in this manuscript. Firstly, the idea that Cys modification could regulate kinase activity. The senior authors have published some standout papers exploring this idea of late, and the current work adds to the picture of how active site Cys might have been favoured in evolution to serve critical regulatory functions. Second, BRSK1/2 are understudied kinases listed as part of the "dark kinome" so any knowledge of their underlying regulation is of critical importance to advancing the field.

      Strengths:

      In this study, the author pinpoints highly-conserved, but BRSK-specific, Cys residues as key players in kinase regulation. There is a delicate balance between equating what happens in vitro with recombinant proteins relative to what the functional consequence of Cys mutation might be in cells or organisms, but the authors are very clear with the caveats relating to these connections in their descriptions and discussion. Accordingly, by extension, they present a very sound biochemical case for how Cys modification might influence kinase activity in cellular environs.

      Weaknesses:

      I have very few critiques for this study, and my major points are barely major.

      Major points

      (1) My sense is that the influence of Cys mutation on dimerization is going to be one of the first queries readers consider as they read the work. It would be, in my opinion, useful to bring forward the dimer section in the manuscript.

      We agree that the influence of Cys on BRSK dimerization is a topic of significant interest. Our primary focus was to explore oxidative regulation of the understudied BRSK kinases as they contain a conserved T-loop Cys, and we have previously demonstrated that equivalent residues at this position in related kinases were critical drivers of oxidative modulation of catalytic activity. We have demonstrated here that BRSK1 & 2 are similarly regulated by redox and this is due to oxidative modification of the T+2 Cys, in addition to Cys residues that are conserved amongst related ARKs as well as BRSK-specific Cys. Although we also provide evidence for limited redox-sensitive higher order BRSK species (dimers) in our in vitro analysis, these represent a small population of the total BRSK protein pool (this was validated by SEC-MALs analysis). As such, we do not have strong evidence to suggest that these limited dimers significantly contribute to the pronounced inhibition of BRSK1 & 2 in the presence of oxidizing agents, and instead believe that other biochemical mechanisms likely drive this response. This may result from oxidized Cys altering the conformation of the activation loop. Indeed, the formation of an intramolecular disulfide within the T-loop of BRSK1 & 2, which we detected by MS, is one such regulatory modification. It is noteworthy, that intramolecular disulfide bonds within the T-loop of AKT and MELK have already been shown to induce an inactive state in the kinase, and we posit a similar mechanism for BRSKs.

      While we recognize the potential importance of dimerization in this context, our current data from in vitro and cell-based assays do not provide substantial evidence to assert dimerization as a primary regulatory mechanism. Hence, we maintained a more conservative stance in our manuscript, discussing dimerization in later sections where it naturally followed from the initial findings. That being said, we acknowledge the potential significance of dimerization in the regulation of the BRSK T-loop cysteine. We believe this aspect merits further investigation and could indeed be the focus of a follow-up study.

      (2) Relatedly, the effect of Cys mutation on the dimerization properties of preparations of recombinant protein is not very clear as it stands. Some SEC traces would be helpful; these could be included in the supplement.

      In order to determine whether our recombinant BRSK proteins (and T-loop mutants) existed as monomers or dimers, we performed SDS-PAGE under reducing and non-reducing conditions (Fig 7). This unambiguously revealed that a monomer was the prominent species, with little evidence of dimers under these experimental conditions (even in the presence of oxidizing agents). Although we cannot discount a regulatory role for BRSK dimers in other physiological contexts, we could not produce sufficient evidence to suggest that multimerization played a substantial role in modifying BRSK kinase activity in our assays. We note that our in vitro analysis was performed using truncated forms of the protein, and as such it is entirely possible that regions of the protein that flank the kinase domain may serve additional regulatory functions that may include higher order BRSK conformations. In this regard, although we have not included SEC traces of our recombinant proteins, we have included analytical SEC-MALS of the truncated proteins (Supplementary Figure 6) which we believe to be more informative. We have also now included additional SEC-MALS data for BRSK2 C176A and C183A (Supplementary Figure 6d and e), which supports our findings in Fig 7, demonstrating the presence of limited dimer species under non-reducing conditions.

      (3) Is there any knowledge of Cys mutants in disease for BRSK1/2?

      We have conducted an extensive search across several databases: COSMIC (Catalogue of Somatic Mutations in Cancer), ProKinO (Protein Kinase Ontology), and TCGA (The Cancer Genome Atlas). These databases are well-regarded for their comprehensive and detailed records of mutations related to cancer and protein kinases. Our analysis using the COSMIC and TCGA databases focused on identifying any reported instances of Cys mutations in BRSK1/2 that are implicated in cancer. Additionally, we utilized the ProKinO database to explore the broader landscape of protein kinase mutations, including any potential disease associations of Cys mutations in BRSK1/2. However, we found no evidence to indicate the presence of Cys mutations in BRSK1/2 that are associated with cancer or disease. This lack of association in the current literature and database records suggests that, as of our latest search, Cys mutations in BRSK1/2 have not been reported as significant contributors to pathogenesis.

      (4) In bar charts, I'd recommend plotting data points. Plus, it is crucial to report in the legend what error measure is shown, the number of replicates, and the statistical method used in any tests.

      We have added the data points to the bar charts and included statistical methods in figure legends.

      (5) In Figure 5b, the GAPDH loading control doesn't look quite right.

      The blot has been repeated and updated.

      (6) In Figure 7 there is no indication of what mode of detection was used for these gels.

      We have updated the figure legend to confirm that the detection method was western blot.

      (7) Recombinant proteins - more detail should be included on how they were prepared. Was there a reducing agent present during purification? Where did they elute off SEC... consistent with a monomer of higher order species?

      We have added ‘produced in the absence of reducing agents unless stated otherwise’ in the methods section to improve clarity. Although we have not added additional sentences to describe the elution profile of the BRSK proteins by SEC during purification, we believe that the inclusion of analytical SEC-MALS data is sufficient evidence that the proteins are largely monomeric under non-reducing conditions.

      Reviewer #2 (Public Review):

      Summary:

      In this study by Bendzunas et al, the authors show that the formation of intra-molecular disulfide bonds involving a pair of Cys residues near the catalytic HRD motif and a highly conserved T-Loop Cys with a BRSK-specific Cys at an unusual CPE motif at the end of the activation segment function as repressive regulatory mechanisms in BSK1 and 2. They observed that mutation of the CPE-Cys only, contrary to the double mutation of the pair, increases catalytic activity in vitro and drives phosphorylation of the BRSK substrate Tau in cells. Molecular modeling and molecular dynamics simulations indicate that oxidation of the CPE-Cys destabilizes a conserved salt bridge network critical for allosteric activation. The occurrence of spatially proximal Cys amino acids in diverse Ser/Thr protein kinase families suggests that disulfide-mediated control of catalytic activity may be a prevalent mechanism for regulation within the broader AMPK family. Understanding the molecular mechanisms underlying kinase regulation by redox-active Cys residues is fundamental as it appears to be widespread in signaling proteins and provides new opportunities to develop specific covalent compounds for the targeted modulation of protein kinases.

      The authors demonstrate that intramolecular cysteine disulfide bonding between conserved cysteines can function as a repressing mechanism as indicated by the effect of DTT and the consequent increase in activity by BSK-1 and -2 (WT). The cause-effect relationship of why mutation of the CPE-Cys only increases catalytic activity in vitro and drives phosphorylation of the BRSK substrate Tau in cells is not clear to me. The explanation given by the authors based on molecular modeling and molecular dynamics simulations is that oxidation of the CPE-Cys (that will favor disulfide bonding) destabilizes a conserved salt bridge network critical for allosteric activation. However, no functional evidence of the impact of the salt-bridge network is provided. If you mutated the two main Cys-pairs (aE-CHRD and A-loop T+2-CPE) you lose the effect of DTT, as the disulfide pairs cannot be formed, hence no repression mechanisms take place, however when looking at individual residues I do not understand why mutating the CPE only results in the opposite effect unless it is independent of its connection with the T+2residue on the A-loop.

      Strengths:

      This is an important and interesting study providing new knowledge in the protein kinase field with important therapeutic implications for the rationale design and development of next-generation inhibitors.

      Weaknesses:

      There are several issues with the figures that this reviewer considers should be addressed.

      Reviewer #1 (Recommendations for The Authors):

      Major points

      Page 26 - the discussion could be more concise. There's an element of recapping the results, which should be avoided.

      Regarding the conciseness of the discussion section, we have thoroughly revised it to ensure a more succinct presentation, deliberately avoiding the recapitulation of results. The revised discussion now focuses on interpreting the findings and their implications, steering clear of redundancy with the results section.

      Figure 1b seems to be mislabeled/annotated. I recommend checking whether the figure legends match more broadly. Figure 1 appears to be incorrectly cited throughout the results.

      Thank you for pointing out the discrepancies in the labeling and citation of Figure 1b. We have carefully reviewed and corrected these issues to ensure that all figure labels, legends, and citations accurately reflect the corresponding data and illustrations. We appreciate your attention to detail and the opportunity to improve the clarity and accuracy of our presentation.

      Figure 6 - please include a color-coding key in the figure. Further support for these simulations could be provided by supplementary movies or plots of the interaction. Figure 4 colour palette should be adjusted for the spheres in the Richardson diagrams to have greater distinction.

      As suggested, we have amended the colour palette in Figure 4 to improve conformity throughout the figure.

      Minor points

      Figure 2 - it'd be helpful to know what the percentage coverage of peptides is.

      We have updated the figure legend to include peptide coverage for both proteins

      Some typos - Supp 2 legend "Domians".

      Fixed

      Figure 6 legend - analyzed by needs a space;

      Fixed

      Fig 8 legend schematic misspelled.

      Fixed

      Broadly, if you Google T-loop you get a pot pourri of enzyme answers. Why not just use Activation loop?

      The choice of "T-loop" over "Activation loop" in our manuscript was made to maintain consistency with other literature in the field, and in particular our previous paper “Aurora A regulation by reversible cysteine oxidation reveals evolutionarily conserved redox control of Ser/Thr protein kinase activity” where we refer to the activation loop cysteine as T-loop + 2. We acknowledge the varied enzyme contexts in which "T-loop" is used and agree on the importance of clarity. To address this, we made an explicit note in the manuscript that the "T-loop" is also referred to as the "Activation loop", ensuring readers are aware of the interchangeable use of these terms. Additionally, this nomenclature facilitates a more straightforward designation of cysteine residues within the loop (T+2 Cysteine). We believe this approach balances adherence to established conventions with the need for clarity and precision in our descriptions.

      Methods - what is LR cloning. Requires some definition. Some manufacturer detail is missing in methods, and referring to prior work is not sufficient to empower readers to replicate.

      We agree, and have added the following to the methods section:

      “BRSK1 and 2 were sub-cloned into pDest vectors (to encode the expression of N-terminal Flag or HA tagged proteins) using the Gateway LR Clonase II system (Invitrogen) according to the manufacturer’s instructions. pENtR BRSK1/2 clones were obtained in the form of Gateway-compatible donor vectors from Dr Ben Major (Washington University in St. Louis). The Gateway LR Clonase II enzyme mix mediates recombination between the attL sites on the Entry clone and the attR sites on the destination vector. All cloned BRSK1/2 genes were fully sequenced prior to use.”

      Page 7 - optimal settings should be reported. How were pTau signals quantified and normalised?

      We have added the following to the methods section:

      “Two-color Western blot detection method employing infrared fluorescence was used to measure the ratio of Tau phospho serine 262 to total Tau. Total GFP Tau was detected using a mouse anti GFP antibody and visualized at 680 nm using goat anti mouse IRdye 680 while phospho-tau was detected using a Tau phospho serine 262 specific antibody and visualized at 800 nm using goat anti rabbit IRdye 800. Imaging was performed using a Licor Odessey Clx with scan control settings set to 169 μm, medium quality, and 0.0 mm distance. Quantification was performed using Licor image studio on the raw image files. Total Tau to phospho Tau ratio was determined by measuring the ratio of the fluorescence intensities measured at 800 nm (pTau) to those at 680 nm (total tau).”

      In the Figure 6g-j legend, the salt bridge is incorrectly annotated as E185-R248 rather than 258.

      Fixed

      Lines 393-395 provides a repeat statement on BRSKs phosphorylating Tau (from 388-389).

      We have removed the repetition and reworded the opening lines of the results section to improve the overall flow of the manuscript.

      Supp. Figure 1 is difficult to view - would it be possible to increase the size of the phylogenetic analysis?

      We thank the reviewer for this observation. We have rotated (90°) and expanded the figure so that it can be more clearly viewed

      Supp. Figure 2 - BRSK1/2 incorrectly spelled.

      Fixed

      Please check the alignment of labels in Supp. Figure 3e.

      Fixed

      Reviewer #2 (Recommendations For The Authors):

      (1) In Figure 1, current panel b is not mentioned/described in the figure legend and as a consequence, the rest of the panels in the legends do not fit the content of the figure.

      Reviewer 1 also noted this error, and we have amended the manuscript accordingly.

      What is the rationale for using the HEK293T cells as the main experimental/cellular system? Are there cell lines that express both proteins endogenously so that the authors can recapitulate the results obtained from ectopic overexpression?

      The selection of HEK-293T cells was driven by their well-established utility in overexpression studies, which make them ideal for the investigation of protein interactions and redox regulation. This cell line's robust transfection efficiency and well-characterized biology provide a reliable platform for dissecting the molecular mechanisms underlying the redox regulation of proteins. Furthermore, the use of HEK-293T cells aligns with the broader scientific practice, serving as a common ground for comparability with existing literature in the field of BRSK1/2 signaling, protein regulation and interaction studies.

      The application of HEK-293T cells as a model system in our study serves as a foundational step towards eventually elucidating the functions of BRSK1/2 in neuronal cells, where these kinases are predominantly expressed and play critical roles. Given the fact that BRSKs are classed as ‘understudied’ kinases, the choice of a HEK-293T co-overexpression system allowed us to analyze the direct effects of BRSK kinase activity (using phosphorylation of Tau as a readout) in a cellular context and in more controlled manner. This approach not only aids in the establishment of a baseline understanding of the redox regulation of BRSK1/2, but also sets the stage for subsequent investigations in more physiologically relevant neuronal models

      In current panel d, could the authors recapitulate the same experimental conditions as in current panel c?

      Figure 1 panel c shows that both BRSK1 and 2 are reversibly inhibited by oxidizing agents such as H2O2, whilst panels d and e show the concentration dependent activation and inhibition of the BRSKs with increasing concentrations of DTT and H2O2 respectively. The experimental conditions were identical, other than changing amounts of reducing and oxidizing agents, and used the same peptide coupled assays. Data for all experiments were originally collected in ‘real time’ as depicted in Fig 1c (increase in substrate phosphorylation over time). However, to aid interpretation of the data, we elected to present the latter two panels as dose response curves by calculating the change in the rate of enzyme activity (shown as pmol phosphate incorporated into the peptide substrate per min) for each condition. To aid the reader, we now include an additional supplementary figure (new supplementary figure 2) depicting BRSK1 and 2 dependent phosphorylation of the peptide substrate in the presence of different concentrations of DTT and H2O2 in a real time (kinetic) assay. The new data shown is a subset of the unprocessed data that was used to calculate the rates of BRSK activity in Fig 1d & e.

      Why did the authors use full-length constructs in these experiments and did not in e.g. Figure 2 where they used KD constructs instead?

      In the initial experiments, illustrated in Figure 1, we employed full-length protein constructs to establish a proof of concept, demonstrating the overall behavior and interactions of the proteins in their full-length form. This confirmed that BRSK1 & 2, which both contain a conserved T + 2 Cys residue that is frequently prognostic for redox sensitivity in related kinases, displayed a near-obligate requirement for reducing agents to promote kinase activity.  

      Subsequently, in Figure 2, our focus shifted towards delineating the specific regions within the proteins that are critical for redox regulation. By using constructs that encompass only the kinase domain, we aimed to demonstrate that the redox-sensitive regulation of these proteins is predominantly mediated by specific cysteine residues located within the kinase domain itself. This strategic use of the kinase domain of the protein allowed for a more targeted investigation. Furthermore, in our hands these truncated forms of the protein were more stable at higher concentrations, enabling more detailed characterization of the proteins by DSF and SEC-MALS. We predict that the flanking disordered regions of the full-length protein (as predicted by AlphaFold) contribute to this effect.

      (2) In Figure 2, Did the authors try to do LC/MS-MS in the same experimental conditions as in Figure 1 (e.g. buffer minus/plus DTT, H2O2, H2O2 + DTT)?

      We would like to clarify that the mass spectrometry experiments were conducted exclusively on proteins purified under native (non-reducing) conditions. We did not extend the LC/MS-MS analyses to include proteins treated with various buffer conditions such as minus/plus DTT, H2O2, or H2O2 + DTT as used in the experiments depicted in Figure 1. Given that we could readily detect disulfides in the absence of oxidizing agents, we did not see the benefit of additional treatment conditions as peroxide treatment of protein samples can frequently complicate interpretation of MS data. However, it should be noted that prior to MS analysis, tryptic peptides were subjected to a 50:50 split, with one half alkylated in the presence of DTT (as described in the methods section) to eliminate disulfides and other transiently oxidized Cys forms. Comparative analysis between reduced and non-reduced tryptic peptides improved our confidence when assigning disulfide bonds (which were eliminated in identical peptides in the presence of DTT).

      On panel b, why did the authors show alphafold predictions and not empiric structural information (e.g. X-ray, EM,..)?

      The AlphaFold models were primarily utilized to map the general locations of redox-sensitive cysteine pairs within the proteins of interest. Although we have access to the crystal structure of mouse BRSK2, they do not fully capture the active conformation seen in the Alphafold model of the human version. The use of AlphaFold models for human proteins in this study aids in consistently tracking residue numbering across the manuscript, offering a useful framework for understanding the spatial arrangement of these critical cysteine pairs in their potentially active-like states. This approach facilitates our analysis and discussion by providing a reference for the structural context of these residues in the human proteins.

      What was the rationale for using the KD construct and not the FL as in Figure 1?

      The rationale to use the kinase domain was primarily based on the significantly lower confidence in the structural predictions for regions outside the kinase domain (KD). Our experimental focus was to investigate the role of conserved cysteine residues within the kinase domain, which are critical for the protein's function and regulation. This targeted approach allowed us to concentrate our analyses on the most functionally relevant and structurally defined portion of the protein, thereby enhancing the precision and relevance of our findings. As is frequently the case, truncated forms of the protein, consisting only of the kinase domain, are much more stable than their full length counterparts and are therefore more amenable to in vitro biochemical analysis. In our hands this was true for both BRSK1 and 2, and as such much of the data collected here was generated using kinase-domain (KD) constructs. Simulations using the KD structures are therefore much more representative of our original experimental setup.

      The BSK1 KD construct appears to be rather inactive and not responsive to DTT treatment. Could the authors comment on the differences observed with the FL construct of Figure 1

      It is important to note that BRSK1, in general, exhibits lower intrinsic activity compared to BRSK2. This reduced activity could be attributed to a range of factors, including the need for activation by upstream kinases such as LKB1, as well as potential post-translational modifications (PTMs) that may be absent in the bacterially expressed KD construct. The full-length forms of the protein were purified from Sf21 cells, and as such may have additional modifications that are lacking in the bacterially derived KD counterparts. We also cannot discount additional regulatory roles of the regions that flank the KD, and these may contribute in part to the modest discrepancy observed between constructs.  Despite these differences, it is crucial to emphasize that both the KD and FL constructs of BRSK1 are regulated by DTT, indicating a conserved redox-dependent activation for both of the related BRSK proteins.  

      (3) In Figure 4, on panel A wouldn´t the authors expect that mutating on the pairs e.g. C198A in BSK1 would have the same effect as mutating the C191 from the T+2 site? Did they try mutating individual sites of the aE/CHRD pair? The same will apply to BSK2

      We appreciate the insightful comment. It's important to clarify that the redox regulation of these proteins is influenced not solely by the formation of disulfide bonds but also by the oxidation state of individual cysteine residues, particularly the T+2 Cys. This nuanced mechanism of regulation allows for a diverse range of functional outcomes based on the specific cysteine involved and its state of oxidation. This aspect forms a key finding of our paper, highlighting the complexity of redox regulation beyond mere disulfide bond formation. For example, AURA kinase activity is regulated by oxidation of a single T+2 Cys (Cys290, equivalent to Cys191 and Cys176 of BRSK1 and 2 respectively), but this regulation can be supplemented through artificial incorporation of a secondary Cys at the DFG+2 position (Byrne et al., 2020). This targeted genetic modification or AURA mirrors equivalent regulatory disulfide-forming Cys pairs that naturally occur in kinases such as AKT and MELK, and which provide an extra layer of regulatory fine tuning (and a possible protective role to prevent deleterious over oxidation) to the T+2 Cys. We surmise that the CPE Cys is also an accessory regulatory element to the T+2 Cys in BRSK1 +2, which is the dominant driver of BRSK redox sensitivity (as judged by the fact that CPE Cys mutants are still potently regulated by redox [Fig 4]), by locking it in an inactive disulfide configuration.

      In our preliminary analysis of BRSK1, we observed that mutations of individual sites within the aE/CHRD pair was similarly detrimental to kinase activity as a tandem mutation (see reviewer figure 1). As discussed in the manuscript, we think that these Cys may serve important structural regulatory functions and opted to focus on co-mutations of the aE/CHRD pair for the remainder of our investigation.

      Author response image 1.

      In vitro kinase assays showing rates of in vitro peptide phosphorylation by WT and Cys-to-Ala (aE/CHRD residues) variants of BRSK1 after activation by LKB1.

      In panels C and D, the same experimental conditions should have been measured as in A and B.

      Panels A and B were designed to demonstrate the enzymatic activity and the response to DTT treatment to establish the baseline redox regulation of the kinase and a panel of Cys-to-Ala mutant variants. In contrast, panels C and D were specifically focused on rescue experiments with mutants that showed a significant effect under the conditions tested in A and B. These panels were intended to further explore the role of redox regulation in modulating the activity of these mutants, particularly those that retained some level of activity or exhibited a notable response to redox changes.

      The rationale for this experimental design was to prioritize the investigation of mutants, such as those at the T+2 and CPE cysteine sites, which provided the most insight into the redox-dependent modulation of kinase activity. Other mutants, which resulted in inactivation, were deprioritized in this context as they offered limited additional information regarding the redox regulation mechanism. This focused approach allowed us to delve deeper into understanding how specific cysteine residues contribute to the redox-sensitive control of kinase function, aligning with the overall objective of elucidating the nuanced roles of redox regulation in kinase activity.

      (4) In figure 5: Why did the authors use reduced Glutathione instead of DTT? The authors should have recapitulated the same experimental conditions as in Figure 4 and not focused only on the T+2 or the CPE single mutants but using the double and the aE/CHRD mutants as well, as internal controls and validation of the enzymatic assays using the modified peptide

      Regarding the use of reduced glutathione (GSH) instead of DTT in Figure 5, we chose GSH for its well characterized biological relevance as an antioxidant in cellular responses to oxidative stress. Furthermore, while DTT has been widely used in experimental setups, it is also potentially cytotoxic at high concentrations.

      Addressing the point on experimental consistency with Figure 4, we appreciate the suggestion and indeed had already conducted such experiments (Previously Supp Fig 3, now changed to current Supp Fig 4). These experiments include analyses of BRSK mutant activity in a HEK-293T model. However, we chose not to focus on inactivating mutants (such as the aE/CHRD mutants which had depleted expression levels possibly as a consequence of compromised structural integrity) or pursue the generation of double mutant CMV plasmids, as these were deemed unlikely to add significant insights into the core narrative of our study. Our focus remained on the mutants that yielded the most informative results regarding the redox regulation mechanisms in the in vitro setting, ensuring a clear and impactful presentation of our findings.

      A time course evaluation of the reducing or oxidizing reagents should have been performed. Would we expect that in WT samples, and in the presence of GSH, and also in the case of the CPE mutant, an increment in the levels of Tau phosphorylation as a readout of BSK1-2 activity?

      We acknowledge the importance of such analyses in understanding the dynamic nature of redox regulation on kinase activity and have included a time course (Supp Fig 2 e-g). These results confirm a depletion of Tau phosphorylation over time in response to peroxide generated by the enzyme glucose oxidase.

      (5) In Figure 6, did the authors look at the functional impact of the residues with which interact the T+2 and the CPE motifs e.g. T174 and the E185-R258 tether?

      Our primary focus was on the salt bridges, as this is a key regulatory structural feature that is conserved across many kinases. Regarding the additional interactions mentioned, we have thoroughly evaluated their roles and dynamics through molecular dynamics (MD) simulations but did not find any results of significant relevance to warrant inclusion.

      (6) In Figure 7: Did the author look at the oligomerization state of the BSK1-2 multimers under non-reducing conditions? Were they also observed in the case of the FL constructs? What was the stoichiometry?

      Our current work indicates that the kinase domain of BRSK1-2 primarily exists in a monomeric state, with some evidence of dimerization or multimer formation under specific conditions. Our SEC-MALS (Supp Fig 6) and SDS-PAGE analysis (Figure 7) clearly demonstrates that monomers are overwhelmingly the dominant species under non-reducing conditions (>90 %). We also conclude that these limited oligomeric species can be removed by inclusion of reducing agents such as DTT (Figure 7), which may suggest a role for a Cys residue(s). Notably, removal of the T+2 Cys was insufficient to prevent multimerization.

      We were unable to obtain reliable SEC-MALS data for the full-length forms of the protein, likely due to the presence of disordered regions that flank the kinase domain which results in a highly heterodispersed and unstable preparation (at the concentrations required for SEC-MALS). Although we are therefore unable to comment on the stoichiometry of FL BRSK dimers, we can detect BRSK1 and 2 hetero- and homo-complexes in HEK-293T cells by IP, which supports the existence of limited BRSK1 & 2 dimers (Supp Fig 6a). However, we were unable to detect intermolecular disulfide bonds by MS, although this does not necessarily preclude their existence. The physiological role of BRSK multimerization (if any) and establishing specifically which Cys residues drive this phenomenon is of significant interest to our future investigations.

    2. eLife assessment

      This study provides fundamental new knowledge into the role of reversible cysteine oxidation and reduction in protein kinase regulation. The data provide convincing evidence that intra-molecular disulfide bonds serve a repressive regulatory role in the Brain Selective Kinases (BRSK) 1 & 2; part of the as yet understudied 'dark kinome'. The findings will be of broad interest to biochemists, structural biologists, and those interested in the rational design and development of next-generation kinase inhibitors.

    3. Reviewer #1 (Public Review):

      Summary:<br /> Bendzunas, Byrne et al. explore two highly topical areas of protein kinase regulation in this manuscript. Firstly, the idea that Cys modification could regulate kinase activity. The senior authors have published some standout papers exploring this idea of late, and the current work adds to the picture of how active site Cys might have been favoured in evolution to serve critical regulatory functions. Second, BRSK1/2 are understudied kinases listed as part of the "dark kinome" so any knowledge of their underlying regulation is of critical importance to advancing the field.

      Strengths:<br /> In this study, the author pinpoints highly-conserved, but BRSK-specific, Cys residues as key players in kinase regulation. There is a delicate balance between equating what happens in vitro with recombinant proteins relative to what the functional consequence of Cys mutation might be in cells or organisms, but the authors are very clear with the caveats relating to these connections in their descriptions and discussion. Accordingly, by extension, they present a very sound biochemical case for how Cys modification might influence kinase activity in cellular environs.

      Comments on revised version:

      The authors have satisfactorily addressed my concerns.

    4. Reviewer #2 (Public Review):

      Summary:

      In this study by Bendzunas et al, the authors show that the formation of intra-molecular disulfide bonds involving a pair of Cys residues near the catalytic HRD motif and a highly conserved T-Loop Cys with a BRSK-specific Cys at an unusual CPE motif at the end of the activation segment function as repressive regulatory mechanisms in BSK1 and 2. They observed that mutation of the CPE-Cys only, contrary to the double mutation of the pair, increases catalytic activity in vitro and drives phosphorylation of the BRSK substrate Tau in cells. Molecular modeling and molecular dynamics simulations indicate that oxidation of the CPE-Cys destabilizes a conserved salt bridge network critical for allosteric activation. The occurrence of spatially proximal Cys amino acids in diverse Ser/Thr protein kinase families suggests that disulfide-mediated control of catalytic activity may be a prevalent mechanism for regulation within the broader AMPK family. Understanding the molecular mechanisms underlying kinase regulation by redox-active Cys residues is fundamental as it appears to be widespread in signaling proteins and provides new opportunities to develop specific covalent compounds for the targeted modulation of protein kinases.

      The authors demonstrate that intramolecular cysteine disulfide bonding between conserved cysteines can function as a repressing mechanism as indicated by the effect of DTT and the consequent increase in activity by BSK-1 and -2 (WT). The cause-effect relationship of why mutation of the CPE-Cys only increases catalytic activity in vitro and drives phosphorylation of the BRSK substrate Tau in cells is not clear to me. The explanation given by the authors based on molecular modeling and molecular dynamics simulations is that oxidation of the CPE-Cys (that will favor disulfide bonding) destabilizes a conserved salt bridge network critical for allosteric activation. However, no functional evidence of the impact of the salt-bridge network is provided. If you mutated the two main Cys-pairs (aE-CHRD and A-loop T+2-CPE) you lose the effect of DTT, as the disulfide pairs cannot be formed, hence no repression mechanisms take place, however when looking at individual residues I do not understand why mutating the CPE only results in the opposite effect unless it is independent of its connection with the T+2residue on the A-loop.

      Strengths:

      This is an important and interesting study providing new knowledge in the protein kinase field with important therapeutic implications for the rationale design and development of next-generation inhibitors.

      Comments on revised version:

      I have one remark related to question number 5 (my question was not clear enough). I meant if the authors did look at the functional relevance of the residues implicated in the identified salt-bridge network/tethers. What happens to the proteins functionally when you mutate those residues? (represented on Fig. 8).

      Otherwise, the authors have satisfactorily addressed my concerns.

    1. Author response:

      We thank the reviewers for their attention to our study and for their fair and reasonable assessment of the strengths and weaknesses of our work. We believe the reviewers adequately captured both the potential implications of our work as well as its major current limitations. As both reviewers noted, we believe the work presented in this manuscript is an exciting first step in adapting minibinders as antigen sensors for synthetic receptors but many questions remain before these new tools can be widely adopted. We hope that this work will catalyze others to try minibinders as potential antigen sensors when developing novel synthetic receptors, and we hope that future work will more thoroughly test a wide range of linkers to better optimize antigen sensor function across synthetic receptors.

      In our future work, we intend to evaluate a greater diversity of minibinders across different relevant therapeutic targets. We are working to test both existing minibinders as well as generate novel minibinders using deep-learning-based de novo protein design methods. We further hope to explore additional linker modifications, especially focusing on modifications that will allow minibinder coupled-synthetic receptors to escape the glycocalyx of engineered cells. We hope to share findings on these topics in either an update to this manuscript or in future manuscripts, depending on the results of our studies in progress.

      Finally, reviewers noted a mismatch in the data displayed in Figure 5A and 5C, whereby LCB-CAR-expressing cells induced higher lysis in Figure 5C than in Figure 5A. This is due to figure 5C showing only 24 hours of incubation between effector and target cells, as opposed to the 72 hours of incubation that is quantitated in 5A. These mismatched timepoints were selected because linker-dependent differences in lysis were most readily apparent at 24 hours and were negligible at 72 hours. The full-time course of lysis for this experiment can be seen in Supplemental Figure 2D.

    2. eLife assessment

      This study presents a useful investigation to test de novo-designed mini binders against the Spike protein of SARS-CoV-2 within two classes of synthetic receptors (SNIPRs and CARs). The methods and evidence supporting the focused claims are very solid, although the small-scale nature of the investigation (number of modifications, number of minibinders, etc.) makes it difficult to determine how generalizable these results and potential design principles are. This work will be of interest to synthetic biologists and cell engineers as a starting point for systematic, larger-scale analysis and optimization of synthetic receptor designs for cellular therapy and other applications.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors want to explore how much two known minibinder protein domains against the Spike protein of SARS-CoV-2 can function as a binding domain of 2 sets of synthetic receptors (SNIPR and CAR); the authors also want to know how some modifications of the linkers of these new receptors affect their activation profile.

      Major strengths and weaknesses of the methods and results:

      - Strengths include: analysis of synthetic receptor function for 2 classes of synthetic receptors, with robust and appropriate assays for both kinds of receptors. The modifications of the linkers are also interesting and the types of modifications that are often used in the field.

      - Weaknesses include: none of the data analysis provides statistical interpretation of the results (that I could find). One dataset is confusing: Figures 5A and C, are said to be the same assay with the same constructs, but the results are 30% in A, and 70% in C.

      An appraisal of whether the authors achieved their aims, and whether the results support their conclusions:

      Given the open-ended nature of the goal (implicit in it being an exploration), it is hard to say if the authors have reached their aims; they have done an exploration for sure; is it big enough an exploration? This reviewer is not sure.

      The results are extremely clearly presented, both in the figures and in the text, both for the methods and the results. The claims put forward (with limited exceptions see below) are very solidly supported by the presented data.

      A discussion of the likely impact of the work on the field, and the utility of the methods and data to the community:

      The work may stimulate others to consider minibinders as potential binding domains for synthetic receptors. The modifications that are presented although not novel, do provide a starting point for larger-scale analysis.

      It is not clear how much this is generalizable to other binders (the authors don't make such claims though). The claims are very focused on the tested modifications, and the 2 receptors and minibinder used, a scope that I would define as narrow; the take-home message if one wants to try it with other minibinders or other receptors seems to be: test a few things, and your results may surprise you.

      Any additional context you think would help readers interpret or understand the significance of the work:

      We are at the infancy stage of synthetic receptors optimization and next-generation derivation; there is a dearth of systematic studies, as most focus is on developing a few ones that work. This work is an interesting attempt to catalyze more research with these new minibinders. Will it be picked up based on this? Not sure.

    4. Reviewer #2 (Public Review):

      Summary:

      Weinberg et al. show that spike LCB minibinders can be used as the extracellular domain for SynNotch, SNIPR, and CAR. They evaluated their designs against cells expressing the target proteins and live virus.

      Strengths:

      This is a good fundamental demonstration of alternative use of the minibinder. The results are unsurprising but robust and solid in most cases.

      Weaknesses:

      The manuscript would benefit from better descriptions of the study's novelty. Given that LCB previously worked in SynNotch, what unexpected finding was uncovered by this study? It is well known that the extracellular domain of CAR is amendable to different types of binding domains (e.g., scFv, nanobody, DARPin, natural ligands). So, it is not surprising that a minibinder also works with CAR. We don't know if the minibinders are more or less likely to be compatible with CAR or SNIPR.

      The demonstrations are all done using just 1 minibinder. It is hard to conclude that minibinders, as a unique class of protein binders, are generalizable in different contexts. All it can conclude is that this specific Spike minibinder can be used in synNotch, SNIPR, and CAR. The LCB3 minibinder seems to be much weaker.

      The sensing of live viruses is interesting, but the output is very weak. It is difficult to imagine a utility for such a weak response.

    1. Reviewer #3 (Public Review):

      Distant metastasis is the major cause of death in patients with breast cancer. In this manuscript, Liu et al. show that RGS10 deficiency elicits distant metastasis via epithelial-mesenchymal transition in breast cancer. As a prognostic indicator of breast cancer, RGS10 regulates the progress of breast cancer and affects tumor phenotypes such as epithelial-mesenchymal transformation, invasion, and migration. The conclusions of this paper are mostly well supported by data, but some analyses need to be clarified.

      (1) Because diverse biomarkers have been identified for EMT, it is recommended to declare the advantages of using RGS10 as an EMT marker.

      (2) The authors utilized databases to study the upstream regulatory mechanisms of RSG10. It is recommended to clarify why the authors focused on miRNAs rather than other epigenetic modifications.

      (3) The role of miR-539-5p in breast cancer has been described in previous studies. Hence, it is recommended to provide detailed elaboration on how miR-539-5p regulates the expression of RSG10.

      (4) To enhance the clarity and interpretability of the Western blot results, it would be advisable to mark the specific kilodalton (kDa) values of the proteins.

    2. eLife assessment

      This study presents a valuable finding on the mechanism to promote distant metastasis in breast cancer. The evidence supporting the claims of the authors is convincing. The work will be of interest to medical biologists working on breast cancer.

    3. Reviewer #1 (Public Review):

      Strengths

      The paper has shown the expression of RGS10 is related to the molecular subtype, distant metastasis, and survival status of breast cancer. The study utilizes bioinformatic analyses, human tissue samples, and in vitro and in vivo experiments which strengthen the data. RGS10 was validated to inhibit EMT through a novel mechanism dependent on LCN2 and miR-539-5p, thereby reducing cancer cell proliferation, colony formation, invasion, and migration. The study elaborated the function of RGS10 in influencing the prognosis and biological behavior which could be considered as a potential drug target in breast cancer.

      Weakness<br /> The mechanism by which the miR-539-5p/RGS10/LCN2 axis may be related to the prognosis of cancer patients still needs to be elucidated. In addition, the sample size used is relatively limited. Especially, if further exploration of the related pathways and mechanisms of LCN2 can be carried out by using organoid models, as well as the potential of RGS10 as a biomarker for further clinical translation to verify its therapeutic target effect, which will make the data more convincing.

    4. Reviewer #2 (Public Review):

      Liu et al., by focusing on the regulation of G protein-signaling 10 (RGS10), reported that RGS10 expression was significantly lower in patients with breast cancer, compared with normal adjacent tissue. Genetic inhibition of RGS10 caused epithelial-mesenchymal transition, and enhanced cell proliferation, migration, and invasion, respectively. These results suggest an inhibitory role of RGS10 in tumor metastasis. Furthermore, bioinformatic analyses determined signaling cascades for RGS10-mediated breast cancer distant metastasis. More importantly, both in vitro and in vivo studies evidenced that alteration of RGS10 expression by modulating its upstream regulator miR-539-5p affects breast cancer metastasis. Altogether, these findings provide insight into the pathogenesis of breast tumors and hence identify potential therapeutic targets in breast cancer.

      The conclusions of this study are mostly well supported by data. However, there is a weakness in the study that needs to be clarified.

      In Figure 2A, although some references supported that SKBR3 and MCF-7 possess poorly aggressive and less invasive abilities, examining only RGS10 expression in those cells, it could not be concluded that 'RGS10 acts as a tumor suppressor in breast cancer'. It would be better to introduce a horizontal comparison of the invasive ability of these 3 types of cells using an invasion assay.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We thank the reviewers for their thorough review of and overall positive comments on our manuscript. We have revised the manuscript to address most of the concerns raised. Below is a point-by-point response to the reviewers’ comments outlining these changes.

      The novelty of the study is compromised due to the recently published structure of unliganded PRex1 (Chang et al. 2022). The unliganded and IP4-bound structure of P-Rex1 appear virtually identical, however, no clear comparison is presented in the manuscript. In the same paper, a very similar model of P-Rex1 activation upon binding to PIP3 membranes and Gbeta/gamma is presented.

      This comparison has been added as Supplemental Figure 5. Although similar models of activation are presented in our manuscript and in that of Chang et al. 2022, our model is extended to incorporate inhibition by IP4 and other aspects of regulation not previously incorporated, shown in both schematic form (Figure 6B) and including supporting data (Figure 6A). We also point out that in the work by Chang et al. they used domain insertions to stabilize the structure, and here we present the native protein structure. It turns out that they look similar, but our work reduces concerns over possible engineering artifacts. Finally, our model is further informed by HDX-MS measurements of the enzyme bound to PIP3 in liposomes (Figure 6A and Supplemental figure 8), which reveal the regions of the protein subject to higher dynamics and are consistent with a more fully extended conformation.

      The authors demonstrate that IP4 binding to P-Rex1 results in catalytic inhibition and increased protection of autoinhibitory interfaces, as judged by HDX. The relevance of this in a cellular setting is not clear and is not experimentally demonstrated. Further, mechanistically, it is not clear whether the biochemical inhibition by IP4 of PIP3 activated P-Rex1 is due to competition of IP4 with activating PIP3 binding to the PH domain of P-Rex1, or due to stabilizing the autoinhibited conformation, or both.

      We feel that both occur. IP4 and PIP3 bind to the same site of the PH domain, thus they must be competitive at the very least. We also show that IP4 stabilizes the autoinhibited conformation (based on both our cryo-EM and HDX-MS data). Because PIP3 does not activate either DH/PH or DH/PH-DEP1 (nor does IP4 inhibit, see Sup. Fig. 1), it is not possible for us to tell with this suite of experiments how much the inhibition is due to competition versus stabilization of the autoinhibited conformation.

      It is difficult to judge the error in the HDX experiments presented in Sup. data 1 and 2. In the method section, it is stated that the results represent the average from two samples. How is the SD error calculated in Fig.1B-C?

      To clarify, the following passages have been revised:

      Figure 1 legend – “Graphs show the exchange over time for select regions in the P-Rex1 (B) PH domain and (C) a IP4P region that was disordered in the P-Rex1–Gbg structure. Shown is the average of two experiments with error bars representing the mean ± standard deviation.” Methods section – “Each sample was analyzed twice by HDX-MS, and the data shown in graphs represent the average of these experiments. For each peptide, the average of all five time points was calculated and used to plot the difference data onto the coordinates.”

      As mentioned, from the explanations in the manuscript it is difficult to judge the differences between the unliganded and the IP4 bound structure. A superposition, pointing to the main differences, would help. Are there any additional interactions observed that could explain a more stable autoinhibitory conformation?

      Added as Supplemental Figure 5. Although there are global shifts in some of the domains, the overall structures are similar to one another. Due to the moderate resolution of both structures (~4.2 Å), accurate placement of sidechains is difficult, in some places more than others. Because of this, we cannot pinpoint many specific sidechain interactions with certainty. There are no obvious interactions observed in our IP4 bound structure compared to that of 7SYF that would explain a more stable autoinhibited conformation, and thus the evidence comes primarily from the HDX-MS data.

      The cellular significance of IP4 regulation is not clear. Finding a way to manipulate intracellular IP4 levels and showing that this affects P-Rex1 cellular activity would greatly increase the significance of this finding.

      We agree that this would be an informative experiment, but not one that we currently have the means to perform.

      From the presented data it is not clear if inhibition by IP4 is due to competition with PIP3 or due to the proposed stabilization of P-Rex1 autoinhibition. Performing a study as shown in Fig.1D, but with the DH/PH construct could resolve this question.

      First, please see our response to the similar concern from Reviewer 1 above. It is not possible for us to test the DH/PH construct and assess if there is direct competition with PIP3. To emphasize this point (and to correct the error that we never made a call to Sup. Fig. 1C in the original manuscript), we added the following lines to the first paragraph of the Results.

      “Negatively charged liposomes (containing PC/PS), including those that also contain PIP3, unexpectedly inhibit the GEF activity of the DH/PH-DEP1 and DH/PH fragments (Sup. Fig. 1C). Because full-length P-Rex1 is not affected by PC/PS liposomes, it suggests this the observed inhibition represents a non-productive interaction of the DH/PH-DEP1 and DH/PH fragments with negatively charged surfaces in our assay. The lack of activation of DH/PH-DEP1 by PIP3 prevents us from testing whether IP4 can directly inhibit via direct competition with PIP3.”

      If I understand correctly, the data shown in Supplementary Data 1 and 2 are averages of 2 measurements, which makes it difficult to judge real signals from outliers. Perhaps, rather than showing the average, the results from the two experiments could be shown. Also, please explain how the SD error is calculated in Fig.1B-C if the data points indeed are averages of 2 measurements.

      We are sorry for the confusion. The data shown in Sup. Data 1 and 2 are not averages of two experiments. The Methods section has therefore been modified to read: “Each image in Supplemental Data 1 and 2 shows one experiment (rainbow plots) or a difference analysis from those experiments (red to blue plots). Only one of the two sets of experiments performed for each condition (+/- liposomes or +/- IP4) is shown here.” As described above, text has been added to clarify the SD error calculated in Fig. 1B and 1C.

      The authors claim that the data presented in Fig 4B suggests that the salt bridge formed by K207 and E251 is important for autoinhibition. If so, the authors should explain why the K207C mutant is not activated.

      Multiple reviewers had problems with this panel, and we now recognize that we misinterpreted the data, which did not help with this. Because this data is largely just supportive of our structure and SAXS data, Figure 4 was moved to the Supplement and this section of the results now reads:

      “Flexibility of the hinge in the a6-aN helix of the DH/PH module is important for autoinhibition.

      One of our initial goals in this project was to determine a high-resolution structure of the autoinhibited DH/PH-DEP1 core by X-ray crystallography. To this end, we started with the DH/PH-DEP1 A170K variant, which was more inhibited than wild-type but still dynamic, and then introduced S235C/M244C and K207C/E251C double mutants to completely constrain the hinge in the a6-aN helix via disulfide bond formation in a redox sensitive manner. Single cysteine variants K207C and M244C were generated as controls. The S235C/M244C variant performed as expected, decreasing the activity of the A170K variant to nearly background in the oxidized but not the reduced state (Supplemental Fig. 4). However, the M244C single mutant exhibited similar effects, suggesting that it forms disulfide bonds with cysteine(s) other than S235C. Indeed, the side chains of Cys200 and Cys234 are very close to that of M244C. The K207C/E251C mutant was similar to S235C/M244C under oxidized conditions, but ~15-fold more active (similar to WT DH/PH levels, see Fig. 3C) under reducing conditions. The K270C variant, on the other hand, exhibited higher activity than A170K on its own under oxidizing conditions, but similar activity to all the variants except K207C/E251C when reduced. These results suggest that K207C/E251C in a reduced state and K270C in an oxidized state favor a configuration where the DEP1 domain is less able to engage the DH domain and maintain the kinked state. The mechanism for this is not known. Regardless, these data show that perturbation of contacts between the kinked segments of the a6-aN helix can have profound consequences on the activity of the DH/PH-DEP1 core.”

      In the low-resolution cryo-EM study, it is mentioned that only a few classes exhibit the extra density that ultimately corresponds to autoinhibited P-Rex1. If so, is this also the case in the high-resolution study and how many of the most populated classes contribute to the autoinhibited structure? It would be informative for the reader to provide this information.

      Indeed, only a small subset of the particles are in the autoinhibited conformation in the Krios data set, similar to the Glacios. How many classes these particles partition to is dependent on how many classes are asked for during 2D classification and how many “garbage” particles are present at the different stages of particle stack cleaning during 2D classification. Also, because of the preferred orientation problem, many of the particles in this conformation segregate together during 2D classification. Therefore, in addition to the information show in Sup. Fig. 2, we think a more informative metric to answer the reviewer’s question is the number of particles at the start of data processing compared to at the end, which is shown in Table 1.

      Page 10, line 217: "The kink .... is important for autoinhibition". It seems unlikely that there is no kink in the activated state. Perhaps it should say something like "Mobility in the kink is important ..."

      Agreed. In fact, the SAXS data we reported on the DH/PH module in Ravala et al. (2020) is most consistent with a DH/PH that exhibits both extended and condensed conformations in solutions.

      Fig. 4A: It would help to label helices alpha6 and alphaN.

      These helices have now been labeled.

      Page 11, lines 223 and 228 are contradictory: In line 223 it is stated that K207C/E251C exhibit reduced GEF activity, while on line 228 it says this has little effect under non-reducing conditions.

      We thank the reviewer for this catch. We have modified the text to make it self-consistent.

      In Fig.5B, it would help if the authors mention in the legend that a trans-well migration assay was used, in order to know what the increase in stained cells signifies.

      The legend has been modified to include this information.

      The previous work by Chang et al., 2022 (PMID: 35864164) found that the final DH domain α6 formed the hinge helix (the kink in this manuscript), which undergoes a significant conformational change between closed and opened conformations of P-Rex1. Could the authors discuss the state of the kink in the presence of IP4 and in the P-Rex1 variants A170K and L177E?

      We have now included an alignment of our structure in the presence of IP4 with the Chang et al., 2022 structure (Supplemental Figure 5). There is very little difference in the kink region. Because the A170K variant exhibits reduced GEF activity and a smaller Dmax, it could be speculated that the kink might be further stabilized as compared to wild-type. The L177E variant exhibited activity similar to that of DH/PH alone, implying a relief of the kink. This interpretation is supported by our SAXS analysis of A170K and L177E in Fig. 3.

      I am a bit confused about the set of experiments with the intended DH-DEP1 interface disruptive mutation A170K, which later turned out to enhance P-Rex1 activity inhibition. The authors explained that the DH K170 salt bridges with DEP1 Glu411 stabilize the DH-DEP1 interaction. Next, the authors used P-Rex1 A170K mutant as the backbone for the introduction of disulfide bonds to block the closed configuration of the DH-PH hinge region by creating some mutants S235C/M244C and K207C/E251C. The first intended C235-C244 disulfide bond did not show any effect on the GEF activity because C235 is so close to the native C234 for a potential disulfide bond. I would recommend putting the data of S235C/M244C into a supplemental figure. Also, I am wondering if the GEF activity measurements in Fig 4B could be performed in the presence or absence of IP4 to see whether the IP4-induced autoinhibition form is distinct from the natural autoinhibitory once the kink was unblocked by reducing agent DTT.

      The confusion was warranted by our poor analysis of this data, rectified as discussed above.

      With regards to experiments plus/minus IP4, due to the absence of the IP4P domain, IP4 had no inhibitory effect on the activity of DH/PH or DH/PH-DEP1 (Supplemental Figure 1A and 1B) and as such this experiment would not likely be informative (or at best very hard to interpret).

      For the IP4 versus PIP3 activity assays, the authors indicated that P-Rex1 inhibition is dependent on the Inositol 3-phosphate. Have the authors tested and could they test with either Ins (1,3,4)P3 or Ins(1,3,5)P3?

      In these assays (Figure 1D), we show that inhibition does not occur with Ins(1,4,5)P3. Based on previous structures of IP4 bound to the PH domain and supporting biochemical assays (Cash et al., 2016, Structure), the 3- and 4-phosphates are the most highly coordinated and the next most thermostabilizing headgroup other than IP4 was Ins(1,3,4)P3. Therefore, we would anticipate that Ins(1,3,4)P3 might stabilize the autoinhibited state, perhaps at higher concentrations, but we have not directly tested this.

      The authors should provide the electron density maps of the P-REX1-IP4 complex in the supplemental figure and highlight the maps for two key interactions between DEP1 and DH and between PH and IP4P 4-helix bundle subdomain.

      The Coulomb potential map of this complex is shown in Figure 2A. Due to the moderate resolution of the reconstruction, side chain details cannot be unambiguously modeled at these interfaces, which is why we do not highlight any observed, specific interactions between sidechains.

      The manuscript was written very well and there is only one typing error in the legend of Supplemental Figure 1.

      Thank you for this catch.

      Details of EM density at significant domain interfaces and at the IP4 binding site should be provided as supplementary material.

      Beyond our comment about interfaces above, we have now provided the map representing the bound IP4 as Figure 4B.

      Line 123: It is difficult to discern in Figure 2A the "severe bend" in the helix that connects the DH and PH domains. It was not apparent (to me, at least) where this helix is located until eventually encountering Figure 4. It would be helpful to highlight or label (maybe with an asterisk) the bend site in Fig 2A.

      This has been labeled in Figure 2A.

      Line 125-126: likewise, It would be helpful to the reader to highlight the GTPase binding site in the DH domain.

      This has been labeled in Figure 2A.

      Line 159. Consider adding a supplementary figure showing a superposition of the two pREX-1 regulatory interfaces in the present structure and in 7SYF.

      A superposition of the two structures has now been added as Supplemental Figure 5. Because both structures are of moderate resolution, it is difficult to place side chains with a high degree of certainty. Thus, we did not think it wise to draw conclusions from comparisons between the details of these interfaces.

      Is the positioning of IP4 dictated by the EM density, prior knowledge from high-resolution structures, or both? A rendering of the EM density over the stick model as a supplementary figure would be helpful.

      This was modeled based on both. This image has now been added as Figure 4B.

      It should be emphasized that the jackknife model is similar to the hinge model proposed by Chang et al (2022).

      Mention of similarity between our model and the model proposed by Chang et al., 2022 occurs twice in the manuscript.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors perform a multidisciplinary approach to describe the conformational plasticity of P-Rex1 in various states (autoinhibited, IP4 bound and PIP3 bound). Hydrogen-deuterium exchange (HDX) is used to reveal how IP4 and PIP3 binding affect intramolecular interactions. While IP4 is found to stabilize autoinhibitory interactions, PIP3 does the opposite, leading to deprotection of autoinhibitory sites. Cryo-EM of IP4 bound P-Rex1 reveals a structure in the autoinhibited conformation, very similar to the unliganded structure reported previously (Chang et al. 2022). Mutations at observed autoinhibitory interfaces result in a more open structure (as shown by SAXS), reduced thermal stability and increased GEF activity in biochemical and cellular assays. Together their work portrays a dynamic enzyme that undergoes long-range conformational changes upon activation on PIP3 membranes. The results are technically sound and the conclusions are justified. The main drawback is the limited novelty due to the recently published structure of unliganded P-Rex1, which is virtually identical to the IP4 bound structure presented here. Novel aspects suggest a regulatory role for IP4, but the exact significance and mechanism of this regulation has not been explored.

      Strengths:

      The authors use a multitude of techniques to describe the dynamic nature and conformational changes of P-Rex1 upon binding to IP4 and PIP3 membranes. The different approaches together fit well with the overall conclusion that IP4 binding negatively regulates P-Rex1, while binding to PIP3 membranes leads to conformational opening and catalytic activation. The experiments are performed very thoroughly and are technically sound. The results are clear and support the conclusions.

      Weaknesses:

      (1) The novelty of the study is compromised due to the recently published structure of unliganded P-Rex1 (Chang et al. 2022). The unliganded and IP4 bound structure of P-Rex1 appear virtually identical, however, no clear comparison is presented in the manuscript. In the same paper a very similar model of P-Rex1 activation upon binding to PIP3 membranes and Gbeta-gamma is presented.

      (2) The authors demonstrate that IP4 binding to P-Rex1 results in catalytic inhibition and increased protection of autoinhibitory interfaces, as judged by HDX. The relevance of this in a cellular setting is not clear and is not experimentally demonstrated. Further, mechanistically, it is not clear whether the biochemical inhibition by IP4 of PIP3 activated P-Rex1 is due to competition of IP4 with activating PIP3 binding to the PH domain of P-Rex1, or due to stabilizing the autoinhibited conformation, or both.

      (3) Fig.1B-C: To give a standard deviation from 2 data points has no statistical significance. In this case it would be better to define as range/difference of the 2 data points.

    3. eLife assessment

      This important study contributes insights into the regulatory mechanisms of a protein governing cell migration at the membrane. The integration of approaches revealing protein structure and dynamics provides convincing data for a model of regulation and suggests a new allosteric role for a solubilized phospholipid headgroup. The work will be interesting to researchers focusing on signaling mechanisms, cell motility, and cancer metathesis.

    4. Reviewer #2 (Public Review):

      Summary:

      In this new paper, the authors used biochemical, structural, and biophysical methods to elucidate the mechanisms by which IP4, the PIP3 headgroup, can induce an autoinhibit form of P-Rex1 and propose a model of how PIP3 can trigger long-range conformational changes of P-Rex1 to relieve this autoinhibition. The main findings of this study are that a new P-Rex1 autoinhibition is driven by an IP4-induced binding of the PH domain to the DH domain active site and that this autoinhibit form stabilized by two key interactions between DEP1 and DH and between PH and IP4P 4-helix bundle (4HB) subdomain. Moreover, they found that the binding of phospholipid PIP3 to the PH domain can disrupt these interactions to relieve P-Rex1 autoinhibition.

      Strengths:

      The study provides good evidence that binding of IP4 to the P-Rex1 PH domain can make the two long-range interactions between the catalytic DH domain and the first DEP domain, and between the PH domain and the C-terminal IP4P 4HB subdomain that generate a novel P-Rex1 autoinhibition mechanism. This valuable finding adds an extra layer of P-Rex1 regulation (perhaps in the cytoplasm) to the synergistic activation by phospholipid PIP3 and the heterotrimeric Gβγ subunits at the plasma membrane. Overall, this manuscript's goal sounds interesting, the experimental data were carried out carefully and reliably.

      Weakness:

      The set of experiments with the disulfide bond S235C/M244C caused a bit of confusion for interpretation, it should be moved into the supplement, and the text and Figure 4 were altered accordingly.

    5. Reviewer #3 (Public Review):

      Summary:

      In this report, Ravala et al demonstrate that IP4, the soluble head-group of phosphatiylinositol 3,4,5 - trisphosphate (PIP3), is an inhibitor of pREX-1, a guanine nucleotide exchange factor (GEF) for Rac1 and related small G proteins that regulate cell cell migration. This finding is perhaps unexpected since pREX-1 activity is PIP3-dependent. By way of Cryo-EM (revealing the structure of the p-REX-1/IP4 complex at 4.2Å resolution), hydrogen-deuterium mass spectrometry and small angle X-ray scattering, they deduce a mechanism for IP4 activation, and conduct mutagenic and cell-based signaling assays that support it. The major finding is that IP4 stabilizes two interdomain interfaces that block access of the DH domain, which conveys GEF activity towards small G protein substrates. One of these is the interface between the PH domain that binds to IP4 and a 4-helix bundle extension of the IP4 Phosphatase domain and the DEP1 domain. The two interfaces are connected by a long helix that extends from PH to DEP1. Although the structure of fully activated pREX-1 has not been determined, the authors propose a "jackknife" mechanism, similar to that described earlier by Chang et al (2022) (referenced in the author's manuscript) in which binding of IP3 relieves a kink in a helix that links the PH/DH modules and allows the DH-PH-DEP triad to assume an extended conformation in which the DH domain is accessible. While the structure of the activated pREX-1 has not been determined, cysteine mutagenesis that enforces the proposed kink is consistent with this hypothesis. SAXS and HDX-MS experiments suggest that IP4 acts by stiffening the inhibitory interfaces, rather than by reorganizing them. Indeed, the cryo-EM structure of ligand-free pREX-1 shows that interdomain contacts are largely retained in the absence of IP4.

      Strengths:

      The manuscript thus describes a novel regulatory role for IP4 and is thus of considerable significance to our understanding of regulatory mechanisms that control cell migration, particularly in immune cell populations. Specifically, they show how the inositol polyphosphate IP4 controls the activity of pREX-1, a guanine nucleotide exchange factor that controls the activity of small G proteins Rac and CDC42 . In their clearly-written discussion, the authors explain how PIP3, the cell membrane and the Gbeta-gamma subunits of heterotrimeric membranes together localize pREX-1 at the membrane and induce activation. The quality of experimental data is high and both in vitro and cell-based assays of site-directed mutants designed to test the author's hypotheses are confirmatory. The results strongly support the conclusions. The combination of cryo-EM data, that describe the static (if heterogeneous) structures with experiments (small angle x-ray scattering and hydrogen-deuterium exchange-mass spectrometry) that report on dynamics are well employed by the authors

      Manuscript revision:

      The reviewers noted a number of weaknesses, including error analysis of the HDX data, interpretation of the mutagenesis data, the small fraction of the total number of particles used to generate the EM reconstruction, the novelty of the findings in light of the previous report by Cheng et al, 2022, various details regarding presentation of structural results and questions regarding the interpretation of the inhibition data (Figure 1D). The authors have responded adequately to these critiques. It appears that pREX-1 is a highly dynamic molecule, and considerable heterogeneity among particles might be expected.

      While, indeed, the conformation of pREX presented in this report is not novel, the finding that this inactive conformational state is stabilized by IP4 is significant and important. The evidence for this is both structural and biochemical, as indicated by micromolar competition of IP4 with PI3-enriched vesicles resulting in the inhibition of pREX-1 GEF activity.

    1. Reviewer #3 (Public Review):

      Summary:

      Studying evolutionary trajectories provides important insight in genetic architecture of adaptation and provide potential contribution to evaluating the predictability (or unpredictability) in biological processes involving adaptation. While many papers in the field address adaptation to environmental challenges, the number of studies on how genomic contexts, such as large-scale variation, can impact evolutionary outcomes adaptation is relatively low. This research experimentally evolved a genome-reduced strain for ~1000 generations with 9 replicates and dissected their evolutionary changes. Using the fitness assay of OD measurement, the authors claimed there is a general trend of increasing growth rate and decreasing carrying capacity, despite a positive correlation among all replicates. The authors also performed genomic and transcriptomic research at the end of experimental evolution, claiming the dissimilarity in the evolution at the molecular level.

      Strengths:

      The experimental evolution approach with a high number of replicates provides a good way to reveal the generality/diversity of the evolutionary routes.

      The assay of fitness, genome, and transcriptome all together allows a more thorough understanding of the evolutionary scenarios and genetic mechanisms.

      Comments on revised version:

      5 in the last round of comments: When the authors mentioned no overlapping in single mutation level, I thought the authors would directly use this statement to support their next sentence about no bias of these mutations. As the author's responded, I was suspecting no overlapping for 65 mutation across the entire genome is likely to be not statistically significant. In the revised version, the authors emphasized and specified their simulation and argument in the following sentences, so I do not have questions on this point anymore.

      14 in the last round of comments: As what authors responded, "short-term responses" meant transcriptional or physiological changes within a few hours after environmental or genetic fluctuation. "long-term responses" involve new compensatory mutations and selection. The point was that, the authors found that "the transcriptome reorganization for fitness increase triggered by evolution differed from that for fitness decrease caused by genome reduction." That is short vs long-term responses to genetic perturbation. Some other experimental evolution did short vs long-term responses to environmental perturbation and usually also found that the short-term responses are reverted in the long-term responses (e.g., https://academic.oup.com/mbe/article/33/1/25/2579742). I hope this explanation makes more sense. And I think the authors can make their own decisions on whether they would like to add this discussion or not.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviewer #1:

      Thank you for the careful reading and the positive evaluation of our manuscript. As you mentioned, the present study tried to address the question of how the lost genomic functions could be compensated by evolutionary adaptation, indicating the potential mechanism of "constructive" rather than "destructive" evolution. Thank you for the instructive comments that helped us to improve the manuscript. We sincerely hope the revised manuscript and the following point-to-point response meet your concerns.

      • Line 80 "Growth Fitness" is this growth rate?

      Yes. The sentence was revised as follows.

      (L87-88) “The results demonstrated that most evolved populations (Evos) showed improved growth rates, in which eight out of nine Evos were highly significant (Fig. 1B, upper).”

      • Line 94 a more nuanced understanding of r/K selection theory, allows for trade-ups between R and K, as well as trade-offs. This may explain why you did not see a trade-off between growth and carrying capacity in this study. See this paper https://doi.org/10.1038/s41396-023-01543-5. Overall, your evos lineages evolved higher growth rates and lower carrying capacity (Figures 1B, C, E). If selection was driving the evolution of higher growth rates, it may have been that there was no selective pressure to maintain high carrying capacity. This means that the evolutionary change you observed in carrying capacity may have been neutral "drift" of the carrying capacity trait, during selection for growth rate, not because of a trade-off between R and K. This is especially likely since carrying capacity declined during evolution. Unless the authors have convincing evidence for a tradeoff, I suggest they remove this claim.

      • Line 96 the authors introduce a previous result where they use colony size to measure growth rate, this finding needs to be properly introduced and explained so that we can understand the context of the conclusion.

      • Line 97 This sentence "the collapse of the trade-off law likely resulted from genome reduction." I am not sure how the authors can draw this conclusion, what is the evidence supporting that the genome size reduction causes the breakdown of the tradeoff between R and K (if there was a tradeoff)?

      Thank you for the reference information and the thoughtful comments. The recommended paper was newly cited, and the description of the trade-off collapse was deleted. Accordingly, the corresponding paragraph was rewritten as follows.

      (L100-115) “Intriguingly, a positive correlation was observed between the growth fitness and the carrying capacity of the Evos (Fig. 1D). It was somehow consistent with the positive correlations between the colony growth rate and the colony size of a genome-reduced strain 11 and between the growth rates and the saturated population size of an assortment of genome reduced strains 13. Nevertheless, the negative correlation between growth rate and carrying capacity, known as the r/K selection30,31 was often observed as the trade-off relationship between r and K in the evolution and ecology studies 32 33,34. As the r/K trade-off was proposed to balance the cellular metabolism that resulted from the cost of enzymes involved 34, the deleted genes might play a role in maintaining the metabolism balance for the r/K correlation. On the other hand, the experimental evolution (i.e., serial transfer) was strictly performed within the exponential growth phase; thus, the evolutionary selection was supposed to be driven by the growth rate without selective pressure to maintain the carrying capacity. The declined carrying capacity might have been its neutral "drift" but not a trade-off to the growth rate. Independent and parallel experimental evolution of the reduced genomes selecting either r or K is required to clarify the actual mechanisms.”

      • Line 103 Genome mutations. The authors claim that there are no mutations in parallel but I see that there is a 1199 base pair deletion in eight of the nine evo strains (Table S3). I would like the author to mention this and I'm actually curious about why the authors don't consider this parallel evolution.

      Thank you for your careful reading. According to your comment, we added a brief description of the 1199-bp deletion detected in the Evos as follows.

      (L119-122) “The number of mutations largely varied among the nine Evos, from two to 13, and no common mutation was detected in all nine Evos (Table S3). A 1,199-bp deletion of insH was frequently found in the Evos (Table S3, highlighted), which well agreed with its function as a transposable sequence.”

      • Line 297 Please describe the media in full here - this is an important detail for the evolution experiment. Very frustrating to go to reference 13 and find another reference, but no details of the method. Looked online for the M63 growth media and the carbon source is not specified. This is critical for working out what selection pressures might have driven the genetic and transcriptional changes that you have measured. For example, the parallel genetic change in 8/9 populations is a deletion of insH and tdcD (according to Table S3). This is acetate kinase, essential for the final step in the overflow metabolism of glucose into acetate. If you have a very low glucose concentration, then it could be that there was selection to avoid fermentation and devote all the pyruvate that results from glycolysis into the TCA cycle (which is more efficient than fermentation in terms of ATP produced per pyruvate).

      Sorry for the missing information on the medium composition, which was additionally described in the Materials and Methods. The glucose concentration in M63 was 22 mM, which was supposed to be enough for bacterial growth. Thank you for your intriguing thinking about linking the medium component to the genome mutation-mediated metabolic changes. As there was no experimental result regarding the biological function of gene mutation in the present study, please allow us to address this issue in our future work.

      (L334-337) “In brief, the medium contains 62 mM dipotassium hydrogen phosphate, 39 mM potassium dihydrogen phosphate, 15 mM ammonium sulfate, 15 μM thiamine hydrochloride, 1.8 μM Iron (II) sulfate, 0.2 mM magnesium sulfate, and 22 mM glucose.”

      • Line 115. I do not understand this argument "They seemed highly related to essentiality, as 11 out of 49 mutated genes were essential (Table S3)." Is this a significant enrichment compared to the expectation, i.e. the number of essential genes in the genome? This enrichment needs to be tested with a Hypergeometric test or something similar.

      • Also, "As the essential genes were known to be more conserved than nonessential ones, the high frequency of the mutations fixed in the essential genes suggested the mutation in essentiality for fitness increase was the evolutionary strategy for reduced genome." I do not think that there is enough evidence to support this claim, and it should be removed.

      Sorry for the unclear description. Yes, the mutations were significantly enriched in the essential genes (11 out of 45 genes) compared to the essential genes in the whole genome (286 out of 3290 genes). The improper description linking the mutation in essential genes to the fitness increase was removed, and an additional explanation on the ratio of essential genes was newly supplied as follows.

      (L139-143) “The ratio of essential genes in the mutated genes was significantly higher than in the total genes (286 out of 3290 genes, Chi-square test p=0.008). As the essential genes were determined according to the growth35 and were known to be more conserved than nonessential ones 36,37, the high frequency of the mutations fixed in the essential genes was highly intriguing and reasonable.”

      • Line 124 Regarding the mutation simulations, I do not understand how the observed data were compared to the simulated data, and how conclusions were drawn. Can the authors please explain the motivation for carrying out this analysis, and clearly explain the conclusions?

      Random simulation was additionally explained in the Materials and Methods and the conclusion of the random simulation was revised in the Results, as follows.

      (L392-401) “The mutation simulation was performed with Python in the following steps. A total of 65 mutations were randomly generated on the reduced genome, and the distances from the mutated genomic locations to the nearest genomic scars caused by genome reduction were calculated. Subsequently, Welch's t-test was performed to evaluate whether the distances calculated from the random mutations were significantly longer or shorter than those calculated from the mutations that occurred in Evos. The random simulation, distance calculation, and statistic test were performed 1,000 times, which resulted in 1,000 p values. Finally, the mean of p values (μp) was calculated, and a 95% reliable region was applied. It was used to evaluate whether the 65 mutations in the Evos were significantly close to the genomic scars, i.e., the locational bias.”

      (L148-157) “Random simulation was performed to verify whether there was any bias or hotspot in the genomic location for mutation accumulation due to the genome reduction. A total of 65 mutations were randomly generated on the reduced genome (Fig. 2B), and the genomic distances from the mutations to the nearest genome reduction-mediated scars were calculated. Welch's t-test was performed to evaluate whether the genomic distances calculated from random mutations significantly differed from those from the mutations accumulated in the Evos. As the mean of p values (1,000 times of random simulations) was insignificant (Fig. 2C, μp > 0.05), the mutations fixed on the reduced genome were either closer or farther to the genomic scars, indicating there was no locational bias for mutation accumulation caused by genome reduction.”

      • Line 140 The authors should give some background here - explain the idea underlying chromosomal periodicity of the transcriptome, to help the reader understand this analysis.

      • Line 142 Here and elsewhere, when referring to a method, do not just give the citation, but also refer to the methods section or relevant supplementary material.

      The analytical process (references and methods) was described in the Materials and Methods, and the reason we performed the chromosomal periodicity was added in the Results as follows.

      (L165-172) “As the E. coli chromosome was structured, whether the genome reduction caused the changes in its architecture, which led to the differentiated transcriptome reorganization in the Evos, was investigated. The chromosomal periodicity of gene expression was analyzed to determine the structural feature of genome-wide pattern, as previously described 28,38. The analytical results showed that the transcriptomes of all Evos presented a common six-period with statistical significance, equivalent to those of the wild-type and ancestral reduced genomes (Fig. 3A, Table S4).”

      • Line 151 "The expression levels of the mutated genes were higher than those of the remaining genes (Figure 3B)"- did this depend on the type of mutation? There were quite a few early stops in genes, were these also more likely to be expressed? And how about the transcriptional regulators, can you see evidence of their downstream impact?

      Sorry, we didn't investigate the detailed regulatory mechanisms of 49 mutated genes, which was supposed to be out of the scope of the present study. Fig. 3B was the statistical comparison between 3225 and 49 genes. It didn't mean that all mutated genes expressed higher than the others. The following sentences were added to address your concern.

      (L181-185) “As the regulatory mechanisms or the gene functions were supposed to be disturbed by the mutations, the expression levels of individual genes might have been either up- or down-regulated. Nevertheless, the overall expression levels of all mutated genes tended to be increased. One of the reasons was assumed to be the mutation essentiality, which remained to be experimentally verified.”

      • Line 199 onward. The authors used WGCNA to analyze the gene expression data of evolved organisms. They identified distinct gene modules in the reduced genome, and through further analysis, they found that specific modules were strongly associated with key biological traits like growth fitness, gene expression changes, and mutation rates. Did the authors expect that there was variation in mutation rate across their populations? Is variation from 3-16 mutations that they observed beyond the expectation for the wt mutation rate? The genetic causes of mutation rate variation are well understood, but I could not see any dinB, mutT,Y, rad, or pol genes among the discovered mutations. I would like the authors to justify the claim that there was mutation rate variation in the evolved populations.

      Thank you for the intriguing thinking. We don't think the mutation rates were significantly varied across the nine populations, as no mutation occurred in the MMR genes, as you noticed. Our previous study showed that the spontaneous mutation rate of the reduced genome was higher than that of the wild-type genome (Nishimura et al., 2017, mBio). As nonsynonymous mutations were not detected in all nine Evos, the spontaneous mutation rate couldn't be calculated (because it should be evaluated according to the ratio of nonsynonymous and synonymous single-nucleotide substitutions in molecular evolution). Therefore, discussing the mutation rate in the present study was unavailable. The following sentence was added for a better understanding of the gene modules.

      (L242-245) “These modules M2, M10 and M16 might be considered as the hotspots for the genes responsible for growth fitness, transcriptional reorganization, and mutation accumulation of the reduced genome in evolution, respectively.”

      • Line 254 I get the idea of all roads leading to Rome, which is very fitting. However, describing the various evolutionary strategies and homeostatic and variable consequence does not sound correct - although I am not sure exactly what is meant here. Looking at Figure 7, I will call strategy I "parallel evolution", that is following the same or similar genetic pathways to adaptation and strategy ii I would call divergent evolution. I am not sure what strategy iii is. I don't want the authors to use the terms parallel and divergent if that's not what they mean. My request here would be that the authors clearly describe these strategies, but then show how their results fit in with the results, and if possible, fit with the naming conventions, of evolutionary biology.

      Thank you for your kind consideration and excellent suggestion. It's our pleasure to adopt your idea in tour study. The evolutionary strategies were renamed according to your recommendation. Both the main text and Fig. 7 were revised as follows.

      (L285-293) “Common mutations22,44 or identical genetic functions45 were reported in the experimental evolution with different reduced genomes, commonly known as parallel evolution (Fig. 7, i). In addition, as not all mutations contribute to the evolved fitness 22,45, another strategy for varied phenotypes was known as divergent evolution (Fig. 7, ii). The present study accentuated the variety of mutations fixed during evolution. Considering the high essentiality of the mutated genes (Table S3), most or all mutations were assumed to benefit the fitness increase, partially demonstrated previously 20. Nevertheless, the evolved transcriptomes presented a homeostatic architecture, revealing the divergent to convergent evolutionary strategy (Fig. 7, iii).”

      Author response image 1.

      • Line 327 Growth rates/fitness. I don't think this should be called growth fitness- a rate is being calculated. I would like the authors to explain how the times were chosen - do the three points have to be during the log phase? Can you also explain what you mean by choosing three ri that have the largest mean and minor variance?

      Sorry for the confusing term usage. The fitness assay was changed to the growth assay. Choosing three ri that have the largest mean and minor variance was to avoid the occasional large values (blue circle), as shown in the following figure. In addition, the details of the growth analysis can be found at https://doi.org/10.3791/56197 (ref. 59), where the video of experimental manipulation, protocol, and data analysis is deposited. The following sentence was added in accordance.

      Author response image 2.

      (L369-371) “The growth rate was determined as the average of three consecutive ri, showing the largest mean and minor variance to avoid the unreliable calculation caused by the occasionally occurring values. The details of the experimental and analytical processes can be found at https://doi.org/10.3791/56197.”

      • Line 403 Chromosomal periodicity analysis. The windows chosen for smoothing (100kb) seem big. Large windows make sense for some things - for example looking at how transcription relates to DNA replication timing, which is a whole-genome scale trend. However, here the authors are looking for the differences after evolution, which will be local trends dependent on specific genes and transcription factors. 100kb of the genome would carry on the order of one hundred genes and might be too coarse-grained to see differences between evos lineages.

      Thank you for the advice. We agree that the present analysis focused on the global trend of gene expression. Varying the sizes may lead to different patterns. Additional analysis was performed according to your comment. The results showed that changes in window size (1, 10, 50, 100, and 200 kb) didn't alter the periodicity of the reduced genome, which agreed with the previous study on a different reduced genome MDS42 of a conserved periodicity (Ying et al., 2013, BMC Genomics). The following sentence was added in the Materials and Methods.

      (L460-461) “Note that altering the moving average did not change the max peak.”

      • Figures - the figures look great. Figure 7 needs a legend.

      Thank you. The following legend was added.

      (L774-777) “Three evolutionary strategies are proposed. Pink and blue arrowed lines indicate experimental evolution and genome reduction, respectively. The size of the open cycles represents the genome size. Black and grey indicate the ancestor and evolved genomes, respectively.”

      Response to Reviewer #2:

      Thank you for reviewing our manuscript and for your fruitful comments. We agree that our study leaned towards elaborating observed findings rather than explaining the detailed biological mechanisms. We focused on the genome-wide biological features rather than the specific biological functions. The underlying mechanisms indeed remained unknown, leaving the questions as you commented. We didn't perform the fitness assay on reconstituted (single and combinatorial) mutants because the research purpose was not to clarify the regulatory or metabolic mechanisms. It's why the RNA-Seq analysis provided the findings on genome-wide patterns and chromosomal view, which were supposed to be biologically valuable. We did understand your comments and complaints that the conclusions were biologically meaningless, as ALE studies that found the specific gene regulation or improved pathway was the preferred story in common, which was not the flow of the present study.

      For this reason, our revision may not address all these concerns. Considering your comments, we tried our best to revise the manuscript. The changes made were highlighted. We sincerely hope the revision and the following point-to-point response are acceptable.

      Major remarks:

      (1) The authors outlined the significance of ALE in genome-reduced organisms and important findings from published literature throughout the Introduction section. The description in L65-69, which I believe pertains to the motivation of this study, seems vague and insufficient to convey the novelty or necessity of this study i.e. it is difficult to grasp what aspects of genome-reduced biology that this manuscript intends to focus/find/address.

      Sorry for the unclear writing. The sentences were rewritten for clarity as follows.

      (L64-70) “Although the reduced growth rate caused by genome reduction could be recovered by experimental evolution, it remains unclear whether such an evolutionary improvement in growth fitness was a general feature of the reduced genome and how the genome-wide changes occurred to match the growth fitness increase. In the present study, we performed the experimental evolution with a reduced genome in multiple lineages and analyzed the evolutionary changes of the genome and transcriptome.”

      (2) What is the rationale behind the lineage selection described in Figure S1 legend "Only one of the four overnight cultures in the exponential growth phase (OD600 = 0.01~0.1) was chosen for the following serial transfer, highlighted in red."?

      The four wells (cultures of different initial cell concentrations) were measured every day, and only the well that showed OD600=0.01~0.1 (red) was transferred with four different dilution rates (e.g., 10, 100, 1000, and 10000 dilution rates). It resulted in four wells of different initial cell concentrations. Multiple dilutions promised that at least one of the wells would show the OD600 within the range of 0.01 to 0.1 after the overnight culture. They were then used for the next serial transfer. Fig. S1 provides the details of the experimental records. The experimental evolution was strictly controlled within the exponential phase, quite different from the commonly conducted ALE that transferred a single culture in a fixed dilution rate. Serial transfer with multiple dilution rates was previously applied in our evolution experiments and well described in Nishimura et al., 2017, mBio; Lu et al., 2022, Comm Biol; Kurokawa et al., 2022, Front Microbiol, etc. The following sentence was added in the Materials and Methods.

      (L344-345) “Multiple dilutions changing in order promised at least one of the wells within the exponential growth phase after the overnight culture.”

      (3) The measured growth rate of the end-point 'F2 lineage' shown in Figure S2 seemed comparable to the rest of the lineages (A1 to H2), but the growth rate of 'F2' illustrated in Figure 1B indicates otherwise (L83-84). What is the reason for the incongruence between the two datasets?

      Sorry for the unclear description. The growth rates shown in Fig. S2 were obtained during the evolution experiment using the daily transfer's initial and final OD600 values. The growth rates shown in Fig. 1B were obtained from the final population (Evos) growth assay and calculated from the growth curves (biological replication, N=4). Fig. 1B shows the precisely evaluated growth rates, and Fig. S2 shows the evolutionary changes in growth rates. Accordingly, the following sentence was added to the Results.

      (L84-87) “As the growth increases were calculated according to the initial and final records, the exponential growth rates of the ancestor and evolved populations were obtained according to the growth curves for a precise evaluation of the evolutionary changes in growth.”

      (4) Are the differences in growth rate statistically significant in Figure 1B?

      Eight out of nine Evos were significant, except F2. The sentences were rewritten and associated with the revised Fig. 1B, indicating significance.

      (L87-90) “The results demonstrated that most evolved populations (Evos) showed improved growth rates, in which eight out of nine Evos were highly significant (Fig. 1B, upper). However, the magnitudes of growth improvement were considerably varied, and the evolutionary dynamics of the nine lineages were somehow divergent (Fig. S2).”

      (5) The evolved lineages showed a decrease in their maximal optical densities (OD600) compared to the ancestral strain (L85-86). ALE could accompany changes in cell size and morphologies, (doi: 10.1038/s41586-023-06288-x; 10.1128/AEM.01120-17), which may render OD600 relatively inaccurate for cell density comparison. I suggest using CFU/mL metrics for the sake of a fair comparison between Anc and Evo.

      The methods evaluating the carrying capacity (i.e., cell density, population size, etc.) do not change the results. Even using CFU is unfair for the living cells that can not form colonies and unfair if the cell size changes. Optical density (OD600) provides us with the temporal changes of cell growth in a 15-minute interval, which results in an exact evaluation of the growth rate in the exponential phase. CFU is poor at recording the temporal changes of population changes, which tend to result in an inappropriate growth rate. Taken together, we believe that our method was reasonable and reliable. We hope you can accept the different way of study.

      (6) Please provide evidence in support of the statement in L115-119. i.e. statistical analysis supporting that the observed ratio of essential genes in the mutant pool is not random.

      The statistic test was performed, and the following sentence was added.

      (L139-141) “The ratio of essential genes in the mutated genes was significantly higher than in the total genes (286 out of 3290 genes, Chi-square test p=0.008).”

      (7) The assumption that "mutation abundance would correlate to fitness improvement" described in L120-122: "The large variety in genome mutations and no correlation of mutation abundance to fitness improvement strongly suggested that no mutations were specifically responsible or crucially essential for recovering the growth rate of the reduced genome" is not easy to digest, in the sense that (i) the effect of multiple beneficial mutations are not necessarily summative, but are riddled with various epistatic interactions (doi: 10.1016/j.mec.2023.e00227); (ii) neutral hitchhikers are of common presence (you could easily find reference on this one); (iii) hypermutators that accumulate greater number of mutations in a given time are not always the eventual winners in competition games (doi: 10.1126/science.1056421). In this sense, the notion that "mutation abundance correlates to fitness improvement" in L120-122 seems flawed (for your perusal, doi: 10.1186/gb-2009-10-10-r118).

      Sorry for the improper description and confusing writing, and thank you for the fruitful knowledge on molecular evolution. The sentence was deleted, and the following one was added.

      (L145-146) “Nevertheless, it was unclear whether and how these mutations were explicitly responsible for recovering the growth rate of the reduced genome.”

      (8) Could it be possible that the large variation in genome mutations in independent lineages results from a highly rugged fitness landscape characterized by multiple fitness optima (doi: 10.1073/pnas.1507916112)? If this is the case, I disagree with the notion in L121-122 "that no mutations were specifically responsible or crucially essential" It does seem to me that, for example, the mutations in evo A2 are specifically responsible and essential for the fitness improvement of evo A2 in the evolutionary condition (M63 medium). Fitness assessment of individual (or combinatorial) mutants reconstituted in the Ancestral background would be a bonus.

      Thank you for the intriguing thinking. The sentence was deleted. Please allow us to adapt your comment to the manuscript as follows.

      (L143-145) “The large variety of genome mutations fixed in the independent lineages might result from a highly rugged fitness landscape 38.”

      (9) L121-122: "...no mutations were specifically responsible or crucially essential for recovering the growth rate of the reduced genome". Strictly speaking, the authors should provide a reference case of wild-type E. coli ALE in order to reach definitive conclusions that the observed mutation events are exclusive to the genome-reduced strain. It is strongly recommended that the authors perform comparative analysis with an ALEed non-genome-reduced control for a more definitive characterization of the evolutionary biology in a genome-reduced organism, as it was done for "JCVI-syn3.0B vs non-minimal M. mycoides" (doi: 10.1038/s41586-023-06288-x) and "E. coli eMS57 vs MG1655" (doi: 10.1038/s41467-019-08888-6).

      The improper description was deleted in response to comments 7 and 8. The mentioned references were cited in the manuscript (refs 21 and 23). Thank you for the experimental advice. We are sorry that the comparison of wild-type and reduced genomes was not in the scope of the present study and will probably be reported soon in our future work.

      (10) L146-148: "The homeostatic periodicity was consistent with our previous findings that the chromosomal periodicity of the transcriptome was independent of genomic or environmental variation" A Previous study also suggested that the amplitudes of the periodic transcriptomes were significantly correlated with the growth rates (doi: 10.1093/dnares/dsaa018). Growth rates of 8/9 Evos were higher compared to Anc, while that of Evo F2 remained similar. Please comment on the changes in amplitudes of the periodic transcriptomes between Anc and each Evo.

      Thank you for the suggestion. The correlation between the growth rates and the amplitudes of chromosomal periodicity was statistically insignificant (p>0.05). It might be a result of the limited data points. Compared with the only nine data points in the present study, the previous study analyzed hundreds of transcriptomes associated with the corresponding growth rates, which are suitable for statistical evaluation. In addition, the changes in growth rates were more significant in the previous study than in the present study, which might influence the significance. It's why we did not discuss the periodic amplitude.

      (11) Please elaborate on L159-161: "It strongly suggested the essentiality mutation for homeostatic transcriptome architecture happened in the reduced genome.".

      Sorry for the improper description. The sentence was rewritten as follows.

      (L191-193) “The essentiality of the mutations might have participated in maintaining the homeostatic transcriptome architecture of the reduced genome.”

      (12) Is FPKM a valid metric for between-sample comparison? The growing consensus in the community adopts Transcripts Per Kilobase Million (TPM) for comparing gene expression levels between different samples (Figure 3B; L372-379).

      Sorry for the unclear description. The FPKM indicated here was globally normalized, statistically equivalent to TPM. The following sentence was added to the Materials and Methods.

      (L421-422) “The resulting normalized FPKM values were statistically equivalent to TPM.”

      (13) Please provide % mapped frequency of mutations in Table S3.

      They were all 100%. The partially fixed mutations were excluded in the present study. The following sentence was added to the caption of Table S3.

      (Supplementary file, p 9) “Note that the entire population held the mutations, i.e., 100% frequency in DNA sequencing.”

      (14) To my knowledge, M63 medium contains glucose and glycerol as carbon sources. The manuscript would benefit from discussing the elements that impose selection pressure in the M63 culture condition.

      Sorry for the missing information on M63, which contains 22 mM glucose as the only carbon source. The medium composition was added in the Materials and Methods, as follows.

      (L334-337) “In brief, the medium contains 62 mM dipotassium hydrogen phosphate, 39 mM potassium dihydrogen phosphate, 15 mM ammonium sulfate, 15 μM thiamine hydrochloride, 1.8 μM Iron (II) sulfate, 0.2 mM magnesium sulfate, and 22 mM glucose.”

      (15) The RNA-Seq datasets for Evo strains seemed equally heterogenous, just as their mutation profiles. However, the missing element in their analysis is the directionality of gene expression changes. I wonder what sort of biological significance can be derived from grouping expression changes based solely on DEGs, without considering the magnitude and the direction (up- and down-regulation) of changes? RNA-seq analysis in its current form seems superficial to derive biologically meaningful interpretations.

      We agree that most studies often discuss the direction of transcriptional changes. The present study aimed to capture a global view of the magnitude of transcriptome reorganization. Thus, the analyses focused on the overall features, such as the abundance of DEGs, instead of the details of the changes, e.g., the up- and down-regulation of DEGs. The biological meaning of the DEGs' overview was how significantly the genome-wide gene expression fluctuated, which might be short of an in-depth view of individual gene expression. The following sentence was added to indicate the limitation of the present analysis.

      (L199-202) “Instead of an in-depth survey on the directional changes of the DEGs, the abundance and functional enrichment of DEGs were investigated to achieve an overview of how significant the genome-wide fluctuation in gene expression, which ignored the details of individual genes.”

      Minor remarks

      (1) L41: brackets italicized "(E. coli)".

      It was fixed as follows.

      (L40) “… Escherichia coli (E. coli) cells …”

      (2) Figure S1. It is suggested that the x-axis of ALE monitor be set to 'generations' or 'cumulative generations', rather than 'days'.

      Thank you for the suggestion. Fig. S1 describes the experimental procedure, so the" day" was used. Fig. S2 presents the evolutionary process, so the "generation" was used, as you recommended here.

      (3) I found it difficult to digest through L61-64. Although it is not within the job scope of reviewers to comment on the language style, I must point out that the manuscript would benefit from professional language editing services.

      Sorry for the unclear writing. The sentences were revised as follows.

      (L60-64) “Previous studies have identified conserved features in transcriptome reorganization, despite significant disruption to gene expression patterns resulting from either genome reduction or experimental evolution 27-29. The findings indicated that experimental evolution might reinstate growth rates that have been disrupted by genome reduction to maintain homeostasis in growing cells.”

      (4) Duplicate references (No. 21, 42).

      Sorry for the mistake. It was fixed (leaving ref. 21).

      (5) Inconsistency in L105-106: "from two to 13".

      "From two to 13" was adopted from the language editing. It was changed as follows.

      (L119) “… from 2 to 13, …”

      Response to Reviewer #3:

      Thank you for reviewing our manuscript and for the helpful comments, which improved the strength of the manuscript. The recommended statistical analyses essentially supported the statement in the manuscript were performed, and those supposed to be the new results in the scope of further studies remained unconducted. The changes made in the revision were highlighted. We sincerely hope the revised manuscript and the following point-to-point response meet your concerns. You will find all your suggested statistic tests in our future work that report an extensive study on the experimental evolution of an assortment of reduced genomes.

      (1) Line 106 - "As 36 out of 45 SNPs were nonsynonymous, the mutated genes might benefit the fitness increase." This argument can be strengthened. For example, the null expectation of nonsynonymous SNPs should be discussed. Is the number of observed nonsynonymous SNPs significantly higher than the expected one?

      (2) Line 107 - "In addition, the abundance of mutations was unlikely to be related to the magnitude of fitness increase." Instead of just listing examples, a regression analysis can be added.

      Yes, it's significant. Random mutations lead to ~33% of nonsynonymous SNP in a rough estimation. Additionally, the regression is unreliable because there's no statistical significance between the number of mutations and the magnitude of fitness increase. Accordingly, the corresponding sentences were revised with additional statistical tests.

      (L123-129) “As 36 out of 45 SNPs were nonsynonymous, which was highly significant compared to random mutations (p < 0.01), the mutated genes might benefit fitness increase. In addition, the abundance of mutations was unlikely to be related to the magnitude of fitness increase. There was no significant correlation between the number of mutations and the growth rate in a statistical view (p > 0.1). Even from an individual close-up viewpoint, the abundance of mutations poorly explained the fitness increase.”

      (3) Line 114 - "They seemed highly related to essentiality, as 11 out of 49 mutated genes were essential (Table S3)." Here, the information mentioned in line 153 ("the ratio of essential to all genes (302 out of 3,290) in the reduced genome.") can be used. Then a statistical test for a contingency table can be used.

      (4) Line 117 - "the high frequency of the mutations fixed in the essential genes suggested the mutation in essentiality for fitness increase was the evolutionary strategy for reduced genome." What is the expected number of fixed mutations in essential genes vs non-essential genes? Is the observed number statistically significantly higher?

      Sorry for the improper and insufficient information on the essential genes. Yes, it's significant. The statistical test was additionally performed. The corresponding part was revised as follows.

      (L134-146) “They seemed highly related to essentiality7 (https://shigen.nig.ac.jp/ecoli/pec/genes.jsp), as 11 out of 49 mutated genes were essential (Table S3). Although the essentiality of genes might differ between the wild-type and reduced genomes, the experimentally determined 302 essential genes in the wild-type E. coli strain were used for the analysis, of which 286 were annotated in the reduced genome. The ratio of essential genes in the mutated genes was significantly higher than in the total genes (286 out of 3290 genes, Chi-square test p=0.008). As the essential genes were determined according to the growth35 and were known to be more conserved than nonessential ones 36,37, the high frequency of the mutations fixed in the essential genes was highly intriguing and reasonable. The large variety of genome mutations fixed in the independent lineages might result from a highly rugged fitness landscape 38. Nevertheless, it was unclear whether and how these mutations were explicitly responsible for recovering the growth rate of the reduced genome.”

      (5) The authors mentioned no overlapping in the single mutation level. Is that statistically significant? The authors can bring up what the no-overlap probability is given that there are in total x number of fixed mutations observed (either theory or simulation is good).

      Sorry, we feel confused about this comment. It's unclear to us why it needs to be statistically simulated. Firstly, the mutations were experimentally observed. The result that no overlapped mutated genes were detected was an Experimental Fact but not a Computational Prediction. We feel sorry that you may over-interpret our finding as an evolutionary rule, which always requires testing its reliability statistically. We didn't conclude that the evolution had no overlapped mutations. Secondly, considering 65 times random mutations happened to a ~3.9 Mb sequence, the statistical test was meaningful only if the experimental results found the overlapped mutations. It is interesting how often the random mutations cause the overlapped mutations in parallel evolutionary lineages while increasing the evolutionary lineages, which seems to be out of the scope of the present study. We are happy to include the analysis in our ongoing study on the experimental evolution of reduced genomes.

      (6) The authors mentioned no overlapping in the single mutation level. How about at the genetic level? Some fixed mutations occur in the same coding gene. Is there any gene with a significantly enriched number of mutations?

      No mutations were fixed in the same gene of biological function, as shown in Table S3. If we say the coding region, the only exception is the IS sequences, well known as the transposable sequences without genetic function. The following description was added.

      (L119-122) “The number of mutations largely varied among the nine Evos, from 2 to 13, and no common mutation was detected in all nine Evos (Table S3). A 1,199-bp deletion of insH was frequently found in the Evos (Table S3, highlighted), which well agreed with its function as a transposable sequence.”

      (7) Line 151-156- It seems like the authors argue that the expression level differences can be just explained by the percentage of essential genes that get fixed mutations. One further step for the argument could be to compare the expression level of essential genes with vs without fixed mutations. Also, the authors can compare the expression level of non-essential genes with vs without fixed mutations. And the authors can report whether the differences in expression level became insignificant after the control of the essentiality.

      It's our pleasure that the essentiality intrigued you. Thank you for the analytical suggestion, which is exciting and valuable for our studies. As only 11 essential genes were detected here and "Mutation in essentiality" was an indication but not the conclusion of the present study, we would like to apply the recommended analysis to the datasets of our ongoing study to demonstrate this statement. Thank you again for your fruitful analytical advice.

      (8) Line 169- "The number of DEGs partially overlapped among the Evos declined significantly along with the increased lineages of Evos (Figure 4B). " There is a lack of statistical significance here while the word "significantly" is used. One statistical test that can be done is to use re-sampling/simulation to generate a null expectation of the overlapping numbers given the DEGs for each Evo line and the total number of genes in the genome. The observed number can then be compared to the distribution of the simulated numbers.

      Sorry for the inappropriate usage of the term. Whether it's statistically significant didn't matter here. The word "significant" was deleted as follows.

      (L205--206) “The number of DEGs partially overlapped among the Evos declined along with the increased lineages of Evos (Fig. 4B).”

      (9) Line 177-179- "In comparison,1,226 DEGs were induced by genome reduction. The common DEGs 177 of genome reduction and evolution varied from 168 to 540, fewer than half of the DEGs 178 responsible for genome reduction in all Evos" Is the overlapping number significantly lower than the expectation? The hypergeometric test can be used for testing the overlap between two gene sets.

      There's no expectation for how many DEGs were reasonable. Not all numbers experimentally obtained are required to be statistically meaningful, which is commonly essential in computational and data science.

      (10) The authors should give more information about the ancestral line used at the beginning of experimental evolution. I guess it is one of the KHK collection lines, but I can not find more details. There are many genome-reduced lines. Why is this certain one picked?

      Sorry for the insufficient information on the reduced genome used for the experimental evolution. The following descriptions were added in the Results and the Materials and Methods, respectively.

      (L75-79) “The E. coli strain carrying a reduced genome, derived from the wild-type genome W3110, showed a significant decline in its growth rate in the minimal medium compared to the wild-type strain 13. To improve the genome reduction-mediated decreased growth rate, the serial transfer of the genome-reduced strain was performed with multiple dilution rates to keep the bacterial growth within the exponential phase (Fig. S1), as described 17,20.”

      (L331-334) “The reduced genome has been constructed by multiple deletions of large genomic fragments 58, which led to an approximately 21% smaller size than its parent wild-type genome W3110.”

      (11) How was the saturated density in Figure 1 actually determined? In particular, the fitness assay of growth curves is 48h. But it seems like the experimental evolution is done for ~24 h cycles. If the Evos never experienced a situation like a stationary phase between 24-48h, and if the author reported the saturated density 48 h in Figure 1, the explanation of the lower saturated density can be just relaxation from selection and may have nothing to do with the increase of growth rate.

      Sorry for the unclear description. Yes, you are right. The evolution was performed within the exponential growth phase (keeping cell division constant), which means the Evos never experienced the stationary phase (saturation). The final evolved populations were subjected to the growth assay to obtain the entire growth curves for calculating the growth rate and the saturated density. Whether the decreased saturated density and the increased growth rate were in a trade-off relationship remained unclear. The corresponding paragraph was revised as follows.

      (L100-115) “Intriguingly, a positive correlation was observed between the growth fitness and the carrying capacity of the Evos (Fig. 1D). It was somehow consistent with the positive correlations between the colony growth rate and the colony size of a genome-reduced strain 11 and between the growth rates and the saturated population size of an assortment of genome reduced strains 13. Nevertheless, the negative correlation between growth rate and carrying capacity, known as the r/K selection30,31 was often observed as the trade-off relationship between r and K in the evolution and ecology studies 32 33,34. As the r/K trade-off was proposed to balance the cellular metabolism that resulted from the cost of enzymes involved 34, the deleted genes might play a role in maintaining the metabolism balance for the r/K correlation. On the other hand, the experimental evolution (i.e., serial transfer) was strictly performed within the exponential growth phase; thus, the evolutionary selection was supposed to be driven by the growth rate without selective pressure to maintain the carrying capacity. The declined carrying capacity might have been its neutral "drift" but not a trade-off to the growth rate. Independent and parallel experimental evolution of the reduced genomes selecting either r or K is required to clarify the actual mechanisms.”

      (12) What annotation of essentiality was used in this paper? In particular, the essentiality can be different in the reduced genome background compared to the WT background.

      Sorry for the unclear definition of the essential genes. They are strictly limited to the 302 essential genes experimentally determined in the wild-type E coli strain. Detailed information can be found at the following website: https://shigen.nig.ac.jp/ecoli/pec/genes.jsp. We agree that the essentiality could differ between the WT and reduced genomes. Identifying the essential genes in the reduced genome will be an exhaustedly vast work. The information on the essential genes defined in the present study was added as follows.

      (L134-139) “They seemed highly related to essentiality7 (https://shigen.nig.ac.jp/ecoli/pec/genes.jsp), as 11 out of 49 mutated genes were essential (Table S3). Although the essentiality of genes might differ between the wild-type and reduced genomes, the experimentally determined 302 essential genes in the wild-type E. coli strain were used for the analysis, of which 286 were annotated in the reduced genome.”

      (13) The fixed mutations in essential genes are probably not rarely observed in experimental evolution. For example, fixed mutations related to RNA polymerase can be frequently seen when evolving to stressful environments. I think the author can discuss this more and elaborate more on whether they think these mutations in essential genes are important in adaptation or not.

      Thank you for your careful reading and the suggestion. As you mentioned, we noticed that the mutations in RNA polymerases (rpoA, rpoB, and rpoD) were identified in three Evos. As they were not shared across all Evos, we didn't discuss the contribution of these mutations to evolution. Instead of the individual functions of the mutated essential gene functions, we focused on the enriched gene functions related to the transcriptome reorganization because they were the common feature observed across all Evos and linked to the whole metabolic or regulatory pathways, which are supposed to be more biologically reasonable and interpretable. The following sentence was added to clarify our thinking.

      (L268-273) “In particular, mutations in the essential genes, such as RNA polymerases (rpoA, rpoB, rpoD) identified in three Evos (Table S3), were supposed to participate in the global regulation for improved growth. Nevertheless, the considerable variation in the fixed mutations without overlaps among the nine Evos (Table 1) implied no common mutagenetic strategy for the evolutionary improvement of growth fitness.”

      (14) In experimental evolution to new environments, several previous literature also show that long-term experimental evolution in transcriptome is not consistent or even reverts the short-term response; short-term responses were just rather considered as an emergency plan. They seem to echo what the authors found in this manuscript. I think the author can refer to some of those studies more and make a more throughput discussion on short-term vs long-term responses in evolution.

      Thank you for the advice. It's unclear to us what the short-term and long-term responses referred to mentioned in this comment. The "Response" is usually used as the phenotypic or transcriptional changes within a few hours after environmental fluctuation, generally non-genetic (no mutation). In comparison, long-term or short-term experimental "Evolution" is associated with genetic changes (mutations). Concerning the Evolution (not the Response), the long-term experimental evolution (>10,000 generations) was performed only with the wild-type genome, and the short-term experimental evolution (500~2,000 generations) was more often conducted with both wild-type and reduced genomes, to our knowledge. Previous landmark studies have intensively discussed comparing the wild-type and reduced genomes. Our study was restricted to the reduced genome, which was constructed differently from those reduced genomes used in the reported studies. The experimental evolution of the reduced genomes has been performed in the presence of additional additives, e.g., antibiotics, alternative carbon sources, etc. That is, neither the genomic backgrounds nor the evolutionary conditions were comparable. Comparison of nothing common seems to be unproductive. We sincerely hope the recommended topics can be applied in our future work.

      Some minor suggestions

      • Figures S3 & Table S2 need an explanation of the abbreviations of gene categories.

      Sorry for the missing information. Figure S3 and Table S3 were revised to include the names of gene categories. The figure was pasted followingly for a quick reference.

      Author response image 3.

      • I hope the authors can re-consider the title; "Diversity for commonality" does not make much sense to me. For example, it can be simply just "Diversity and commonality."

      Thank you for the suggestion. The title was simplified as follows.

      (L1) “Experimental evolution for the recovery of growth loss due to genome reduction.”

      • It is not easy for me to locate and distinguish the RNA-seq vs DNA-seq files in DRA013662 at DDBJ. Could you make some notes on what RNA-seq actually are, vs what DNA-seq files actually are?

      Sorry for the mistakes in the DRA number of DNA-seq. DNA-seq and RNA-seq were deposited separately with the accession IDs of DRA013661 and DRA013662, respectively. The following correction was made in the revision.

      (L382-383) “The raw datasets of DNA-seq were deposited in the DDBJ Sequence Read Archive under the accession number DRA013661.”

    3. eLife assessment

      This is an important study of the recovery of genome-reduced bacterial cells in laboratory evolution experiments, to understand how they regain their fitness. Through the analysis of gene expression and a series of tests, the authors present convincing evidence indicating distinct molecular changes in the evolved bacterial strains, although the precise mechanisms remain uncharacterized. These findings imply that diverse mechanisms are employed to offset the effects of a reduced genome, offering intriguing insights into genome evolution.

    4. Reviewer #1 (Public Review):

      In this study, the authors explored how the reduced growth fitness, resulting from genome reduction, can be compensated through evolution. They conducted an evolution experiment with a strain of Escherichia coli that carried a reduced genome, over approximately 1,000 generations. The authors carried out sequencing, and found no clear genetic signatures of evolution across replicate populations. They carry out transcriptomics and a series of analyses that lead them to conclude that there are divergent mechanisms at play in individual evolutionary lineages. The authors used gene network reconstruction to identify three gene modules functionally differentiated, correlating with changes in growth fitness, genome mutation, and gene expression, respectively, due to evolutionary changes in the reduced genome.

      I think that this study addresses an interesting question. Many microbial evolution experiments evolve by loss of function mutations, but presumably a cell that has already lost so much of its genome needs to find other mechanisms to adapt. Experiments like this have the potential to study "constructive" rather than "destructive" evolution.

      Comments on revised version:

      I think the authors have carefully gone through the manuscript and addressed all of my concerns.

    5. Reviewer #2 (Public Review):

      This manuscript describes an adaptive laboratory evolution (ALE) study with a previously constructed genome-reduced E. coli. The growth performance of the end-point lineages evolved in M63 medium was comparable to the full-length wild-type level at lower cell densities.

      Subsequent mutation profiling and RNA-Seq analysis revealed many changes on the genome and transcriptomes of the evolved lineages. The authors did a great deal on analyzing the patterns of evolutionary changes between independent lineages, such as the chromosomal periodicity of transcriptomes, pathway enrichment analysis, weight gene co-expression analysis, and so on. They observed a striking diversity in the molecular characteristics amongst the evolved lineages, which, as they suggest, reflect divergent evolutionary strategies adopted by the genome-reduced organism.

      As for the overall quality of the manuscript, I am rather torn. The manuscript leans towards elaborating observed findings, rather than explaining their biological significance. For this reason, readers are left with more questions than answers. For example, fitness assay on reconstituted (single and combinatorial) mutants was not performed, nor any supporting evidence on the proposed contributions of each mutants provided. This leaves the nature of mutations - be them beneficial, neutral or deleterious, the presence of epistatic interactions, and the magnitude of fitness contribution, largely elusive. Also, it is difficult to tell whether the RNA-Seq analysis in this study managed to draw biologically meaningful conclusions, or instill insight into the nature of genome-reduced bacteria. The analysis primarily highlighted the differences in transcriptome profiles among each lineage based on metrics such as 'DEG counts' and the 'GO enrichment'. However, I could not see any specific implications regarding the biology of the evolved minimal genome drawn. In their concluding remark, 'Multiple evolutionary paths for the reduced genome to improve growth fitness were likely all roads leading to Rome,' the authors observed the first half of the sentence, but the distinctive characteristics of 'all roads' or 'evolutionary paths', which I think should have been the key aspect in this investigation, remains elusive.

      Comments on revised version:

      I appreciate the author's responses. They responded to most of the comments, but I still think that there is room for improvement. Please refer to the following comments. Quoted below are the author's responses.

      "We agree that our study leaned towards elaborating observed findings rather than explaining the detailed biological mechanisms."<br /> - Comment: I doubt if there are scientific merits in merely elaborating observed findings. The conclusion of this study suggests that evolutionary paths in reduced genomes are highly diverse. But if you think about the nature of adaptive evolution, which relies upon the spontaneous mutation event followed by selection, certain degree of divergence is always expected. The problem with current experimental setting is that there are no ways to quantitively assess whether the degree of evolutionary divergence increases as the function of genome reduction, as the authors claimed. In addition, this notion is in direct contradiction to the prediction that genome reduction constraints evolution by reducing the number of solution space. It is more logical to think and predict that genome reduction would, in turn, lead to the loss of evolutionary divergence. We are also interested to know whether solution space to the optimization problem altered in response to the genome reduction. In this regard, a control ALE experiment on non-reduced wild-type seems to be a mandatory experimental control. I highly suggest that authors present a control experiment, as it was done for "JCVI syn3.0B vs non-minimal M. mycoides" (doi: 10.1038/s41586 023 06288 x) and "E. coli eMS57 vs MG1655" (doi: 10.1038/s41467 019 08888 6).<br /> "We focused on the genome wide biological features rather than the specific biological functions."<br /> - Comment: The 'biological features' delivered in current manuscript does not give insight as to which genomic changes translated into strain fitness improvement. Rather than explaining the genotype-phenotype relationships and/or the mechanistic basis of fitness improvement, authors merely elaborated on the observed phenotypes. I question the scientific merits of such 'findings'.<br /> "Although the reduced growth rate caused by genome reduction could be recovered by experimental evolution, it remains unclear whether such an evolutionary improvement in growth fitness was a general feature of the reduced genome and how the genome wide changes occurred to match the growth fitness increase."<br /> - Comment: This response is very confusing to understand. "it remains unclear whether such an evolutionary improvement in growth fitness was a general feature of the reduced genome" - what aspects remain unclear?? What assumption led the authors to believe that reduced genome's fitness cannot be evolutionarily improved?<br /> - Comment: "and how the genome wide changes occurred to match the growth fitness increase" - this is exactly the aspect that authors should deliver, instead of just elaborating the observed findings. Why don't authors select one or two fastest-growing (or the fittest) lineages and specifically analyze underlying adaptive changes (i.e. genotype-phenotype relationships)?

    1. Author response:

      eLife assessment

      In this valuable study, Kumar et al., provide evidence suggesting that the p130Cas drives the formation of condensates that sprout from focal adhesions to cytoplasm and suppress translation. Pending further substantiation, this study was found to be likely to provide previously unappreciated insights into the mechanisms linking focal adhesions to the regulation of protein synthesis and was thus considered to be of broad general interest. However, the evidence supporting the proposed model was incomplete; additional evidence is warranted to substantiate the relationship between p130Cas condensates and mRNA translation and establish corresponding functional consequences.

      We thank the Elife editorial team for their positive assessment of the broad significance of our manuscript. We fully agree that the functional consequences need to be explored in more detail. We feel that many of the criticisms are valid points that are not easily addressed via available tools, thus, should be considered limitations of present approaches. We hope that readers appreciate that identification of a new class of liquid-liquid phase separations calls for much more work to fully explore their characteristics, regulation and function, which will likely advance many areas of cell biology and perhaps even medicine.

      Reviewer #1 (Public Review):

      Summary:

      The authors demonstrated the phenomenon of p130Cas, a protein primarily localized at focal adhesions, and its formation of condensates. They identified the constituents within the condensates, which include other focal adhesion proteins, paxillin, and RNAs. Furthermore, they proposed a link between p130Cas condensates and translation.

      Strengths:

      Adhesion components undergo rapid exchange with the cytoplasm for some unclear biological functions. Given that p130Cas is recognized as a prominent mechanical focal adhesion component, investigating its role in condensate formation, particularly its impact on the translation process, is intriguing and significant.

      We thank the reviewer for recognizing the functional significance of the work.

      Weaknesses:

      The authors identified the disordered region of p130Cas and investigated the formation of p130Cas condensate. They attempted to demonstrate that p130Cas condensates inhibit translation, but the results did not fully support this assertion. There are several comments below:

      (1) Despite isolating p130Cas-GFP protein using GFP-trap beads, the authors cannot conclusively eliminate the possibility of isolating p130Cas from focal adhesions. While the characterization of the GFP-tagged pulls can reveal the proteins and RNAs associated with p130Cas, they need to clarify their intramolecular mechanism of localization within p130Cas droplets. Whether the protein condensates retain their liquid phase or these GFP-p130Cas pulls represent protein aggregate remains uncertain.

      We agree, the isolation from cell lysates does not distinguish between focal adhesions and cytoplasmic LLPS. We note that p130Cas in focal adhesions also appears to be in LLPS. But there are no methods available to isolate them separately. We acknowledge this is a limitation of the study.

      (2) The authors utilized hexanediol and ammonium acetate to highlight the phenomenon of p130Cas condensates. Although hexanediol is an inhibitor for hydrophobic interactions and ammonium acetate is a salt, a more thorough explanation of the intramolecular mechanisms underlying p130Cas protein-protein interaction is required. Additionally, given that the size of p130Cas condensates can exceed >100um2, classification is needed to differentiate between p130Cas condensates and protein aggregation.

      Ammonium acetate, which works by promoting hydrophobic interactions and weak Van der Waals forces, has been widely used in phase separation studies to change ionic strength without altering intracellular pH. Conversely, hexanediol weakens hydrophobic/ Van der Walls interactions that commonly mediate phase separation of IDRs. In the case of p130Cas, the multiple tyrosines and within the scaffolding domain are obvious targets. If the reviewer is asking us to resolve the detailed hydrophobic interactions within the scaffolding domain, this is far beyond the scope of the current paper.

      Protein aggregates are defined by their characteristics (e.g irreversibility, departure from spherical) not by size. Older, larger droplets remain circular and show slower but still measurable rates of exchange. Moreover, droplets are essentially absent after trypsinizing and replating cells. All these results argue against aggregates.

      (3) The connection between p130Cas condensates and translation inhibition appears tenuous. The data only suggests a correlation between p130Cas expression and translation inhibition. Further evidence is required to bolster this hypothesis.

      The optogenetic experiment shows that triggering LLPS by dimerizing p130Cas results in inhibition of translation. This is a causal not a correlative experiment. The reviewer may be thinking that dimerizing p130Cas could stimulate focal adhesion signaling, activating FAK or a src family kinase or other signals. However, none of these signals has been linked to inhibition of cell growth or migration. Thus, we agree that this is a limitation but consider it a low probability mechanism.

      Reviewer #2 (Public Review):

      Summary:

      In this article, Kumar et al., report on a previously unappreciated mechanism of translational regulation whereby p130Cas induces LLPS condensates that then traffic out from focal adhesion into the cytoplasm to modulate mRNA translation. Specifically, the authors employed EGFP-tagged p130Cas constructs, endogenous p130Cas, and p130Cas knockouts and mutants in cell-based systems. These experiments in conjunction with various imaging techniques revealed that p130Cas drives assembly of LLPS condensates in a manner that is largely independent of tyrosine phosphorylation. This was followed by in vitro EGFP-tagged p130Cas-dependent induction of LLPS condensates and determination of their composition by mass spectrometry, which revealed enrichment of proteins involved in RNA metabolism in the condensates. The authors excluded the plausibility that p130Cas-containing condensates co-localize with stress granules or p-bodies. Next, the authors determined mRNA compendium of p130Cas-containing condensates which revealed that they are enriched in transcripts encoding proteins implicated in cell cycle progression, survival, and cell-cell communication. These findings were followed by the authors demonstrating that p130Cas-containing condensates may be implicated in the suppression of protein synthesis using puromycylation assay. Altogether, it was found that this study significantly advances the knowledge pertinent to the understanding of molecular underpinnings of the role of p130Cas and more broadly focal adhesions on cellular function, and to this end, it is likely that this report will be of interest to a broad range of scientists from a wide spectrum of biomedical disciplines including cell, molecular, developmental and cancer biologists.

      Strengths:

      Altogether, this study was found to be of potentially broad interest inasmuch as it delineates a hitherto unappreciated link between p130Cas, LLPS, and regulation of mRNA translation. More broadly, this report provides unique molecular insights into the previously unappreciated mechanisms of the role of focal adhesions in regulating protein synthesis. Overall, it was thought that the provided data sufficiently supported most of the authors' conclusions. It was also thought that this study incorporates an appropriate balance of imaging, cell and molecular biology, and biochemical techniques, whereby the methodology was found to be largely appropriate.

      We thank reviewer for this positive assessment.

      Weaknesses:

      Two major weaknesses of the study were noted. The first issue is related to the experiments establishing the role of p130Cas-driven condensates in translational suppression, whereby it remained unclear whether these effects are affecting global mRNA translation or are specific to the mRNAs contained in the condensates. Moreover, some of the results in this section (e.g., experiments using cycloheximide) may be open to alternative interpretation. The second issue is the apparent lack of functional studies, and although the authors speculate that the described mechanism is likely to mediate the effects of focal adhesions on e.g., quiescence, experimental testing of this tenet was lacking.

      We appreciate the reviewer’s insights. Assessing translational inhibition for specific genes rather than global measurement of translation is an important direction for future work.

      Regarding the cycloheximide experiments, we are unsure what the reviewer means. We used it as a control for puromycin labeling but this is a very standard approach. It seems more likely that the question concerns Fig 5G, where we used it to sequester mRNAs on ribosomes to deplete from other pools. In this case, p130cas condensates decrease after 2 minutes. The reviewer may be suggesting that this effect could be due to blocked translation per se and loss of short-lived proteins. We acknowledge that this is possible but given the very rapid effect (2 min), we think it unlikely.

      Lastly, we agree with the reviewer that further functional studies in quiescence or senescence are warranted; however, these are extensive, open-ended studies and we will not be able to include them as part of the current paper.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      In this valuable study, the authors investigate the transcriptional landscape of tuberculous meningitis, revealing important molecular differences contributed by HIV co-infection. Whilst some of the evidence presented is compelling, the bioinformatics analysis is limited to a descriptive narrative of gene-level functional annotations, which are somewhat basic and fail to define aspects of biology very precisely. Whilst the work will be of broad interest to the infectious disease community, validation of the data is critical for future utility.

      We appreciate with eLife’s positive assessment, although we challenge the conclusion that we ‘fail to define aspects of biology very precisely’. Our stated objective was to use bioinformatics tools to identify the biological pathways and hub genes associated with TBM pathogenesis and the eLife assessment affirms we have investigated ‘the transcriptional landscape of tuberculous meningitis’. To more precisely define aspects of the biology will require another study with different design and methods.

      Reviewer #1 (Public Review):

      Summary:

      Tuberculous meningitis (TBM) is one of the most severe forms of extrapulmonary TB. TBM is especially prevalent in people who are immunocompromised (e.g. HIV-positive). Delays in diagnosis and treatment could lead to severe disease or mortality. In this study, the authors performed the largest-ever host whole blood transcriptomics analysis on a cohort of 606 Vietnamese participants. The results indicated that TBM mortality is associated with increased neutrophil activation and decreased T and B cell activation pathways. Furthermore, increased angiogenesis was also observed in HIV-positive patients who died from TBM, whereas activated TNF signaling and down-regulated extracellular matrix organisation were seen in the HIV-negative group. Despite similarities in transcriptional profiles between PTB and TBM compared to healthy controls, inflammatory genes were more active in HIV-positive TBM. Finally, 4 hub genes (MCEMP1, NELL2, ZNF354C, and CD4) were identified as strong predictors of death from TBM.

      Strengths:

      This is a really impressive piece of work, both in terms of the size of the cohort which took years of effort to recruit, sample, and analyse, and also the meticulous bioinformatics performed. The biggest advantage of obtaining a whole blood signature is that it allows an easier translational development into a test that can be used in the clinical with a minimally invasive sample. Furthermore, the data from this study has also revealed important insights into the mechanisms associated with mortality and the differences in pathogenesis between HIV-positive and HIV-negative patients, which would have diagnostic and therapeutic implications.

      Weaknesses:

      The data on blood neutrophil count is really intriguing and seems to provide a very powerful yet easy-to-measure method to differentiate survival vs. death in TBM patients. It would be quite useful in this case to perform predictive analysis to see if neutrophil count alone, or in combination with gene signature, can predict (or better predict) mortality, as it would be far easier for clinical implementation than the RNA-based method. Moreover, genes associated with increased neutrophil activation and decreased T cell activation both have significantly higher enrichment scores in TBM (Figure 9) and in morality (Figure 8). While I understand the basis of selecting hub genes in the significant modules, they often do not represent these biological pathways (at least not directly associated in most cases). If genes were selected based on these biologically relevant pathways, would they have better predictive values?

      We conducted a sensitivity analysis including blood neutrophil as a potential predictor in the multivariate Cox elastic-net regression model for important predictor selection (Table S14). In this analysis, all six selected important predictors (genes and clinical risk factors) identified in the original analysis (Table S13) were also selected, together with blood neutrophil number. Additionally, we evaluated the predictive value of blood neutrophil alone, which demonstrated poor performance, with an optimism-corrected AUC of 0.63 for all TBM, 0.67 for HIV-negative TBM, and 0.70 for HIV-positive TBM. Even when combined with identified gene signatures, blood neutrophil did not improve the overall performance of predictive model (optimism-corrected AUC of 0.79 for all TBM, 0.76 for HIV-negative TBM, and 0.80 for HIV-positive). These results indicate that identified hub genes exhibit better predictive values compared to blood neutrophil alone or in combination. These findings have been incorporated into our manuscript results.

      To test whether pathway representative genes have better predictive values than hub genes, we included all these genes in the analysis for important predictor selection. Pathway representative genes comprised ANXA3 and CXCR2 representing neutrophil activation and IL1b representing acute inflammatory response. We observed that all hub genes (MCEMP1, NELL2, ZNF354C, and CD4) consistently emerged as the most important genes with the highest selection in the models, compared to the rest, in both the HIV-negative TBM and HIV-positive TBM cohorts. Additionally, these identified hub genes were still selected when testing together with other hub genes representing relevant biological pathways associated with TBM mortality, such as CYSTM1 involved in neutrophil activation, TRAF5 involved in NF-kappa B signaling pathway, CD28 and TESPA1 involved in T cell receptor signaling. These results show that selected genes based on known biologically relevant pathways did not give better predictive values than the identified hub genes in the significant modules.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript describes the analysis of blood transcriptomic data from patients with TB meningitis, with and without HIV infection, with some comparison to those of patients with pulmonary tuberculosis and healthy volunteers. The objectives were to describe the comparative biological differences represented by the blood transcriptome in TBM associated with HIV co-infection or survival/mortality outcomes and to identify a blood transcriptional signature to predict these outcomes. The authors report an association between mortality and increased levels of acute inflammation and neutrophil activation, but decreased levels of adaptive immunity and T/B cell activation. They propose a 4-gene prognostic signature to predict mortality.

      Strengths:

      Biological evaluations of blood transcriptomes in TB meningitis and their relationship to outcomes have not been extensively reported previously.

      The size of the data set is a major strength and is likely to be used extensively for secondary analyses in this field of research.

      Weaknesses:

      The bioinformatic analysis is limited to a descriptive narrative of gene-level functional annotations curated in GO and KEGG databases. This analysis cannot be used to make causal inferences. In addition, the functional annotations are limited to 'high-level' terms that fail to define biology very precisely. At best, they require independent validation for a given context. As a result, the conclusions are not adequately substantiated. The identification of a prognostic blood transcriptomic signature uses an unusual discovery approach that leverages weighted gene network analysis that underpins the bioinformatic analyses. However, the main problem is that authors seem to use all the data for discovery and do not undertake any true external validation of their gene signature. As a result, the proposed gene signature is likely to be overfitted to these data and not generalisable. Even this does not achieve significantly better prognostic discrimination than the existing clinical scoring.

      As explained in response to the eLife assessment, our objective was to use bioinformatics tools to identify the biological pathways and hub genes associated with TBM pathogenesis. We agree that ‘This analysis cannot be used to make causal inferences’: that would require different study design and approaches. The proposed gene signature has higher AUC values than the existing clinical model alone or in combination with clinical risk factors (Table 4). We agree that independent validation of the gene signature will be a crucial next step for future utility. We have performed qPCR in another sample set, and have added these results in the revision (Table 4 and supplementary figure S8)

      Reviewer #1 (Recommendations For The Authors):

      I have a few additional comments most of which are relatively minor:

      (1) Can the authors please clarify if all the PTB cases are also HIV-negative?

      This has been added to the methods section.

      (2) For Table 1, can the authors please list the total number of patients with microbiologically confirmed TB regardless of the methods used? And for the two TBM groups, was the positive microbiology based on CSF findings?

      The total number of patients with microbiologically confirmed TB was presented in Table 2 in definite TBM group, which was microbiologically confirmed TB diagnosed using microscopy, culture, and Xpert testing in cerebrospinal fluid (CSF) samples. We have updated the note in Table 2 to provide clarity on the definition.

      (3) How was the discovery and validation set selected? Was it based on randomisation?

      We randomly split TBM data into two datasets, a discovery cohort (n=142) and a validation cohort (n=139) with a purpose to ensure reproducibility of data analysis. We described this in the methods section.

      (4) Line 107 can be better clarified by stating that the overall 3-month mortality rate is 21.7% for TBM regardless of HIV status.

      Thank you, we have restated this sentence in the results section.

      (5) The authors stated that samples were collected at enrolment when patients would have received less than 6 days of anti-tubercular treatment. Is there information on the median and IQR on the number of days that the patients would have received Rx, especially between the groups? Did the authors control for this variable when analysing for DEGs?

      One of criteria to enroll participants in LAST-ACT and ACT-HIV trials is that they must receive less than 6 consecutive days of two or more drugs active against M. tuberculosis. However, the information of the days that the patients would have received Rx was not recorded and we could not control this variable when performing differential expression analysis for DEGs. This has been clarified further in the methods section: ‘The samples were taken at enrollment, when patients could not have received more than 6 consecutive days of two or more drugs active against M. tuberculosis.’

      (6) I am a little bit concerned with the reads mapping accuracy (57%) to the human genome, which is fairly low. Did the authors investigate the reasons behind this low accuracy?

      Thank you. It was indeed a typo. We have corrected it in the results section.

      (7) On Tables S2-S4, can the authors please clarify what the last column (labelled as "B") shows?

      Tables S2-S4 now have been changed to S3-S5. We have updated the legend of these tables to provide clarification regarding the meaning of the last column.

      Reviewer #2 (Recommendations For The Authors):

      If the authors wish to revise their manuscript, I suggest the following amendments:

      (1) Provide a consort diagram for the selection of samples included in the present analysis (from parent study cohorts), allocation to test and validation splits for bioinformatics analysis, and outcomes.

      We have provided our consort diagram in supplementary Figure S10.

      (2) Provide details of inclusion criteria for pulmonary TB cohort, and how samples from this cohort were selected for inclusion in the present analysis. Please clarify whether this cohort excluded HIV-positive participants by design or by chance.

      The inclusion criteria for the pulmonary TB cohort were described in the methods section. Due to the very low prevalence of HIV in this prospective observational study, HIV-positive participants were excluded. We have clarified in the amended manuscript that the pulmonary TB cohort only included HIV-negative participants.

      (3) Baseline characteristics of HIV-positive participants (Table 1) should include CD4 count, HIV viral load, and whether anti-retroviral therapy was naïve or experienced.

      We have included pre-treatment CD4 cell count, information on anti-retroviral therapy, and HIV viral load data in Table 1, as well as described these information in the results section.

      (4) I note that the TBM samples were derived from RCTs of adjunctive steroid therapy, but not stratified in the present analysis by treatment arm allocation. Clearly, this may affect the survival/mortality outcomes that are the central focus of this manuscript. Therefore, they should be included in the models for differential gene expression analysis and prognostic signature discovery. To do so, the authors may need to wait until they are able to unblind the trial metadata.

      With permission from the trial investigators, we were able to adjust the analyses for treatment with corticosteroids. The investigators remained blind to the allocation and we have not reported any direct effects of corticosteroids on outcome – such an analysis could only be done once the LAST-ACT trial has been reported (which won’t be until the end of 2024). Treatment outcome and effect were blinded by extracting only the fold change difference between survival and death in the linear regression model, in which gene expression was outcome and survival and treatment were covariates.

      (5) I understood from the methods (lines 460-461) that batch correction of the RNAseq data was necessary. However, it is not clear how the samples were batched. PCA of the transcriptomes before and after batch correction with batch and study group labels should be provided. I would also advocate for a sensitivity analysis to check the robustness of the main findings without batch correction. I assume Fig2A represents batch-corrected data, but this is not clear.

      We have now added information about the RNA sequencing batch and the batch correction approach, analyses and data visualizations utilized batch-corrected data in the methods section. We have also updated results related to batch correction in Fig. 2A and Supplementary Figure S9.

      (6) I would encourage the authors to include a differential gene expression analysis to directly compare the transcriptome of TBM to that of pulmonary TB. I think it would add additional value to their focus on describing the transcriptome in TBM.

      We thank for reviewer’s suggestion. Conducting differential gene expression analysis to compare the transcriptome of TBM with that of PTB is beyond the scope of this manuscript and we will examine this question separately.

      (7) I don't really understand the purpose of splitting their data set into test and validation for the purposes of showing that WGCNA analysis is mostly reproduced in the two halves of the data. I would advocate that they scrap this approach to maximise the statistical power of their analysis in the descriptive work.

      As mentioned in response to reviewer #1 in question #3, the purpose of splitting data is to ensure the reproducibility of the data analysis as suggested by Langfelder et al. (PMID: 21283776). This approach served two purposes: (i) to affirm the existence of functional modules in an independent cohort and (ii) to validate the association of interested modules or their hub genes with survival outcomes.

      (8) The authors should soften the confidence in their interpretation of the GO/KEGG annotations of WGCNA modules. At least, they should include a paragraph that explicitly details the limitations of their analyses, including (i) the accuracy GO/KEGG annotations are not validated in this context (if at all), (ii) that none of the data can be used to make causal inferences and (iii) that peripheral blood assessments that are obviously impacted by changes in cellular composition of peripheral blood do not necessarily reflect immunopathogenesis at the site of disease - in fact if circulating cells are being recruited to the site of disease or other immune compartments, then quite the opposite interpretations may be true.

      We appreciate the reviewer's comment. (i) In our analysis, we initially confirmed the existence of Weighted Gene Co-expression Network Analysis (WGCNA) modules in discovery cohort and validated the association of these modules with mortality outcomes in validation cohort. We then applied GO/KEGG annotations to define the biological functions involved in WGCNA modules. Finally, we performed Qusage analysis to directly test the association of top-hit pathways of each WGCNA module with mortality outcomes (see supplementary S6). This analysis approach helped to identify and validate modules and biological pathways associated with TBM mortality in this context, avoiding potential false positives in GO/KEGG annotations of WGCNA modules. (ii) We agree with the assessment that 'This analysis cannot be used to make causal inferences,' as that would require a different study design and approach. (iii) The focus of this study is to investigate the pathogenesis of TBM in the systemic immune system. We have highlighted this focus in the title and the aim of the manuscript.

      (9) For the prognostic signature discovery and validation, I strongly recommend the authors include more robust validation. For example, to undertake an 80:20 split for sequential discovery (for feature selection and derivation of a prognostic model), followed by validation of a 'locked' model in data that made no contribution to discovery. In two separate sensitivity analyses. I also suggest they split their dataset (i) by treatment allocation in the RCT and (ii) by HIV status. In addition, their method for feature selection has to be clearer- precisely how they select hub genes from their WGCNA analysis as candidate predictors is not explained. Since this is such a prominent output of their manuscript, the results of this analysis should really be included in the main manuscript, and all performance metrics for discrimination should include confidence intervals.

      Employing an 80:20 split for training and testing models is a good approach for an internal validation. However, we addressed the issue of overestimating the performance of a prognostic model by bootstrapping sampling approach proposed by Steyerberg et al. (PMID: 11470385). This approach has been proven to provide stable estimates with low bias. The overall model performance for discrimination, reported in our manuscript, was corrected for “optimism” to ensure internal validity. This adjustment was achieved through a 1000-times bootstrapping approach, which effectively accounted for estimation uncertainty. As such, there is no need to present confidence intervals for these metrics.

      Moreover, in our revision, to confirm prognostic signatures independently, we have evaluated the predictive value of identified gene signatures using qPCR in another set of samples. The results have been added in Table 4, supplementary Figure S8 and the results section.

      For the reasons given above (comment 4), we are unable to split our dataset by treatment allocation in this analysis. But as described, we have adjusted the analysis for corticosteroid treatment. Once the primary results of the LAST ACT trial have been published, we will examine the impact of corticosteroids on TBM pathophysiology and outcomes, seeking to better understand the mechanisms by which steroids have their therapeutic effects.

      Given the difference in pathogenesis and immune response by HIV-coinfection, we stratified our analysis by HIV status. As the reviewer’s suggestion, we have provided additional details in the methods section regarding the selection of hub genes from associated WGCNA modules and the feature selection process for predictive modeling.

    2. eLife assessment

      In this valuable study, the authors investigate the transcriptional landscape of tuberculous meningitis. They reveal potentially significant molecular differences contributed by HIV co-infection, and derive a prognostic model to predict mortality combining a gene expression signature with clinical parameters. Whilst some of the evidence presented is compelling, the bioinformatics analysis remains limited and cannot be used to make causal inferences and conclusions about immunopathogenesis for tuberculous meningitis. The work will be of broad interest to the infectious disease community however, further validation of the findings is critical for future utility.

    3. Reviewer #1 (Public Review):

      Summary:

      Tuberculous meningitis (TBM) is one of the most severe form of extrapulmonary TB. TBM is especially prevalent in people who are immunocompromised (e.g. HIV-positive). Delays in diagnosis and treatment could lead to severe disease or mortality. In this study, the authors performed the largest ever host whole blood transcriptomics analysis on a cohort of 606 Vietnamese participants. The results indicated that TBM mortality is associated with increased neutrophil activation and decreased T and B cell activation pathways. Furthermore, increased angiogenesis was also observed in HIV-positive patients who died from TBM, whereas activated TNF signaling and down-regulated extracellular matrix organisation were seen in the HIV-negative group. Despite similarities in transcriptional profiles between PTB and TBM compared to healthy controls, inflammatory genes were more active in HIV-positive TBM. Finally, 4 hub genes (MCEMP1, NELL2, ZNF354C and CD4) were identified as strong predictors of death from TBM.

      Strengths:

      This is a really impressive piece of work, both in terms of the size of the cohort which took years of effort to recruit, sample and analyse and also the meticulous bioinformatics performed. The biggest advantage of obtaining a whole blood signature is that it allows an easier translational development into test that can be used in the clinical with a minimally invasive sample. Furthermore, the data from this study has also revealed important insights in the mechanisms associated with mortality and the differences in pathogenesis between HIV-positive and HIV-negative patients, which would have diagnostic and therapeutic implications.

      Weaknesses:

      The authors have addressed all the weaknesses in the revised version.

    4. Reviewer #2 (Public Review):

      Summary:

      This manuscript describes the analysis of blood transcriptomic data from patients with TB meningitis, with and without HIV infection, with some comparison to those of patients with pulmonary tuberculosis and healthy volunteers. The objectives were to describe the comparative biological differences represented by the blood transcriptome in TBM associated with HIV co-infection or survival/mortality outcomes, and to identify a blood transcriptional signature to predict these outcomes. The authors report an association between mortality and increased levels of acute inflammation and neutrophil activation, but decreased levels of adaptive immunity and T/B cell activation. They propose a 4-gene prognostic signature to predict mortality.

      Strengths:

      Biological evaluations of blood transcriptomes in TB meningitis and their relationship to outcomes have not been extensively reported previously.<br /> The size of the data set is a major strength and is likely to be used extensively for secondary analyses in this field of research.<br /> The addition of a new validation cohort to evaluate the generalisability of their prognostic model in the revised manuscript is welcome.

      Weaknesses:

      The bioinformatic analysis is limited to a descriptive narrative of gene-level functional annotations curated in GO and KEGG databases. This analysis cannot be used to make causal inferences. In addition the functional annotations are limited to 'high-level' terms that fail to define the biology very precisely. As a result, the conclusions about the immunopathogenesis of TBM are not adequately substantiated.<br /> The lack of AUROC confidence intervals and direct comparison to the reference prognostic model in the validation cohort undermines confidence in their conclusion that their new prognostic model combing gene expression data and clinical variables performs better than the reference model.

    1. Author response:

      The following is the authors’ response to the previous reviews

      We extend our sincere gratitude for the invaluable comments provided by the reviewers and yourself, along with the constructive suggestions to enhance the quality of our manuscript. In response to this invaluable feedback, we have diligently revised and resubmitted our paper as an article, introducing five primary figures, seven supplementary figures, and two supplementary data files. Importantly, this work represents a significant contribution to the field, presenting novel findings for the first time without any prior publication.

      Within the enclosed document, we have provided a comprehensive response to the editor and reviewer comments, addressing each point meticulously and specifically. We extend our heartfelt thanks to the reviewers and yourself for your diligent examination of our manuscript and for offering insightful recommendations.

      In our latest revision, we have taken great care to address every comment, ensuring that we clarify the manuscript and provide robust evidence where required. We have meticulously highlighted the modifications within the manuscript in yellow for your convenience, while also including the modifications made in response to each specific comment. The primary focus of these revisions was to provide additional context regarding the relationship between PARP-1 and mono-methylated histones. Substantial modifications were made to our discussion section to address this point.

      Another concern raised was regarding the discrepancy in the relationship of PR-SET7 and PARP-1 between our study and the recent study by Estève et al. (PMID: 36434141). We have revised the results and discussion sections to discuss this concern.

      Addressing Reviewer 2’s concern about the potential indirect role of PARP1 in the regulation of some metabolic genes despite its direct binding to loci coding for metabolic genes we revised the discussion section to highlight this possibility.

      Enclosed, you will find a detailed, point-by-point response to each of the editor’s and reviewers' comments, showcasing our commitment to addressing their concerns with precision.

      We firmly believe that our revisions successfully resolve all the concerns raised by the editor and the reviewers, and we are confident that this improved version of our manuscript contributes significantly to the scientific discourse. Once again, we thank you for considering our work, and please feel free to contact me if you require any additional information.

      In the revised manuscript, most of the concerns raised by the reviewers have been addressed satisfactorily. However, as suggested by reviewer#2, it would have been more significant, if the PARP1-mediated reading of global mono-methylation of histone could be addressed. At least the mechanisms of selectivity of PARP1 need further convincing discussion.

      We thank the editor for their valuable comments. We have extended our discussion section to discuss in more detail the relationship between PARP1 and mono-methylated histones. In our refined Discussion section, we have endeavored to articulate more clearly how PARP-1 may be selectively recruited to active chromatin domains through its interaction with mono-methylated histone marks. We propose a model wherein PARP-1 actively participates in the turnover process, contributing to the maintenance of an active chromatin environment. This mechanism entails PARP-1 selectively binding to mono-methylated active histone marks associated with highly transcribed genes. Upon activation, PARP-1 undergoes automodification, leading to its release from chromatin and facilitating the reassembly of nucleosomes carrying the mono-methylated marks. Subsequently, the enzymatic action of Poly(ADP)-ribose glycohydrolase (PARG) cleaves pADPr, enabling the restoration of PARP-1's binding affinity to mono-methylated active histone marks. This proposed hypothesis is consistent with existing research across various model organisms and aligns with the known association of PARP-1 with highly expressed genes, as well as its role in mediating nucleosome dynamics and assembly.

      Our modified Discussion section unfolds as follows:

      "Finally, highly transcribed genes have been reported to present a high turnover of mono-methylated modifications, maintaining a state of low methylation (50). Moreover, our previous study revealed that PARP1 preferentially binds to highly active genes (34).  Consequently, our findings suggest an active involvement of PARP-1 in the turnover process to maintain an active chromatin environment. This proposed mechanism unfolds in the following steps: 1) PARP-1 selectively binds to mono-methylated active histone marks associated with highly transcribed genes. 2) Upon activation, PARP-1 undergoes automodification and subsequently disengages from chromatin, facilitating the reassembly of nucleosomes carrying the mono-methylated marks. 3) The enzymatic action of Poly(ADP)-ribose glycohydrolase (PARG) cleaves pADPr, restoring PARP-1's binding affinity to mono-methylated active histone marks. This proposed hypothesis is consistent with existing research conducted across various model organisms, including mice, Drosophila, and Humans (7, 24, 30, 51-53). Notably, previous studies have consistently demonstrated that PARP-1 predominantly associates with highly expressed genes and plays a crucial role in mediating nucleosome dynamics and assembly. Thus, our proposed model provides a molecular framework that may contribute to understanding the relationship between PARP-1 and the epigenetic regulation of gene expression."

      We trust that these revisions effectively address the editor’s comment and enhance the overall strength and clarity of our manuscript.

      Furthermore, recent developments in the area are omitted, as an important publication hasn't been discussed anywhere in the work (PMID: 36434141).

      We appreciate the editor's thorough review of our revised manuscript and the responses to the previous reviewer's comments. To address this important concern, we have carefully investigated the levels of PR-SET7 in parp1 hypomorphic conditions.

      Supplemental Fig. S4 and S5 demonstrate that in the absence of Parp1, there were no significant changes observed in PR-SET7 RNA or protein levels, respectively. This finding supports the conclusion that Parp1 is not directly involved in the regulation of PR-SET7 in Drosophila contrasting with the findings of Estève et al.'s study (PMID: 36434141). This discrepancy may arise from differing relationships between PARP-1 and PR-SET7, which could cooperate in the context of Drosophila development while playing antagonistic roles in specific cell lines or under particular conditions.

      We have updated the Results section to explicitly mention this observation:

      "Interestingly, in the absence of PARP-1, neither PR-SET7 RNA nor protein levels were affected (Supplemental Fig.S4-5), indicating that PARP-1 is not directly implicated in the regulation of pr-set7. This finding contrasts with recent evidence demonstrating PARP1-induced degradation of PR-SET7/SET8 in human cells (16)."

      Furthermore, we have modified the discussion section to address this discrepancy:

      "A recent study demonstrated that in human cells overexpressing PARP-1, PR-SET7/SET8 is degraded, whereas depletion of PARP-1 leads to an increase in PR-SET7/SET8 levels (16). However, in our study involving parp-1 mutant in Drosophila third-instar larvae revealed a nuanced scenario: we detected a minor but not significant reduction in both PR-SET7 RNA and protein levels (Supplemental Fig.S4 and S5). This outcome stands in stark contrast to the previous study's findings. The discrepancy could be due to the distinct experimental approaches used: the previous research focused on mammalian cells and in vitro experiments, whereas our study examined the functions of PARP-1 in whole Drosophila third-instar larvae during development. Consequently, while PARP-1 may cooperate with PR-SET7 in the context of Drosophila development, it could exhibit antagonistic roles against PR-SET7 in specific cell lines and under certain biological or developmental conditions."

      We believe that these modifications effectively address the raised concern and provide a more comprehensive understanding of the relationship between PARP1 and PR-SET7 in our study. We hope these clarifications enhance the overall robustness and clarity of our findings.

      Reviewer #2 (Public Review):

      Summary:

      This study from Bamgbose et al. identifies a new and important interaction between H4K20me and Parp1 that regulates inducible genes during development and heat stress. The authors present convincing experiments that form a mostly complete manuscript that significantly contributes to our understanding of how Parp1 associates with target genes to regulate their expression.

      Strengths:

      The authors present 3 compelling experiments to support the interaction between Parp1 and H4K20me, including:

      (1) PR-Set7 mutants remove all K4K20me and phenocopy Parp mutant developmental arrest and defective heat shock protein induction.

      (2) PR-Set7 mutants have dramatically reduced Parp1 association with chromatin and reduced poly-ADP ribosylation.

      (3) Parp1 directly binds H4K20me in vitro.

      Weaknesses:

      (1) The RNAseq analysis of Parp1/PR-Set7 mutants is reasonable, but there is a caveat to the author's conclusion (Line 251): "our results indicate H4K20me1 may be required for PARP-1 binding to preferentially repress metabolic genes and activate genes involved in neuron development at co-enriched genes." An alternative possibility is that many of the gene expression changes are indirect consequences of altered development induced by Parp1 or PR-Set7 mutants. For example, Parp1 could activate a transcription factor that represses metabolic genes. The authors counter this model by stating that Parp1 directly binds to "repressed" metabolic genes. While this argument supports their model, it does not rule out the competing indirect transcription factor model. Therefore, they should still mention the competing model as a possibility.

      We appreciate Reviewer 2's insightful comments during both rounds of revision, which have significantly enriched the quality of our manuscript. The binding of PARP1 to loci encoding metabolic genes indeed suggests a direct role of PARP1 in their regulation. However, we acknowledge Reviewer 2's point that some of these targets might be regulated indirectly, with PARP1 potentially modulating the expression of intermediary transcription factors.

      To address this possibility, we have revised the discussion section of our manuscript accordingly:

      "Remarkably, our observations indicate a notable affinity of PARP-1 for binding to the gene bodies of these metabolic genes (34), suggesting a direct involvement of PARP1 in their regulation. Nonetheless, it remains plausible that certain genes may be indirectly regulated by PARP1 through intermediary transcription factors."

      We trust that this modification adequately addresses Reviewer 2's concern.

      (2) The section on inducibility of heat shock genes is interesting but missing an important control that might significantly alter the author's conclusions. Hsp23 and Hsp83 (group B genes) are transcribed without heat shock, which likely explains why they have H4K20me without heat shock. The authors made the reasonable hypothesis that this H4K20me would recruit Parp-1 upon heat shock (line 270). However, they observed a decrease of H4K20me upon heat shock, which led them to conclude that "H4K20me may not be necessary for Parp1 binding/activation" (line 275). However, their RNA expression data (Fig4A) argues that both Parp1 and H40K20me are important for activation. An alternative possibility is that group B genes indeed recruit Parp1 (through H4K20me) upon heat shock, but then Parp1 promotes H3/H4 dissociation from group B genes. If Parp1 depletes H4, it will also deplete H4K20me1. To address this possibility, the authors should also do a ChIP for total H4 and plot both the raw signal of H4K20me1 and total H4 as well as the ratio of these signals. The authors could also note that Group A genes may similarly recruit Parp1 and deplete H3/H4 but with different kinetics than Group B genes because their basal state lacks H4K20me/Parp1. To test this possibility, the authors could measure Parp association, H4K20methylation, and H4 depletion at more time points after heat shock at both classes of genes.

      We sincerely appreciate Reviewer 2 for their insightful comment on our manuscript. Your hypothesis regarding the potential induction of H3/H4 dissociation from group B genes by PARP-1, leading to a reduction in H4K20me1, offers a thought-provoking perspective. However, our findings suggest an alternative interpretation.

      Our data indicate that while H4K20me1 is indeed present under normal conditions at group B genes, its reduction following heat shock does not seem to impede PARP-1's role in transcriptional activation (Fig. 4A, C, and E). Instead, we propose that this decrease in H4K20me1 might signify a regulatory shift in chromatin structure, facilitating transcriptional activation during heat shock, with PARP-1 playing an independent facilitating role. Moreover, existing studies have highlighted the dual role of H4K20me1, acting as a promoter of transcription elongation in certain contexts and as a repressor in others.

      The elevated enrichment of H4K20me1 in group B genes under normal conditions may indeed indicate a repressive state that requires alleviation for transcriptional activation. Additionally, we cannot discount the possibility of unique regulatory functions associated with PR-SET7, extending beyond its recognized role as a histone methylase. Non-catalytic activities and potential interactions with non-histone substrates might contribute to the nuanced control exerted by PR-SET7 on group B genes during heat stress.

      Furthermore, our exploration of pr-set720 and ParpC03256 mutants reveals distinct roles for PARP-1 and H4K20me1 in modulating gene expression (Fig 3E). This reinforces the notion that the interplay between PR-SET7 and PARP-1 involves a multifaceted regulatory mechanism.

      To address these points, we have revised the discussion section of our manuscript accordingly:

      "Another plausible explanation could be that the recruitment of PARP-1 to group B genes loci promotes H4 dissociation and then leads to a reduction of H4K20me1. However, our findings suggest an alternative interpretation: the decrease in H4K20me1 at group B genes during heat shock does not seem to impede PARP-1's role in transcriptional activation, (Fig.4A, C and E). Rather than disrupting PARP-1 function, we propose that this reduction in H4K20me1 may signify a regulatory shift in chromatin structure, priming these genes for transcriptional activation during heat shock, with PARP-1 playing an independent facilitating role. Moreover, existing studies have highlighted the dual role of H4K20me1, acting as a promoter of transcription elongation in certain contexts and as a repressor in others (13, 26, 39, 40, 42-46). The elevated enrichment of H4K20me1 in group B genes under normal conditions may indicate a repressive state that requires alleviation for transcriptional activation. Additionally, we cannot discount the possibility of unique regulatory functions associated with PR-SET7, extending beyond its recognized role as a histone methylase. Non-catalytic activities and potential interactions with non-histone substrates might contribute to the nuanced control exerted by PR-SET7 on group B genes during heat stress (47, 48). Furthermore, our exploration of pr-set720 and parp-1C03256 mutants reveals distinct roles for PARP-1 and H4K20me1 in modulating gene expression (Fig 3E). This reinforces the notion that the interplay between PR-SET7 and PARP-1 involves a multifaceted regulatory mechanism. Understanding the intricate relationship between these molecular players is crucial for elucidating the complexities of gene expression modulation under heat stress conditions."

      We believe that this modification enhances the clarity of our conclusions and adequately addresses Reviewer 2's concerns regarding the intricate relationship between PARP-1, H4K20me1, and PR-SET7 in transcriptional regulation under heat stress conditions.

    2. eLife assessment

      This valuable study presents convincing evidence for an association between PARP-1 and H4K20me1 in transcriptional regulation, supported by biochemical and ChIP-seq analyses. The work contributes significantly to our understanding of how Parp1 associates with target genes to regulate their expression.

    3. Reviewer #2 (Public Review):

      Summary:

      This study from Bamgbose et al. identifies a new and important interaction between H4K20me and Parp1 that regulates inducible genes during development and heat stress. The authors present convincing experiments that form a mostly complete manuscript that significantly contributes to our understanding of how Parp1 associates with target genes to regulate their expression.

      Strengths:

      The authors present 3 compelling experiments to support the interaction between Parp1 and H4K20me, including:

      (1) PR-Set7 mutants remove all K4K20me and phenocopy Parp mutant developmental arrest and defective heat shock protein induction.

      (2) PR-Set7 mutants have dramatically reduced Parp1 association with chromatin and reduced poly-ADP ribosylation.

      (3) Parp1 directly binds H4K20me in vitro.

    1. eLife assessment

      Using new cannabinoid receptor (CNR1 and CNR2) knockout mouse models, this important paper shows how dysregulation of the endocannabinoid system is involved in endometriosis progression. The transcriptomic evidence is solid, but a major limitation of the work is the absence of detailed measurements of lesion size and burden by histopathology.

    2. Reviewer #1 (Public Review):

      Summary:

      The endocannabinoid system (ECS) components are dysregulated within the lesion microenvironment and systemic circulation of endometriosis patients. Using endometriosis mouse models and genetic loss of function approaches, Lingegowda et al. report that canonical ECS receptors, CNR1 and CNR2, are required for disease initiation, progression, and T-cell dysfunction.

      Strengths:

      The approach uses genetic approaches to establish in vivo causal relationships between dysregulated ECS and endometriosis pathogenesis. The experimental design incorporates bulk RNAseq approaches, as well as imaging mass spectrometry to characterize the mouse lesions. The identification of immune-related and T-cell-specific changes in the lesion microenvironment of CNR1 and CNR2 knockout (KO) mice represents a significant advance

      Weaknesses:

      Although the mouse phenotypic analyses involve a detailed molecular characterization of the lesion microenvironment using genomic approaches, detailed measurements of lesion size/burden and histopathology would provide a better understanding of how CNR1 or CNR2 loss contributes to endometriosis initiation and progression. The cell or tissue-specific effects of the CNR1 and CNR2 are not incorporated into the experimental design of the studies. Although this aspect of the approach is recognized as a major limitation, global CNR1 and CNR2 KO may affect normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, or lead to preexisting alterations in host or donor tissues, which could affect lesion establishment and development in the surgically induced, syngeneic mouse model of endometriosis.

    3. Reviewer #2 (Public Review):

      Summary:

      The endocannabinoid system (ECS) regulates many critical functions, including reproductive function. Recent evidence indicates that dysregulated ECS contributes to endometriosis pathophysiology and the microenvironment. Therefore, the authors further examined the dysregulated ECS and its mechanisms in endometriosis lesion establishment and progression using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. The authors presented differential gene expressions and altered pathways, especially those related to the adaptive immune response in CNR1 and CNR2 ko lesions. Interstingly, the T-cell population was dramatically reduced in the peritoneal cavity lacking CNR2, and the loss of proliferative activity of CD4+ T helper cells. Imaging mass cytometry analysis provided spatial profiling of cell populations and potential relationships among immune cells and other cell types. This study provided fundamental knowledge of the endocannabinoid system in endometriosis pathophysiology.

      Strengths:

      Dysregulated ECS and its mechanisms in endometriosis pathogenesis were assessed using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. Not only endometriotic lesions, but also peritoneal exudate (and splenic) cells were analyzed to understand the specific local disease environment under the dysregulated ECS.

      Providing the results of transcriptional profiles and pathways, immune cell profiles, and spatial profiles of cell populations support altered immune cell population and their disrupted functions in endometriosis pathogenesis via dysregulation of ECS.

      In line 386: Role of CNR2 in T cells. The finding that nearly absent CD3+ T cells in the peritoneal cavity of CNR2 ko mice is intriguing.

      The interpretation of the results is well-described in the Discussion.

      Weaknesses:

      The study was terminated and characterized 7 days after EM induction surgery without the details for selecting the time point to perform the experiments.

      The authors also mentioned that altered eutopic endometrium contributes to the establishment and progression of endometriosis. This reviewer agrees with lines 324-325. If so, DEGs are likely identified between eutopic endometrium (with/without endometriosis lesion induction) and ectopic lesions. It would be nice to see the data (even though using publicly available data sets).

      Figure 7 CDEF. The results of the statistical analyses and analyzed sample numbers should be added. Lines 444-450 cannot be reviewed without them.

      This reviewer agrees with lines 498-500. In contrast, retrograded menstrual debris is not decidualized. The section could be modified to avoid misunderstanding.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The endocannabinoid system (ECS) components are dysregulated within the lesion microenvironment and systemic circulation of endometriosis patients. Using endometriosis mouse models and genetic loss of function approaches, Lingegowda et al. report that canonical ECS receptors, CNR1 and CNR2, are required for disease initiation, progression, and T-cell dysfunction.

      Strengths:

      The approach uses genetic approaches to establish in vivo causal relationships between dysregulated ECS and endometriosis pathogenesis. The experimental design incorporates bulk RNAseq approaches, as well as imaging mass spectrometry to characterize the mouse lesions. The identification of immune-related and T-cell-specific changes in the lesion microenvironment of CNR1 and CNR2 knockout (KO) mice represents a significant advance

      Weaknesses:

      Although the mouse phenotypic analyses involve a detailed molecular characterization of the lesion microenvironment using genomic approaches, detailed measurements of lesion size/burden and histopathology would provide a better understanding of how CNR1 or CNR2 loss contributes to endometriosis initiation and progression. The cell or tissue-specific effects of the CNR1 and CNR2 are not incorporated into the experimental design of the studies. Although this aspect of the approach is recognized as a major limitation, global CNR1 and CNR2 KO may affect normal female reproductive tract function, ovarian steroid hormone levels, decidualization response, or lead to preexisting alterations in host or donor tissues, which could affect lesion establishment and development in the surgically induced, syngeneic mouse model of endometriosis.

      We appreciate the reviewer's thoughtful and constructive feedback. We agree that the additional measurements of lesion size/burden and histopathology would provide valuable insights into the specific contributions of CNR1 and CNR2 to endometriosis progression. However, the focus of this study was on assessing the alterations in complex immune microenvironment due to the absence of CNR1 and CNR2, given their close relation in regulating immune cell populations. We will plan to incorporate these measurements in future studies to further strengthen the understanding of the disease pathogenesis. Regarding the potential effects of global knockout, the reviewer raises a valid concern. To address this, we will explore cell and/or tissue-specific knockout models in future experiments to better isolate the direct effects of CNR1 and CNR2 on the disease process, while minimizing potential confounding factors from systemic alterations.

      Reviewer #2 (Public Review):

      Summary:

      The endocannabinoid system (ECS) regulates many critical functions, including reproductive function. Recent evidence indicates that dysregulated ECS contributes to endometriosis pathophysiology and the microenvironment. Therefore, the authors further examined the dysregulated ECS and its mechanisms in endometriosis lesion establishment and progression using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. The authors presented differential gene expressions and altered pathways, especially those related to the adaptive immune response in CNR1 and CNR2 ko lesions. Interestingly, the T-cell population was dramatically reduced in the peritoneal cavity lacking CNR2, and the loss of proliferative activity of CD4+ T helper cells. Imaging mass cytometry analysis provided spatial profiling of cell populations and potential relationships among immune cells and other cell types. This study provided fundamental knowledge of the endocannabinoid system in endometriosis pathophysiology.

      Strengths:

      Dysregulated ECS and its mechanisms in endometriosis pathogenesis were assessed using two different endometrial sources of mouse models of endometriosis with CNR1 and CNR2 knockout mice. Not only endometriotic lesions, but also peritoneal exudate (and splenic) cells were analyzed to understand the specific local disease environment under the dysregulated ECS.

      Providing the results of transcriptional profiles and pathways, immune cell profiles, and spatial profiles of cell populations support altered immune cell population and their disrupted functions in endometriosis pathogenesis via dysregulation of ECS.

      In line 386: Role of CNR2 in T cells. The finding that nearly absent CD3+ T cells in the peritoneal cavity of CNR2 ko mice is intriguing.

      The interpretation of the results is well-described in the Discussion.

      Weaknesses:

      The study was terminated and characterized 7 days after EM induction surgery without the details for selecting the time point to perform the experiments.

      The authors also mentioned that altered eutopic endometrium contributes to the establishment and progression of endometriosis. This reviewer agrees with lines 324-325. If so, DEGs are likely identified between eutopic endometrium (with/without endometriosis lesion induction) and ectopic lesions. It would be nice to see the data (even though using publicly available data sets).

      Figure 7 CDEF. The results of the statistical analyses and analyzed sample numbers should be added. Lines 444-450 cannot be reviewed without them.

      This reviewer agrees with lines 498-500. In contrast, retrograded menstrual debris is not decidualized. The section could be modified to avoid misunderstanding.

      We would like to thank the reviewer for insightful comments, suggestions and acknowledging the importance of the work presented in this manuscript.

      Regarding 7-day time point, we have provided rationale in lines 479-481, but agree that it isn’t sufficient and hence we have provided additional details on the selection of the 7-day time point for the experiments in methods section (Mouse model of EM). We have also noted the suggestion on providing comparison of differentially expressed genes in the eutopic endometrium vs ectopic lesions. Since there are publications comparing the eutopic vs ectopic gene expression patterns (PMIDs: 33868805 and 18818281), including a study exploring the ECS genes in the endometrium throughout different menstrual cycles (PMID: 35672435), we believe additional analysis using the same dataset may not yield new information. However, we see the value in reviewer’s comment, and we will look at the gene expression patterns in the uterine vs endometriosis like lesions in our future studies with tissue or cell specific CNR1 and CNR2 knockout models to understand functional relevance of ECS in endometriosis initiation.

      Since the IMC study was exploratory for proof of concept, we did not have enough biological replicates for meaningful statistical validation (n = 2-3). We have clarified this information in the methods, results, and figure legends for appropriately representing the limitations of the current setup.

      Finally, we appreciate the feedback on the section discussing retrograded menstrual debris. Even though the menstrual debris may not be decidualized, some endometriotic lesions have the ability to decidualize based on their response to estrogen and progesterone in a cycling manner (PMID: 26450609), similar to the endometrium in the uterine cavity. We have clarified this in the revised MS.

    1. eLife assessment

      In this useful study, the authors show that N-acetylation of synuclein increases clustering of synaptic vesicles in vitro and that this effect is mediated by enhanced interaction with lysophosphatidylcholine. While the evidence for enhanced clustering is largely solid, the biological significance remains unclear.

    2. Reviewer #1 (Public Review):

      ⍺-synuclein (syn) is a critical protein involved in many aspects of human health and disease. Previous studies have demonstrated that post-translational modifications (PTMs) play an important role in regulating the structural dynamics of syn. However, how post-translational modifications regulate syn function remains unclear. In this manuscript, Wang et al. reported an exciting discovery that N-acetylation of syn enhances the clustering of synaptic vesicles (SVs) through its interaction with lysophosphatidylcholine (LPC). Using an array of biochemical reconstitution, single vesicle imaging, and structural approaches, the authors uncovered that N-acetylation caused distinct oligomerization of syn in the presence of LPC, which is directly related to the level of SV clustering. This work provides novel insights into the regulation of synaptic transmission by syn and might also shed light on new ways to control neurological disorders caused by syn mutations.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors provide evidence that posttranslational modification of synuclein by N-acetylation increases clustering of synaptic vesicles in vitro. When using liposomes the authors found that while clustering is enhanced by the presence of either lysophosphatidylcholine (LPC) or phosphatidylcholine in the membrane, N-acetylation enhanced clustering only in the presence of LPC. Enhancement of binding was also observed when LPC micelles were used, which was corroborated by increased intra/intermolecular cross-linking of N-acetylated synuclein in the presence of LPC.

      Strengths:

      It is known for many years that synuclein binds to synaptic vesicles but the physiological role of this interaction is still debated. The strength of this manuscript is clearly in the structural characterization of the interaction of synuclein and lipids (involving NMR-spectroscopy) showing that the N-terminal 100 residues of synuclein are involved in LPC-interaction, and the demonstration that N-acetylation enhances the interaction between synuclein and LPC.

      Weaknesses:

      Lysophosphatides form detergent-like micelles that destabilize membranes, with their steady-state concentrations in native membranes being low, questioning the significance of the findings. Oddly, no difference in binding between the N-acetylated and unmodified form was observed when the acidic phospholipid phosphatidylserine was included. It remains unclear to which extent binding to LPC is physiologically relevant, particularly in the light of recent reports from other laboratories showing that synuclein may interact with liquid-liquid phases of synapsin I that were reported to cause vesicle clustering.

    1. eLife assessment

      This manuscript reports important data on the interaction of Rev7 with the Rad50-Mre11-Xrs2 complex in budding yeast providing evidence that a 42 amino acid region of Rev7 is necessary and sufficient for interaction. Rev7 is found to inhibit the Rad50 ATPase and the Mre11 nuclease activities, with the exception of the ssDNA exonuclease activity. Overall, the study is incomplete: controls are lacking, there is little evidence to support the conclusion about DSB repair pathway usage, and the work on the role of Mre11 in G4 metabolism is underdeveloped.

    2. Reviewer #1 (Public Review):

      Summary:

      The mammalian Shieldin complex consisting of REV7 (aka MAD2L2, MAD2B) and SHLD1-3 affects pathway usage in DSB repair favoring non-homologous endjoining (NHEJ) at the expense of homologous recombination (HR) by blocking resection and/or priming fill-in DNA synthesis to maintain or generate near blunt ends suitable for NHEJ. While the budding yeast Saccharomyces cerevisiae does not have homologs to SHLD1-3, it does have Rev7, which was identified to function in conjunction with Rev3 in the translesion DNA polymerase zeta. Testing the hypothesis that Rev7 also affects DSB resection in budding yeast, the work identified a direct interaction between Rev7 and the Rad50-Mre11-Xrs2 complex by two-hybrid and direct protein interaction experiments. Deletion analysis identified that the 42 amino acid C-terminal region was necessary and sufficient for the 2-hybrid interaction. Direct biochemical analysis of the 42 aa peptide was not possible. Rev7 deficient cells were found to be sensitive to HU only in synergy with G2 tetraplex forming DNA. Importantly, the 42 aa peptide alone suppressed this phenotype. Biochemical analysis with full-length Rev7 and a C-terminal truncation lacking the 42 aa region shows G4-specific DNA binding that is abolished in the C-terminal truncation and with a substrate containing mutations to prevent G4 formation. Rev7 lacks nuclease activity but inhibits the dsDNA exonuclease activity of Mre11. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition suggesting the involvement of additional binding sites besides the 42 aa region. Also, the Mre11 ssDNA endonuclease activity is inhibited by Rev7 but not the degradation of linear ssDNA. Rev7 does not affect ATP binding by Rad50 but inhibits in a concentration-dependent manner the Rad50 ATPase activity. The C-terminal truncation protein lacking the 42 aa region also showed some inhibition but significantly less than the full-length protein.

      Using an established plasmid-based NHEJ assay, the authors provide strong evidence that Rev7 affects NEHJ, showing a four-fold reduction in this assay. The mutations in the other Pol zeta subunits, Rev3 and Rev1, show a significantly smaller effect (~25% reduction). A strain expressing only the Rev7 C-terminal 42 aa peptide showed no NHEJ defect, while the truncation protein lacking this region exhibited a smaller defect than the deletion of REV7. The conclusion that Rev7 supports NHEJ mainly through the 42 aa region was validated using a chromosomal NHEJ assay. The effect on HR was assessed using a plasmid:chromosome system containing G4 forming DNA. The rev7 deletion strain showed an increase in HR in this system in the presence and absence of HU. Cells expressing the 42 aa peptide were indistinguishable from the wild type as were cells expressing the Rev7 truncation lacking the 42 aa region. The authors conclude that Rev7 suppresses HR, but the context appears to be system-specific and the conclusion that Rev7 abolished HR repair of DSBs is unwarranted and overly broad.

      Strength:

      This is a well-written manuscript with many well-executed experiments that suggest that Rev7 inhibits MRX-mediated resection to favor NEHJ during DSB repair. This finding is novel and provides insight into the potential mechanism of how the human Shieldin complex might antagonize resection.

      Weaknesses:

      The nuclease experiments were conducted using manganese as a divalent cation, and it is unclear whether there is an effect with the more physiological magnesium cation. Additional controls for the ATPase and nuclease experiments to eliminate non-specific effects would be helpful. Evidence for an effect on resection in cells is lacking. The major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified, as only a highly specialized assay is used that does not warrant the broad conclusion drawn. Specifically, the results that the Rev7 C-terminal truncation lacking the 42 aa region still suppresses HR is unexpected and unexplained. The effect of Rev7 on G4 metabolism is underdeveloped and distracts from the main results that Rev7 modulated MRX activity. The authors should consider removing this part and develop a more complete story on this later.

    3. Reviewer #2 (Public Review):

      In this study, Badugu et al investigate the Rev7 roles in regulating the Mre11-Rad50-Xrs2 complex and in the metabolism of G4 structures. The authors also try to make a conclusion that REV7 can regulate the DSB repair choice between homologous recombination and non-homologous end joining.

      The major observations of this study are:

      (1) Rev7 interacts with the individual components of the MRX complex in a two-hybrid assay and in a protein-protein interaction assay (microscale thermophoresisi) in vitro.<br /> (2) Modeling using AlphaFold-Multimier also indicated that Rev7 can interact with Mre11 and Rad50.<br /> (3) Using a two-hybrid assay, a 42 C terminal domain in Rev7 responsible for the interaction with MRX was identified.<br /> (4) Rev7 inhibits Mre11 nuclease and Rad50 ATPase activities in vitro.<br /> (5) Rev 7 promotes NHEJ in plasmid cutting/relegation assay.<br /> (6) Rev7 inhibits recombination between chromosomal ura3-1 allele and plasmid ura3 allele containing G4 structure.<br /> (7) Using an assay developed in V. Zakian's lab, it was found that rev7 mutants grow poorly when both G4 is present in the genome and yeast are treated with HU.<br /> (8) In vitro, purified Rev7 binds to G4-containing substrates.

      In general, a lot of experiments have been conducted, but the major conclusion about the role of Rev7 in regulating the choice between HR and NHEJ is not justified.

      (1) Two stories that do not overlap (regulation of MRX by Rev7 and Rev7's role in G4 metabolism) are brought under one umbrella in this work. There is no connection unless the authors demonstrate that Rev7 inhibits the cleavage of G4 structures by the MRX complex.

      (2) The authors cannot conclude based on the recombination assay between G4-containing 2-micron plasmid and chromosomal ura3-1 that Rev7" completely abolishes DSB-induced HR". First of all, there is no evidence that DSBs are formed at G4. Why is there no induction of recombination when cells are treated with HU? Second, as the authors showed, Rev7 binds to G4, therefore it is not clear if the observed effects are the result of Rev7 interaction with G4 or its impact on HR. The established HO-based assays where the speed of resection can be monitored (e.g., Mimitou and Symington, 2010) have to be used to justify the conclusion that Rev7 inhibits MRX nuclease activity in vivo.

    4. Reviewer #3 (Public Review):

      Summary:

      REV7 facilitates the recruitment of Shieldin complex and thereby inhibits end resection and controls DSB repair choice in metazoan cells. Puzzlingly, Shieldin is absent in many organisms and it is unknown if and how Rev7 regulates DSB repair in these cells. The authors surmised that yeast Rev7 physically interacts with Mre11/Rad50/Xrs2 (MRX), the short-range resection nuclease complex, and tested this premise using yeast two-hybrid (Y2H) and microscale thermophoresis (MST). The results convincingly showed that the individual subunits of MRX interact robustly with Rev7. AlphaFold Multimer modelling followed by Y2H confirmed that the carboxy-terminal 42 amino acid is essential for interaction with MR and G4 DNA binding by REV7. The mutant rev7 lacking the binding interface (Rev7-C1) to MR shows moderate inhibition to the nuclease and the ATPase activity of Mre11/Rad50 in biochemical assays. Deletion of REV7 also causes a mild reduction in NHEJ using both plasmid and chromosome-based assays and increases mitotic recombination between chromosomal ura3-01 and the plasmid ura3 allele interrupted by G4. The authors concluded that Rev7 facilitates NHEJ and antagonizes HR even in budding yeast, but it achieves this by blocking Mre11 nuclease and Rad50 ATPase.

      Weaknesses:

      There are many strengths to the studies and the broad types of well-established assays were used to deduce the conclusion. Nevertheless, I have several concerns about the validity of experimental settings due to the lack of several key controls essential to interpret the experimental results. The manuscript also needs a few additional functional assays to reach the accurate conclusions as proposed.

      (1) AlphaFold model predicts that Mre11-Rev7 and Rad50-Rev7 binding interfaces overlap and Rev7 might bind only to Mre11 or Rad50 at a time. Interestingly, however, Rev7 appears dimerized (Figure 1). Since the MR complex also forms with 2M and 2R in the complex, it should still be possible if REV7 can interact with +-*both M and R in the MR complex. The author should perform MST using MR complex instead of individual MR components. The authors should also analyze if Rev7-C1 is indeed deficient in interaction with MR individually and with complex using MST assay.

      (2) The nuclease and the ATPase assays require additional controls. Does Rev7 inhibit the other nuclease or ATPase non-specifically? Are these outcomes due to the non-specific or promiscuous activity of Rev7? In Figure 6, the effect of REV7 on the ATP binding of Rad50 could be hard to assess because the maximum Rad50 level (1 uM) was used in the experiments. The author should use the suboptimal level of Rad50 to check if REV7 still does not influence ATP binding by Rad50.

      (3) The moderate deficiency in NHEJ using plasmid-based assay in REV7 deleted cells can be attributed to aberrant cell cycle or mating type in rev7 deleted cells. The authors should demonstrate that rev7 deleted cells retain largely normal cell cycle patterns and the mating type phenotypes. The author should also analyze the breakpoints in plasmid-based NHEJ assays in all mutants, especially from rev7 and rev7-C1 cells.

      (4) It is puzzling why the authors did not analyze end resection defects in rev7 deleted cells after a DSB. The author should employ the widely used resection assay after a HO break in rev3, rev7, and mre11 rev7 cells as described previously.

      (5) Is it possible that Rev7 also contributes to NHEJ as the part of TLS polymerase complex? Although NHEJ largely depends on Pol4, the authors should not rule out that the observed NHEJ defect in rev7 cells is due at least partially to its TLS defect. In fact, both rev3 or rev1 cells are partially defective in NHEJ (Figure 7). Rev7-C1 is less deficient in NHEJ than REV7 deletion. These results predict that rev7-C1 rev3 should be as defective as the rev7 deletion. Additionally, the authors should examine if Rev7-C1 might be deficient in TLS. In this regard, does rev7-C1 reduce TLS and TLS-dependent mutagenesis? Is it dominant? The authors should also check if Rev3 or Rev1 are stable in Rev7 deleted or rev7-C1 cells by immunoblot assays.

      (6) Due to the G4 DNA and G4 binding activity of REV7, it is not clear which class of events the authors are measuring in plasmid-chromosome recombination assay in Figure 9. Do they measure G4 instability or the integrity of recombination or both in rev7 deleted cells? Instead, the effect of rev7 deletion or rev7-C1 on recombination should be measured directly by more standard mitotic recombination assays like mating type switch or his3 repeat recombination.

    1. eLife assessment

      This fundamental study advances our understanding of the role of bacterial derived extracellular ATP in the pathogenesis of sepsis. The evidence supporting the conclusions is solid, particularly with the analysis of E. coli mutants to address different aspects of bacterial release of ATP. The work will be of broad interest to researchers on microbiology and infectious diseases.

    2. Reviewer #1 (Public Review):

      Summary:

      Extracellular ATP represents a danger-associated molecular pattern associated to tissue damage and can act also in an autocrine fashion in macrophages to promote proinflammatory responses, as observed in a previous paper by the authors in abdominal sepsis. The present study addresses an important aspect possibly conditioning the outcome of sepsis that is the release of ATP by bacteria. The authors show that sepsis-associated bacteria do in fact release ATP in a growth dependent and strain-specific manner. However, whether this bacterial derived ATP play a role in the pathogenesis of abdominal sepsis has not been determined. To address this question, a number of mutant strains of E. coli has been used first to correlate bacterial ATP release with growth and then, with outer membrane integrity and bacterial death. By using E. coli transformants expressing the ATP-degrading enzyme apyrase in the periplasmic space, the paper nicely shows that abdominal sepsis by these transformants results in significantly improved survival. This effect was associated with a reduction of peritoneal macrophages and CX3CR1+ monocytes, and an increase in neutrophils. To extrapolate the function of bacterial ATP from the systemic response to microorganisms, the authors exploited bacterial OMVs either loaded or not with ATP to investigate the systemic effects devoid of living microorganisms. This approach showed that ATP-loaded OMVs induced degranulation of neutrophils after lysosomal uptake, suggesting that this mechanism could contribute to sepsis severity.

      Strengths:

      A strong part of the study is the analysis of E. coli mutants to address different aspects of bacterial release of ATP that could be relevant during systemic dissemination of bacteria in the host.

      Weaknesses:

      As pointed out in the limitations of the study whether ATP-loaded OMVs provide a mechanistic proof of the pathogenetic role of bacteria-derived ATP independently of live microorganisms in sepsis is interesting but not definitively convincing. It could be useful to see whether degranulation of neutrophils is differentially induced by apyrase-expressing vs control E. coli transformants. Also, the increase of neutrophils in bacterial ATP-depleted abdominal sepsis, which has better outcomes than "ATP-proficient" sepsis, seems difficult to correlate to the hypothesized tissue damage induced by ATP delivered via non-infectious OMVs. Are the neutrophils counts affected by ATP delivered via OMVs? A comparison of cytokine profiles in the abdominal fluids of E. coli and OMV treated animals could be helpful in defining the different responses induced by OMV-delivered vs bacterial-released ATP. The analyses performed on OMV treated versus E. coli infected mice are not closely related and difficult to combine when trying to draw a hypothesis for bacterial ATP in sepsis. Also it was not clear why lung neutrophils were used for the RNAseq data generation and analysis.

    3. Reviewer #2 (Public Review):

      Summary:

      In their manuscript "Released Bacterial ATP Shapes Local and Systemic Inflammation during Abdominal Sepsis", Daniel Spari et al. explored the dual role of ATP in exacerbating sepsis, revealing that ATP from both host and bacteria significantly impacts immune responses and disease progression.

      Strengths:<br /> The study meticulously examines the complex relationship between ATP release and bacterial growth, membrane integrity, and how bacterial ATP potentially dampens inflammatory responses, thereby impairing survival in sepsis models. Additionally, this compelling paper implies a concept that bacterial OMVs act as vehicles for the systemic distribution of ATP, influencing neutrophil activity and exacerbating sepsis severity.

      Weaknesses:

      (1) The researchers extracted and cultivated abdominal fluid on LB agar plates, then randomly picked 25 colonies for analysis. However, they did not conduct 16S rRNA gene amplicon sequencing on the fluid itself. It is worth noting that the bacterial species present may vary depending on the individual patients. It would be beneficial if the authors could specify whether they've verified the existence of unculturable species capable of secreting high levels of Extracellular ATP.

      (2) Do mice lacking commensal bacteria show a lack of extracellular ATP following cecal ligation puncture?

      (3) The authors isolated various bacteria from abdominal fluid, encompassing both Gram-negative and Gram-positive types. Nevertheless, their emphasis appeared to be primarily on the Gram-negative E. coli. It would be beneficial to ascertain whether the mechanisms of Extracellular ATP release differ between Gram-positive and Gram-negative bacteria. This is particularly relevant given that the Gram-positive bacterium E. faecalis, also isolated from the abdominal fluid, is recognized for its propensity to release substantial amounts of Extracellular ATP.

      (4) The authors observed changes in the levels of LPM, SPM, and neutrophils in vivo. However, it remains uncertain whether the proliferation or migration of these cells is modulated or inhibited by ATP receptors like P2Y receptors. This aspect requires further investigation to establish a convincing connection.

      (5) Additionally, is it possible that the observed in vivo changes could be triggered by bacterial components other than Extracellular ATP? In this research field, a comprehensive collection of inhibitors is available, so it is desirable to utilize them to demonstrate clearer results.

      (6) Have the authors considered the role of host-derived Extracellular ATP in the context of inflammation?

      (7) The authors mention that Extracellular ATP is rapidly hydrolyzed by ectonucleotases in vivo. Are the changes of immune cells within the peritoneal cavity caused by Extracellular ATP released from bacterial death or by OMVs?

      (8) In the manuscript, the sample size (n) for the data consistently remains at 2. I would suggest expanding the sample size to enhance the robustness and rigor of the results.

    1. eLife assessment

      This important study investigates, from Drosophila to mammals, the role of the Forkhead box O (FoxO) transcription factors in airway epithelial cells' response to stressors including hypoxia, temperature variations, and oxidative stress. The findings suggest a conserved role of FoxO in maintaining airway homeostasis across species. However, limitations in the specificity and concerns with the loss-of-function experiments render the evidence presented incomplete. Nonetheless, this study highlights FoxO's potential relevance in respiratory diseases like asthma and offers insights into potential therapeutic targets for conditions affecting airway health.

    2. Joint Public Review

      This work investigates the evolutionary conservation and functional significance of FoxO transcription factors in the response of airway epithelia to diverse stressors, ranging from hypoxia to temperature fluctuations and oxidative stress. Utilizing a comprehensive approach encompassing Drosophila, murine models, and human samples, the study investigates FoxO's role across species. The authors demonstrate that hypoxia triggers a dFOXO-dependent immune response in Drosophila airways, with subsequent nuclear localization of dFOXO in response to various stressors. Transcriptomic analysis reveals differential regulation of crucial gene categories in respiratory tissues, highlighting FoxO's involvement in metabolic pathways, DNA replication, and stress resistance mechanisms.

      The study underscores FoxO's importance in maintaining homeostasis by revealing reduced stress resistance in dFOXO Drosophila mutants, shedding light on its protective role against stressors. In mammalian airway cells, FoxO exhibits nuclear translocation in response to hypoxia, accompanied by upregulation of cytokines with antimicrobial activities. Intriguingly, mouse models of asthma show FoxO downregulation, which is also observed in sputum samples from human asthma patients, implicating FoxO dysregulation in respiratory pathologies.

      Overall, the manuscript suggests that FoxO signaling plays a critical role in preserving airway epithelial cell homeostasis under stress conditions, with implications for understanding and potentially treating respiratory diseases like asthma. By providing compelling evidence of FoxO's involvement across species and its correlation with disease states, the study underscores the importance of further exploration into FoxO-mediated mechanisms in respiratory health.

      Strengths

      (1) This study shows that FoxO transcription factors are critical for regulating immune and inflammatory responses across species, and for orchestrating responses to various stressors encountered by airway epithelial cells, including hypoxia, temperature changes, and oxidative stress. Understanding the intricate regulation of FoxO transcription factors provides insights into modulating immune and inflammatory pathways, offering potential avenues for therapeutic interventions against respiratory diseases and other illnesses.

      (2) The work employs diverse model systems, including Drosophila, murine models, and human samples, thereby establishing a conserved role for FoxOs in airway epithelium and aiding translational relevance to human health.

      (3) The manuscript establishes a strong correlation between FoxO expression levels and respiratory diseases such as asthma. Through analyses of both murine models of asthma and asthmatic human samples, the study demonstrates a consistent reduction in FoxO expression, indicating its potential involvement in the pathogenesis of respiratory disorders. This correlation underscores the clinical relevance of FoxO dysregulation and opens avenues for developing treatments for respiratory conditions like asthma, COPD, and pulmonary fibrosis, addressing significant unmet clinical needs.

      (4) The study unveils intriguing mechanistic details regarding FoxO regulation and function. Particularly noteworthy is the observation of distinct regulatory mechanisms governing dFOXO translocation in response to different stressors. The independence of hypoxia-induced dFOXO translocation from JNK signaling adds complexity to our understanding of FoxO-mediated stress responses. Such mechanistic insights deepen our understanding of FoxO biology and pave the way for future investigations into the intricacies of FoxO signaling pathways in airway epithelial cells.

      Weaknesses

      (1) The manuscript does not distinguish between FoxO expression levels and FoxO activation status. While FoxO nuclear localization is observed in Drosophila and murine models, it remains unclear whether this reflects active FoxO signaling or merely FoxO expression, limiting the mechanistic understanding of FoxO regulation.

      (2) The manuscript utilizes various stressors across different experiments without providing a clear rationale for their selection. This lack of coherence in stressor choice complicates the interpretation of results and diminishes the ability to draw meaningful comparisons across experiments.

      (3) The manuscript frequently refers to "FoxO signaling" without providing specific signaling readouts. This ambiguity undermines the clarity of the conclusions drawn from the data and hinders the establishment of clear cause-and-effect relationships between FoxO activation and cellular responses to stress.

      (4) Many conclusions drawn in the manuscript rely heavily on the quantification of immunostaining images for FoxO nuclear localization. While this is an important observation, it does not provide a sufficient mechanistic understanding of FoxO expression or activation regulation.

      (5) The primary weakness in the Drosophila experiments is the analysis of dFoxO in homozygous dFoxO mutant animals, which precludes determining the specific role of dFoxO in airway cells. Despite available tools for tissue-specific gene manipulation, such as tissue-specific RNAi and CRISPR techniques, these approaches were not employed, limiting the precision of the findings.

      (6) In mammalian experiments, the results are primarily correlative, lacking causal evidence. While changes in FoxO expression are observed under pathological conditions, the absence of experiments on FoxO-deficient cells or tissues precludes establishing a causal relationship between FoxO dysregulation and respiratory pathologies.

      (7) Although the evidence suggests a critical role for FoxO in airway tissues, the precise nature of this role remains unclear. With gene expression changes analyzed only in Drosophila, the extent of conservation in downstream FoxO-mediated pathways between mammals and Drosophila remains uncertain. Additionally, the functional consequences of FoxO deficiency in airway cells were not determined, hindering comparisons between species and limiting insights into FoxO's functional roles in different contexts.

    1. eLife assessment

      This fundamental study provides insights into how pathogens respond, on a systemic level including several gene targets and clusters, to selected antimicrobial molecules. Compelling evidence is provided, through multi-omics and functional approaches, that very similar molecules originally designed to target the same bacterial protein act differently within the context of the whole set of cellular transcripts, expressed proteins, and pre-lethal metabolic changes. Given the incredibly fast accumulation of omics data to date and the much slower capacity of extracting biologically relevant insights from big data, this work exemplifies how the development of sensitive data analysis is still a major necessity in modern research.

    2. Reviewer #1 (Public Review):

      In this manuscript, entitled " Merging Multi-OMICs with Proteome Integral Solubility Alteration Unveils Antibiotic Mode of Action", Dr. Maity and colleagues aim to elucidate the mechanisms of action of antibiotics through combined approaches of omics and the PISA tool to discover new targets of five drugs developed against Helicobacter pylori.

      Strengths:

      Using transcriptomics, proteomic analysis, protein stability (PISA), and integrative analysis, Dr. Maity and colleagues have identified pathways targeted by five compounds initially discovered as inhibitors against H. pylori flavodoxin. This study underscores the necessity of a global approach to comprehensively understanding the mechanisms of drug action. The experiments conducted in this paper are well-designed and the obtained results support the authors' conclusions.

      Weaknesses:

      This manuscript describes several interesting findings. A few points listed below require further clarification:

      (1) Compounds IVk exhibits markedly different behavior compared to the other compounds. The authors are encouraged to discuss these findings in the context of existing literature or chemical principles.

      (2) The incubation time for treating H. pylori with the drugs was set at 4 hours for transcriptomic and proteomic analyses, compared to 20 min for PISA analysis. The authors need to explain the reason for these differences in treatment duration.

      (3) The PISA method facilitates the identification of proteins stabilized by drug treatment. DnaJ and Trigger factor (tig), well-known molecular chaperones, prevent protein aggregation under stress. Their enrichment in the soluble fraction is expected and does not necessarily indicate direct stabilization by the drugs. The possibility that their stabilization results from binding to other proteins destabilized by the drugs should be considered. To prevent any misunderstanding, the authors should clarify that their methodology does not solely identify direct targets. Instead, the combination of their findings sheds light on various pathways affected by the treatment.

      (4) At the end of the manuscript, the authors conclude that four compounds "strongly interact with CagA". However, detailed molecule/protein interaction studies are necessary to definitively support this claim. The authors should exercise caution in their statement. As the authors mentioned, additional research (not mandated in the scope of this current paper) is necessary to determine the drug's binding affinity to the proposed targets.

      (5) The authors should clarify the PISA-Express approach over standard PISA. A detailed explanation of the differences between both methods in the main text is important.

    3. Reviewer #2 (Public Review):

      Summary:

      This work has an important and ambitious goal: understanding the effects of drugs, in this case antimicrobial molecules, from a holistic perspective. This means that the effect of drugs on a group of genes and whole metabolic pathways is unveiled, rather than its immediate effect on a protein target only. To achieve this goal the authors successfully implement the PISA-Express method (Protein Integral Solubility Alteration), using combined transcriptomics, proteomics, and drug-induced changes in protein stability to retrieve a large number of genes and proteins affected by the used compounds. The compounds used in the study (compound IVa, IVb, IVj, and IVk) were all derived from the precursors compound IV, they are effective against Helicobacter pylori, and their mode of action on clusters of genes and proteins has been compared to the one of the known pylori drug metronidazole (MNZ). Due to this comparison, and confirmed by the diversity of responses induced by these very similar compounds, it can be understood that the approach used is reliable and very informative. Notably, although all compound IV derivatives were designed to target pylori Flavodoxin (Fld), only one showed a statistically significant shift of Fld solubility (compound IVj, FIG S11). For most other compounds, instead, the involvement of other possible targets affecting diverse metabolic pathways was also observed, notably concerning a series of genes with other important functions: CagA (virulence factor), FtsY/FtsA (cell division), AtpD (ATP-synthase complex), the essential GTPase ObgE, Tig (protein export), as well as other proteins involved in ribosomal synthesis, chemotaxis/motility and DNA replication/repairs. Finally, for all tested molecules, in vivo functional data have been collected that parallel the omics predictions, comforting them and showing that compound IV derivatives differently affect cellular generation of reactive oxygen species (ROS), oxygen consumption rates (OCR), DNA damage, and ATP synthesis.

      Strengths:

      The approach used is very potent in retrieving the effects of chemically active molecules (in this case antimicrobial ones) on whole cells, evidencing protein and gene networks that are involved in cell sensitivity to the studied molecules. The choice of these compounds against H. pylori is perfect, showcasing how different the real biological response is, compared to the hypothetical one. In fact, although all molecules were retrieved based on their activity on Fld, the authors unambiguously show that large unexpected gene clusters may, and in fact are, affected by these compounds, and each of them in different manners.

      Impact:

      The present work is the first report relying on PISA-Express performed on living bacterial cells. Because of its findings, this work will certainly have a high impact on the way we design research to develop effective drugs, allowing us to understand the fine effects of a drug on gene clusters, drive molecule design towards specific metabolic pathways, and eventually better plan the combination of multiple active molecules for drug formulation. Beyond this, however, we expect this article to impact other related and unrelated fields of research as well. The same holistic approaches might also allow gaining deep, and sometimes unexpected, insight into the cellular targets involved in drug side effects, drug resistance, toxicity, and cellular adaptation, in fields beyond the medicinal one, such as cellular biology and environmental studies on pollutants.

    1. eLife assessment

      This important study reveals how Drosophila may be used to investigate the role of missense variants in the gene PLCG1 related to human disease in case studies. The evidence that most of these variants have a gain-of-function effect in the fly is convincing and supportive of their pathogenic effect. With some additional control experiments to assess overexpression toxicity, this work would be of relevance to human and Drosophila geneticists alike.

    2. Reviewer #1 (Public Review):

      Summary:

      This manuscript provides an initial characterization of three new missense variants of the PLCG1 gene associated with diverse disease phenotypes, utilizing a Drosophila model to investigate their molecular effects in vivo. Through the meticulous creation of genetic tools, the study assesses the small wing (sl) phenotype - the fly's ortholog of PLCG1 - across an array of phenotypes from longevity to behavior in both sl null mutants and variants. The findings indicate that the Drosophila PLCG1 ortholog displays aberrant functions. Notably, it is demonstrated that overexpression of both human and Drosophila PLCG1 variants in fly tissue leads to toxicity, underscoring their pathogenic potential in vivo.

      Strengths:

      The research effectively highlights the physiological significance of sl in Drosophila. In addition, the study establishes the in vivo toxicity of disease-associated variants of both human PLCG1 and Drosophila sl.

      Weaknesses:

      The study's limitations include the human PLCG1 transgene's inability to compensate for the Drosophila sl null mutant phenotype, suggesting potential functional divergence between the species. This discrepancy signals the need for additional exploration into the mechanistic nuances of PLCG1 variant pathogenesis, especially regarding their gain-of-function effects in vivo.

      Overall:

      The study offers compelling evidence for the pathogenicity of newly discovered disease-related PLCG1 variants, manifesting as toxicity in a Drosophila in vivo model, which substantiates the main claim by the authors. Nevertheless, a deeper inquiry into the specific in vivo mechanisms driving the toxicity caused by these variants in Drosophila could significantly enhance the study's impact.

    3. Reviewer #2 (Public Review):

      The manuscript by Ma et al. reports the identification of three unrelated people who are heterozygous for de novo missense variants in PLCG1, which encodes phospholipase C-gamma 1, a key signaling protein. These individuals present with partially overlapping phenotypes including hearing loss, ocular pathology, cardiac defects, abnormal brain imaging results, and immune defects. None of the patients present with all of the above phenotypes. PLCG1 has also been implicated as a possible driver for cell proliferation in cancer.

      The three missense variants found in the patients result in the following amino acid substitutions: His380Arg, Asp1019Gly, and Asp1165Gly. PLCG1 (and the closely related PLCG2) have a single Drosophila ortholog called small wing (sl). sl-null flies are viable but have small wings with ectopic wing veins and supernumerary photoreceptors in the eye. As all three amino acids affected in the patients are conserved in the fly protein, in this work Ma et al. tested whether they are pathogenic by expressing either reference or patient variant fly or human genes in Drosophila and determining the phenotypes produced by doing so.

      Expression in Drosophila of the variant forms of PLCG1 found in these three patients is toxic; highly so for Asp1019Gly and Asp1165Gly, much more modestly for His380Arg. Another variant, Asp1165His which was identified in lymphoma samples and shown by others to be hyperactive, was also found to be toxic in the Drosophila assays. However, a final variant, Ser1021Phe, identified by others in an individual with severe immune dysregulation, produced no phenotype upon expression in flies.

      Based on these results, the authors conclude that the PLCG1 variants found in patients are pathogenic, producing gain-of-function phenotypes through hyperactivity. In my view, the data supporting this conclusion are robust, despite the lack of a detectable phenotype with Ser1021Phe, and I have no concerns about the core experiments that comprise the paper.

      Figure 6, the last in the paper, provides information about PLCG1 structure and how the different variants would affect it. It shows that the His380, Asp1019, and Asp1165 all lie within catalytic domains or intramolecular interfaces and that variants in the latter two affect residues essential for autoinhibition. It also shows that Ser1021 falls outside the key interface occupied by Asp1019, but more could have been said about the potential effects of Ser1021Phe.

      Overall, I believe the authors fully achieved the aims of their study. The work will have a substantial impact because it reports the identification of novel disease-linked genes, and because it further demonstrates the high value of the Drosophila model for finding and understanding gene-disease linkages.

    4. Reviewer #3 (Public Review):

      Summary:

      The paper attempts to model the functional significance of variants of PLCG2 in a set of patients with variable clinical manifestations.

      Strengths:

      A study attempting to use the Drosophila system to test the function of variants reported from human patients.

      Weaknesses:

      Additional experiments are needed to shore up the claims in the paper. These are listed below.

      Major Comments:

      (1) Does the pLI/ missense constraint Z score prediction algorithm take into consideration whether the gene exhibits monoallelic or biallelic expression?

      (2) Figure 1B: Include human PLCG2 in the alignment that displays the species-wide conserved variant residues.

      (3) Figure 4A:<br /> Given that<br /> (i) sl is predicted to be the fly ortholog for both mammalian PLCγ isozymes: PLCG1 and PLCG2 [Line 62]<br /> (ii) they are shown to have non-redundant roles in mammals [Line 71] and<br /> (iii) reconstituting PLCG1 is highly toxic in flies, leading to increased lethality.<br /> This raises questions about whether sl mutant phenotypes are specifically caused by the absence of PLG1 or PLCG2 functions in flies. Can hPLCG2 reconstitution in sl mutants be used as a negative control to rule out the possibility of the same?

      (4) Do slT2A/Y; UAS-PLCG1Reference flies survive when grown at 22{degree sign}C? Since transgenic fly expressing PLCG1 cDNA when driven under ubiquitous gal4s, Tubulin and Da, can result in viable progeny at 22{degree sign}C, the survival of slT2A/Y; UAS-PLCG1Reference should be possible.<br /> and similarly<br /> Does slT2A flies exhibit the phenotypes of (i) reduced eclosion rate (ii) reduced wing size and ectopic wing veins and (iii) extra R7 photoreceptor in the fly eye at 22{degree sign}C?<br /> If so, will it be possible to get a complete rescue of the slT2A mutant phenotypes with the hPLCG1 cDNA at 22{degree sign}C? This dataset is essential to establish Drosophila as an ideal model to study the PLCG1 de novo variants.

      (5) Localisation and western blot assays to check if the introduction of the de novo mutations can have an impact on the sub-cellular targeting of the protein or protein stability respectively.

      (6) Analysing the nature of the reported gain of function (experimental proof for the same is missing in the manuscript) variants:<br /> Instead of directly showing the effect of introducing the de novo variant transgenes in the Drosophila model especially when the full-length PLCG1 is not able to completely rescue the slT2A phenotype;<br /> (i) Show that the gain-of-function variants can have an impact on the protein function or signalling via one of the three signalling outputs in the mammalian cell culture system: (i) inositol-1,4,5-trisphosphate production, (ii) intracellular Ca2+ release or (iii) increased phosphorylation of extracellular signal-related kinase, p65, and p38.<br /> OR<br /> (ii) Run a molecular simulation to demonstrate how the protein's auto-inhibited state can be disrupted and basal lipase activity increased by introducing D1019G and D1165G, which destabilise the association between the C2 and cSH2 domains. The H380R variant may also exhibit characteristics similar to the previously documented H335A mutation which leaves the protein catalytically inactive as the residue is important to coordinate the incoming water molecule required for PIP2 hydrolysis.

      (7) Clarify the reason for carrying out the wing-specific and eye-specific experiments using nub-gal4 and eyless-gal4 at 29˚C despite the high gal4 toxicity at this temperature.

      (8) For the sake of completeness the authors should also report other variants identified in the genomes of these patients that could also contribute to the clinical features.

    1. eLife assessment

      This useful manuscript challenges the utility of current paradigms for estimating brain-age with magnetic resonance imaging measures, but presents inadequate evidence to support the suggestion that an alternative approach focused on predicting cognition is better. The paper would benefit from a clearer explication of the methods and a more critical evaluation of the conceptual basis of the different models. This work will be of interest to researchers working on brain-age and related models.

    2. Reviewer #1 (Public Review):

      In this paper, the authors evaluate the utility of brain age derived metrics for predicting cognitive decline by performing a 'commonality' analysis in a downstream regression that enables the different contribution of different predictors to be assessed. The main conclusion is that brain age derived metrics do not explain much additional variation in cognition over and above what is already explained by age. The authors propose to use a regression model trained to predict cognition ('brain cognition') as an alternative suited to applications of cognitive decline. While this is less accurate overall than brain age, it explains more unique variance in the downstream regression.

      REVISED VERSION: while the authors have partially addressed my concerns, I do not feel they have addressed them all. I do not feel they have addressed the weight instability and concerns about the stacked regression models satisfactorily. I also must say that I agree with Reviewer 3 about the limitations of the brain age and brain cognition methods conceptually. In particular that the regression model used to predict fluid cognition will by construction explain more variance in cognition than a brain age model that is trained to predict age. This suffers from the same problem the authors raise with brain age and would indeed disappear if the authors had a separate measure of cognition against which to validate and were then to regress this out as they do for age correction. I am aware that these conceptual problems are more widespread than this paper alone (in fact throughout the brain age literature), so I do not believe the authors should be penalised for that. However, I do think they can make these concerns more explicit and further tone down the comments they make about the utility of brain cognition. I have indicated the main considerations about these points in the recommendations section below.

      In this paper, the authors evaluate the utility of brain age derived metrics for predicting cognitive decline by performing a 'commonality' analysis in a downstream regression that enables the different contribution of different predictors to be assessed. The main conclusion is that brain age derived metrics do not explain much additional variation in cognition over and above what is already explained by age. The authors propose to use a regression model trained to predict cognition ('brain cognition') as an alternative that explains more unique variance in the downstream regression.

      This is a reasonably good paper and the use of a commonality analysis is a nice contribution to understanding variance partitioning across different covariates. I have some comments that I believe the authors ought to address, which mostly relate to clarity and interpretation

      First, from a conceptual point of view, the authors focus exclusively on cognition as a downstream outcome. I would suggest the authors nuance their discussion to provide broader considerations of the utility of their method and on the limits of interpretation of brain age models more generally.

      Second, from a methods perspective , there is not a sufficient explanation of the methodological procedures in the current manuscript to fully understand how the stacked regression models were constructed. I would request that the authors provide more information to enable the reader to better understand the stacked regression models used to ensure that these models are not overfit.

      Please also provide an indication of the different regression strengths that were estimated across the different models and cross-validation splits. Also, how stable were the weights across splits?

      Please provide more details about the task designs, MRI processing procedures that were employed on this sample in addition to the regression methods and bias correction methods used. For example, there are several different parameterisations of the elastic net, please provide equations to describe the method used here so that readers can easily determine how the regularisation parameters should be interpreted.

    1. Author response:

      Public Reviews:

      Reviewer #1:

      Summary:

      Casas-Tinto et al. present convincing data that injury of the adult Drosophila CNS triggers transdifferentiation of glial cells and even the generation of neurons from glial cells. This observation opens up the possibility of getting a handle on the molecular basis of neuronal and glial generation in the vertebrate CNS after traumatic injury caused by Stroke or Crush injury. The authors use an array of sophisticated tools to follow the development of glial cells at the injury site in very young and mature adults. The results in mature adults revealing a remarkable plasticity in the fly CNS and dispels the notion that repair after injury may be only possible in nerve cords which are still developing. The observation of so-called VC cells which do not express the glial marker repo could point to the generation of neurons by former glial cells.

      Conclusion:

      The authors present an interesting story that is technically sound and could form the basis for an in-depth analysis of the molecular mechanism driving repair after brain injury in Drosophila and vertebrates.

      Strengths:

      The evidence for transdifferentiation of glial cells is convincing. In addition, the injury to the adult CNS shows an inherent plasticity of the mature ventral nerve cord which is unexpected.

      Weaknesses:

      Traumatic brain injury in Drosophila has been previously reported to trigger mitosis of glial cells and generation of neural stem cells in the larval CNS and the adult brain hemispheres. Therefore this report adds to but does not significantly change our current understanding. The origin and identity of VC cells is unclear.

      The Reviewer correctly points out that it has been reported that traumatic brain injury trigger generation of neural stem cells. However, according to previous reports, those cells where quiescent Dpn+ neuroblast. We now report that already differentiated adult neuropil glia transdifferentiate into neurons. Which is a new mechanism not previously reported.

      We agree with the reviewer regarding the identity of VC neurons although according to the results of G-TRACE experiments the origin is clear, they originate from neuropil glia (i.e. Astrocyte-like glia and ensheathing glia). We will use a battery of antibodies previously reported to identify specific subtypes of neurons to identify these newly generated neurons.

      Reviewer #2:

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe the interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of differentiated glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such glia-derived neurogenesis is specifically favored following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Numerous experiments have been carried out in 7-day-old flies, showing that the observed plasticity is not due to residual developmental remodeling or a still immature VNC.

      By elegantly combining different genetic tools, the authors show glial divisions with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies Prospero in glia as an important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      We express our gratitude to the reviewer for their keen appreciation of our efforts and their enthusiasm for the outcomes of this research.

      Weaknesses:

      Although the authors do use a variety of methods to show glial proliferation, the EdU data (Figure 1B) could be more informative (Figure 1B) by displaying images of non-injured animals and providing quantifications or the mention of these numbers based on results previously acquired in the system.

      We appreciate the Reviewer’s comment. We believed that adding images of non-injured animals did not add new information as we already quantified the increase of glial proliferation upon injury in Losada-Perez let al. 2021. Besides, the porpoise of this experiment was to figure out if dividing cells where Astrocyte-like glia rather than the number of dividing cells. Comparing independent experiments could be tricky but if we compare the quantifications of G2-M glia (repo>fly-Fucci) done in Losada-Perez et al 2021 (fig 1C) with the quantifications of G2-M neuropil glia done in this work (fig 1C) we can see that the numbers are comparable.

      The experiments relying on the FUCCI cell cycle reporter suggested considerable baseline proliferation for EGs and ALGs, but when using an independent method (Twin Spot MARCM), mitotic marking was only detected for ALGs. This discrepancy could be addressed by assessing the co-localization of the different glia subsets using the identified driver lines with mitotic markers such as PH3.

      In our understanding this discrepancy could be explained by the magnitude of proliferation. The lower proliferation rate of EG (as indicate the fly-fucci experiments) combining with the incomplete efficiency of MARCM clones induction reduces considerably the chances of finding EG MARCM clones. PH3 is a mitotic marker but it is also found in apoptotic cells (Kim and Park 2012. DOI: 10.1371/journal.pone.0044307), however we can do the suggested experiment and quantify the results.

      The data in Figure 1C would be more convincing in combination with images of the FUCCI Reporter as it can provide further information on the location and proportion of glia that enter the cell cycle versus the fraction that remains quiescent.

      We will add the suggested images.

      The analyses of inter-glia conversion in Figure 3 are complicated by the fact that Prospero RNAi is both used to suppress EG - to ALG conversion and as a marker to establish ALG nature. Clarifications if the GFP+ cells still expressed Pros or were classified as NP-like GFP cells are required here.

      As described in the text, Pros is a marker for ALG and the results suggest that Prospero expression is required for the EG to ALG transition. We will clarify these concepts in the text accordingly. In figure 3 we showed images of NP-like cells originated from EG that are prospero+, and therefore supporting the transdifferentiation from EG to ALG.

      The conclusion that ALG and EG glial cells can give rise to cells of neuronal lineage is based on glial lineage information (GFP+ cells from glial G-trace) and staining for the neuronal marker Elav. The use of other neuronal markers apart from Elav or morphological features would provide a more compelling case that GFP+ cells are mature neurons.

      We completely agree with the reviewer's observation regarding the identity of VC neurons. We will try to identify the identity of these cells using previously described antibodies to identify neuronal populations. We will also appreciate any suggestions regarding the antibodies we can use

      Although the text discusses in which contexts, glial plasticity is observed or increased upon injury, the figures are less clear regarding this aspect. A more systematic comparison of injured VNCs versus homeostatic conditions, combined with clear labelling of the injury area would facilitate the understanding of the panels.

      We appreciate the Reviewer’s observation. We will carefully check all figures in order to increase their clarity

      Context/Discussion

      The study finds that glia in the ventral cord of flies have latent neurogenic potential. Such observations have not been made regarding glia in the fly brain, where injury is reported to drive glial divisions or the proliferation of undifferentiated progenitor cells with neurogenic potential.

      Discussing this different strategy for cell replacement adopted by glia in the VNC and pointing out differences to other modes seems fascinating. Highlighting differences in the reactiveness of glia in the VNC compared to the brain also seems highly relevant as they may point to different properties to repair damage.

      Based on the assays employed, the study points to a significant amount of glial "identity" changes or interconversions, which is surprising under homeostatic conditions. The significance of this "baseline" plasticity remains undiscussed, although glia unarguably show extensive adaptations during nervous system development.

      It would be interesting to know if the "interconversion" of glia is determined by the needs in the tissue or would shift in the context of selective ablation/suppression of a glial type.

      We deeply appreciate the Reviewer’s enthusiasm on this subject, it is indeed fascinating. We made a reduced discussion in order to fit in the eLife Short report requirements but the specific condition that trigger glial interconversion are of great interest for us. To compromise EG or ALG viability and evaluate the behaviour of glial cells is of great interest for developmental biology and regeneration, but the precise scenario to develop these experiments is not well defined. In this report, we aim to reproduce an injury in Drosophila brain and this model should serve to analyze cellular behaviours. The scenario where we deplete on specific subpopulation of glial cells is conceptually attractive, but far away from the scope of this report.

      Reviewer #3:

      In this manuscript, Casas-Tintó et al. explore the role of glial cells in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a model organism and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury.

      This paper provides a new mechanism in regeneration and gives an understanding of the role of glial cells in the process.

    2. eLife assessment

      In this work, the authors use a Drosophila adult ventral nerve cord injury model extending and confirming previous observations; this important study reveals key aspects of adult neural plasticity. Taking advantage of several genetic reporter and fate tracing tools, the authors provide solid evidence for different forms of glial plasticity, that are increased upon injury. The data on detected plasticity under physiologic conditions and especially the extent of cell divisions and cell fate changes upon injury would benefit from validation by additional markers. The experimental part would improve if strengthened and accompanied by a more comprehensive integration of results regarding glial reactivity in the adult CNS.

    3. Reviewer #1 (Public Review):

      Summary:

      Casas-Tinto et al. present convincing data that injury of the adult Drosophila CNS triggers transdifferentiation of glial cells and even the generation of neurons from glial cells. This observation opens up the possibility of getting a handle on the molecular basis of neuronal and glial generation in the vertebrate CNS after traumatic injury caused by Stroke or Crush injury. The authors use an array of sophisticated tools to follow the development of glial cells at the injury site in very young and mature adults. The results in mature adults revealing a remarkable plasticity in the fly CNS and dispels the notion that repair after injury may be only possible in nerve cords which are still developing. The observation of so-called VC cells which do not express the glial marker repo could point to the generation of neurons by former glial cells.

      Conclusion:

      The authors present an interesting story that is technically sound and could form the basis for an in-depth analysis of the molecular mechanism driving repair after brain injury in Drosophila and vertebrates.

      Strengths:

      The evidence for transdifferentiation of glial cells is convincing. In addition, the injury to the adult CNS shows an inherent plasticity of the mature ventral nerve cord which is unexpected.

      Weaknesses:

      Traumatic brain injury in Drosophila has been previously reported to trigger mitosis of glial cells and generation of neural stem cells in the larval CNS and the adult brain hemispheres. Therefore this report adds to but does not significantly change our current understanding. The origin and identity of VC cells is unclear.

    4. Reviewer #2 (Public Review):

      Summary:

      Casas-Tinto et al., provide new insight into glial plasticity using a crush injury paradigm in the ventral nerve cord (VNC) of adult Drosophila. The authors find that both astrocyte-like glia (ALG) and ensheating glia (EG) divide under homeostatic conditions in the adult VNC and identify ALG as the glial population that specifically ramps up proliferation in response to injury, whereas the number of EGs decreases following the insult. Using lineage-tracing tools, the authors interestingly observe the interconversion of glial subtypes, especially of EGs into ALGs, which occurs independent of injury and is dependent on the availability of the transcription factor Prospero in EGs, adding to the plasticity observed in the system. Finally, when tracing the progeny of differentiated glia, Casas-Tinto and colleagues detect cells of neuronal identity and provide evidence that such glia-derived neurogenesis is specifically favored following ventral nerve cord injury, which puts forward a remarkable way in which glia can respond to neuronal damage.

      Strengths:

      This study highlights a new facet of adult nervous system plasticity at the level of the ventral nerve cord, supporting the view that proliferative capacity is maintained in the mature CNS and stimulated upon injury.

      The injury paradigm is well chosen, as the organization of the neuromeres allows specific targeting of one segment, compared to the remaining intact, and with the potential to later link observed plasticity to behavior such as locomotion.

      Numerous experiments have been carried out in 7-day-old flies, showing that the observed plasticity is not due to residual developmental remodeling or a still immature VNC.

      By elegantly combining different genetic tools, the authors show glial divisions with mitotic-dependent tracing and find that the number of generated glia is refined by apoptosis later on.

      The work identifies Prospero in glia as an important coordinator of glial cell fate, from development to the adult context, which draws further attention to the upstream regulatory mechanisms.

      Weaknesses:

      Although the authors do use a variety of methods to show glial proliferation, the EdU data (Figure 1B) could be more informative (Figure 1B) by displaying images of non-injured animals and providing quantifications or the mention of these numbers based on results previously acquired in the system.

      The experiments relying on the FUCCI cell cycle reporter suggested considerable baseline proliferation for EGs and ALGs, but when using an independent method (Twin Spot MARCM), mitotic marking was only detected for ALGs. This discrepancy could be addressed by assessing the co-localization of the different glia subsets using the identified driver lines with mitotic markers such as PH3.

      The data in Figure 1C would be more convincing in combination with images of the FUCCI Reporter as it can provide further information on the location and proportion of glia that enter the cell cycle versus the fraction that remains quiescent.

      The analyses of inter-glia conversion in Figure 3 are complicated by the fact that Prospero RNAi is both used to suppress EG - to ALG conversion and as a marker to establish ALG nature. Clarifications if the GFP+ cells still expressed Pros or were classified as NP-like GFP cells are required here.

      The conclusion that ALG and EG glial cells can give rise to cells of neuronal lineage is based on glial lineage information (GFP+ cells from glial G-trace) and staining for the neuronal marker Elav. The use of other neuronal markers apart from Elav or morphological features would provide a more compelling case that GFP+ cells are mature neurons.

      Although the text discusses in which contexts, glial plasticity is observed or increased upon injury, the figures are less clear regarding this aspect. A more systematic comparison of injured VNCs versus homeostatic conditions, combined with clear labelling of the injury area would facilitate the understanding of the panels.

      Context/Discussion

      The study finds that glia in the ventral cord of flies have latent neurogenic potential. Such observations have not been made regarding glia in the fly brain, where injury is reported to drive glial divisions or the proliferation of undifferentiated progenitor cells with neurogenic potential.

      Discussing this different strategy for cell replacement adopted by glia in the VNC and pointing out differences to other modes seems fascinating. Highlighting differences in the<br /> the reactiveness of glia in the VNC compared to the brain also seems highly relevant as they may point to different properties to repair damage.

      Based on the assays employed, the study points to a significant amount of glial "identity" changes or interconversions, which is surprising under homeostatic conditions. The significance of this "baseline" plasticity remains undiscussed, although glia unarguably show extensive adaptations during nervous system development.

      It would be interesting to know if the "interconversion" of glia is determined by the needs in the tissue or would shift in the context of selective ablation/suppression of a glial type.

    5. Reviewer #3 (Public Review):

      In this manuscript, Casas-Tintó et al. explore the role of glial cells in the response to a neurodegenerative injury in the adult brain. They used Drosophila melanogaster as a model organism and found that glial cells are able to generate new neurons through the mechanism of transdifferentiation in response to injury.

      This paper provides a new mechanism in regeneration and gives an understanding of the role of glial cells in the process.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this manuscript, Huang and colleagues explored the role of iron in bacterial therapy for cancer. Using proteomics, they revealed the upregulation of bacterial genes that uptake iron, and reasoned that such regulation is an adaptation to the iron-deficient tumor microenvironment. Logically, they engineered E. Coli strains with enhanced iron-uptake efficiency, and showed that these strains, together with iron scavengers, suppress tumor growth in a mouse model. Lastly, they reported the tumor suppression by IroA-E. Coli provides immunological memory via CD8+ T cells. In general, I find the findings in the manuscript novel and the evidence convincing.

      (1) Although the genetic and proteomic data are convincing, would it be possible to directly quantify the iron concentration in (1) E. Coli in different growth environments, and (2) tumor microenvironment? This will provide the functional consequences of upregulating genes that import iron into the bacteria.

      We appreciate the reviewer’s comment regarding the precise quantification of iron concentrations. In our study, we attempted various experimental approaches, including Immunohistochemistry utilizing an a Fe3+ probe, iron assay kit (ab83366), and Inductively Coupled Plasma Mass Spectrometry (ICP-MS). Despite these attempts, the quantification of oxidized Fe3+ concentrations proved challenging due to the inherently low levels of Fe ions and difficulty to distinguish Fe2+ and Fe3+. We observed measurements below the detection threshold of even the sensitive ICP-MS technique. To circumvent this limitation, we designed an experiment wherein bacteria were cultured in a medium supplemented with Chrome Azurol S (CAS) reagent, which colormetrically detects siderophore activity. We compared WT bacteria and IroA-expressing bacteria at varying levels of Lcn2 proteins. The outcome, as depicted in the updated Fig. 3b, reveals an enhanced iron acquisition capability in IroA-E. coli under the presence of Lcn2 proteins, in comparison to the wild-type E. coli strains. In addition to the Lcn2 study, the proteomic study in Figure 4 highlights the competitive landscape between cancer cells and bacteria. We observed that IroA-E. coli showed reduced stress responses and exerted elevated iron-associated stress to cancer cells, thus further supporting the IroA-E. coli’s iron-scavenging capability against nutritional immunity.

      (2) Related to 1, the experiment to study the synergistic effect of CDG and VLX600 (lines 139-175) is very nice and promising, but one flaw here is a lack of the measurement of iron concentration. Therefore, a possible explanation could be that CDG acts in another manner, unrelated to iron uptake, that synergizes with VLX600's function to deplete iron from cancer cells. Here, a direct measurement of iron concentration will show the effect of CDG on iron uptake, thus complementing the missing link.

      We appreciate the reviewer’s comment and would like to point the reviewer to our results in Figure S3, which shows that the expression of CDG enhances bacteria survival in the presence of LCN2 proteins, which reflects the competitive relationship between CDG and enterobactin for LCN2 proteins as previously shown by Li et al. [Nat Commun 6:8330, 2015]. We regret to inform the reviewer that direct measurement of iron concentration was attempted to no avail due to the limited sensitivity of iron detecting assays. We do acknowledge that CDG may exert different effects in addition to enhancing iron uptake, particularly the potentiation of the STING pathway. We pointed out such effect in Fig 2c that shows enhanced macrophage stimulation by the CDG-expressing bacteria. We would like to accentuate, however, that a primary objective of the experiment is to show that the manipulation of nutritional immunity for promoting anticancer bacterial therapy can be achieved by combining bacteria with iron chelator VLX600. The multifaceted effects of CDG prompted us to focus on IroA-E. coli in subsequent experiments to examine the role of nutritional immunity on bacterial therapy. We have updated the associated text to better convey our experimental design principle.

      Lines 250-268: Although statistically significant, I would recommend the authors characterize the CD8+ T cells a little more, as the mechanism now seems quite elusive. What signals or memories do CD8+ T cells acquire after IroA-E. Coli treatment to confer their long-term immunogenicity?

      We apologize for the overinterpretation of the immune memory response in our previous manuscript and appreciate the reviewer’s recommendation to further characterize CD8+ T cells post-IroA-E. coli treatment. Our findings, which show robust tumor inhibition in rechallenge studies, indicate establishment of anticancer adaptive immune responses. As the scope of the present work is aimed at demonstrating the value of engineered bacteria for overcoming nutritional immunity, expounding on the memory phenotypes of the resulting cellular immunity is beyond the scope of the study. We do acknowledge that our initial writing overextended our claims and have revised the manuscript accordingly. The revised manuscript highlights induction of anticancer adaptive immunity, attributable to CD8+ T cells, following the bacterial therapy.

      (3) Perhaps this goes beyond the scope of the current manuscript, but how broadly applicable is the observed iron-transport phenomenon in other tumor models? I would recommend the authors to either experimentally test it in another model or at least discuss this question.

      We highly appreciate the reviewer’s suggestion regarding the generalizability of the iron-transport phenomenon in diverse tumor models. To address this, we extended our investigations beyond the initial model, employing B16-F10 melanoma and E0771 breast cancer in mouse subcutaneous models. The results, as depicted in Figures 3g to 3j and Figure S5, demonstrate the superiority of IroA-E. coli over WT bacteria in tumor inhibition. These findings support the broad implication of nutritional immunity as well as the potential of iron-scavenging bacteria for different solid tumor treatments.

      Reviewer #2 (Public Review):

      Summary:

      The authors provide strong evidence that bacteria, such as E. coli, compete with tumor cells for iron resources and consequently reduce tumor growth. When sequestration between LCN2 and bacterobactin is blocked by upregulating CDG(DGC-E. coli) or salmochelin(IroA-E.coli), E. coli increase iron uptake from the tumor microenvironment (TME) and restrict iron availability for tumor cells. Long-term remission in IroA-E.coli treated mice is associated with enhanced CD8+ T cell activity. Additionally, systemic delivery of IroA-E.coli shows a synergistic effect with chemotherapy reagent oxaliplatin to reduce tumor growth.

      Strengths:

      It is important to identify the iron-related crosstalk between E. coli and TME. Blocking lcn2-bacterobactin sequestration by different strategies consistently reduces tumor growth.

      Weaknesses:

      As engineered E.coli upregulate their function to uptake iron, they may increase the likelihood of escaping from nutritional immunity (LCN2 becomes insensitive to sequester iron from the bacteria). Would this raise the chance of developing sepsis? Do authors think that it is safe to administrate these engineered bacteria in mice or humans?

      We appreciate the reviewer’s comment on the safety evaluation of the iron-scavenging bacteria. To address the concern, we assessed the potential risk of sepsis development by measuring the bacterial burden and performing whole blood cell analyses following intravenous injection of the engineered bacteria. As illustrated in Figures 3k and 3l, our findings indicate that the administration of these engineered bacteria does not elevate the risk of sepsis. The blood cell analysis suggests that mice treated with the bacteria eventually return to baseline levels comparable to untreated mice, supporting the safety of this approach in our experimental models.

      Reviewer #3 (Public Review):

      Summary:

      Based on their observation that tumor has an iron-deficient microenvironment, and the assumption that nutritional immunity is important in bacteria-mediated tumor modulation, the authors postulate that manipulation of iron homeostasis can affect tumor growth. They show that iron chelation and engineered DGC-E. coli have synergistic effects on tumor growth suppression. Using engineered IroA-E. coli that presumably have more resistance to LCN2, they show improved tumor suppression and survival rate. They also conclude that the IroA-E. coli treated mice develop immunological memory, as they are resistant to repeat tumor injections, and these effects are mediated by CD8+ T cells. Finally, they show synergistic effects of IroA-E. coli and oxaliplatin in tumor suppression, which may have important clinical implications.

      Strengths:

      This paper uses straightforward in vitro and in vivo techniques to examine a specific and important question of nutritional immunity in bacteria-mediated tumor therapy. They are successful in showing that manipulation of iron regulation during nutritional immunity does affect the virulence of the bacteria, and in turn the tumor. These findings open future avenues of investigation, including the use of different bacteria, different delivery systems for therapeutics, and different tumor types.

      Weaknesses:

      • There is no discussion of the cancer type and why this cancer type was chosen. Colon cancer is not one of the more prominently studied cancer types for LCN2 activity. While this is a proof-of-concept paper, there should be some recognition of the potential different effects on different tumor types. For example, this model is dependent on significant LCN production, and different tumors have variable levels of LCN expression. Would the response of the tumor depend on the role of iron in that cancer type? For example, breast cancer aggressiveness has been shown to be influenced by FPN levels and labile iron pools.

      We highly appreciate the reviewer’s insightful comment on the varying LCN2 activities across different tumor types. In light of the reviewer’s suggestion, we extended our investigations beyond the initial colon cancer model, employing B16-F10 melanoma and E0771 breast cancer in mouse subcutaneous models. The results, as depicted in Figures 3g to 3j and Figure S5, demonstrate that IroA-E. coli consistently outperforms WT bacteria in tumor inhibition. We acknowledge the reviewer’s comment regarding LCN2 being more prominently examined in breast cancer and have highlighted this aspect in the revised manuscript. For colon and melanoma cancers, several reports have pointed out the correlation of LCN2 expression and the aggressiveness of these cancers [Int J Cancer. 2021 Oct 1;149(7):1495-1511][Nat Cancer. 2023 Mar;4(3):401-418], albeit to a lesser extent. These findings support the broad implication of nutritional immunity as well as the potential of iron-scavenging bacteria for different solid tumor treatments. The manuscript has been revised to reflect the reviewer’s insightful comment.

      • Are the effects on tumor suppression assumed to be from E. coli virulence, i.e. Does the higher number of bacteria result in increased immune-mediated tumor suppression? Or are the effects partially from iron status in the tumor cells and the TME?

      We appreciate the reviewer’s question regarding the therapeutic mechanism of IroA-E. coli. Bacterial therapy exerts its anticancer action through several different mechanisms, including bacterial virulence, nutrient and ecological competition, and immune stimulation. Decoupling one mechanism from another would be technically challenging and beyond the scope of the present work. With the objective of demonstrating that an iron-scavenging bacteria can elevate anticancer activity by circumventing nutritional immunity, we highlight our data in Fig. S6, which shows that IroA-E. coli administration resulted in higher bacterial colonization within solid tumors compared to WT-E. coli on Day 15. This increased bacterial presence supports our iron-scavenging bacteria design, and we highlight a few anticancer mechanisms mediated by the engineered bacteria. Firstly, as shown in Fig. 4d, IroA-E. coli is shown to induce an elevated iron stress response in tumor cells as the treated tumor cells show increased expression of transferrin receptors. Secondly, our experiments involving CD8+ T cell depletion indicates that the IroA-E. coli establishes a more robust anticancer CD8+ T cell response than WT bacteria. Both immune-mediated responses and alterations in iron status within the tumor microenvironment are demonstrated to contribute to the enhanced anticancer activity of IroA-E. coli in the present study.

      • If the effects are iron-related, could the authors provide some quantification of iron status in tumor cells and/or the TME? Could the proteomic data be queried for this data?

      We appreciate the reviewer’s query regarding the quantification of iron concentrations. In our study, we attempted various experimental approaches, including Immunohistochemistry utilizing an a Fe3+ probe, iron assay kit (ab83366), and Inductively Coupled Plasma Mass Spectrometry (ICP-MS). Despite these attempts, the quantification of oxidized Fe3+ concentrations proved challenging due to the inherently low levels of Fe ions and difficulty to distinguish Fe2+ and Fe3+. We observed measurements below the detection threshold of even the sensitive ICP-MS technique. Consequently, to circumvent this limitation, we designed an experiment wherein bacteria were cultured in a medium supplemented with Chrome Azurol S (CAS) reagent, which colormetrically detects siderophore activity. We compared WT bacteria and IroA-expressing bacteria at varying levels of Lcn2 proteins. The outcome, as depicted in the updated Fig. 3b, reveals an enhanced iron acquisition capability in IroA-E. coli under the presence of Lcn2 proteins, in comparison to the wild-type E. coli strains. In addition to the Lcn2 study, the proteomic study in Figure 4 highlights the competitive landscape between cancer cells and bacteria. We observed that IroA-E. coli showed reduced stress responses and exerted elevated iron-associated stress to cancer cells, thus further supporting the IroA-E. coli’s iron-scavenging capability against nutritional immunity.

      Reviewing Editor:

      The authors provide compelling technically sound evidence that bacteria, such as E. coli, can be engineered to sequester iron to potentially compete with tumor cells for iron resources and consequently reduce tumor growth. Long-term remission in IroA-E.coli treated mice is associated with enhanced CD8+ T cell activity and a synergistic effect with chemotherapy reagent oxaliplatin is observed to reduce tumor growth. The following additional assessments are needed to fully evaluate the current work for completeness; please see individual reviews for further details.

      We appreciate the editor’s positive comment.

      (1) The premise is one of translation yet the authors have not demonstrated that manipulating bacteria to sequester iron does not provide a potential for sepsis or other evidence that this does not increase the competitiveness of bacteria relative to the host. Only tumor volume was provided rather than animal survival and cause of death, but bacterial virulence is enhanced including the possibility of septic demise. Alternatively, postulated by the authors, that tumor volume is decreased due to iron sequestration but they do not directly quantify the iron concentration in (1) E. Coli in different growth environments, and (2) tumor microenvironment. These important endpoints will provide the functional consequences of upregulating genes that import iron into the bacteria.

      We appreciate the editor’s comment and have added substantial data to support the translational potential of the iron-scavenging bacteria. In particular, we added evidence that the iron-scavenging bacteria does not increase the risk of sepsis (Fig. 3k, l), evidence of increased bacteria competitiveness and survival in tumor (Fig. S6), and iron-scavenging bacteria’s superior anticancer ability and survival benefit across 3 different tumor models (Fig. 3e-j; Fig. S5). While direct measurement of iron concentration in the tumor environment is technically difficult due to the challenge in differentiating Fe2+ and Fe3+ by available techniques, we added a colormetric CAS assay to demonstrate the iron-scavenging bacteria can more effectively utility Fe than WT bacteria in the presence of LCN2 (Fig. 3b). These results substantiate the translational relevance of the engineered bacteria.

      (2) There is no discussion of the cancer type and why this cancer type was chosen. If the current tumor modulation system is dependent on LCN2 activity, there would need to be some recognition that different tumors have variable levels of LCN expression. Would the response of the tumor depend on the role of iron in that cancer type?

      We appreciate the comment and added relevant text and citations describing clinical relevance of LCN2 expression associated with the tumor types used in the study (breast cancer, melanoma, and colon cancer). Elevated LCN2 has been associated with higher aggressiveness for all three cancer types.

      (3) To demonstrate long-term anti-cancer memory was established through enhancement of CD8+ T cell activity (Fig 5c), the "2nd seeding tumor cells" experiment may need to be done in CD8 antibody-treated IronA mice since CD8+ T cells may play a role in tumor suppression regardless of whether or not iron regulation is being manipulated. It appears that the control group for this experiment is naive mice (and not WT-E. coli treated mice), in which case the immunologic memory could be from having had tumor/E. coli rather than the effect of IroA-E. coli.

      We acknowledge that our prior writing may have overstated our claim on immunological memory. Our intention is to show that upon treatment and tumor eradication by iron-scavenging bacteria, adaptive immunity mediated by CD8 T cells can be elicited. We also did not consider a WT-E. coli control as no WT-E. coli treated group achieved complete tumor regression. We have modified our text to reflect our intended message.

      Reviewer #1 (Recommendations For The Authors):

      All the figures seem to be in low resolution and pixelated. Please upload high-resolution ones.

      We have updated figures to high-resolution ones.

      Reviewer #2 (Recommendations For The Authors):

      Some specific comments towards experiments:

      (1) For Fig 2 f/ Fig 3f/ Fig 5d/Fig6c, the survival rate is based on the tumor volume (the mouse was considered dead when the tumor volume exceeded 1,500 mm3). Did the mice die from the experiment (how many from each group)? If it only reflects the tumor size, do these figures deliver the same information as the tumor growth figure?

      We appreciate the reviewer’s comment. The survival rate is indeed based on tumor volume, and we used a cutoff of 1500 mm3. No death event was observed prior to the tumors reaching 1500 mm3. Although the survival figures cover some of the information conveyed by the tumor volume tracking, the figures offer additional temporal resolution of tumor progression with the survival figures. Having both tumor volume and survival tracking are commonly adopted to depict tumor progression. We have the protocol regarding survival monitoring to the materials and method section.

      (2) Fig 3a, not sure if entE is a good negative control for this experiment. Neg. Ctrl should maintain its CFU/ml at a certain level regardless of Lcn2 conc. However, entE conc. is at 100 CUF/ml throughout the experiment suggesting there is no entE in media or if it is supersensitive to Lcn2 that bacteria die at the dose of 0.1nM?

      We appreciate the reviewer’s comment. The △entE-E. coli was indeed observed to be highly sensitive to LCN2. We included the control to highlight the competitive relationship between entE and LCN2 for iron chelation, which is previously reported in literature [Biometals 32, 453–467 (2019)].

      (3) Fig 4, the authors harvested bacteria from the tumor by centrifuging homogenized samples at different speeds. Internal controls confirming sample purity (positive for bacteria and negative for cells for panels a,b,c; or vice versa for panel d) may be necessary. This comment may also apply to samples from Fig 1.

      We acknowledge the reviewer’s concern and would like to point out that the proteomic analysis was performed using a highly cited protocol that provides reference and normalization standards for E. coli proteins [Mol Cell Proteomics. 2014 Sep; 13(9): 2513–2526]. The reference is cited in the Materials and Method section associated with the proteomic analysis.

      (4) To demonstrate long-term anti-caner memory was established through enhancement of CD8+ T cell activity, the "2nd seeding tumor cells" experiment may need to be done in CD8 antibody-treated IronA mice.

      We have modified our claims to highlight that the tumor eradication by iron scavenging bacteria can establish adaptive anticancer immunity through the elicitation of CD8 T cells. We apologize for overstating our claim in the previous manuscript draft.

      Minor suggestions:

      (1) Please include the tumor re-challenge experiment in the method section.

      The re-challenge experiment has been added to the method section as instructed.

      (2) Please cite others' and your previous work. E.g. line 281, 282, line 306-307.

      We have added the citations as instructed.

      (3) Line 448, BL21 is bacteria, not cells.

      We have made the correction accordingly.

      Reviewer #3 (Recommendations For The Authors):

      • The authors postulate that IroA-E. coli is more potent than DGC-E. coli in resisting LCN2 activity, and that this potency is the cause of the increased tumor suppression of this engineered strain. If so, Fig 3a should include DGC-E. coli for direct comparison.

      We appreciate the reviewer for the comment and would like to clarify that we intended construct IroA-E. coli as a more specific iron-scavenging strategy, which can aide the discussion of nutritional immunity and minimize compounding factors from the immune-stimulatory effect of CDG. We have modified our text to clarify our stance.

      • The data refers to the effects of WT bacteria-mediated tumor suppression, e.g. Figure 3e shows that even WT bacteria have a significant suppressive effect on tumor growth. Could the authors provide background on what is known about the mechanism of this tumor suppression, outside of tumor targeting and engineerability? They only reference "immune system stimulation."

      We appreciate the reviewer’s comment and would like to refer the reviewer to our recently published article [Lim et al., EMBO Molecular Medicine 2024; DOI: 10.1038/s44321-023-00022-w], which shows that in addition to immune system stimulation, WT bacteria can also be perceived as an invading species in the tumor that can exert differential selective pressure against cancer cells. Competition for nutrient is highlighted as a major contribution to contain tumor growth. In fact, the nutrient competition that we observed in the prior article inspired the design of the iron scavenging bacteria towards overcoming nutritional immunity. We have cited this recently published article to the revised manuscript to enrich the background.

      • The authors claim that there is immunologic memory because of tumor resistance in re-challenged mice after IroA-E. coli treatment (Fig 5c). It appears that the control group for this experiment is naive mice (and not WT-E. coli treated mice), in which case the immunologic memory could be from having had tumor/E. coli rather than the effect of IroA-E. coli.

      We have modified our claims to highlight that the tumor eradication by iron scavenging bacteria can establish adaptive anticancer immunity through the elicitation of CD8 T cells. We did not intend to highlight that the adaptive immunity stemmed from IroA-E. coli only, and we intend to build upon current literature that has reported CD8+ T cell elicitation by bacterial therapy. The IroA-E.coli is shown to enhance adaptive immunity. We also did not consider a WT-E. coli control as no WT-E. coli treated group achieved complete tumor regression.

      • The authors claim that CD8+ T cells are mechanistically important in the effects of iron status manipulation in E. coli-mediated tumor suppression (Fig 5). In order to show this, it seems that Fig 5c should include WT-E. coli and WT-E. coli+CD8 ab groups, as it may be that CD8+ T cells play a role in tumor suppression regardless of whether or not iron regulation is being manipulated.

      We apologize for the confusion from our prior writing. We have modified our claims to highlight that the tumor eradication by iron scavenging bacteria can establish adaptive anticancer immunity through the elicitation of CD8 T cells. We did not intend to convey that CD8+ T cells are mechanistically important in the effects of iron status manipulation.

    2. eLife assessment

      This valuable study combines proteomics and a mouse model to reveal the importance of iron uptake in bacterial therapy for cancer. The evidence presented is convincing. Notably, the authors showed upregulation of iron uptake of bacteria significantly inhibits tumor growth in vivo. This paper will be of interest to a broad audience including researchers in cancer biology, cell biology, and microbiology.

    3. Reviewer #1 (Public Review):

      In this manuscript, Huang and colleagues explored the role of iron in bacterial therapy for cancer. Using proteomics, they revealed the upregulation of bacterial genes that uptake iron, and reasoned that such regulation is an adaptation to the iron-deficient tumor microenvironment. Logically, they engineered E. Coli strains with enhanced iron-uptake efficiency, and showed that these strains, together with iron scavengers, suppress tumor growth in a mouse model. Lastly, they reported the tumor suppression by IroA-E. Coli provides immunological memory via CD8+ T cells. In general, I find the findings in the manuscript novel and the evidence convincing.

      (1) Although the genetic and proteomic data are convincing, would it be possible to directly quantify the iron concentration in (1) E. Coli in different growth environments, and (2) tumor microenvironment? This will provide functional consequence of upregulating genes that import iron into the bacteria.

      (2) Related to 1, the experiment to study the synergistic effect of CDG and VLX600 (lines 139-175) is very nice and promising, but one flaw here is a lack of the measurement of iron concentration. Therefore, a possible explanation could be that CDG acts in another manner, unrelated to iron uptake, that synergizes with VLX600's function to deplete iron from cancer cells. Here, a direct measurement of iron concentration will show the effect of CDG on iron uptake, thus complementing the missing link.

      (3) Lines 250-268: Although statistically significant, I would recommend the authors characterize the CD8+ T cells a little more, as the mechanism now seems quite elusive. What signals or memories do CD8+ T cells acquire after IroA-E. Coli treatment to confer their long-term immunogenicity?

      (4) Perhaps this goes beyond the scope of the current manuscript, but how broadly applicable is the observed iron-transport phenomenon in other tumor models? I would recommend the authors to either experimentally test it in another model, or at least discuss this question.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors provide strong evidence that bacteria, such as E. coli, compete with tumor cells for iron resources and consequently reduce tumor growth. When sequestration between LCN2 and bacterobactin is blocked by upregulating CDG(DGC-E. coli) or salmochelin(IroA-E.coli), E. coli increase iron uptake from the tumor microenvironment (TME) and restrict iron availability for tumor cells. Long-term remission in IroA-E.coli treated mice is associated with enhanced CD8+ T cell activity. Additionally, systemic delivery of IroA-E.coli shows a synergistic effect with chemotherapy reagent oxaliplatin to reduce tumor growth.

      Strengths:

      It is important to identify the iron-related crosstalk between E. coli and TME. Blocking lcn2-bacterobactin sequestration by different strategies consistently reduce tumor growth.

      Weaknesses:

      As engineered E.coli upregulate their function to uptake iron, they may increase the likelihood of escaping from nutritional immunity (LCN2 becomes insensitive to sequester iron from the bacteria). Would this raise the chance of developing sepsis? Do authors think that it is safe to administrate these engineered bacteria in mice or humans?

    5. Reviewer #3 (Public Review):

      Summary:

      Based on their observation that tumor has an iron-deficient microenvironment, and the assumption that nutritional immunity is important in bacteria-mediated tumor modulation, the authors postulate that manipulation of iron homeostasis can affect tumor growth. This paper uses straightforward in vitro and in vivo techniques to examine a specific and important question of nutritional immunity in bacteria-mediated tumor therapy. They are successful in showing that manipulation of iron regulation during nutritional immunity does affect the virulence of the bacteria, and in turn the tumor. These findings open future avenues of investigation, including the use of different bacteria, different delivery systems for therapeutics, and different tumor types. The authors were also successful in addressing the reviewer's concerns adequately.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We thank the editorial team and reviewers for their continued contributions to improve our work.

      Below we have addressed the final recommendations to the authors

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      I asked previously why the suppression depth should vary based on the contrast change speed. I now understand that the authors expect this variation from a working model based on neural adaptation (lines 274-277 and 809-820). I suggest the authors specify this prediction also on lines 473-479, where there is room for improved clarity (the words/phrases 'impact,' 'be sensitive to,' and 'covary' are non-directional).

      We have now specified this prediction to improve clarity:

      Line 475 – 486

      “In the context of the tCFS method, the steady increases and decreases in the target’s actual strength (i.e., its contrast) should, respectively, boost its emergence from suppression (bCFS) and facilitate its reversion to suppression (reCFS) as it competes against the mask. Whether construed as a consequence of neural adaptation or error signal, we surmise that these cycling state transitions defining suppression depth should be sensitive to the rate of contrast change of the monocular target. Specifically, the slower the contrast change, the greater the amount of accrued adaptation, which will contract the range between breakthrough and suppression thresholds according to an adapting reciprocal inhibition model. For fast contrast change, there will be less accrual of adaptation meaning that the range between breakthrough and suppression thresholds will exhibit less contraction. Expressed in operational terms, the depth of suppression should be positively related to the rate of target change. Experiment 3 tested this supposition using three rates of contrast change.”

      Line 108: 'By comparing the thresholds for a target to transition into (reCFS) and out of awareness (bCFS)'-are 'into' and 'out of' reversed?

      They were, thank you, these have now been corrected.

      Lines 696-698 read, 'Figure 3 shows that polar patterns tend to emerge from suppression at slightly lower contrasts than do gratings.' In the same paragraph, lines 716-171 read, 'Figure 3 shows that bCFS and reCFS thresholds are very similar for all image categories.' There is a statistically significant effect of category in these results; meanwhile, the differences among categories are arguably small. Which side do the authors intend to emphasize? Are the readers meant to interpret this as a glass-half-full, half-empty situation?

      We have now revised this paragraph. We emphasise that the small differences do not support ‘preferential processing’ of the magnitude that would be expected from category specific neural CRFs.

      From Line 702

      “Next we turn to another question raised about our conclusion concerning invariant depth of suppression. If a certain image type had overall lower bCFS and reCFS contrast thresholds relative to another image type (despite equivalent suppression depth), would that imply the former image enjoyed “preferential processing” relative to the latter? And, what would determine the differences in bCFS and reCFS thresholds? Figure 3 shows that polar patterns tend to emerge from suppression at slightly lower contrasts than do gratings and that polar patterns, once dominant, tend to maintain dominance to lower contrasts than do gratings and this happens even though the rate of contrast change is identical for both types of stimuli. But while rate of contrast change is identical, the neural responses to those contrast changes may not be the same: neural responses to changing contrast will depend on the neural contrast response functions (CRFs) of the cells responding to each of those two types of stimuli, where the CRF defines the relationship between neural response and stimulus contrast. CRFs rise monotonically with contrast and typically exhibit a steeply rising initial response as stimulus contrast rises from low to moderate values, followed by a reduced growth rate for higher contrasts. CRFs can vary in how steeply they rise and at what contrast they achieve half-max response. CRFs for neurons in mid-level vision areas such as V4 and FFA (which respond well to polar stimuli and faces, respectively) are generally steeper and shifted towards lower contrasts than CRFs for neurons in primary visual cortex (which respond well to gratings). Therefore, the effective strength of the contrast changes in our tCFS procedure will depend on the shape and position of the underlying CRF, an idea we develop in more detail in Supplementary Appendix 1, comparing the case of V1 and V4 CRFs. Interestingly, the comparison of V1 and V4 CRFs shows two interesting points: (i) that V4 CRFs should produce much lower bCFS and reCFS thresholds than V1 CRFs, and (ii) that V4 CRFs should produce much more suppression than V1 CRFs. Our data do not support either prediction: bCFS and reCFS thresholds for the polar shape are not ‘much lower’ than those for gratings (Fig. 3) and neither is there ‘much more’ suppression depth for the polar form. There is no room in these results to support the claim that certain images are special and receive “preferential processing” or processing outside of awareness. Instead, the similar data patterns for all image types is most parsimoniously explained by a single mechanism processing all images (see Appendix 1), although there are many other kinds of images still to be tested in tCFS and exceptions may yet be found. As a first step in exploring this idea, one could use standard psychophysical techniques (e.g., (Ling & Carrasco, 2006)) to derive CRFs for different categories of patterns and then measure suppression depth associated with those patterns using tCFS.”

    2. Reviewer #2 (Public Review):

      Summary

      The paper concerns the phenomenon of continuous flash suppression (CFS), relevant to questions about the extent and nature of subconscious visual processing. Whereas standard CFS studies only measure the breakthrough threshold-the contrast at which an initially suppressed target stimulus with steadily increasing contrast becomes visible-this study also measures the re-suppression threshold, the contrast at which a visible target with decreasing contrast becomes suppressed. Thus, the authors could calculate suppression depth, the ratio between the breakthrough and re-suppression thresholds. To measure both thresholds, the study introduces the tracking-CFS method, a continuous-trial design that results in faster, better controlled, and lower-variance threshold estimates compared to the discrete trials standard in the literature. The study finds that suppression depths are similar for different image categories, providing an interesting contrast to previous results that breakthrough thresholds differ for different image categories. The new finding calls for a reassessment of interpretations based solely on the breakthrough threshold that subconscious visual processing is category-specific.

      Strengths

      (1) The tCFS method quickly estimates breakthrough and re-suppression thresholds using continuous trials, which also better control for slowly varying factors such as adaptation and attention. Indeed, tCFS produces estimates with lower across-subject variance than the standard discrete-trial method (Fig. 2). The tCFS method is straightforward to adopt in future research on CFS and binocular rivalry.

      (2) The CFS literature has lacked re-suppression threshold measurements. By measuring both breakthrough and re-suppression thresholds, this work calculated suppression depth (i.e., the difference between the two thresholds), which warrants different interpretations from the breakthrough threshold alone.

      (3) The work found that different image categories show similar suppression depths, suggesting some aspects of CFS are not category-specific. This result enriches previous findings that breakthrough thresholds vary with image categories. Re-suppression thresholds vary symmetrically, such that their differences are constant.

      Weakness

      I do not follow the authors' reasoning as to why the suppression depth is a better (or fuller, superior, more informative) indication of subconscious visual processing than the breakthrough threshold alone. To my previous round of comments, the authors replied that 'breakthrough provides only half of the needed information.' I do not understand this. One cannot infer the suppression depth from the breakthrough threshold alone, but *one cannot obtain the breakthrough threshold from the suppression depth alone*, either. The two measures are complementary. (To be sure, given *both* the suppression depth and the re-suppression threshold, one can recover the breakthrough threshold. The discussion concerns the suppression depth *alone* and the breakthrough threshold *alone*.) I am fully open to being convinced that there is a good reason why the suppression depth may be more informative than the breakthrough threshold about a specific topic, e.g., inter-ocular suppression or subconscious visual processing. I only request that the authors make such an argument explicit. For example, in the significance statement, the authors write, 'all images show equal suppression when both thresholds are measured. We *thus* find no evidence of differential unconscious processing and *conclude* reliance on breakthrough thresholds is misleading' (emphasis added). Just what supports the 'thus' and the 'conclude'? Similarly, at the end of the introduction, the authors write, '[...] suppression depth was constant for faces, objects, gratings and visual noise. *In other words*, we find no evidence to support differential unconscious processing among these particular, diverse categories of suppressed images' (emphasis added). I am not sure the statements in the two sentences are equivalent.

      The authors' reply included a discussion of neural CRFs, which may explain why the bCFS thresholds differ across image categories. A further step seems necessary to explain why CRFs do not qualify as a form of subconscious processing.

    3. eLife assessment

      This valuable study introduces an innovative method for measuring interocular suppression depth, which implicates mechanisms underlying subconscious visual processing. The evidence is solid in suggesting that the new method yields provocative uniform suppression depth results across image categories that differ from conventional bCFS threshold. It will be of interest not only to cognitive psychologists and neuroscientists who study sensation and perception but also to philosophers who work on theories of consciousness.

    4. Reviewer #1 (Public Review):

      Summary

      A new method, tCFS, is introduced to offer richer and more efficient measurement of interocular suppression. It generates a new index, the suppression depth, based on the contrast difference between the up-ramped contrast for the target to breakthrough suppression and the down-ramped contrast for the target to disappear into suppression. A uniform suppression depth regardless of image types (e.g., faces, gratings and scrambles) was discovered in the paper, favoring an early-stage mechanism involving CFS. Discussions about claims of unconscious processing and the related mechanisms.

      Strength

      The tCFS method adds to the existing bCFS paradigms by providing the (re-)suppression threshold and thereafter the depression depth. Benefiting from adaptive procedures with continuous trials, the tCFS is able to give fast and efficient measurements. It also provides a new opportunity to test theories and models about how information is processed outside visual awareness.

      Weakness:

      This paper reports the surprising finding of uniform suppression depth over a variety of stimuli. This is novel and interesting. But given the limited samples being tested, the claim of uniformity suppression depth needs to be further examined, with respect to different complexities and semantic meanings.

      From an intuitive aspect, the results challenged previous views about "preferential processing" for certain categories, though it invites further research to explore what exactly could suppression depth tell us about unconscious visual processing.

    5. Reviewer #3 (Public Review):

      Summary:

      In the 'bCFS' paradigm, a monocular target gradually increases in contrast until it breaks interocular suppression by a rich monocular suppressor in the other eye. The present authors extend the bCFS paradigm by allowing the target to reduce back down in contrast until it becomes suppressed again. The main variable of interest is the contrast difference between breaking suppression and (re) entering suppression. The authors find this difference to be constant across a range of target types, even ones that differ substantially in the contrast at which they break interocular suppression (the variable conventionally measured in bCFS). They also measure how the difference changes as a function of other manipulations. Interpretation is in terms of the processing of unconscious visual content, as well as in terms of the mechanism of interocular suppression.

      Strengths:

      Interpretation of bCFS findings is mired in controversy, and this is an ingenuous effort to move beyond the paradigm's exclusive focus on breaking suppression. The notion of using the contrast difference between breaking and entering suppression as an index of suppression depth is interesting. The finding that this difference is similar for a range of target types that do differ in the contrast at which they break suppression, suggests a common mechanism of suppression across those target types.

    1. Author response:

      The following is the authors’ response to the original reviews.

      The reviewers praised multiple aspects of our study. Reviewer 1 noted that “the work aligns well with current research trends and will greatly interest researchers in the field.” Reviewer 2 highlighted the unique capability of our imaging approach, which “allows for investigation of the heterogeneity of response across individual dopamine axons, unlike other common approaches such as fiber photometry.” Reviewer 3 commented that “the experiments are beautifully executed” and “are revealing novel information about how aversive and rewarding stimuli is encoded at the level of individual axons, in a way that has not been done before.”

      In addition to the positive feedback, the reviewers also provided useful criticisms and suggestions, some of which may not be fully addressed in a single study. For instance, questions regarding whether dopamine axons encode the valence or specific identity of the stimuli, or the most salient aspects of the environment, remain open. At the same time, as all the reviewers agreed, our report on the diversity of dopamine axonal responses using a novel imaging design introduces significant new insights to the neuroscience community. Following the reviewers’ recommendations, we have refrained from making interpretations that could be perceived as overinterpretation, such as concluding that “dopamine axons are involved in aversive processing.” This has necessitated extensive revisions, including modifying the title of our manuscript to make clear that the novelty of our work is revealing ‘functional diversity’ using our new imaging approach.

      Below, we respond to the reviewers’ comments point by point.

      eLife assessment

      This valuable study shows that distinct midbrain dopaminergic axons in the medial prefrontal cortex respond to aversive and rewarding stimuli and suggest that they are biased toward aversive processing. The use of innovative microprism based two-photon calcium imaging to study single axon heterogeneity is solid, although the experimental design could be optimized to distinguish aversive valence from stimulus salience and identity in this dopamine projection. This work will be of interest to neuroscientists working on neuromodulatory systems, cortical function and decision making.

      Reviewer #1

      Summary:

      In this manuscript, Abe and colleagues employ in vivo 2-photon calcium imaging of dopaminergic axons in the mPFC. The study reveals that these axons primarily respond to unconditioned aversive stimuli (US) and enhance their responses to initially-neutral stimuli after classical association learning. The manuscript is well-structured and presents results clearly. The utilization of a refined prism-based imaging technique, though not entirely novel, is well-implemented. The study's significance lies in its contribution to the existing literature by offering single-axon resolution functional insights, supplementing prior bulk measurements of calcium or dopamine release. Given the current focus on neuromodulator neuron heterogeneity, the work aligns well with current research trends and will greatly interest researchers in the field.

      However, I would like to highlight that the authors could further enhance their manuscript by addressing study limitations more comprehensively and by providing essential details to ensure the reproducibility of their research. In light of this, I have a number of comments and suggestions that, if incorporated, would significantly contribute to the manuscript's value to the field.

      Strengths:

      • Descriptive.

      • Utilization of a well-optimized prism-based imaging method.

      • Provides valuable single-axon resolution functional observations, filling a gap in existing literature.

      • Timely contribution to the study of neuromodulator neuron heterogeneity.

      We thank the reviewer for this positive assessment.

      Weaknesses:

      (1) It's important to fully discuss the fact that the measurements were carried out only on superficial layers (30-100um), while major dopamine projections target deep layers of the mPFC as discussed in the cited literature (Vander Weele et al., 2018) and as illustrated in FigS1B,C. This limitation should be explicitly acknowledged and discussed in the manuscript, especially given the potential functional heterogeneity among dopamine neurons in different layers. This potential across-layer heterogeneity could also be the cause of discrepancy among past recording studies with different measurement modalities. Also, mentioning technical limitations would be informative. For example: how deep the authors can perform 2p-imaging through the prism? was the "30-100um" maximum depth the authors could get?

      Thank you for pointing out this important issue about layer differences.

      It is possible that the mesocortial pathway has layer-specific channels, with some neurons targeting supra granular layers and others targeting infragranular ones. Alternatively, it is also plausible that the axons of the same neurons branch into both superficial and deep layers. This is a critical issue that has not been investigated in anatomical studies and will require single-cell labeling of dopamine neurons (Matsuda et al 2009 and Aransay et al 2015). We now discuss this issue in the Discussion.

      As for the imaging depth of 30–100 m, we were unable to visualize deeper axons in a live view mode. Our imaging system has already been optimized to detect weak signals (e.g., we have employed an excitation wavelength of 980 nm, dispersion compensation, and a hybrid photodetector). It is possible that future studies using improved imaging approaches may be able to visualize deeper layers. Importantly, sparse axons in the supragranular layers are advantageous in detecting weak signals; dense labeling of axons would increase the background fluorescence relative to signals. We now reference this layer issue in the Results and Discussion sections.

      (2) In the introduction, it seems that the authors intended to refer to Poulin et al. 2018 regarding molecular/anatomical heterogeneity of dopamine neurons, but they inadvertently cited Poulin et al. 2016 (a general review on scRNAseq). Additionally, the statement that "dopamine neurons that project to the PFC show unique genetic profiles (line 85)" requires clarification, as Poulin et al. 2018 did not specifically establish this point. Instead, they found at least the Vglut2/Cck+ population projects into mPFC, and they did not reject the possibility of other subclasses projecting to mPFC. Rather, they observed denser innervation with DAT-cre, suggesting that non-Vglut2/Cck populations would also project to mPFC. Discuss the potential molecular heterogeneity among mPFC dopamine axons in light of the sampling limitation mentioned earlier.

      We thank the reviewer for pointing this out. Genetic profiles of PFC-projecting DA neurons are still being investigated, so describing them as “unique” was misleading. We have edited the Introduction accordingly, and now discuss this issue in detail in the Discussion.

      (3) I find the data presented in Figure 2 to be odd. Firstly, the latency of shock responses in the representative axons (right panels of G, H) is consistently very long - nearly 500ms. It raises a query whether this is a biological phenomenon or if it stems from a potential technical artifact, possibly arising from an issue in synchronization between the 2-photon imaging and stimulus presentation. My reservations are compounded by the notable absence of comprehensive information concerning the synchronization of the experimental system in the method section.

      The synchronization of the stimulus and data acquisition is accomplished at a sub-millisecond resolution. We use a custom-made MATLAB program that sends TTL commands to standard imaging software (ThorImage or ScanImage) and a stimulator for electrical shocks. All events are recorded as analogue inputs to a different DAQ to ensure synchronization. We have provided additional details regarding the configuration in the Methods section.

      We consider that the long latency of shock response is biological. For instance, a similar long latency was found after electrical shock in a photometry imaging study (Kim, …, Deisseroth, 2016).

      Secondly, there appear to be irregularities in Panel J. While the authors indicate that "Significant axons were classified as either reward-preferring (cyan) or aversive-preferring (magenta), based on whether the axons are above or below the unity line of the reward/aversive scatter plot (Line 566)," a cyan dot slightly but clearly deviates above the unity line (around coordinates (x, y) = (20, 21)). This needs clarification. Lastly, when categorizing axons for analysis of conditioning data in Fig3 (not Fig2), the authors stated "The color-coded classification (cyan/magenta) was based on k-means clustering, using the responses before classical conditioning (Figure 2J)". I do not understand why the authors used different classification methods for two almost identical datasets.

      We thank the reviewer for pointing out these insufficient descriptions. We classified the axons using k-means clustering, and the separation of the two clusters happened to roughly coincide with the unity line of the reward/aversive scatter plot in Fig 2J. In other words, we did not use the unity line to classify the data points (which is why the color separation of the histogram is not at 45 degrees). We have clarified this point in the Methods section.

      (4) In connection with Point 3, conducting separate statistical analyses for aversive and rewarding stimuli would offer a fairer approach. This could potentially reveal a subset of axons that display responses to both aversive and appetitive stimuli, aligning more accurately with the true underlying dynamics. Moreover, the characterization of Figure 2J as a bimodal distribution while disregarding the presence of axons responsive to both aversive and appetitive cues seems somewhat arbitrary and circular logic. A more inclusive consideration of this dual-responsive population could contribute to a more comprehensive interpretation.

      We also attempted k-means clustering with additional dimensions (e.g., temporal domains as shown in Fig. 3I, J), but no additional clusters were evident. We note that the lack of other clusters does not exclude the possibility of their existence, which may only become apparent with a substantial increase in the number of samples. In the current report, we present the clusters that were the easiest/simplest for us to identify.

      Additionally, we have revised our manuscript to reflect that many axons respond to both reward and aversive stimuli, and that aversive-preferring axons do not exclusively respond to the aversive stimulus.

      (5) The contrast in initialization to novel cues between aversive and appetitive axons mirrors findings in other areas, such as the tail-of-striatum (TS) and ventral striatum (VS) projecting dopamine neurons (Menegas et al., 2017, not 2018). You might consider citing this very relevant study and discussing potential collateral projections between mPFC and TS or VS.

      Thank you for pointing this out. We have now included Menegas et al., 2017, and also discuss the possibility of collaterals to these areas. In addition, we also referred to Azcorra et al., 2023 - this was published after our initial submission.

      (6) The use of correlation values (here >0.65) to group ROIs into axons is common but should be justified based on axon density in the FOV and imaging quality. It's important to present the distribution of correlation values and demonstrate the consistency of results with varying cut-off values. Also, provide insights into the reliability of aversive/appetitive classifications for individual ROIs with high correlations. Importantly, if you do the statistical testing and aversive/appetitive classifications for individual ROIs with above-threshold high correlation (to be grouped into the same axon), do they always fall into the same category? How many false positives/false negatives are observed?


      "Our results remained similar for different correlation threshold values (Line 556)" (data not shown) is obsolete.

      We have conducted additional analysis using correlation values 0.5 and 0.3 that resulted in a smaller number of axon terminals. In essence, the relationship between reward responses and aversive responses remained very similar to Fig. 2J, K.

      Author response image 1.

      Reviewer #2 (Public Review):

      Summary:

      This study aims to address existing differences in the literature regarding the extent of reward versus aversive dopamine signaling in the prefrontal cortex. To do so, the authors chose to present mice with both a reward and an aversive stimulus during different trials each day. The authors used high spatial resolution two-photon calcium imaging of individual dopaminergic axons in the medial PFC to characterize the response of these axons to determine the selectivity of responses in unique axons. They also paired the reward (water) and an aversive stimulus (tail shock) with auditory tones and recorded across 12 days of associative learning.

      The authors find that some axons respond to both reward and aversive unconditioned stimuli, but overall, there is a strong preference to respond to aversive stimuli consistent with expectations from prior studies that used other recording methods. The authors find that both of their two auditory stimuli initially drive responses in axons, but that with training axons develop more selective responses for the shock associated tone indicating that associative learning led to changes in these axon's responses. Finally, the authors use anticipatory behaviors during the conditioned stimuli and facial expressions to determine stimulus discrimination and relate dopamine axons signals with this behavioral evidence of discrimination. This study takes advantage of cutting-edge imaging approaches to resolve the extent to which dopamine axons in PFC respond appetitive or aversive stimuli. They conclude that there is a strong bias to respond to the aversive tail shock in most axons and weaker more sparse representation of water reward.

      Strengths:

      The strength of this study is the imaging approach that allows for investigation of the heterogeneity of response across individual dopamine axons, unlike other common approaches such as fiber photometry which provide a measure of the average population activity. The use of appetitive and aversive stimuli to probe responses across individual axons is another strength.

      We thank the reviewer for this positive assessment.

      Weaknesses:

      A weakness of this study is the design of the associative conditioning paradigm. The use of only a single reward and single aversive stimulus makes it difficult to know whether these results are specific to the valence of the stimuli versus the specific identity of the stimuli. Further, the reward presentations are more numerous than the aversive trials making it unclear how much novelty and habituation account for results. Moreover, the training seems somewhat limited by the low number of trials and did not result in strong associative conditioning. The lack of omission responses reported may reflect weak associative conditioning. Finally, the study provides a small advance in our understanding of dopamine signaling in the PFC and lacks evidence for if and what might be the consequence of these axonal responses on PFC dopamine concentrations and PFC neuron activity.

      We thank the reviewer for the suggestions.

      We agree that interpreting the response change during classical conditioning is not straightforward. Although the reward and aversive stimuli we employed are commonly used in the field, future studies with more sophisticated paradigms will be necessary to address whether dopamine axons encode the valence of the stimuli, the specific identity of the stimuli, or novelty and habituation. In our current manuscript, we refrain from making a conclusion that distinct groups of neurons encode different valances. In fact, many axons respond to both stimuli, at different ratios. We have removed descriptions that may suggest exclusive coding of reward or aversive processing. Additionally, we have extensively discussed possible interpretations.

      In terms of the strength of the conditioning association, behavioral results indicated that the learning plateaued – anticipatory behaviors did not increase during the last two phases when the conditioned span was divided into six phases (Figure 3–figure supplement 1).

      Our goal in the current manuscript is to provide new insight into the functional diversity of dopamine axons in the mPFC. Investigating the impact of dopamine axons on local dopamine concentration and neural activity in the mPFC is important but falls beyond the scope of our current study. In particular, given the functional diversity of dopamine axons, interpreting bulk optogenetic or chemogenetic axonal manipulation experiments would not be straightforward. As suggested, measuring the dopamine concentration through two-photon imaging of dopamine sensors and monitoring the activity of dopamine recipient neurons (e.g., D1R- or D2R-expressing neurons) is a promising approach that we plan to undertake in the near future.

      Reviewer #3 (Public Review):

      Summary:

      The authors image dopamine axons in medial prefrontal cortex (mPFC) using microprism-mediated two-photon calcium imaging. They image these axons as mice learn that two auditory cues predict two distinct outcomes, tailshock or water delivery. They find that some axons show a preference for encoding of the shock and some show a preference for encoding of water. The authors report a greater number of dopamine axons in mPFC that respond to shock. Across time, the shock-preferring axons begin to respond preferentially to the cue predicting shock, while there is a less pronounced increase in the water-responsive axons that acquire a response to the water-predictive cue (these axons also increase non-significantly to the shock-predictive cue). These data lead the authors to argue that dopamine axons in mPFC preferentially encode aversive stimuli.

      Strengths:

      The experiments are beautifully executed and the authors have mastered an impressively complex technique. Specifically, they are able to image and track individual dopamine axons in mPFC across days of learning. This technique is used the way it should be: the authors isolate distinct dopamine axons in mPFC and characterize their encoding preferences and how this evolves across learning of cue-shock and cue-water contingencies. Thus, these experiments are revealing novel information about how aversive and rewarding stimuli is encoded at the level of individual axons, in a way that has not been done before. This is timely and important.

      We thank the reviewer for this positive assessment.

      Weaknesses:

      The overarching conclusion of the paper is that dopamine axons preferentially encode aversive stimuli. This is prevalent in the title, abstract, and throughout the manuscript. This is fundamentally confounded. As the authors point out themselves, the axonal response to stimuli is sensitive to outcome magnitude (Supp Fig 3). That is, if you increase the magnitude of water or shock that is delivered, you increase the change in fluorescence that is seen in the axons. Unsurprisingly, the change in fluorescence that is seen to shock is considerably higher than water reward.

      We agree that the interpretation of our results is not straightforward. Our current manuscript now focuses on our strength, which is reporting the functional diversity of dopamine axons. Therefore, we avoid using the word ‘encode’ when describing the response.

      We believe that our results could reconcile the apparent discrepancy as to why some previous studies reported only aversive responses while others reported reward responses. In particular, if the reward volume were very small, the reward response could go undetected.

      Further, when the mice are first given unexpected water delivery and have not yet experienced the aversive stimuli, over 40% of the axons respond [yet just a few lines below the authors write: "Previous studies have demonstrated that the overall dopamine release at the mPFC or the summed activity of mPFC dopamine axons exhibits a strong response to aversive stimuli (e.g., tail shock), but little to rewards", which seems inconsistent with their own data].

      We always recorded the reward and aversive response together, which might have confused the reviewer. Therefore, there is no inconsistency in our data. We have clarified our methods and reasoning accordingly.

      Given these aspects of the data, it could be the case that the dopamine axons in mPFC encodes different types of information and delegates preferential processing to the most salient outcome across time.

      This is certainly an exciting interpretation, so we have included it in our discussion. Meanwhile, ‘the most salient outcome’ alone cannot fully capture the diverse response patterns of the dopaminergic axons, particularly reward-preferring axons. We discuss our findings in more detail in the revised manuscript.

      The use of two similar sounding tones (9Khz and 12KHz) for the reward and aversive predicting cues are likely to enhance this as it requires a fine-grained distinction between the two cues in order to learn effectively. There is considerable literature on mPFC function across species that would support such a view. Specifically, theories of mPFC function (in particular prelimbic cortex, which is where the axon images are mostly taken) generally center around resolution of conflict in what to respond, learn about, and attend to. That is, mPFC is important for devoting the most resources (learning, behavior) to the most relevant outcomes in the environment. This data then, provides a mechanism for this to occur in mPFC. That is, dopamine axons signal to the mPFC the most salient aspects of the environment, which should be preferentially learned about and responded towards. This is also consistent with the absence of a negative prediction error during omission: the dopamine axons show increases in responses during receipt of unexpected outcomes, but do not encode negative errors. This supports a role for this projection in helping to allocate resources to the most salient outcomes and their predictors, and not learning per se. Below are a just few references from the rich literature on mPFC function (some consider rodent mPFC analogous to DLPFC, some mPFC), which advocate for a role in this region in allocating attention and cognitive resources to most relevant stimuli, and do not indicate preferential processing of aversive stimuli.

      Distinguishing between 9 kHz and 12 kHz sound tones may not be that difficult, considering anticipatory licking and running are differentially manifested. In addition, previous studies have shown that mice can distinguish between two sound tones when they are separated by 7% (de Hoz and Nelken 2014). Nonetheless, we agree with the attractive interpretation that “the mPFC devotes the most resources (learning, behavior) to the most relevant outcomes in the environment” and that dopamine is a mechanism for this. Therefore, we discuss this interpretation in the revised text.

      References:

      (1) Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1), 167-202.

      (2) Bissonette, G. B., Powell, E. M., & Roesch, M. R. (2013). Neural structures underlying set-shifting: roles of medial prefrontal cortex and anterior cingulate cortex. Behavioural brain research, 250, 91101.

      (3) Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual review of neuroscience, 18(1), 193-222.

      (4) Sharpe, M. J., Stalnaker, T., Schuck, N. W., Killcross, S., Schoenbaum, G., & Niv, Y. (2019). An integrated model of action selection: distinct modes of cortical control of striatal decision making. Annual review of psychology, 70, 53-76.

      (5) Ridderinkhof, K. R., Ullsperger, M., Crone, E. A., & Nieuwenhuis, S. (2004). The role of the medial frontal cortex in cognitive control. science, 306(5695), 443-447.

      (6) Nee, D. E., Kastner, S., & Brown, J. W. (2011). Functional heterogeneity of conflict, error, taskswitching, and unexpectedness effects within medial prefrontal cortex. Neuroimage, 54(1), 528-540.

      (7) Isoda, M., & Hikosaka, O. (2007). Switching from automatic to controlled action by monkey medial frontal cortex. Nature neuroscience, 10(2), 240-248.

      Reviewer #1 (Recommendations For The Authors):

      Specific Suggestions and Questions on the Methods Section:

      In general, the methods part is not well documented and sometimes confusing. Thus, as it stands, it hinders reproducible research. Specific suggestions/questions are listed in the following section.

      (1) Broussard et al. 2018 introduced axon-GCaMP6 instead of axon-jGCaMP8m. The authors should provide details about the source of this material. If it was custom-made, a description of the subcloning process would be appreciated. Additionally, consider depositing sequence information or preferably the plasmid itself. Furthermore, the introduction of the jGCaMP8 series by Zhang, Rozsa, et al. 2023 should be acknowledged and referenced in your manuscript.

      We thank the reviewer for pointing this out. We have now included details on how we prepared the axon-jGCaMP8m, which was based on plasmids available at Addgene. Additionally, we have deposited our construct to Addgene ( https://www.addgene.org/216533/ ). We have also cited Janelia’s report on jGCaMP8, Zhang et al.

      (2) The authors elaborate on the approach taken for experimental synchronization. Specifically, how was the alignment achieved between 2-photon imaging, treadmill recordings, aversive/appetitive stimuli, and videography? It would be important to document the details of the software and hardware components employed for generating TTLs that trigger the pump, stimulator, cameras, etc.

      We have now included a more detailed explanation about the timing control. We utilize a custommade MATLAB program that sends TTL square waves and analogue waves via a single National Instruments board (USB-6229) to control two-photon image acquisition, behavior camera image acquisition, water syringe movement, current flow from a stimulator, and sound presentation. We also continuously recorded at 30 kHz via a separate National Instrument board (PCIe-6363) the frame timing of two-photon imaging, the frame timing of a behavior camera, copies of command waves (sent to the syringe pump, the stimulator, and the speaker), and signals from the treadmill corresponding to running speed.

      (3) The information regarding the cameras utilized in the study presents some confusion. In one instance, you mention, "To monitor licking behavior, the face of each mouse was filmed with a camera at 60 Hz (CM3-U3-13Y3M-CS, FLIR)" (Line 488). However, there's also a reference to filming facial expressions using an infrared web camera (Line 613). Could you clarify whether the FLIR camera (which is an industrial CMOS not a webcam) is referred to as a webcam? Alternatively, if it's a different camera being discussed, please provide product details, including pixel numbers and frame rate for clarity.

      We thank the reviewer for pointing this out. This was a mistake on our end. The camera used in the current project was a CM3-U3-13Y3M-CS, not a web camera. We have now corrected this.

      (4) Please provide more information about the methodology employed for lick detection. Specifically, did the authors solely rely on videography for this purpose? If so, why was an electrical (or capacitive) detector not used? It would provide greater accuracy in detecting licking.

      Lick detection was performed offline based on videography, using DeepLabCut. As licking occurs at a frequency of ~6.5 Hz (Xu, …, O’Connor Nature Neurosci, 2022), the movement can be detected at a frame rate of 60 Hz. Initially, we used both a lick sensor and videography. However, we favored videography because it could potentially provide non-binary information.

      Other Minor Points:

      (5) Ensure consistency in the citation format; both Vander Weele et al. 2018 and Weele et al. 2019, share the same first author.

      Thank you for pointing this out. Endnote processes the first author’s name differently depending on the journal. We fixed the error manually. The first paper (2018) is an original research paper, and the second one (2019) is a review about how dopamine modulates aversive processing in the mPFC. We cited the second one in three instances where we mentioned review papers.

      (6) The distinction between "dashed vs dotted lines" in Figure 3K and 3M appears to be very confusing. Please consider providing a clearer visualization/labeling to mitigate this confusion.

      We have now changed the line styles.

      (7) Additionally plotting mean polar angles of aversive/appetitive axons as vectors in the Cartesian scatter plots (2J, 3I,J) would make interpretation easier.

      We have now made this change to Figures 2, 3, 4.

      (8) Data and codes should be shared in a public database. This is important for reproducible research and we believe that "available from the corresponding author upon reasonable request" is outdated language.

      We have uploaded the data to GitHub, https://github.com/pharmedku/2024-elife-da-axon.

      Reviewer #2 (Recommendations For The Authors):

      (1) Authors don't show which mouse each axon data comes from making it hard to know if differences arise from inter-mouse differences vs differences in axons. The best way to address this point is to show similar plots as Figure 2J & K but broken down by mouse to shows whether each mouse had evidence of these two clusters.

      We have now made this change to Figure 2-figure supplement 3.

      (2) Line 166: Should this sentence point to panels 2F, G, H rather than 2I which doesn't show a shock response?

      We thank the reviewer for pointing this out. We have fixed the incorrect labels.

      Line 195: The population level bias to aversive stimuli was shown previously using photometry so it is not justified to say "for the first time" regarding this statement.

      We have adjusted this sentences so the claim of ”for the first time” is not associated with the population-level bias.

      (4) The paper lacks a discussion of the potential role that novelty plays in the amplitude of the responses given that tail shocks occur less often that rewards. Is the amplitude of the first reward of the day larger than subsequent rewards? Would tail shock responses decay if they occurred in sequential trials?

      Following the reviewer's suggestion, we conducted a comparison of individual axonal responses to both conditioned and unconditioned stimuli across the first trial and subsequent trials. Our findings reveal a notable trend: aversive-preferring axons exhibited attenuation in response to CSreward, yet enhancement in response to CSaversive. Conversely, the response of these axons to USreward was attenuated, with no significant change observed for USaversive. In contrast, reward-preferring axons displayed an invariable activity pattern from the initial trial, highlighting the functional diversity present within dopamine axons. This analysis has been integrated into Figure 3-figure supplement 4 and is elaborated upon in the Discussion section.

      (5) Fix typo in Figure 1 - supplement 1. Shift

      We have now corrected this. Thank you.

      (6) The methods section needs information about trial numbers. Please indicate how many trials were presented to each mouse per day.

      We have now added the information about trial numbers to the Methods section.

      Reviewer #3 (Recommendations For The Authors):

      In line with the public review, my recommendation is for the authors to remain as objective about their data as possible. There are many points in the manuscript where the authors seem to directly contradict their own data. For example, they first detail that dopamine axons respond to unexpected water rewards. Indeed, they find that there are 40% of dopamine axons that respond in this way. Then, a few paragraphs later they state: "Previous studies have demonstrated that the overall dopamine release at the mPFC or the summed activity of mPFC dopamine axons exhibits a strong response to aversive stimuli (e.g., tail shock), but little to rewards". As detailed above, I do not think these data support an idea that dopamine axons in mPFC preferentially encode aversive outcomes. If the authors wanted to examine a role for mPFC in preferential encoding of aversive stimuli, you would first have to equate the outcomes by magnitude and then compare how the axons acquire preferences across time. Alternatively, a prediction of a more general process that I detail above would predict that you could give mice two rewards that differ in magnitude (e.g., lots of food vs. small water) and you would see the same results that the authors have seen here (i.e., a preference for the food, which is the larger and more salient outcome). Without other tests of how dopamine axons in mPFC respond to situations like this, I don't think any conclusion around mPFC in favoring aversive stimuli can be made.

      As suggested, we have made the current manuscript as objective as possible, removing interpretation aspects regarding what dopamine axons encode and emphasizing their functional diversity. In particular, we remove the word ‘encode’ when describing the response of dopamine axons.

      Although it may have appeared unclear, there was no contradiction within our data regarding the response to reward and aversive stimuli. We have now improved the readability of the Results and Methods sections. Concerning the interpretation of what exactly the mPFC dopamine axons encode, we have rewritten the discussion to be as objective about our data as possible, as suggested. We also have edited our title and abstract accordingly. Meanwhile, we wish to emphasize that our reward and aversive stimuli are standard paradigms commonly used in the field. We believe, and all the reviewers agreed, that reporting the diversity of dopamine axonal responses with a novel imaging design constitutes new insight for the neuroscience community. Therefore, we have decided to leave the introduction of new behavioral tasks for future studies and instead expanded our discussion.

      As mentioned, I think the experiments are executed really well and the technological aspects of the authors' methods are impressive. However, there are also some aspects of the data presentation that would be improved. Some of the graphs took a considerable amount of effort to unpack. For example, Figure 4 is hard going. Is there a way to better illustrate the main points that this figure wants to convey? Some of this might be helped by a more complete description in the figure captions about what the data are showing. It would also be great to see how the response of dopamine axons changes across trial within a session to the shock and water-predictive cues. Supp Figure 1 should be in the main text with standard error and analyses across time. Clarifying these aspects of the data would make the paper more relevant and accessible to the field.

      We thank the reviewer for pointing out that the legend of Figure 4 was incomplete. We have fixed it, along with improving the presentation of the figure. We have also prepared a new figure (Figure 3– figure supplement 4) to compare CSaversive and CSreward signals for the first and rest of the trials within daily sessions, revealing further functional diversity in dopamine axons. We have decided to keep Figure 1–figure supplement 2 as a figure supplement with an additional analysis, as another reviewer pointed out that the design is not completely new. Furthermore, as eLife readers can easily access figure supplements, we believe it is appropriate to maintain it in this way.

      Minor points:

      (1) What is the control period for the omission test? Was omission conducted for the shock?

      The control period for reward omission is a 2-second period just before the CS onset. We did not include shock omission, because a sufficient number of trials (> 6 trials) for the rare omission condition could not be achieved within a single day.

      (2) The authors should mention how similar the tones were that predicted water and shock.

      According to de Hoz and Nelken (2014), a frequency difference of 4–7% is enough for mice to discriminate between tones. In addition, anticipatory licking and running confirmed that the mice could discriminate between the frequencies. We have now included this information in the Discussion.

      (3) I realize the viral approach used in the current studies may not allow for an idea of where in VTA dopamine neurons are that project to mPFC- is there data in the literature that speak to this? Particularly important as we now know that there is considerable heterogeneity in dopamine neuronal responses, which is often captured by differences in medial/lateral position within VTA.

      Some studies have suggested that mesocortical dopamine neurons are located in the medial posterior VTA (e.g., Lammel et al., 2008). However, in mouse anterograde tracing, it is not possible to spatially confine the injection of conventional viruses/tracers. We now refer to Lammel et al., 2008 in the Introduction.

    2. Reviewer #2 (Public Review):

      Summary:

      This study aims to address existing differences in the literature regarding the extent of reward versus aversive dopamine signaling in the prefrontal cortex. To do so, the authors chose to present mice with both a reward and an aversive stimulus during different trials each day. The authors used high spatial resolution two-photon calcium imaging of individual dopaminergic axons in the medial PFC to characterize the response of these axons to determine the selectivity of responses in unique axons. They also paired the reward (water) and an aversive stimulus (tail shock) with auditory tones and recorded across 12 days of associative learning.

      The authors find that some axons respond to both reward and aversive unconditioned stimuli, but overall, there is a preference to respond to aversive stimuli consistent with expectations from prior studies that used other recording methods. The authors find that both of their two auditory stimuli initially drive responses in axons, but that with training axons develop more selective responses for the shock associated tone indicating that associative learning led to changes in these axon's responses. Finally, the authors use anticipatory behaviors during the conditioned stimuli and facial expressions to determine stimulus discrimination and relate dopamine axons signals with this behavioral evidence of discrimination. This study takes advantage of cutting-edge imaging approaches to resolve the extent to which dopamine axons in PFC respond appetitive or aversive stimuli. They conclude that there is a bias to respond to the aversive tail shock in most axons and weaker more sparse representation of water reward.

      Strengths:

      The strength of this study is the imaging approach that allows for investigation of the heterogeneity of response across individual dopamine axons unlike other common approaches such as fiber photometry which provide a measure of the average population activity. The use of appetitive and aversive stimuli to probe responses across individual axons is another strength as it reveals response diversity that is often overlooked in reward-only studies.

      Weaknesses:

      A weakness of this study is the design of the associative conditioning paradigm. The use of only a single reward and single aversive stimulus makes it difficult to know whether these results are specific to the valence of the stimuli versus the specific identity of the stimuli. Further, the reward presentations are more numerous than the aversive trials making it unclear how much novelty and habituation account for results. Moreover, the training seems somewhat limited by the low number of trials and did not result in strong associative conditioning. The lack of omission responses reported may reflect weak associative conditioning. Finally, the study provides a small advance in our understanding of dopamine signaling in the PFC and lacks evidence for if and what might be the consequence of these axonal responses on PFC dopamine concentrations and PFC neuron activity.

    3. eLife assessment

      This important study shows that distinct midbrain dopaminergic axons in the medial prefrontal cortex respond to aversive and rewarding stimuli and suggest that they are biased toward aversive processing. The use of innovative microprism based two-photon calcium imaging to study single axon heterogeneity is convincing, although the experimental design makes it difficult to definitively distinguish aversive valence from stimulus salience in this dopamine projection. This work will be of interest to neuroscientists working on neuromodulatory systems, cortical function and decision making.

    4. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Abe and colleagues employ in vivo 2-photon calcium imaging of dopaminergic axons in the mPFC. The study reveals that these axons primarily respond to unconditioned aversive stimuli (US) and enhance their responses to initially-neutral stimuli after classical association learning. The manuscript is well-structured and presents results clearly. The utilization of a refined prism-based imaging technique, though not entirely novel, is well-implemented. The study's significance lies in its contribution to the existing literature by offering single-axon resolution functional insights, supplementing prior bulk measurements of calcium or dopamine release. Given the current focus on neuromodulator neuron heterogeneity, the work aligns well with current research trends and will greatly interest researchers in the field.

      Comment on the revised version:

      In my opinion, the authors did a great job with the revision of the manuscript.

    5. Reviewer #3 (Public Review):

      Summary:

      The authors image dopamine axons in medial prefrontal cortex (mPFC) using microprism-mediated two-photon calcium imaging. They image these axons as mice learn that two auditory cues predict two distinct outcomes, tailshock, or water delivery. They find that some axons show a preference for encoding of the shock and some show a preference for encoding of water. The authors report a greater number of dopamine axons in mPFC that respond to shock. Across time, the shock-preferring axons begin to respond preferentially to the cue predicting shock, while there is a less pronounced increase in the water-responsive axons that acquire a response to the water-predictive cue (these axons also increase non-significantly to the shock-predictive cue). These data lead the authors to argue that dopamine axons in mPFC preferentially encode aversive stimuli.

      Strengths:

      The experiments are beautifully executed and the authors have mastered an impressively complex technique. Specifically, they are able to image and track individual dopamine axons in mPFC across days of learning. And this technique is used the way it should be: the authors isolate distinct dopamine axons in mPFC and characterize their encoding preferences and how this evolves across learning of cue-shock and cue-water contingencies. Thus, these experiments are revealing novel information about how aversive and rewarding stimuli is encoded at the level of individual axons, in a way that has not been done before. This is timely and important.

      Weaknesses:

      The overarching conclusion of the paper is that dopamine axons preferentially encode aversive stimuli. However, this is confounded by differences in the strength of the aversive and appetitive outcomes. As the authors point out, the axonal response to stimuli is sensitive to outcome magnitude (Supp Fig 3). That is, if you increase the magnitude of water or shock that is delivered, you increase the change in fluorescence that is seen in the axons. Unsurprisingly, the change in fluorescence that is seen to shock is considerably higher than water reward. Further, over 40% of the axons respond to water early in training [yet just a few lines below the authors write: "Previous studies have demonstrated that the overall dopamine release at the mPFC or the summed activity of mPFC dopamine axons exhibits a strong response to aversive stimuli (e.g., tail shock), but little to rewards", which seems inconsistent with their own data]. Given these aspects of the data, it could be the case that the dopamine axons in mPFC encodes different types of information and delegates preferential processing to the most salient outcome across time. The use of two similar sounding tones (9Khz and 12KHz) for the reward and aversive predicting cues are likely to enhance this as it requires a fine-grained distinction between the two cues in order to learn effectively. That is not to say that the mice cannot distinguish between these cues, rather that they may require additional processes to resolve the similarity, which are known to be dependent on the mPFC.

      There is considerable literature on mPFC function across species that would support such a view. Specifically, theories of mPFC function (in particular prelimbic cortex, which is where the axon images are mostly taken) generally center around resolution of conflict in what to respond, learn about, and attend to. That is, mPFC is important for devoting the most resources (learning, behavior) to the most relevant outcomes in the environment. This data then, provides a mechanism for this to occur in mPFC. That is, dopamine axons signal to the mPFC the most salient aspects of the environment, which should be preferentially learnt about and responded towards. This is also consistent with the absence of a negative prediction error during omission: the dopamine axons show increases in responses during receipt of unexpected outcomes but do not encode negative errors. This supports a role for this projection in helping to allocate resources to the most salient outcomes and their predictors, and not learning per se. Below are a just few references from the rich literature on mPFC function (some consider rodent mPFC analogous to DLPFC, some mPFC), which advocate for a role in this region in allocating attention and cognitive resources to most relevant stimuli, and do not indicate preferential processing of aversive stimuli.

      References:<br /> 1. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1), 167-202.<br /> 2. Bissonette, G. B., Powell, E. M., & Roesch, M. R. (2013). Neural structures underlying set-shifting: roles of medial prefrontal cortex and anterior cingulate cortex. Behavioural brain research, 250, 91-101.<br /> 3. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual review of neuroscience, 18(1), 193-222.<br /> 4. Sharpe, M. J., Stalnaker, T., Schuck, N. W., Killcross, S., Schoenbaum, G., & Niv, Y. (2019). An integrated model of action selection: distinct modes of cortical control of striatal decision making. Annual review of psychology, 70, 53-76.<br /> 5. Ridderinkhof, K. R., Ullsperger, M., Crone, E. A., & Nieuwenhuis, S. (2004). The role of the medial frontal cortex in cognitive control. science, 306(5695), 443-447.<br /> 6. Nee, D. E., Kastner, S., & Brown, J. W. (2011). Functional heterogeneity of conflict, error, task-switching, and unexpectedness effects within medial prefrontal cortex. Neuroimage, 54(1), 528-540.<br /> 7. Isoda, M., & Hikosaka, O. (2007). Switching from automatic to controlled action by monkey medial frontal cortex. Nature neuroscience, 10(2), 240-248.

    1. Author response:

      eLife assessment

      This study provides valuable information on the mechanism of PepT2 through enhanced-sampling molecular dynamics, backed by cell-based assays, highlighting the importance of protonation of selected residues for the function of a proton-coupled oligopeptide transporter (hsPepT2). The molecular dynamics approaches are convincing, but with limitations that could be addressed in the manuscript, including lack of incorporation of a protonation coordinate in the free energy landscape, possibility of protonation of the substrate, errors with the chosen constant pH MD method for membrane proteins, dismissal of hysteresis emerging from the MEMENTO method, and the likelihood of other residues being affected by peptide binding. Some changes to the presentation could be considered, including a better description of pKa calculations and the inclusion of error bars in all PMFs. Overall, the findings will appeal to structural biologists, biochemists, and biophysicists studying membrane transporters.

      We would like to express our gratitude to the reviewers for providing their feedback on our manuscript, and also for recognising the variety of computational methods employed, the amount of sampling collected and the experimental validation undertaken. Following the individual reviewer comments, as addressed point-by-point below, we will shortly prepare a revised version of this paper. Intended changes to the revised manuscript are marked up in bold font in the detailed responses below, but before that we address some of the comments made above in the general assessment:

      • “lack of incorporation of a protonation coordinate in the free energy landscape”. We acknowledge that of course it would be highly desirable to treat protonation state changes explicitly and fully coupled to conformational changes. However, at this point in time, evaluating such a free energy landscape is not computationally feasible (especially considering that the non-reactive approach taken here already amounts to almost 1ms of total sampling time). Previous reports in the literature tend to focus on either simpler systems or a reduced subset of a larger problem. As we were trying to obtain information on the whole transport cycle, we decided to focus here on non-reactive methods.

      • “possibility of protonation of the substrate”. The reviewers are correct in pointing out this possibility, which we had not discussed explicitly in our manuscript. Briefly, while we describe a mechanism in which protonation of only protein residues (with an unprotonated ligand) can account for driving all the necessary conformational changes of the transport cycle, there is some evidence for a further intermediate protonation site in our data (as we commented on in the first version of the manuscript as well), which may or may not be the substrate itself. A future explicit treatment of the proton movements through the transporter, when it will become computationally tractable to do so, will have to include the substrate as a possible protonation site; for the present moment, we will amend our discussion to alert the reader to the possibility that the substrate could be an intermediate to proton transport. This has repercussions for our study of the E56 pKa value, where – if protons reside with a significant population at the substrate C-terminus – our calculated shift in pKa upon substrate binding could be an overestimate, although we would qualitatively expect the direction of shift to be unaffected. However, we also anticipate that treating this potential coupling explicitly would make convergence of any CpHMD calculation impractical to achieve and thus it may be the case that for now only a semi-quantitative conclusion is all that can be obtained.

      • “errors with the chosen constant pH MD method for membrane proteins”. We acknowledge that – as reviewer #1 has reminded us – the AMBER implementation of hybrid-solvent CpHMD is not rigorous for membrane proteins, and as such we will add a cautionary note to our paper. We will also explain how the use of the ABFE thermodynamic cycle calculations helps to validate the CpHMD results in a completely orthogonal manner (we will promote this validation which was in the supplementary figures into the main text in the revised version). We therefore remain reasonably confident in the results presented with regards to the reported pKa shift of E56 upon substrate binding, and suggest that if the impact of neglecting the membrane in the implicit-solvent stage of CpHMD is significant, then there is likely an error cancellation when considering shifts induced by the incoming substrate.

      • “dismissal of hysteresis emerging from the MEMENTO method”. We have shown in our method design paper how the use of the MEMENTO method drastically reduces hysteresis compared to steered MD and metadynamics for path generation, and find this improvement again for PepT2 in this study. We will address reviewer #3’s concern about our presentation on this point by revising our introduction of the MEMENTO method, as detailed in the response below.

      • “the likelihood of other residues being affected by peptide binding”. In this study, we have investigated in detail the involvement of several residues in proton-coupled di-peptide transport by PepT2. Short of the potential intermediate protonation site mentioned above, the set of residues we investigate form a minimal set of sorts within which the important driving forces of alternating access can be rationalised. We have not investigated in substantial detail here the residues involved in holding the peptide in the binding site, as they are well studied in the literature and ligand promiscuity is not the problem of interest here. It remains entirely possible that further processes contribute to the mechanism of driving conformational changes by involving other residues not considered in this paper. We will make our speculation that an ensemble of different processes may be contributing simultaneously more explicit in our revision, but do not believe any of our conclusions would be affected by this.

      As for the additional suggested changes in presentation, we will provide the requested details on the CpHMD analysis. Furthermore, we will use the convergence data presented separately in figures S12 and S16 to include error bars on our 1D-reprojections of the 2D-PMFs in figures 3, 4 and 5. (Note that we will opt to not do so in figures S10 and S15 which collate all 1D PMF reprojections for the OCC ↔ OF and OCC ↔ IF transitions in single reference plots, respectively, to avoid overcrowding those necessarily busy figures). We are also changing the colours schemes of these plots in our revision to improve accessibility.

      Reviewer #1 (Public Review):

      The authors have performed all-atom MD simulations to study the working mechanism of hsPepT2. It is widely accepted that conformational transitions of proton-coupled oligopeptide transporters (POTs) are linked with gating hydrogen bonds and salt bridges involving protonatable residues, whose protonation triggers gate openings. Through unbiased MD simulations, the authors identified extra-cellular (H87 and D342) and intra-cellular (E53 and E622) triggers. The authors then validated these triggers using free energy calculations (FECs) and assessed the engagement of the substrate (Ala-Phe dipeptide). The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cellbased transport assays. An alternating-access mechanism was proposed. The study was largely conducted properly, and the paper was well-organized. However, I have a couple of concerns for the authors to consider addressing.

      We would like to note here that it may be slightly misleading to the reader to state that “The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cell-based transport assays.” The cellbased transport assays confirmed the importance of the extracellular gating trigger residues H87, S321 and D342 (as mentioned in the preceding sentence), not of the substrate-protonation link as this line might be understood to suggest.

      (1) As a proton-coupled membrane protein, the conformational dynamics of hsPepT2 are closely coupled to protonation events of gating residues. Instead of using semi-reactive methods like CpHMD or reactive methods such as reactive MD, where the coupling is accounted for, the authors opted for extensive non-reactive regular MD simulations to explore this coupling. Note that I am not criticizing the choice of methods, and I think those regular MD simulations were well-designed and conducted. But I do have two concerns.

      a) Ideally, proton-coupled conformational transitions should be modelled using a free energy landscape with two or more reaction coordinates (or CVs), with one describing the protonation event and the other describing the conformational transitions. The minimum free energy path then illustrates the reaction progress, such as OCC/H87D342- → OCC/H87HD342H → OF/H87HD342H as displayed in Figure 3.

      We concur with the reviewer that the ideal way of describing the processes studied in our paper would be as a higher-dimensional free energy landscapes obtained from a simulation method that can explicitly model proton-transfer processes. Indeed, it would have been particularly interesting and potentially informative with regards to the movement of protons down into the transporter in the OF → OCC → IF sequence of transitions. As we note in our discussion on the H87→E56 proton transfer:

      “This could be investigated using reactive MD or QM/MM simulations (both approaches have been employed for other protonation steps of prokaryotic peptide transporters, see Parker et al. (2017) and Li et al. (2022)). However, the putative path is very long (≈ 1.7 nm between H87 and E56) and may or may not involve a large number of intermediate protonatable residues, in addition to binding site water. While such an investigation is possible in principle, it is beyond the scope of the present study.”

      Where even sampling the proton transfer step itself in an essentially static protein conformation would be pushing the boundaries of what has been achieved in the field, we believe that considering the current state-of-the-art, a fully coupled investigation of large-scale conformational changes and proton-transfer reaction is not yet feasible in a realistic/practical time frame. We also note this limitation already when we say that:

      “The question of whether proton binding happens in OCC or OF warrants further investigation, and indeed the co-existence of several mechanisms may be plausible here”.

      Nonetheless, we are actively exploring approaches to treat uptake and movement of protons explicitly for future work.

      In our revision, we will expand on our discussion of the reasoning behind employing a nonreactive approach and the limitations that imposes on what questions can be answered in this study.

      Without including the protonation as a CV, the authors tried to model the free energy changes from multiple FECs using different charge states of H87 and D342. This is a practical workaround, and the conclusion drawn (the OCC→ OF transition is downhill with protonated H87 and D342) seems valid. However, I don't think the OF states with different charge states (OF/H87D342-, OF/H87HD342-, OF/H87D342H, and OF/H87HD342H) are equally stable, as plotted in Figure 3b. The concern extends to other cases like Figures 4b, S7, S10, S12, S15, and S16. While it may be appropriate to match all four OF states in the free energy plot for comparison purposes, the authors should clarify this to ensure readers are not misled.

      The reviewer is correct in their assessment that the aligning of PMFs in these figures is arbitrary; no relative free energies of the PMFs to each other can be estimated without explicit free energy calculations at least of protonation events at the end state basins. The PMFs in our figures are merely superimposed for illustrating the differences in shape between the obtained profiles in each condition, as discussed in the text, and we will make this clear in the appropriate figure captions in our revision.

      b) Regarding the substrate impact, it appears that the authors assumed fixed protonation states. I am afraid this is not necessarily the case. Variations in PepT2 stoichiometry suggest that substrates likely participate in proton transport, like the Phe-Ala (2:1) and Phe-Gln (1:1) dipeptides mentioned in the introduction. And it is not rigorous to assume that the N- and C-termini of a peptide do not protonate/deprotonate when transported. I think the authors should explicitly state that the current work and the proposed mechanism (Figure 8) are based on the assumption that the substrates do not uptake/release proton(s).

      This is indeed an assumption inherent in the current work. While we do “speculate that the proton movement processes may happen as an ensemble of different mechanisms, and potentially occur contemporaneously with the conformational change” we do not in the current version indicate explicitly that this may involve the substrate. We will make clear the assumption and this possibility in the revised version of our paper. Indeed, as we discuss, there is some evidence in our PMFs of an additional protonation site not considered thus far, which may or may not be the substrate. We will make note of this point in the revised manuscript.

      As for what information can be drawn from the given experimental stoichiometries, we note in our paper that “a 2:1 stoichiometry was reported for the neutral di-peptide D-Phe-L-Ala and 3:1 for anionic D-Phe-L-Glu. (Chen et al., 1999) Alternatively, Fei et al. (1999) have found 1:1 stoichiometries for either of D-Phe-L-Gln (neutral), D-Phe-L-Glu (anionic), and D-Phe-L-Lys (cationic).”

      We do not assume that it is our place to arbit among the apparent discrepancies in the experimental data here, although we believe that our assumed 2:1 stoichiometry is additionally “motivated also by our computational results that indicate distinct and additive roles played by two protons in the conformational cycle mechanism”.

      (2) I have more serious concerns about the CpHMD employed in the study.

      a) The CpHMD in AMBER is not rigorous for membrane simulations. The underlying generalized Born model fails to consider the membrane environment when updating charge states. In other words, the CpHMD places a membrane protein in a water environment to judge if changes in charge states are energetically favorable. While this might not be a big issue for peripheral residues of membrane proteins, it is likely unphysical for internal residues like the ExxER motif. As I recall, the developers have never used the method to study membrane proteins themselves. The only CpHMD variant suitable for membrane proteins is the membrane-enabled hybrid-solvent CpHMD in CHARMM. While I do not expect the authors to redo their CpHMD simulations, I do hope the authors recognize the limitations of their method.

      We will discuss the limitations of the AMBER CpHMD implementation in the revised version. However, despite that, we believe we have in fact provided sufficient grounds for our conclusion that substrate binding affects ExxER motif protonation in the following way:

      In addition to CpHMD simulations, we establish the same effect via ABFE calculations, where the substrate affinity is different at the E56 deprotonated vs protonated protein. This is currently figure S20, though in the revised version we will move this piece of validation into a new panel of figure 6 in the main text, since it becomes more important with the CpHMD membrane problem in mind. Since the ABFE calculations are conducted with an all-atom representation of the lipids and the thermodynamic cycle closes well, it would appear that if the chosen CpHMD method has a systematic error of significant magnitude for this particular membrane protein system, there may be the benefit of error cancellation. While the calculated absolute pKa values may not be reliable, the difference made by substrate binding appears to be so, as judged by the orthogonal ABFE technique.

      Although the reviewer does “not expect the authors to redo their CpHMD simulations”, we consider that it may be helpful to the reader to share in this response some results from trials using the continuous, all-atom constant pH implementation that has recently become available in GROMACS (Aho et al 2022, https://pubs.acs.org/doi/10.1021/acs.jctc.2c00516) and can be used rigorously with membrane proteins, given its all-atom lipid representation.

      Unfortunately, when trying to titrate E56 in this CpHMD implementation, we found few protonationstate transitions taking place, and the system often got stuck in protonation state–local conformation coupled minima (which need to interconvert through rearrangements of the salt bridge network involving slow side-chain dihedral rotations in E53, E56 and R57). Author response image 1 shows this for the apo OF state, Author response image 2 shows how noisy attempts at pKa estimation from this data turn out to be, necessitating the use of a hybrid-solvent method.

      Author response image 1.

      All-atom CpHMD simulations of apo-OF PepT2. Red indicates protonated E56, blue is deprotonated.

      Author response image 2.

      Difficulty in calculating the E56 pKa value from the noisy all-atom CpHMD data shown in Author response image 1

      b) It appears that the authors did not make the substrate (Ala-Phe dipeptide) protonatable in holosimulations. This oversight prevents a complete representation of ligand-induced protonation events, particularly given that the substrate ion pairs with hsPepT2 through its N- & C-termini. I believe it would be valuable for the authors to acknowledge this potential limitation.

      In this study, we implicitly assumed from the outset that the substrate does not get protonated, which – as by way of response to the comment above – we will acknowledge explicitly in revision. This potential limitation for the available mechanisms for proton transfer also applies to our investigation of the ExxER protonation states. In particular, a semi-grand canonical ensemble that takes into account the possibility of substrate C-terminus protonation may also sample states in which the substrate is protonated and oriented away from R57, thus leaving the ExxER salt bridge network in an apo-like state. The consequence would be that while the direction of shift in E56 pKa value will be the same, our CpHMD may overestimate its magnitude. It would thus be interesting to make the C-terminus protonatable for obtaining better quantitative estimates of the E56 pKa shift (as is indeed true in general for any other protein protonatable residue, though the effects are usually assumed to be negligible). We do note, however, that convergence of the CpHMD simulations would be much harder if the slow degree of freedom of substrate reorientation (which in our experience takes 10s to 100s of ns in this binding pocket) needs to be implicitly equilibrated upon protonation state transitions. We will discuss such considerations in the revision.

      Reviewer #2 (Public Review):

      This is an interesting manuscript that describes a series of molecular dynamics studies on the peptide transporter PepT2 (SLC15A2). They examine, in particular, the effect on the transport cycle of protonation of various charged amino acids within the protein. They then validate their conclusions by mutating two of the residues that they predict to be critical for transport in cell-based transport assays. The study suggests a series of protonation steps that are necessary for transport to occur in Petp2. Comparison with bacterial proteins from the same family shows that while the overall architecture of the proteins and likely mechanism are similar, the residues involved in the mechanism may differ.

      Strengths:

      This is an interesting and rigorous study that uses various state-of-the-art molecular dynamics techniques to dissect the transport cycle of PepT2 with nearly 1ms of sampling. It gives insight into the transport mechanism, investigating how the protonation of selected residues can alter the energetic barriers between various states of the transport cycle. The authors have, in general, been very careful in their interpretation of the data.

      Weaknesses:

      Interestingly, they suggest that there is an additional protonation event that may take place as the protein goes from occluded to inward-facing but they have not identified this residue.

      We have indeed suggested that there may be an additional protonation site involved in the conformational cycle that we have not been able to capture, which – as we discuss in our paper – might be indicated by the shapes of the OCC ↔ IF PMFs given in Figure S15. One possibility is for this to be the substrate itself (see the response to reviewer #1 above) though within the scope of this study the precise pathway by which protons move down the transporter and the exact ordering of conformational change and proton transfer reactions remains a (partially) open question. We acknowledge this and denote it with question marks in the mechanistic overview we give in Figure 8, and also “speculate that the proton movement processes may happen as an ensemble of different mechanisms, and potentially occur contemporaneously with the conformational change”.

      Some things are a little unclear. For instance, where does the state that they have defined as occluded sit on the diagram in Figure 1a? - is it truly the occluded state as shown on the diagram or does it tend to inward- or outward-facing?

      Figure 1a is a simple schematic overview intended to show which structures of PepT2 homologues are available to use in simulations. This was not meant to be a quantitative classification of states. Nonetheless, we can note that the OCC state we derived has extra- and intracellular gate opening distances (as measured by the simple CVs defined in the methods and illustrated in Figure 2a) that indicate full gate closure at both sides. In particular, although it was derived from the IF state via biased sampling, the intracellular gate opening distance in the OCC state used for our conformational change enhanced sampling was comparable to that of the OF state (ie, full closure of the gate), see Figure S2b and the grey bars therein. Therefore, we would schematically classify the OCC state to lie at the center of the diagram in Figure 1a. Furthermore, it is largely stable over triplicates of 1 μslong unbiased MD, where in 2/3 replicates the gates remain stable, and the remaining replicate there is partial opening of the intracellular gate (as shown in Figure 2 b/c under the “apo standard” condition). We comment on this in the main text by saying that “The intracellular gate, by contrast, is more flexible than the extracellular gate even in the apo, standard protonation state”, and link it to the lower barrier for transition to IF than to OF. We did this by saying that “As for the OCC↔OF transitions, these results explain the behaviour we had previously observed in the unbiased MD of Figure 2c.” We acknowledge this was not sufficiently clear and will add details to the latter sentence in revision to help clarify better the nature of the occluded state.

      The pKa calculations and their interpretation are a bit unclear. Firstly, it is unclear whether they are using all the data in the calculations of the histograms, or just selected data and if so on what basis was this selection done. Secondly, they dismiss the pKa calculations of E53 in the outward-facing form as not being affected by peptide binding but say that E56 is when there seems to be a similar change in profile in the histograms.

      In our manuscript, we have provided two distinct analyses of the raw CpHMD data. Firstly, we analysed the data by the replicates in which our simulations were conducted (Figure 6, shown as bar plots with mean from triplicates +/- standard deviation), where we found that only the effect on E56 protonation was distinct as lying beyond the combined error bars. This analysis uses the full amount of sampling conducted for each replicate. However, since we found that the range of pKa values estimated from 10ns/window chunks was larger than the error bars obtained from the replicate analysis (Figures S17 and S18), we sought to verify our conclusion by pooling all chunk estimates and plotting histograms (Figure S19). We recover from those the effect of substrate binding on the E56 protonation state on both the OF and OCC states. However, as the reviewer has pointed out (something we did not discuss in our original manuscript), there is a shift in the pKa of E53 of the OF state only. In fact, the trend is also apparent in the replicate-based analysis of Figure 6, though here the larger error bars overlap. In our revision, we will add more details of these analyses for clarity (including more detailed figure captions regarding the data used in Figure 6) as well as a discussion of the partial effect on the E53 pKa value.

      We do not believe, however, that our key conclusions are negatively affected. If anything, a further effect on the E53 pKa which we had not previously commented on (since we saw the evidence as weaker, pertaining to only one conformational state) would strengthen the case for an involvement of the ExxER motif in ligand coupling.

      Reviewer #3 (Public Review):

      Summary:

      Lichtinger et al. have used an extensive set of molecular dynamics (MD) simulations to study the conformational dynamics and transport cycle of an important member of the proton-coupled oligopeptide transporters (POTs), namely SLC15A2 or PepT2. This protein is one of the most wellstudied mammalian POT transporters that provides a good model with enough insight and structural information to be studied computationally using advanced enhanced sampling methods employed in this work. The authors have used microsecond-level MD simulations, constant-PH MD, and alchemical binding free energy calculations along with cell-based transport assay measurements; however, the most important part of this work is the use of enhanced sampling techniques to study the conformational dynamics of PepT2 under different conditions.

      The study attempts to identify links between conformational dynamics and chemical events such as proton binding, ligand-protein interactions, and intramolecular interactions. The ultimate goal is of course to understand the proton-coupled peptide and drug transport by PepT2 and homologous transporters in the solute carrier family.

      Some of the key results include:

      (1) Protonation of H87 and D342 initiate the occluded (Occ) to the outward-facing (OF) state transition.

      (2) In the OF state, through engaging R57, substrate entry increases the pKa value of E56 and thermodynamically facilitates the movement of protons further down.

      (3) E622 is not only essential for peptide recognition but also its protonation facilitates substrate release and contributes to the intracellular gate opening. In addition, cell-based transport assays show that mutation of residues such as H87 and D342 significantly decreases transport activity as expected from simulations.

      Strengths:

      (1) This is an extensive MD-based study of PepT2, which is beyond the typical MD studies both in terms of the sheer volume of simulations as well as the advanced methodology used. The authors have not limited themselves to one approach and have appropriately combined equilibrium MD with alchemical free energy calculations, constant-pH MD, and geometry-based free energy calculations. Each of these 4 methods provides a unique insight regarding the transport mechanism of PepT2.

      (2) The authors have not limited themselves to computational work and have performed experiments as well. The cell-based transport assays clearly establish the importance of the residues that have been identified as significant contributors to the transport mechanism using simulations.

      (3) The conclusions made based on the simulations are mostly convincing and provide useful information regarding the proton pathway and the role of important residues in proton binding, protein-ligand interaction, and conformational changes.

      Weaknesses:

      (1) Some of the statements made in the manuscript are not convincing and do not abide by the standards that are mostly followed in the manuscript. For instance, on page 4, it is stated that "the K64-D317 interaction is formed in only ≈ 70% of MD frames and therefore is unlikely to contribute much to extracellular gate stability." I do not agree that 70% is negligible. Particularly, Figure S3 does not include the time series so it is not clear whether the 30% of the time where the salt bridge is broken is in the beginning or the end of simulations. For instance, it is likely that the salt bridge is not initially present and then it forms very strongly. Of course, this is just one possible scenario but the point is that Figure S3 does not rule out the possibility of a significant role for the K64-D317 salt bridge.

      The reviewer is right to point out that the statement and Figure S3 as they stand do not adequately support our decision to exclude the K64-D317 salt-bridge in our further investigations. The violin plot shown in Figure S3, visualised as pooled data from unbiased 1 μs triplicates, does indeed not rule out a scenario where the salt bridge only formed late in our simulations (or only in some replicates), but then is stable. Therefore, in our revision, we will include the appropriate time-series of the salt bridge distances, showing how K64-D317 is initially stable but then falls apart in replicate 1, and is transiently formed and disengaged across the trajectories in replicates 2 and 3. We will also remake the data for this plot as we discovered a bug in the relevant analysis script that meant the D170-K642 distance was not calculated accurately. The results are however almost identical, and our conclusions remain.

      (2) Similarly, on page 4, it is stated that "whether by protonation or mutation - the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed (Figure S5)." I do not agree with this assessment. The authors need to be aware of the limitations of this approach. Consider "WT H87-prot" and "D342A H87-prot": when D342 residue is mutated, in one out of 3 simulations, we see the opening of the gate within 1 us. When D342 residue is not mutated we do not see the opening in any of the 3 simulations within 1 us. It is quite likely that if rather than 3 we have 10 simulations or rather than 1 us we have 10 us simulations, the 0/3 to 1/3 changes significantly. I do not find this argument and conclusion compelling at all.

      If the conclusions were based on that alone, then we would agree. However, this section of work covers merely the observations of the initial unbiased simulations which we go on to test/explore with enhanced sampling in the rest of the paper, and which then lead us to the eventual conclusions.

      Figure S5 shows the results from triplicate 1 μs-long trajectories as violin-plot histograms of the extracellular gate opening distance, also indicating the first and final frames of the trajectories as connected by an arrow for orientation – a format we chose for intuitively comparing 48 trajectories in one plot. The reviewer reads the plot correctly when they analyse the “WT H87-prot” vs “D342A H87-prot” conditions. In the former case, no spontaneous opening in unbiased MD is taking place, whereas when D342 is mutated to alanine in addition to H87 protonation, we see spontaneous transition in 1 out of 3 replicates. However, the reviewer does not seem to interpret the statement in question in our paper (“the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed”) in the way we intended it to be understood. We merely want to note here a correlation in the unbiased dataset we collected at this stage, and indeed the one spontaneous opening in the case comparison picked out by the reviewer is in the condition where both the H87 interaction network and D342-R206 are perturbed. In noting this we do not intend to make statistically significant statements from the limited dataset. Instead, we write that “these simulations show a large amount of stochasticity and drawing clean conclusions from the data is difficult”. We do however stand by our assessment that from this limited data we can “already appreciate a possible mechanism where protons move down the transporter pore” – a hypothesis we investigate more rigorously with enhanced sampling in the rest of the paper. We will revise the section in question to make clearer that the unbiased MD is only meant to give an initial hypothesis here to be investigated in more detail in the following sections. In doing so, we will also incorporate, as we had not done before, the case (not picked out by the reviewer here but concerning the same figure) of S321A & H87 prot. In the third replicate, this shows partial gate opening towards the end of the unbiased trajectory (despite D342 not being affected), highlighting further the stochastic nature that makes even clear correlative conclusions difficult to draw.

      (3) While the MEMENTO methodology is novel and interesting, the method is presented as flawless in the manuscript, which is not true at all. It is stated on Page 5 with regards to the path generated by MEMENTO that "These paths are then by definition non-hysteretic." I think this is too big of a claim to say the paths generated by MEMENTO are non-hysteretic by definition. This claim is not even mentioned in the original MEMENTO paper. What is mentioned is that linear interpolation generates a hysteresis-free path by definition. There are two important problems here: (a) MEMENTO uses the linear interpolation as an initial step but modifies the intermediates significantly later so they are no longer linearly interpolated structures and thus the path is no longer hysteresisfree; (b) a more serious problem is the attribution of by-definition hysteresis-free features to the linearly interpolated states. This is based on conflating the hysteresis-free and unique concepts. The hysteresis in MD-based enhanced sampling is related to the presence of barriers in orthogonal space. For instance, one may use a non-linear interpolation of any type and get a unique pathway, which could be substantially different from the one coming from the linear interpolation. None of these paths will be hysteresis-free necessarily once subjected to MD-based enhanced sampling techniques.

      We certainly do not intend to claim that the MEMENTO method is flawless. The concern the reviewer raises around the statement "These paths are then by definition non-hysteretic" is perhaps best addressed by a clarification of the language used and considering how MEMENTO is applied in this work.

      Hysteresis in the most general sense denotes the dependence of a system on its history, or – more specifically – the lagging behind of the system state with regards to some physical driver (for example the external field in magnetism, whence the term originates). In the context of biased MD and enhanced sampling, hysteresis commonly denotes the phenomenon where a path created by a biased dynamics method along a certain collective variable lags behind in phase space in slow orthogonal degrees of freedom (see Figure 1 in Lichtinger and Biggin 2023, https://doi.org/10.1021/acs.jctc.3c00140). When used to generate free energy profiles, this can manifest as starting state bias, where the conformational state that was used to seed the biased dynamics appears lower in free energy than alternative states. Figure S6 shows this effect on the PepT2 system for both steered MD (heavy atom RMSD CV) + umbrella sampling (tip CV) and metadynamics (tip CV). There is, in essence, a coupled problem: without an appropriate CV (which we did not have to start with here), path generation that is required for enhanced sampling displays hysteresis, but the refinement of CVs is only feasible when paths connecting the true phase space basins of the two conformations are available. MEMENTO helps solve this issue by reconstructing protein conformations along morphing paths which perform much better than steered MD paths with respect to giving consistent free energy profiles (see Figure S7 and the validation cases in the MEMENTO paper), even if the same CV is used in umbrella sampling.

      There are still differences between replicates in those PMFs, indicating slow conformational flexibility propagated from end-state sampling through MEMENTO. We use this to refine the CVs further with dimensionality reduction (see the Method section and Figure S8), before moving to 2D-umbrella sampling (figure 3). Here, we think, the reviewer’s point seems to bear. The MEMENTO paths are ‘non-hysteretic by definition’ with respect to given end states in the sense that they connect (by definition) the correct conformations at both end-states (unlike steered MD), which in enhanced sampling manifests as the absence of the strong starting-state bias we had previously observed (Figure S7 vs S6). They are not, however, hysteresis-free with regards to how representative of the end-state conformational flexibility the structures given to MEMENTO really were, which is where the iterative CV design and combination of several MEMENTO paths in 2D-PMFs comes in.

      We also cannot make a direct claim about whether in the transition region the MEMENTO paths might be separated from the true (lower free energy) transition paths by slow orthogonal degrees of freedom, which may conceivably result in overestimated barrier heights separating two free energy basins. We cannot guarantee that this is not the case, but neither in our MEMENTO validation examples nor in this work have we encountered any indications of a problem here.

      We hope that the reviewer will be satisfied by our revision, where we will replace the wording in question by a statement that the MEMENTO paths do not suffer from hysteresis that is otherwise incurred as a consequence of not reaching the correct target state in the biased run (in some orthogonal degrees of freedom).

    2. eLife assessment

      This study provides valuable information on the mechanism of PepT2 through enhanced-sampling molecular dynamics, backed by cell-based assays, highlighting the importance of protonation of selected residues for the function of a proton-coupled oligopeptide transporter (hsPepT2). The molecular dynamics approaches are convincing, but with limitations that could be addressed in the manuscript, including lack of incorporation of a protonation coordinate in the free energy landscape, possibility of protonation of the substrate, errors with the chosen constant pH MD method for membrane proteins, dismissal of hysteresis emerging from the MEMENTO method, and the likelihood of other residues being affected by peptide binding. Some changes to the presentation could be considered, including a better description of pKa calculations and the inclusion of error bars in all PMFs. Overall, the findings will appeal to structural biologists, biochemists, and biophysicists studying membrane transporters.

    3. Reviewer #1 (Public Review):

      The authors have performed all-atom MD simulations to study the working mechanism of hsPepT2. It is widely accepted that conformational transitions of proton-coupled oligopeptide transporters (POTs) are linked with gating hydrogen bonds and salt bridges involving protonatable residues, whose protonation triggers gate openings. Through unbiased MD simulations, the authors identified extra-cellular (H87 and D342) and intra-cellular (E53 and E622) triggers. The authors then validated these triggers using free energy calculations (FECs) and assessed the engagement of the substrate (Ala-Phe dipeptide). The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cell-based transport assays. An alternating-access mechanism was proposed. The study was largely conducted properly, and the paper was well-organized. However, I have a couple of concerns for the authors to consider addressing.

      (1) As a proton-coupled membrane protein, the conformational dynamics of hsPepT2 are closely coupled to protonation events of gating residues. Instead of using semi-reactive methods like CpHMD or reactive methods such as reactive MD, where the coupling is accounted for, the authors opted for extensive non-reactive regular MD simulations to explore this coupling. Note that I am not criticizing the choice of methods, and I think those regular MD simulations were well-designed and conducted. But I do have two concerns.

      a) Ideally, proton-coupled conformational transitions should be modelled using a free energy landscape with two or more reaction coordinates (or CVs), with one describing the protonation event and the other describing the conformational transitions. The minimum free energy path then illustrates the reaction progress, such as OCC/H87D342-  OCC/H87HD342H  OF/H87HD342H as displayed in Figure 3. Without including the protonation as a CV, the authors tried to model the free energy changes from multiple FECs using different charge states of H87 and D342. This is a practical workaround, and the conclusion drawn (the OCCOF transition is downhill with protonated H87 and D342) seems valid. However, I don't think the OF states with different charge states (OF/H87D342-, OF/H87HD342-, OF/H87D342H, and OF/H87HD342H) are equally stable, as plotted in Figure 3b. The concern extends to other cases like Figures 4b, S7, S10, S12, S15, and S16. While it may be appropriate to match all four OF states in the free energy plot for comparison purposes, the authors should clarify this to ensure readers are not misled.

      b) Regarding the substrate impact, it appears that the authors assumed fixed protonation states. I am afraid this is not necessarily the case. Variations in PepT2 stoichiometry suggest that substrates likely participate in proton transport, like the Phe-Ala (2:1) and Phe-Gln (1:1) dipeptides mentioned in the introduction. And it is not rigorous to assume that the N- and C-termini of a peptide do not protonate/deprotonate when transported. I think the authors should explicitly state that the current work and the proposed mechanism (Figure 8) are based on the assumption that the substrates do not uptake/release proton(s).

      (2) I have more serious concerns about the CpHMD employed in the study.

      a) The CpHMD in AMBER is not rigorous for membrane simulations. The underlying generalized Born model fails to consider the membrane environment when updating charge states. In other words, the CpHMD places a membrane protein in a water environment to judge if changes in charge states are energetically favorable. While this might not be a big issue for peripheral residues of membrane proteins, it is likely unphysical for internal residues like the ExxER motif. As I recall, the developers have never used the method to study membrane proteins themselves. The only CpHMD variant suitable for membrane proteins is the membrane-enabled hybrid-solvent CpHMD in CHARMM. While I do not expect the authors to redo their CpHMD simulations, I do hope the authors recognize the limitations of their method.

      b) It appears that the authors did not make the substrate (Ala-Phe dipeptide) protonatable in holo-simulations. This oversight prevents a complete representation of ligand-induced protonation events, particularly given that the substrate ion pairs with hsPepT2 through its N- & C-termini. I believe it would be valuable for the authors to acknowledge this potential limitation.

    1. Reviewer #2 (Public Review):

      The goal of the present study is to better understand the 'control objectives' that subjects adopt in a video-game-like virtual-balancing task. In this task, the hand must move in the opposite direction from a cursor. For example, if the cursor is 2 cm to the right, the subject must move their hand 2 cm to the left to 'balance' the cursor. Any imperfection in that opposition causes the cursor to move. E.g., if the subject were to move only 1.8 cm, that would be insufficient, and the cursor would continue to move to the right. If they were to move 2.2 cm, the cursor would move back toward the center of the screen. This return to center might actually be 'good' from the subject's perspective, depending on whether their objective is to keep the cursor still or keep it near the screen's center. Both are reasonable 'objectives' because the trial fails if the cursor moves too far from the screen's center during each six-second trial.

      This task was recently developed for use in monkeys (Quick et al., 2018), with the intention of being used for the study of the cortical control of movement, and also as a task that might be used to evaluate BMI control algorithms. The purpose of the present study is to better characterize how this task is performed. What sort of control policies are used. Perhaps more deeply, what kind of errors are those policies trying to minimize? To address these questions, the authors simulate control-theory style models and compare with behavior. They do in both in monkeys and in humans.

      These goals make sense as a precursor to future recording or BMI experiments. The primate motor-control field has long been dominated by variants of reaching tasks, so introducing this new task will likely be beneficial. This is not the first non-reaching task, but it is an interesting one and it makes sense to expand the presently limited repertoire of tasks. The present task is very different from any prior task I know of. Thus, it makes sense to quantify behavior as thoroughly as possible in advance of recordings. Understanding how behavior is controlled is, as the authors note, likely to be critical to interpreting neural data.

      From this perspective - providing a basis for interpreting future neural results - the present study is fairly successful. Monkeys seem to understand the task properly, and to use control policies that are not dissimilar from humans. Also reassuring is the fact that behavior remains sensible even when task-difficulty become high. By 'sensible' I simply mean that behavior can be understood as seeking to minimize error: position, velocity, or (possibly) both, and that this remains true across a broad range of task difficulties. The authors document why minimizing position and minimizing velocity are both reasonable objectives. Minimizing velocity is reasonable, because a near-stationary cursor can't move far in six seconds. Minimizing position error is reasonable, because the trial won't fail if the cursor doesn't stray far from the center. This is formally demonstrated by simulating control policies: both objectives lead to control policies that can perform the task and produce realistic single-trial behavior. The authors also demonstrate that, via verbal instruction, they can induce human subjects to favor one objective over the other. These all seem like things that are on the 'need to know' list, and it is commendable that this amount of care is being taken before recordings begin, as it will surely aid interpretation.

      Yet as a stand-alone study, the contribution to our understanding of motor control is more limited. The task allows two different objectives (minimize velocity, minimize position) to be equally compatible with the overall goal (don't fail the trial). Or more precisely, there exists a range of objectives with those two at the extreme. So it makes sense that different subjects might choose to favor different objectives, and also that they can do so when instructed. But has this taught us something about motor control, or simply that there is a natural ambiguity built into the task? If I ask you to play a game, but don't fully specify the rules, should I be surprised that different people think the rules are slightly different?

      The most interesting scientific claim of this study is not the subject-to-subject variability; the task design makes that quite likely and natural. Rather, the central scientific result is the claim that individual subjects are constantly switching objectives (and thus control policies), such that the policy guiding behavior differs dramatically even on a single-trial basis. This scientific claim is supported by a technical claim: that the authors' methods can distinguish which objective is in use, even on single trials. I am uncertain of both claims.

      Consider Figure 8B, which reprises a point made in Figure 1&3 and gives the best evidence for trial-to-trial variability in objective/policy. For every subject, there are two example trials. The top row of trials shows oscillations around the center, which could be consistent with position-error minimization. The bottom row shows tolerance of position errors so long as drift is slow, which could be consistent with velocity-error minimization. But is this really evidence that subjects were switching objectives (and thus control policies) from trial to trial? A simpler alternative would be a single control policy that does not switch, but still generates this range of behaviors. The authors don't really consider this possibility, and I'm not sure why. One can think of a variety of ways in which a unified policy could produce this variation, given noise and the natural instability of the system.

      Indeed, I found that it was remarkably easy to produce a range of reasonably realistic behaviors, including the patterns that the authors interpret as evidence for switching objectives, based on a simple fixed controller. To run the simulations, I made the simple assumption that subjects simply attempt to match their hand position to oppose the cursor position. Because subjects cannot see their hand, I assumed modest variability in the gain, with a range from -1 to -1.05. I assumed a small amount of motor noise in the outgoing motor command. The resulting (very simple) controller naturally displayed the basic range of behaviors observed across trials (see Image 1)

      Image 1.

      Some trials had oscillations around the screen center (zero), which is the pattern the authors suggest reflects position control. In other trials the cursor was allowed to drift slowly away from the center, which is the pattern the authors suggest reflects velocity control. This is true even though the controller was the same on every trial. Trial-to-trial differences were driven both by motor noise and by the modest variability in gain. In an unstable system, small differences can lead to (seemingly) qualitatively different behavior on different trials.

      This simple controller is also compatible with the ability of subjects to adapt their strategy when instructed. Anyone experienced with this task likely understands (or has learned) that moving the hand slightly more than 'one should' will tend to shepherd the cursor back to center, at the cost of briefly high velocity. Using this strategy more sparingly will tend to minimize velocity even if position errors persist. Thus, any subject using this control policy would be able to adapt their strategy via a modest change in gain (the gain linking visible cursor position to intended hand position).

      This model is simple, and there may be reasons to dislike it. But it is presumably a reasonable model. The nature of the task is that you should move your hand opposite where the cursor is. Because you can't see your hand, you will make small mistakes. Due to the instability of the system, those small mistakes have large and variable effects. This feature is likely common to other controllers as well; many may explicitly or implicitly blend position and velocity control, with different trials appearing more dominated by one versus the other. Given this, I think the study presents only weak evidence that individual subjects are switching their objective on individual trials. Indeed, the more parsimonious explanation may be that they aren't. While the study certainly does demonstrate that the control policy can be influenced by verbal instructions, this might be a small adjustment as noted above.

      I thus don't feel convinced that the authors can conclusively tell us the true control policy being used by human and monkey subjects, nor whether that policy is mostly fixed or constantly switching. The data are potentially compatible with any of these interpretations, depending on which control-style model one prefers.

      I see a few paths that the authors might take if they chose.<br /> --First, my reasoning above might be faulty, or there might be additional analyses that could rule out the possibility of a unified policy underlying variable behavior. If so, the authors may be able to reject the above concerns and retain the present conclusions. The main scientifically novel conclusion of the present study is that subjects are using a highly variable control policy, and switching on individual trials. If this is indeed the case, there may be additional analyses that could reveal that.<br /> --Second, additional trial types (e.g., with various perturbations) might be used as a probe of the control policy. As noted below, there is a long history of doing this in the pursuit system. That additional data might better disambiguate control policies both in general, and across trials.<br /> --Third, the authors might find that a unified controller is actually a good (and more parsimonious) explanation. Which might actually be a good thing from the standpoint of future experiments. Interpretation of neural data is likely to be much easier if the control policy being instantiated isn't in constant flux.

      In any case, I would recommend altering the strength of some conclusions, particularly the conclusion that the presented methods can reliably discriminate amongst objectives/policies on individual trials. This is mentioned as a major motivation on multiple occasions, but in most of these instances, the subsequent analysis infers the objective only across trial (e.g., one must observe a scatterplot of many trials). By Figure 7, they do introduce a method for inferring the control policy on individual trials, and while this seems to work considerably better than chance, it hardly appears reliable.

      In this same vein I would suggest toning down aspects of the Introduction and Discussion. The Introduction in particular is overly long, and tries to position the present study as unique in ways that seem strained. Other studies have built links between human behavior, monkey behavior, and monkey neural data (for just one example, consider the corpus of work from the Scott lab that includes Pruszynski et al. 2008 and 2011). Other studies have used highly quantitative methods to infer the objective function used by subjects (e.g. Kording and Wolpert 2004). The very issue that is of interest in the present study - velocity-error-minimization versus position-error-minimization - has been extensively addressed in the smooth pursuit system. That field has long combined quantitative analyses of behavior in humans and monkeys, along with neural recordings. Many pursuit experiments used strategies that could be fruitfully employed to address the central questions of the present study. For example, error stabilization was important for dissecting the control policy used by the pursuit system. By artificially stabilizing the error (position or velocity) at zero, or at some other value, one can determine the system's response. The classic Rashbass step (1961) put position and velocity errors in opposition, to see which dominates the response. Step and sinusoidal perturbations were useful in distinguishing between models, as was the imposition of artificially imposed delays. The authors note the 'richness' of the behavior in the present task, and while one could say the same of pursuit, it was still the case that specific and well-thought through experimental manipulations were pretty critical. It would be better if the Introduction considered at least some of the above-mentioned work (or other work in a similar vein). While most would agree with the motivations outlined by the authors - they are logical and make sense - the present Introduction runs the risk of overselling the present conclusions while underselling prior work.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1

      (1) Given the low trial numbers, and the point of sequential vs clustered reactivation mentioned in the public review, it would be reassuring to see an additional sanity check demonstrating that future items that are currently not on-screen can be decoded with confidence, and if so, when in time the peak reactivation occurs. For example, the authors could show separately the decoding accuracy for near and far items in Fig. 5A, instead of plotting only the difference between them.

      We have now added the requested analysis showing the raw decoded probabilities for near and distant items separately in Figure 5A. We have also chosen to replace Figure 5B with the new figure as we think it provides more information than the previous Figure 5B. Instead, we have moved Figure 5B to the supplement. The median peak decoded accuracy for near and distant items is equivalent. We have added the following description to the figure:

      “Decoded raw probabilities for off-screen items, that were up to two steps ahead of the current stimulus cue (‘near’,) vs. distant items that were more than two steps away on the graph, on trials with correct answers. The median peak decoded probability for near and distant items was at the same time point for both probability categories. Note that displayed lines reflect the average probability while, to eliminate influence of outliers, the peak displays the median.”

      (2) The non-sequential reactivation analyses often use a time window of peak decodability, and it was not entirely clear to me what data this time window is determined on, e.g., was it determined based on all future reactivations irrespective of graph distance? This should be clarified in the methods.

      Thank you for raising this. We now clarify this in the relevant section to read: “First, we calculated a time point of interest by computing the peak probability estimate of decoders across all trials, i.e., the average probability for each timepoint of all trials (except previous onscreen items) of all distances, which is equivalent to the peak of the differential reactivation analysis”

      (3) Fig 4 shows evidence for forward and backward sequential reactivation, suggesting that both forward and backward replay peak at a lag of 40-50msec. It would be helpful if this counterintuitive finding could be picked up in the discussion, explaining how plausible it is, physiologically, to find forward and backward replay at the same lag, and whether this could be an artifact of the TDLM method.

      This is an important point and we agree that it appears counterintuitive. However, we would highlight this exact time range has been reported in previous studies, though t never for both forward and backward replay. We now include a discussion of this finding. The section now reads:

      “[… ] Even though we primarily focused on the mean sequenceness scores across time lags, there appears s to be a (non-significant) peak at 40-60 milliseconds. While simultaneous forward and backward replay is theoretically possible, we acknowledge that it is somewhat surprising and, given our paradigm, could relate to other factors such as autocorrelations (Liu, Dolan, et al., 2021).”

      (4) It is reported that participants with below 30% decoding accuracy are excluded from the main analyses. It would be helpful if the manuscript included very specific information about this exclusion, e.g., was the criterion established based on the localizer cross-validated data, the temporal generalisation to the cued item (Fig. 2), or only based on peak decodability of the future sequence items? If the latter, is it applied based on near or far reactivations, or both?

      We now clarify this point to include more specific information, which reads:

      “[…] Therefore, we decided a priori that participants with a peak decoding accuracy of below 30% would be excluded from the analysis (nine participants in all) as obtained from the cross-validation of localizer trials”

      (5) Regarding the low amount of data for the reactivation analysis, the manuscript should be explicit about the number of trials available for each participant. For example, Supplemental Fig. 1 could provide this information directly, rather than the proportion of excluded trials.

      We have adapted the plot in the supplement to show the absolute number of rejected epochs per participant, in addition to the ratio.

      (6) More generally, the supplements could include more detailed information in the legends.

      We agree and have added more extensive explanation of the plots in the supplement legends.

      (7) The choice of comparing the 2 nearest with all other future items in the clustered reactivation analysis should be better motivated, e.g., was this based on the Wimmer et al. (2020) study?

      We have added our motivation for taking the two nearest items and contrasting them with the items further away. The paragraph reads:

      “[…] We chose to combine the following two items for two reasons: First, this doubled the number of included trials; secondly, using this approach the number of trials for each category (“near” and “distant”) was more balanced. […]”

      Reviewer 2

      (1) Focus exclusively on retrieval data (and here just on the current image trials).

      If I understand correctly, you focus all your analyses (behavioural as well as MEG analyses) on retrieval data only and here just on the current image trials. I am surprised by that since I see some shortcomings due to that. These shortcomings can likely be addressed by including the learning data (and predecessor image trials) in your analyses.

      a) Number of trials: During each block, you presented each of the twelve edges once. During retrieval, participants then did one "single testing session block". Does that mean that all your results are based on max. 12 trials? Given that participants remembered, on average, 80% this means even fewer trials, i.e., 9-10 trials?

      This is correct and a limitation of the paper. However, while we used only correct trials for the reactivation analysis, the sequential analysis was conducted using all trials disregarding the response behaviour. To retain comparability with previous studies we mainly focused on data from after a consolidation phase. Nevertheless, despite the trial limitation we consider the results are robust and worth reporting. Additionally, based on the suggestion of the referee, we now include results from learning blocks (see below).

      b) Extend the behavioural and replay/reactivation analysis to predecessor images.

      Why do you restrict your analyses to the current image trials? Especially given that you have such a low trial number for your analyses, I was wondering why you did not include the predecessor trials (except the non-deterministic trials, like the zebra and the foot according to Figure 2B) as well.

      We agree it would be great to increase power by adding the predecessor images to the current image cue analysis, excluding the ambiguous trials, we did not do so as we considered the underlying retrieval processes of these trial types are not the same, i.e. cannot be simply combined. Nevertheless, we have performed the suggested analysis to check if it increases our power. We found, that the reactivation effect is robust and significant at the same time point of 220-230 ms. However, the effect size actually decreased: While before, peak differential reactivation was at 0.13, it is now at 0.07. This in fact makes conceptual sense. We suspect that the two processes that are elicited by showing a single cue and by showing a second, related, cue are distinct insofar as the predecessor image acts as a primer for the current image, potentially changing the time course/speed of retrieval. Given our concerns that the two processes are not actually the same we consider it important to avoid mixing these data.

      We have added a statement to the manuscript discussing this point. The section reads:

      “Note that we only included data from the current image cue, and not from the predecessor image cue, as we assume the retrieval processes differ and should not be concatenated.”

      c) Extend the behavioural and replay/reactivation analysis to learning trials.

      Similar to point 1b, why did you not include learning trials in your analyses?

      The advantage of including (correct and incorrect) learning trials has the advantage that you do not have to exclude 7 participants due to ceiling performance (100%).

      Further, you could actually test the hypothesis that you outline in your discussion: "This implies that there may be a switch from sequential replay to clustered reactivation corresponding to when learned material can be accessed simultaneously without interference." Accordingly, you would expect to see more replay (and less "clustered" reactivation) in the first learning blocks compared to retrieval (after the rest period).

      To track reactivation and replay over the course of learning is a great idea. We have given a lot of thought as to how to integrate these findings but have not found a satisfying solution. Thus, analysis of the learning data turned out to be quite tricky: We decided that each participant should perform as many blocks as necessary to reach at least 80% (with a limit of six and lower bound of two, see Supplement figure 4). Indeed, some participant learned 100% of the sequence after one block (these were mostly medical students, learning things by hard is their daily task). With the benefit of hindsight, we realise our design means that different blocks are not directly comparable between participants. In theory, we would expect that replay emerges in parallel with learning and then gradually changes to clustered reactivation as memory traces become consolidated/stronger. However, it is unclear when replay should emerge and when precisely a switch to clustered reactivation would happen. For this reason, we initially decided not to include the learning trials into the paper.

      Nevertheless, to provide some insight into the learning process, and to see how consolidation impacts differential reactivation and replay, we have split our data into pre and post resting state, aggregating all learning trials of each participant. While this does not allow us to track processes on a block basis, it does offer potential (albeit limited) insight into the hypothesis we outline in the discussion.

      For reactivation, we see emergence of a clear increase, further strengthening the outlined hypothesis, however, for replay the evidence is less clear, as we do not know over how many learning blocks replay is expected.

      We calculated individual trajectories of how reactivation and replay changes from learning to retrieval and related these to performance. Indeed, we see an increase of reactivation is nominally associated with higher learning performance, while an increase in replay strength is associated with lower performance (both non-significant). However, due to the above-mentioned reasons we think it would premature to add this weak evidence to the paper.

      To mitigate problems of experiment design in relation to this question we are currently implementing a follow-study, where we aim to normalize the learning process across participants and index how replay/reactivation changes over the course of learning and after consolidation.

      We have added plots showing clustered reactivation sequential replay measures during learning (Figure 5D and Supplement 8)

      The added section(s) now read:

      “To provide greater detail on how the 8-minute consolidation period affected reactivation we, post-hoc, looked at relevant measures across learning trials in contrast to retrieval trials. For all learning trials, for each participant, we calculated differential reactivation for the same time point we found significant in the previous analysis (220-260 milliseconds). On average, differential reactivation probability increased from pre to post resting state (Figure 5D). […]

      Nevertheless, even though our results show a nominal increase in reactivation from learning to retrieval (see Figure 5D), due to experimental design features our data do not enable us to test for an hypothesized switch for sequential replay (see also “limitations” and Supplement 8).”

      d) Introduction (last paragraph): "We examined the relationship of graph learning to reactivation and replay in a task where participants learned a ..." If all your behavioural analyses are based on retrieval performance, I think that you do not investigate graph learning (since you exclusively focus the analyses on retrieving the graph structure). However, relating the graph learning performance and replay/reactivation activity during learning trials (i.e., during graph learning) to retrieval trials might be interesting but beyond the scope of this paper.

      We agree. We have changed the wording to be more accurate. Indeed, we do not examine graph learning but instead examine retrieval from a graph, after graph learning. The mentioned sentence now read

      “[…] relationship of retrieval from a learned graph structure to reactivation [...]”

      e) It is sometimes difficult to follow what phase of the experiment you refer to since you use the terms retrieval and test synonymously. Not a huge problem at all but maybe you want to stick to one term throughout the whole paper.

      Thank you for pointing this out. We have now adapted the manuscript to exclusively refer to “retrieval” and not to “test”.

      (2) Is your reactivation clustered?

      In Figure 5A, you compare the reactivation strength of the two items following the cue image (i.e., current image trials) with items further away on the graph. I do not completely understand why your results are evidence for clustered reactivation in contrast to replay.

      First, it would be interesting to see the reactivation of near vs. distant items before taking the difference (time course of item probabilities).

      (copied answer from response to Reviewer 1, as the same remark was raised)

      We have added the requested analysis showing the raw decoded probabilities for near and distant items separately in Figure 5A. We have chosen to replace Figure 5B with the new figure as we think that it offers more information than the previous Figure 5B. Instead, we have moved Figure 5B to the supplement. The median peak decoded accuracy for near and distant items is equivalent. We have added the following description to the figure:

      “Decoded raw probabilities for off-screen items, that were up to two steps ahead of the current stimulus cue (‘near’,) vs. distant items that were more than two steps away on the graph, on trials with correct answers. The median peak decoded probability for near and distant items was at the same time point for both probability categories. Note that displayed lines reflect the average probability while, to eliminate influence of outliers, the peak displays the median. .”

      Second, could it still be that the first item is reactivated before the second item? By averaging across both items, it becomes not apparent what the temporal courses of probabilities of both items look like (and whether they follow a sequential pattern). Additionally, the Gaussian smoothing kernel across the time dimension might diminish sequential reactivation and favour clustered reactivation. (In the manuscript, what does a Gaussian smoothing kernel of  = 1 refer to?). Could you please explain in more detail why you assume non-sequential clustered reactivation here and substantiate this with additional analyses?

      We apologise for the unclear description. Note the Gaussian kernel is in fact only used for the reactivation analysis and not the replay analysis, so any small temporal successions would have been picked up by the sequential analysis. We now clarify this in the respective section of the sequential analysis and also explain the parameter of delta= 1 in the reactivation analysis section. The paragraph now reads

      “[…] As input for the sequential analysis, we used the raw probabilities of the ten classifiers corresponding to the stimuli. [...]

      […] Therefore, to address this we applied a Gaussian smoothing kernel (using scipy.ndimage.gaussian_filter with the default parameter of σ=1 which corresponds approximately to taking the surrounding timesteps in both direction with the following weighting: current time step: 40%, ±1 step: 25%, ±2 step: 5%, ±3 step: 0.5%) [...]”

      (3) Replay and/or clustered reactivation?

      The relationship between the sequential forward replay, differential reactivation, and graph reactivation analysis is not really apparent. Wimmer et al. demonstrated that high performers show clustered reactivation rather than sequential reactivation. However, you did not differentiate in your differential reactivation analysis between high vs. low performers. (You point out in the discussion that this is due to a low number of low performers.)

      We agree that a split into high vs low performers would have been preferably for our analysis. However, there is one major obstacle that made us opt for a correlational analysis instead: We employed criteria learning, rendering a categorical grouping conceptually biased. Even though not all participants reached the criteria of 80%, our sample did not naturally split between high and low performers but was biased towards higher performance, leaving the groups uneven. The median performance was 83% (mean ~81%), with six of our subjects (~1/4th of included participant) having this exact performance. This makes a median or mean split difficult, as either binning assignment choice would strongly affect the results. We have added a limitations section in which we extensively discuss this shortcoming and reasoning for not performing a median split as in Wimmer et al (2020). The section now reads:

      “There are some limitations to our study, most of which originate from a suboptimal study design. [...], as we performed criteria learning, a sub-group analysis as in Wimmer et al., (2020) was not feasible, as median performance in our sample would have been 83% (mean 81%), with six participants exactly at that threshold. [...]”

      It might be worth trying to bring the analysis together, for example by comparing sequential forward replay and differential reactivation at the beginning of graph learning (when performance is low) vs. retrieval (when performance is high).

      Thank you for the suggestion to include the learning segments, which we think improves the paper quite substantially. However, analysis of the learning data turned out to be quite tricky> We had decided that each participant should perform as many blocks as necessary to reach at least 80% accuracy (with a limit of six and lower bound of two, see Supplement figure 4). Some participants learned 100% of the sequence after one block (these were mostly medical students, learning things by hard is their daily task). This in hindsight is an unfortunate design feature in relation to learning as it means different blocks are not directly comparable between participants.

      In theory, we would expect that replay emerges in parallel with learning and then gradually change to clustered reactivation, as memory traces get consolidated/stronger. However, it is unclear when replay would emerge and when the switch to reactivation would happen. For this reason, we initially decided not to include the learning trials into the paper at all.

      Nevertheless, to give some insight into the learning process and to see how consolidation effects differential reactivation and replay, we have split our data into pre and post resting state, aggregating all learning trials of each participant. While this does not allow us to track measures of interest on a block basis, it gives some (albeit limited) insight into the hypothesis outlined in our discussion.

      For reactivation, we see a clear increase, further strengthening the outlined hypothesis, However, for replay the evidence is less obvious, potentially due to that fact that we do not know across how many learning blocks replay is to be expected.

      The added section(s) now read:

      “To examine how the 8-minute consolidation period affected reactivation we, post-hoc, looked at relevant measures during learning trials in contrast to retrieval trials. For all learning trial, for each participant, we calculated differential reactivation for the time point we found significant during the previous analysis (220-260 milliseconds). On average, differential reactivation probability increased from pre to post resting state (Figure 5D).

      […]

      Nevertheless, even though our results show a nominal increase in reactivation from learning to retrieval (see Figure 5D), our data does not enable us to show an hypothesized switch for sequential replay (see also “limitations” and Supplement 8).”

      Additionally, the main research question is not that clear to me. Based on the introduction, I thought the focus was on replay vs. clustered reactivation and high vs. low performance (which I think is really interesting). However, the title is more about reactivation strength and graph distance within cognitive maps. Are these two research questions related? And if so, how?

      We agree we need to be clearer on this point. We have added two sentences to the introduction, which should address this point. The section now reads:

      “[…] In particular, the question remains how the brain keeps track of graph distances for successful recall and whether the previously found difference between high and low performers also holds true within a more complex graph learning context.”

      (4) Learning the graph structure.

      I was wondering whether you have any behavioural measures to show that participants actually learn the graph structure (instead of just pairs or triplets of objects). For example, do you see that participants chose the distractor image that was closer to the target more frequently than the distractor image that was further away (close vs. distal target comparison)? It should be random at the beginning of learning but might become more biased towards the close target.

      Thanks, this is an excellent suggestion. Our analysis indeed shows that people take the near lure more often than the far lure in later blocks, while it is random in the first block.

      Nevertheless, we have decided to put these data into the supplement and reference it in the text. This is because analysis of the learning blocks is challenging and biased in general. Each participant had a different number of learning blocks based on their learning rate, and this makes it difficult to compare learning across participants. We have tried our best to accommodate and explain these difficulties in the figure legend. Nevertheless, we thank the referee for guidance here and this analysis indeed provides further evidence that participants learned the actual graph structure.

      The added section reads

      “Additionally, we have included an analysis showing how wrong answers participants provided were random in the first block and biased towards closer graph nodes in later blocks. This is consistent with participants actually learning the underlying graph structure as opposed to independent triplets (see figure and legend of Supplement 6 for details).”

      (5) Minor comments

      a) "Replay analysis relies on a successive detection of stimuli where the chance of detection exponentially decreases with each step (e.g., detecting two successive stimuli with a chance of 30% leaves a 9% chance of detecting the replay event). " Could you explain in more detail why 30% is a good threshold then?

      Thank you. We have further clarified the section. As we are working mainly with probabilities, it is useful to keep in mind that accuracy is a class metric that only provides a rough estimate of classifier ability. Alternatively, something like a Top-3-Accuracy would be preferable, but also slightly silly in the context of 10 classes.

      Nevertheless, subtle changes in probability estimates are present and can be picked up by the methods we employ. Therefore, the 30% is a rough lower bound and decided based on pilot data that showed that clean MEG data from attentive participants can usually reach this threshold. The section now reads:

      “(e.g., detecting two successive stimuli with a chance of 30% leaves a 9% chance of detecting a replay event). However, one needs to bear in mind that accuracy is a “winnertakes-all” metric indicating whether the top choice also has the highest probability, disregarding subtle, relative changes in assigned probability. As the methods used in this analysis are performed on probability estimates and not class labels, one can expect that the 30% are a rough lower bound and that the actual sensitivity within the analysis will be higher. Additionally, based on pilot data, we found that attentive participants were able to reach 30% decodability, allowing us to use decodability as a data quality check. “

      b) Could you make explicit how your decoders were designed? Especially given that you added null data, did you train individual decoders for one class vs. all other classes (n = 9 + null data) or one class vs. null data?

      We added detail to the decoder training. The section now reads

      “Decoders were trained using a one-vs-all approach, which means that for each class, a separate classifier was trained using positive examples (target class) and negative examples (all other classes) plus null examples (data from before stimulus presentation, see below). In detail, null data was.”

      c) Why did you choose a ratio of 1:2 for your null data?

      Our choice for using a higher ratio was based upon previous publications reporting better sensitivity of TDLM using higher ratios, as spatial sensor correlations are decreasing. Nevertheless, this choice was not well investigated beforehand. We have added more information to this to the manuscript

      d) You could think about putting the questionnaire results into the supplement if they are sanity checks.

      We have added the questionnaire results. However, due to the size of the tables, we have decided to add them as excel files into the supplementary files of the code repository. We have mentioned the existence file in the publication.

      e) Figure 2. There is a typo in D: It says "Precessor Image" instead of "Predecessor Image".

      Fixed typo in figure.

      f) You write "Trials for the localizer task were created from -0.1 to 0.5 seconds relative to visual stimulus onset to train the decoders and for the retrieval task, from 0 to 1.5 seconds after onset of the second visual cue image." But the Figure legend 3D starts at -0.1 seconds for the retrieval test.

      We have now clarified this. For the classifier cross-validation and transfer sanity check and clustered analysis we used trials from -0.1 to 0.5s, whereas for the sequenceness analysis of the retrieval, we used trials from 0 to 1.5 seconds

    2. eLife assessment

      This magnetoencephalography study reports important new findings regarding the nature of memory reactivation during cued recall. It replicates previous work showing that such reactivation can be sequential or clustered, with sequential reactivation being more prevalent in low performers. It adds convincing evidence, even though based on limited amounts of data, that high memory performers tend to show simultaneous (i.e., clustered) reactivation, varying in strength with item distance in the learned graph structure. The study will be of interest to scientists studying memory replay.

    3. Reviewer #1 (Public Review):

      Summary:

      Previous work in humans and non-human animals suggests that during offline periods following learning, the brain replays newly acquired information in a sequential manner. The present study uses an MEG-based decoding approach to investigate the nature of replay/reactivation during a cued recall task directly following a learning session, where human participants are trained on a new sequence of 10 visual images embedded in a graph structure. During retrieval, participants are then cued with two items from the learned sequence, and neural evidence is obtained for the simultaneous or sequential reactivation of future sequence items. The authors find evidence for both sequential and clustered (i.e., simultaneous) reactivation. Replicating previous work, low-performing participants tend to show sequential, temporally segregated reactivation of future items, whereas high-performing participants show more clustered reactivation. Adding to previous work, the authors show that an image's reactivation strength varies depending on its proximity to the retrieval cue within the graph structure.

      Strengths:

      As the authors point out, work on memory reactivation has largely been limited to the retrieval of single associations. Given the sequential nature of our real-life experiences, there is clearly value in extending this work to structured, sequential information. State-of-the-art decoding approaches for MEG are used to characterize the strength and timing of item reactivation. The manuscript is very well written with helpful and informative figures in the main sections. The task includes an extensive localizer with 50 repetitions per image, allowing for stable training of the decoders and the inclusion of several sanity checks demonstrating that on-screen items can be decoded with high accuracy.

      Weaknesses:

      Of major concern, the experiment is not optimally designed for analysis of the retrieval task phase, where only 4 min of recording time and a single presentation of each cue item are available for the analyses of sequential and non-sequential reactivation. In their revision, the authors include data from the learning blocks in their analysis. These blocks follow the same trial structure as the retrieval task, and apart from adding more data points could also reveal a possible shift from sequential to clustered reactivation as learning of the graph structure progresses. The new analyses are not entirely conclusive, maybe given the variability in the number of learning blocks that participants require to reach criterion. In principal, they suggest that reactivation strength increases from learning (pre-rest) to final retrieval (post-rest).

      On a more conceptual note, the main narrative of the manuscript implies that sequential and clustered reactivation are mutually exclusive, such that a single participant would show either one or the other type. With the analytic methods used here, however, it seems possible to observe both types of reactivation. For example, the observation that mean reactivation strength (across the entire trial, or in a given time window of interest) varies with graph distance does not exclude the possibility that this reactivation is also sequential. In fact, the approach of defining one peak time window of reactivation may bias towards simultaneous, graded reactivation. It would be helpful if the authors could clarify this conceptual point. A strong claim that the two types of reactivation are mutually exclusive would need to be substantiated by further evidence, for instance a suitable metric contrasting "sequenceness" vs "clusteredness".

      On the same point, the non-sequential reactivation analyses use a time window of peak decodability that is determined based on the average reactivation of all future items, irrespective of graph distance. In a sequential forward cascade of reactivations, it could be assumed that the reactivation of near items would peak earlier than the reactivation of far items. In the revised manuscript, the authors now show the "raw" timecourses of item decodability at different graph distances, clearly demonstrating their peak reactivation times, which show convincingly that reactivation for near and far items occurs at very similar time points. The question that remains, therefore, is whether the method of pre-selecting a time window of interest described above could exert a bias towards finding clustered reactivation.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors investigate replay (defined as sequential reactivation) and clustered reactivation during retrieval of an abstract cognitive map. Replay and clustered reactivation were analysed based on MEG recordings combined with a decoding approach. While the authors state to find evidence for both, replay and clustered reactivation during retrieval, replay was exclusively present in low performers. Further, the authors show that reactivation strength declined with an increasing graph distance.

      Strengths:

      The paper raises interesting research questions, i.e., replay vs. clustered reactivation and how that supports retrieval of cognitive maps. The paper is well written, well structured and easy to follow. The methodological approach is convincing and definitely suited to address the proposed research questions.

      The paper is a great combination between replicating previous findings (Wimmer et al. 2020) with a new experimental approach but at the same time presenting novel evidence (reactivation strength declines as a function of graph distance).

      What I also want to positively highlight is their general transparency. For example, they pre-registered this study but with a focus on a different part of the data and outlined this explicitly in the paper.

      The paper has very interesting findings. However, there are some shortcomings especially in the experimental design. These are shortly outlined below but are also openly and in detail discussed by the authors.

      Weaknesses:

      The individual findings are interesting. However, due to some shortcomings in the experimental design they cannot be profoundly related to each other. For example, the authors show that replay is present in low but not in high performers with the assumption that high performers tend to simultaneously reactivate items. But then, the authors do not investigate clustered reactivation (= simultaneous reactivation) as a function of performance due to a low number of retrieval trials and ceiling performance in most participants.<br /> As a consequence of the experimental design, some analyses are underpowered (very low number of trials, n = ~10, and for some analyses, very low number of participants, n = 14).

    1. eLife assessment

      This useful study reports the behavioural and physiological effects of the longitudinal activation of neurons associated with negative experiences. The main claims of the paper are supported by solid experimental evidence, but the specificity of the long-term manipulation requires additional validation. This study will be of interest to neuroscientists working on memory.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, Jellinger et al. performed engram-specific sequencing and identified genes that were selectively regulated in positive/negative engram populations. In addition, they performed chronic activation of the negative engram population over 3 months and observed several effects on fear/anxiety behavior and cellular events such as upregulation of glial cells and decreased GABA levels.

      Strengths:

      They provide useful engram-specific GSEA data and the main concept of the study, linking negative valence/memory encoding to cellular level outcomes including upregulation of glial cells, is interesting and valuable.

      Weaknesses:

      A number of experimental shortcomings make the conclusion of the study largely unsupported. In addition, the observed differences in behavioral experiments are rather small, inconsistent, and the interpretation of the differences is not compelling.

      Major points for improvement:

      (1) Lack of essential control experiments

      With the current set of experiments, it is not certain that the DREADD system they used was potent and stable throughout the 3 months of manipulations. Basic confirmatory experiments (e.g., slice physiology at 1m vs. 3m) to show that the DREADD effects on these vHP are stable would be an essential bottom line to make these manipulation experiments convincing.

      Furthermore, although the authors use the mCherry vector as a control, they did not have a vehicle/saline control for the hM3Dq AAV. Thus, the long-term effects such as the increase in glial cells could simply be due to the toxicity of DREADD expression, rather than an induced activity of these cells.

      (2) Figure 1 and the rest of the study are disconnected

      The authors used the cFos-tTA system to label positive/negative engram populations, while the TRAP2 system was used for the chronic activation experiments. Although both genetic tools are based on the same IEG Fos, the sensitivity of the tools needs to be validated. In particular, the sensitivity of the TRAP2 system can be arbitrarily altered by the amount of tamoxifen (or 4OHT) and the administration protocols. The authors should at least compare and show the percentage of labeled cells in both methods and discuss that the two experiments target (at least slightly) different populations. In addition, the use of TRAP2 for vHP is relatively new; the authors should confirm that this method actually captures negative engram populations by checking for reactivation of these cells during recall by overlap analysis of Fos staining or by artificial activation.

      (3) Interpretation of the behavior data

      In Figures 3a and b, the authors show that the experimental group showed higher anxiety based on time spent in the center/open area. However, there were no differences in distance traveled and center entries, which are often reduced in highly anxious mice. Thus, it is not clear what the exact effect of the manipulation is. The authors may want to visualize the trajectories of the mice's locomotion instead of just showing bar graphs.

      In addition, the data shown in Figure 4b is somewhat surprising - the 14MO control showed more freezing than the 6MO control, which can be interpreted as "better memory in old". As this is highly counterintuitive, the authors may want to discuss this point. The authors stated that "Mice typically display increased freezing behavior as they age, so these effects during remote recall are expected" without any reference. This is nonsense, as just above in Figure 4a, older mice actually show less freezing than young mice.

      Overall, the behavioral effects are rather small and random. I would suggest that these data be interpreted more carefully.

      (4) Lack of citation and discussion of relevant study

      Khalaf et al. 2018 from Gräff lab showed that experimental activation of recall-induced populations leads to fear attenuation. Despite the differences in experimental details, the conceptual discrepancy should be discussed.

    3. Reviewer #2 (Public Review):

      Summary:

      Jellinger, Suthard, et al. investigated the transcriptome of positive and negative valence engram cells in the ventral hippocampus, revealing anti- and pro-inflammatory signatures of these respective valences. The authors further reactivated the negative valence engram ensembles to assay the effects of chronic negative memory reactivation in young and old mice. This chronic re-activation resulted in differences in aspects of working memory, and fear memory, and caused morphological changes in glia. Such reactivation-associated changes are putatively linked to GABA changes and behavioral rumination.

      Strengths:

      Much of the content of this manuscript is of benefit to the community, such as the discovery of differential engram transcriptomes dependent on memory valence. The chronic activation of neurons, and the resultant effects on glial cells and behavior, also provide the community with important data. Laudable points of this manuscript include the comprehensiveness of behavioral experiments, as well as the cross-disciplinary approach.

      Weaknesses:

      There are several key claims made that are unsubstantiated by the data, particularly regarding the anthropomorphic framing of "rumination" on a mouse model and the role of GABA. The conclusions and inferences in these areas need to be carefully considered.

      (1) There are many issues regarding the arguments for the behavioural data's human translation as "rumination." There is no definition of rumination provided in the manuscript, nor how rumination is similar/different to intrusive thoughts (which are psychologically distinct but used relatively interchangeably in the manuscript), nor how rumination could be modelled in the rodent. The authors mention that they are attempting to model rumination behaviours by chronically reactivating the negative engram ("To understand if our experimental model of negative rumination..."), but this occurs almost at the very end of the results section, and no concrete evidence from the literature is provided to attempt to link the behavioural results (decreased working memory, increased fear extinction times) to rumination-like behaviours. The arguments in the final paragraph of the Discussion section about human rumination appear to be unrelated to the data presented in the manuscript and contain some uncited statements. Finally, the rumination claims seem to be based largely upon a single data figure that needs to be further developed (Figure 6, see also point 2 below).

      (2) The staining and analysis in Figure 6 are challenging to interpret, and require more evidence to substantiate the conclusions of these results. The histological images are zoomed out, and at this resolution, it appears that only the pyramidal cell layer is being stained. A GABA stain should also label the many sparsely spaced inhibitory interneurons existing across all hippocampal layers, yet this is not apparent here. Moreover, both example images in the treatment group appear to have lower overall fluorescence intensity in both DAPI and GABA. The analysis is also unclear: the authors mention "ROIs" used to measure normalized fluorescence intensity but do not specify what the ROI encapsulates. Presumably, the authors have segmented each DAPI-positive cell body and assessed fluorescence - however, this is not explicated nor demonstrated, making the results difficult to interpret.

      (3) A smaller point, but more specific detail is needed for how genes were selected for GSEA analysis. As GSEA relies on genes to be specified a priori, to avoid a circular analysis, these genes need to be selected in a blind/unbiased manner to avoid biasing downstream results and conclusions. It's likely the authors have done this, but explicitly noting how genes were selected is an important context for this analysis.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors note that negative ruminations can lead to pathological brain states and mood/anxiety dysregulation. They test this idea by using mouse engram-tagging technology to label dentate gyrus ensembles activated during a negative experience (fear conditioning). They show that chronic chemogenetic activation of these ensembles leads to behavioral (increased anxiety, increased fear generalization, reduced fear extinction) and neural (increases in neuroinflammation, microglia, and astrocytes).

      Strengths:

      The question the authors ask here is an intriguing one, and the engram activation approach is a powerful way to address the question. Examination of a wide range of neural and behavioral dependent measures is also a strength.

      Weaknesses:

      The major weakness is that the authors have found a range of changes that are correlates of chronic negative engram reactivation. However, they do not manipulate these outcomes to test whether microglia, astrocytes, or neuroinflammation are causally linked to the dysregulated behaviors.

    1. eLife assessment

      This important work provides insights into the neural mechanisms regulating specific parental behaviors. By identifying a key role for oxytocin synthesizing cells in the paraventricular nucleus of the hypothalamus and their projections to the medial prefrontal cortex in promoting pup care and inhibiting infanticide, the study advances our understanding of the neurobiological basis of these contrasting behaviors in male and female mandarin voles. The evidence supporting the authors' conclusions is solid but lacks some critical methodological detail. The work should be of interest to researchers studying neuropeptide control of social behaviors in the brain.

    2. Reviewer #1 (Public Review):

      Summary:

      This important study investigated the role of oxytocin (OT) neurons in the paraventricular nucleus (PVN) and their projections to the medial prefrontal cortex (mPFC) in regulating pup care and infanticide behaviors in mandarin voles. The researchers used techniques like immunofluorescence, optogenetics, OT sensors, and peripheral OT administration. Activating OT neurons in the PVN reduced the time it took pup-caring male voles to approach and retrieve pups, facilitating pup-care behavior. However, this activation had no effect on females. Interestingly, this same PVN OT neuron activation also reduced the time for both male and female infanticidal voles to approach and attack pups, suggesting PVN OT neuron activity can promote pup care while inhibiting infanticide behavior. Inhibition of these neurons promoted infanticide. Stimulating PVN->mPFC OT projections facilitated pup care in males and in infanticide-prone voles, activation of these terminals prolonged latency to approach and attack. Inhibition of PVN->mPFC OT projections promoted infanticide. Peripheral OT administration increased pup care in males and reduced infanticide in both sexes. However, some results differed in females, suggesting other mechanisms may regulate female pup care.

      Strengths:

      This multi-faceted approach provides converging evidence, strengthens the conclusions drawn from the study, and makes them very convincing. Additionally, the study examines both pup care and infanticide behaviors, offering insights into the mechanisms underlying these contrasting behaviors. The inclusion of both male and female voles allows for the exploration of potential sex differences in the regulation of pup-directed behaviors. The peripheral OT administration experiments also provide valuable information for potential clinical applications and wildlife management strategies.

      Weaknesses:

      While the study presents exciting findings, there are several weaknesses that should be addressed. The sample sizes used in some experiments, such as the Fos study and optogenetic manipulations, appear to be small, which may limit the statistical power and generalizability of the results. Effect sizes are not reported, making it difficult to evaluate the practical significance of the findings. The imaging parameters and analysis details for the Fos study are not clearly described, hindering the interpretation of these results (i.e., was the entire PVN counted?). Also, does the Fos colocalization align with previous studies that look at PVN Fos and maternal/ paternal care? Additionally, the study lacks electrophysiological data to support the optogenetic findings, which could provide insights into the neural mechanisms underlying the observed behaviors.

      The study has several limitations that warrant further discussion. Firstly, the potential effects of manipulating OT neurons on the release of other neurotransmitters (or the influence of other neurochemicals or brain regions) on pup-directed behaviors, especially in females, are not fully explored. Additionally, it is unclear whether back-propagation of action potentials during optogenetic manipulations causes the same behavioral effect as direct stimulation of PVN OT cells. Moreover, the authors do not address whether the observed changes in behavior could be explained by overall increases or decreases in locomotor activity.

      The authors do not specify the percentage of PVN->mPFC neurons labeled that were OT-positive, nor do they directly compare the sexes in their behavioral analysis (or if they did, it is not clear statistically). While the authors propose that the sex difference in pup-directed behaviors is due to females having greater OT expression, they do not provide evidence to support this claim from their labeling data. It is also uncertain whether more OT neurons were manipulated in females compared to males. The study could benefit from a more comprehensive discussion of other factors that could influence the neural circuit under investigation, especially in females.

    3. Reviewer #2 (Public Review):

      Summary:

      This series of experiments studied the involvement of PVN OT neurons and their projection to the mPFC in pup-care and attack behavior in virgin male and female Mandarin voles. Using Fos visualization, optogenetics, fiber photometry, and IP injection of OT the results converge on OT regulating caregiving and attacks on pups. Some sex differences were found in the effects of the manipulations.

      Strengths:

      Major strengths are the modern multi-method approaches and involving both sexes of Mandarin vole in every experiment.

      Weaknesses:

      Weaknesses include the lack of some specific details in the methods that would help readers interpret the results. These include:

      (1) No description of diffusion of centrally injected agents.

      (2) Whether all central targets were consistent across animals included in the data analyses. This includes that is not stated if the medial prelimbic mPFC target was in all optogenetic study animals as shown in Figure 4 and if that is the case, there is no discussion of that subregion's function compared to other mPFC subregions.

      (3) How groups of pup-care and infanticidal animals were created since there was no obvious pre-test mentioned so perhaps there was the testing of a large number of animals until getting enough subjects in each group.

      (4) The apparent use of a 20-minute baseline data collection period for photometry that started right after the animals were stressed from handling and placement in the novel testing chamber.

      (5) A weakness in the results reporting is that it's unclear what statistics are reported (2 x 2 ANOVA main effect of interaction results, t-test results) and that the degrees of freedom expected for the 2 X 2 ANOVAs in some cases don't appear to match the numbers of subjects shown in the graphs; including sample sizes in each group would be helpful because the graph panels are very small and data points overlap.

      The additional context that could help readers of this study is that the authors overlook some important mPFC and pup caregiving and infanticide studies in the introduction which would help put this work in better context in terms of what is known about the mPFC and these behaviors. These previous studies include Febo et al., 2010; Febo 2012; Peirera and Morrell, 2011 and 2020; and a very relevant study by Alsina-Llanes and Olazábal, 2021 on mPFC lesions and infanticide in virgin male and female mice. The introduction states that nothing is known about the mPFC and infanticide. In the introduction and discussion, stating the species and sex of the animals tested in all the previous studies mentioned would be useful. The authors also discuss PVN OT cell stimulation findings seen in other rodents, so the work seems less conceptually novel. Overall, the findings add to the knowledge about OT regulation of pup-directed behavior in male and female rodents, especially the PVN-mPFC OT projection.

    4. Reviewer #3 (Public Review):

      Summary:

      Here Li et al. examine pup-directed behavior in virgin Mandarin voles. Some males and females tend towards infanticide, others tend towards pup care. c-Fos staining showed more oxytocin cells activated in the paraventricular nucleus (PVN) of the hypothalamus in animals expressing pup care behaviors than in infanticidal animals. Optogenetic stimulation of PVN oxytocin neurons (with an oxytocin-specific virus to express the opsin transgene) increased pup-care, or in infanticidal voles increased latency towards approach and attack.

      Suppressing the activity of PVN oxytocin neurons promoted infanticide. The use of a recent oxytocin GRAB sensor (OT1.0) showed changes in medial prefrontal cortex (mPFC) signals as measured with photometry in both sexes. Activating mPFC oxytocin projections increased latency to approach and attack in infanticidal females and males (similar to the effects of peripheral oxytocin injections), whereas in pup-caring animals only males showed a decrease in approach. Inhibiting these projections increased infanticidal behaviors in both females and males and had no effect on pup caretaking.

      Strengths:

      Adopting these methods for Mandarin voles is an impressive accomplishment, especially the valuable data provided by the oxytocin GRAB sensor. This is a major achievement and helps promote systems neuroscience in voles.

      Weaknesses:

      The study would be strengthened by an initial figure summarizing the behavioral phenotypes of voles expressing pup care vs infanticide: the percentages and behavioral scores of individual male and female nulliparous animals for the behaviors examined here. Do the authors have data about the housing or life history/experiences of these animals? How bimodal and robust are these behavioral tendencies in the population?

      Optogenetics with the oxytocin promoter virus is a nice advance here. More details about their preparation and methods should be in the main text, and not simply relegated to the methods section. For optogenetic stimulation in Figure 2, how were the stimulation parameters chosen? There is a worry that oxytocin neurons can co-release other factors- are the authors sure that oxytocin is being released by optogenetic stimulation as opposed to other transmitters or peptides, and acting through the oxytocin receptor (as opposed to a vasopressin receptor)?

      Given that they are studying changes in latency to approach/attack, having some controls for motion when oxytocin neurons are activated or suppressed might be nice. Oxytocin is reported to be an anxiolytic and a sedative at high levels.

      The OT1.0 sensor is also amazing, these data are quite remarkable. However, photometry is known to be susceptive to motion artifacts and I didn't see much in the methods about controls or correction for this. It's also surprising to see such dramatic, sudden, and large-scale suppression of oxytocin signaling in the mPFC in the infanticidal animals - does this mean there is a substantial tonic level of oxytocin release in the cortex under baseline conditions?

      Figure 5 is difficult to parse as-is, and relates to an important consideration for this study: how extensive is the oxytocin neuron projection from PVN to mPFC?

      In Figures 6 and 7, the authors use the phrase 'projection terminals'; however, to my knowledge, there have not been terminals (i.e., presynaptic formations opposed to a target postsynaptic site) observed in oxytocin neuron projections into target central regions.

      Projection-based inhibition as in Figure 7 remains a controversial issue, as it is unclear if the opsin activation can be fast enough to reduce the fast axonal/terminal action potential. Do the authors have confirmation that this works, perhaps with the oxytocin GRAB OT sensor?

      As females and males had similar GRAB OT1.0 responses in mPFC, why would the behavioral effects of increasing activity be different between the sexes?

    1. eLife assessment

      This method paper proposes a valuable Oscillation Component Analysis (OCA) approach, in analogy to Independent Component Analysis (ICA), in which source separation is achieved through biophysically inspired generative modeling of neural oscillations. The empirical evidence justifying the approach's advantage is incomplete. This work will be of interest to cognitive neuroscience, neural oscillation, and MEG/EEG.

    2. Reviewer #1 (Public Review):

      Summary:

      The present paper introduces Oscillation Component Analysis (OCA), in analogy to ICA, where source separation is underpinned by a biophysically inspired generative model. It puts the emphasis on oscillations, which is a prominent characteristic of neurophysiological data.

      Strengths:

      Overall, I find the idea of disambiguating data-driven decompositions by adding biophysical constrains useful, interesting and worth-pursuing. The model incorporates both a component modelling of oscillatory responses that is agnostic about the frequency content (e.g.. doesn't need bandpass filtering or predefinition of bands) and a component to map between sensor and latent-space. I feel these elements can be useful in practice.

      Weaknesses:

      Lack of empirical support: I am missing empirical justification of the advantages that are theoretically claimed in the paper. I feel the method needs to be compared to existing alternatives.

    1. eLife assessment

      The manuscript looks at how dysregulated purine metabolism in mutants for the Aprt gene impacts survival, motor and sleep behavior in the fruit fly. Interestingly, although several deficits arise from dopaminergic neurons, dopamine levels are increased in Aprt mutants. Instead the biochemical change responsible for Aprt mutant neurobehavioural phenotypes appears to be a reduction in levels of adenosine. This valuable study suggests that Drosophila Aprt mutants may serve as a model for understanding Lesch-Nyhan Disease (LND), caused by mutations in the human HPRT1 gene, and may also potentially serve as a model to screen for drugs for the neurobehavioural deficits observed in LND. The strength of evidence is solid.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This important study advances our understanding of how past and future information is jointly considered in visual working memory by studying gaze biases in a memory task that dissociates the locations during encoding and memory tests. The evidence supporting the conclusions is convincing, with state-of-the-art gaze analyses that build on a recent series of experiments introduced by the authors. This work, with further improvements incorporating the existing literature, will be of broad interest to vision scientists interested in the interplay of vision, eye movements, and memory.

      We thank the Editors and the Reviewers for their enthusiasm and appreciation of our task, our findings, and our article. We also wish to thank the Reviewers for their constructive comments that we have embraced to improve our article. Please find below our point-by-point responses to this valuable feedback, where we also state relevant revisions that we have made to our article.

      In addition, please note that we have now also made our data and code publicly available.

      Reviewer 1, Comments:

      In this study, the authors offer a fresh perspective on how visual working memory operates. They delve into the link between anticipating future events and retaining previous visual information in memory. To achieve this, the authors build upon their recent series of experiments that investigated the interplay between gaze biases and visual working memory. In this study, they introduce an innovative twist to their fundamental task. Specifically, they disentangle the location where information is initially stored from the location where it will be tested in the future. Participants are tasked with learning a novel rule that dictates how the initial storage location relates to the eventual test location. The authors leverage participants' gaze patterns as an indicator of memory selection. Intriguingly, they observe that microsaccades are directed toward both the past encoding location and the anticipated future test location. This observation is noteworthy for several reasons. Firstly, participants' gaze is biased towards the past encoding location, even though that location lacks relevance to the memory test. Secondly, there's a simultaneous occurrence of an increased gaze bias towards both the past and future locations. To explore this temporal aspect further, the authors conduct a compelling analysis that reveals the joint consideration of past and future locations during memory maintenance. Notably, microsaccades biased towards the future test location also exhibit a bias towards the past encoding location. In summary, the authors present an innovative perspective on the adaptable nature of visual working memory. They illustrate how information relevant to the future is integrated with past information to guide behavior.

      Thank you for your enthusiasm for our article and findings as well as for your constructive suggestions for additional analyses that we respond to in detail below.

      This short manuscript presents one experiment with straightforward analyses, clear visualizations, and a convincing interpretation. For their analysis, the authors focus on a single time window in the experimental trial (i.e., 0-1000 ms after retro cue onset). While this time window is most straightforward for the purpose of their study, other time windows are similarly interesting for characterizing the joint consideration of past and future information in memory. First, assessing the gaze biases in the delay period following the cue offset would allow the authors to determine whether the gaze bias towards the future location is sustained throughout the entire interval before the memory test onset. Presumably, the gaze bias towards the past location may not resurface during this delay period, but it is unclear how the bias towards the future location develops in that time window. Also, the disappearance of the retro cue constitutes a visual transient that may leave traces on the gaze biases which speaks again for assessing gaze biases also in the delay period following the cue offset.

      Thank you for raising this important point. We initially focused on the time window during the cue given that our central focus was on gaze-biases associated with mnemonic item selection. By zooming in on this window, we could best visualize our main effects of interest: the joint selection (in time) of past and future memory attributes.

      At the same time, we fully agree that examining the gaze biases over a more extended time window yields a more comprehensive view of our data. To this end, we have now also extended our analysis to include a wider time range that includes the period between cue offset (1000 ms after cue onset) and test onset (1500 ms after cue onset). We present these data below. Because we believe our future readers are likely to be interested in this as well, we have now added this complementary visualization as Supplementary Figure 4 (while preserving the focus in our main figure on the critical mnemonic selection period of interest).

      Author response image 1.

      Supplementary Figure 4. Gaze biases in extended time window as a complement to Figure 1 and Supplementary Figure 2. This extended analysis reveals that while the gaze bias towards the past location disappears around 600 ms after cue onset, the gaze bias towards the future location persists (panel a) and that while the early (joint) future bias occurs predominantly in the microsaccade range below 1 degree visual angle, the later bias to the future location incorporates larger eye movement that likely involve preparing for optimally perceiving the anticipated test stimulus (panel b).

      This extended analysis reveals that while the gaze bias towards the past location disappears around 600 ms after cue onset (consistent with our prior reports of this bias), the gaze bias towards the future location persists. Moreover, as revealed by the data in panel b above, while the early (joint) future bias occurs predominantly in the microsaccade range below 1 degree visual angle, the later bias to the future location incorporates larger eye movement that likely involve preparing for optimally perceiving the anticipated test stimulus.

      We now also call out these additional findings and figure in our article:

      Page 2 (Results): “Gaze biases in both axes were driven predominantly by microsaccades (Supplementary Fig. 2) and occurred similarly in horizontal-to-vertical and vertical-tohorizontal trials (Supplementary Fig. 3). Moreover, while the past bias was relatively transient, the future bias continued to increase in anticipation of the of the test stimulus and increasingly incorporated eye-movements beyond the microsaccade range (see Supplementary Fig. 4 for a more extended time range)”.

      Moreover, assessing the gaze bias before retro-cue onset allows the authors to further characterize the observed gaze biases in their study. More specifically, the authors could determine whether the future location is considered already during memory encoding and the subsequent delay period (i.e., before the onset of the retro cue). In a trial, participants encode two oriented gratings presented at opposite locations. The future rule indicates the test locations relative to the encoding locations. In their example (Figure 1a), the test locations are shifted clockwise relative to the encoding location. Thus, there are two pairs of relevant locations (each pair consists of one stimulus location and one potential test location) facing each other at opposite locations and therefore forming an axis (in the illustration the axis would go from bottom left to top right). As the future rule is already known to the participants before trial onset it is possible that participants use that information already during encoding. This could be tested by assessing whether more microsaccades are directed along the relevant axis as compared to the orthogonal axis. The authors should assess whether such a gaze bias exists already before retro cue onset and discuss the theoretical consequences for their main conclusions (e.g., is the future location only jointly used if the test location is implicitly revealed by the retro cue).

      Thank you – this is another interesting point. We fully agree that additional analysis looking at the period prior to retrocue onset may also prove informative. In accordance with the suggested analysis, we have therefore now also analysed the distribution of saccade directions (including in the period from encoding to retrocue) as a function of the future rule (presented below, and now also included as Supplementary Fig. 5). Complementary recent work from our lab has shown how microsaccade directions can align to the axis of memory contents during retention (see de Vries & van Ede, eNeuro, 2024). Based on this finding, one may predict that if participants retain the items in a remapped fashion, their microsaccades may align with the axis of the future rule, and this could potentially already happen prior to cue onset.

      These complementary analyses show that saccade directions are predominantly influenced by the encoding locations rather than the test locations, as seen most clearly by the saccade distribution plots in the middle row of the figure below. To obtain time-courses, we categorized saccades as occurring along the axis of the future rule or along the orthogonal axis (bottom row of the figure below). Like the distribution plots, these time course plots also did not reveal any sign of a bias along the axis of the future rule itself.

      Importantly, note how this does not argue against our main findings of joint selection of past and future memory attributes, as for that central analysis we focused on saccade biases that were specific to the selected memory item, whereas the analyses we present below focus on biases in the axes in which both memory items are defined; not only the cued/selected memory item.

      Author response image 2.

      Supplementary Figure 5. Distribution of saccade directions relative to the future rule from encoding onset. (Top panel) The spatial layouts in the four future rules. (Middle panel) Polar distributions of saccades during 0 to 1500 ms after encoding onset (i.e., the period between encoding onset and cue onset). The purple quadrants represent the axis of the future rule and the grey quadrants the orthogonal axis. (Bottom panel) Time courses of saccades along the above two axes. We did not observe any sign of a bias along the axis of the future rule itself.

      We agree that these additional results are important to bring forward when we interpret our findings. Accordingly, we now mention these findings at the relevant section in our Discussion:

      Page 5 (Discussion): “First, memory contents could have directly been remapped (cf. 4,24–26) to their future-relevant location. However, in this case, one may have expected to exclusively find a future-directed gaze bias, unlike what we observed. Moreover, using a complementary analysis of saccade directions along the axis of the future rule (cf. 24), we found no direct evidence for remapping in the period between encoding and cue (Supplementary Fig. 5)”.

      Reviewer 2, Comments:

      The manuscript by Liu et al. reports a task that is designed to examine the extent to which "past" and "future" information is encoded in working memory that combines a retro cue with rules that indicate the location of an upcoming test probe. An analysis of microsaccades on a fine temporal scale shows the extent to which shifts of attention track the location of the location of the encoded item (past) and the location of the future item (test probe). The location of the encoded grating of the test probe was always on orthogonal axes (horizontal, vertical) so that biases in microsaccades could be used to track shifts of attention to one or the other axis (or mixtures of the two). The overall goal here was then to (1) create a methodology that could tease apart memory for the past and future, respectively, (2) to look at the time-course attention to past/future, and (3) to test the extent to which microsaccades might jointly encode past and future memoranda. Finally, some remarks are made about the plausibility of various accounts of working memory encoding/maintenance based on the examination of these time courses.

      Strengths:

      This research has several notable strengths. It has a clear statement of its aims, is lucidly presented, and uses a clever experimental design that neatly orthogonalizes "past" and "future" as operationalized by the authors. Figure 1b-d shows fairly clearly that saccade directions have an early peak (around 300ms) for the past and a "ramping" up of saccades moving in the forward direction. This seems to be a nice demonstration the method can measure shifts of attention at a fine temporal resolution and differentiate past from future-oriented saccades due to the orthogonal cue approach. The second analysis shown in Figure 2, reveals a dependency in saccade direction such that saccades toward the probe future were more likely also to be toward the encoded location than away from the encoded direction. This suggests saccades are jointly biased by both locations "in memory".

      Thank you for your overall appreciation of our work and for highlighting the above strengths. We also thank you for your constructive comments and call for clarifications that we respond to below.

      Weaknesses:

      (1) The "central contribution" (as the authors characterize it) is that "the brain simultaneously retains the copy of both past and future-relevant locations in working memory, and (re)activates each during mnemonic selection", and that: "... while it is not surprising that the future location is considered, it is far less trivial that both past and future attributes would be retained and (re)activated together. This is our central contribution." However, to succeed at the task, participants must retain the content (grating orientation, past) and probe location (future) in working memory during the delay period. It is true that the location of the grating is functionally irrelevant once the cue is shown, but if we assume that features of a visual object are bound in memory, it is not surprising that location information of the encoded object would bias processing as indicated by microsaccades. Here the authors claim that joint representation of past and future is "far less trivial", this needs to be evaluaed from the standpoint of prior empirical data on memory decay in such circumstances, or some reference to the time-course of the "unbinding" of features in an encoded object.

      Thank you. We agree that our participants have to use the future rule – as otherwise they do not know to which test stimulus they should respond. This was a deliberate decision when designing the task. Critically, however, this does not require (nor imply) that participants have to incorporate and apply the rule to both memory items already prior to the selection cue. It is at least as conceivable that participants would initially retain the two items at their encoded (past) locations, then wait for the cue to select the target memory item, and only then consider the future location associated with the target memory item. After all, in every trial, there is only 1 relevant future location: the one associated with the cued memory item. The time-resolved nature of our gaze markers argues against such a scenario, by virtue of our observation of the joint (simultaneous) consideration of past and future memory attributes (as opposed to selection of past-before-future). These temporal dynamics are central to the insights provided by our study.

      In our view, it is thus not obvious that the rule would be applied at encoding. In this sense, we do not assume that the future location is part of both memory objects from encoding, but rather ask whether this is the case – and, if so, whether the future location takes over the role of the past location, or whether past and future locations are retained jointly.

      Our statements regarding what is “trivial” and what is “less trivial” regard exactly this point: it is trivial that the future is considered (after all, our task demanded it). However, it is less trivial that (1) the future location was already available at the time of initial item selection (as reflected in the simultaneous engagement of past and future locations), and (2) that in presence of the future location, the past location was still also present in the observed gaze biases.

      Having said that, we agree that an interesting possibility is that participants remap both memory items to their future-relevant locations ahead of the cue, but that the past location is not yet fully “unbound” by the time of the cue. This may trigger a gaze bias not only to the new future location but also to the “sticky” (unbound) past location. We now acknowledge this possibility in our discussion (also in response to comment 3 below) where we also suggest how future work may be able to tap into this:

      Page 6 (Discussion): “In our study, the past location of the memory items was technically irrelevant for the task and could thus, in principle, be dropped after encoding. One possibility is that participants remapped the two memory items to their future locations soon after encoding, and had started – but not finished – dropping the past location by the time the cue arrived. In such a scenario, the past signal is merely a residual trace of the memory items that serves no purpose but still pulls gaze. Alternatively, however, the past locations may be utilised by the brain to help individuate/separate the two memory items. Moreover, by storing items with regard to multiple spatial frames (cf. 37) – here with regard to both past and future visual locations – it is conceivable that memories may become more robust to decay and/or interference. Also, while in our task past locations were never probed, in everyday life it may be useful to remember where you last saw something before it disappeared behind an occluder. In future work, it will prove interesting to systematically vary to the delay between encoding and cue to assess whether the reliance on the past location gradually dissipates with time (consistent with dropping an irrelevant feature), or whether the past trace remains preserved despite longer delays (consistent with preserving utility for working memory).”

      (2) The authors refer to "future" and "past" information in working memory and this makes sense at a surface level. However, once the retrocue is revealed, the "rule" is retrieved from long-term memory, and the feature (e.g. right/left, top/bottom) is maintained in memory like any other item representation. Consider the classic test of digit span. The digits are presented and then recalled. Are the digits of the past or future? The authors might say that one cannot know, because past and future are perfectly confounded. An alternative view is that some information in working memory is relevant and some is irrelevant. In the digit span task, all the digits are relevant. Relevant information is relevant precisely because it is thought be necessary in the future. Irrelevant information is irrelevant precisely because it is not thought to be needed in the immediate future. In the current study, the orientation of the grating is relevant, but its location is irrelevant; and the location of the test probe is also relevant.

      Thank you for this stimulating reflection. We agree that in our set-up, past location is technically “task-irrelevant” while future location is certainly “task-relevant”. At the same time, the engagement of the past location suggests to us that the brain uses past location for the selection – presumably because the brain uses spatial location to help individuate/separate the items, even if encoded locations are never asked about. Therefore, whether something is relevant or irrelevant ultimately depends on how one defines relevance (past location may be relevant/useful for the brain even if technically irrelevant from the perspective of the task). In comparison, the use of “past” and “future” may be less ambiguous.

      It is also worth noting how we interpret our findings in relation to demands on visual working memory, inspired by dynamic situations whereby visual stimuli may be last seen at one location but expected to re-appear at another, such as a bird disappearing behind a building (the example in our introduction). Thus, past for us does not refer to the memory item perse (like in the digit span analogue) but, rather, quite specifically to the past location of a dynamic visual stimulus in memory (which, in our experiment, was operationalised by the future rule, for convenience).

      (3) It is not clear how the authors interpret the "joint representation" of past and future. Put aside "future" and "past" for a moment. If there are two elements in memory, both of which are associated with spatial bindings, the attentional focus might be a spatial average of the associated spatial indices. One might also view this as an interference effect, such that the location of the encoded location attracts spatial attention since it has not been fully deleted/removed from working memory. Again, for the impact of the encoded location to be exactly zero after the retrieval cue, requires zero interference or instantaneous decay of the bound location information. It would be helpful for the authors to expand their discussion to further explain how the results fit within a broader theoretical framework and how it fits with empirical data on how quickly an irrelevant feature of an object can be deleted from working memory.

      Thank you also for this point (that is related to the two points above). As we stated in our reply to comment 1 above, we agree that one possibility is that the past location is merely “sticky” and pulls the task-relevant future bias toward the past location. If so, our time courses suggest that such “pulling” occurs only until approximately 600 ms after cue onset, as the past bias is only transient. An alternative interpretation is that the past location may not be merely a residual irrelevant trace, but actually be useful and used by the brain.

      For example, the encoded (past) item locations provide a coordinate system in which to individuate/separate the two memory items. While the future locations also provide such a coordinate system, the brain may benefit from holding onto both coordinate systems at the same time, rendering our observation of joint selection in both frames. Indeed, in a recent VR experiment in which we had participants (rather than the items) rotate, we also found evidence for the joint use of two spatial frames, even if neither was technically required for the upcoming task (see Draschkow, Nobre, van Ede, Nature Human Behaviour, 2022). Though highly speculative at this stage, such reliance on multiple spatial frames may make our memories more robust to decay and/or interference. Moreover, while past location was never explicitly probed in our task, in daily life the past location may sometimes (unexpectedly) become relevant, hence it may be useful to hold onto it, just in case. Thus, considering the past location merely as an “irrelevant feature” (that takes time to delete) may not do sufficient justice to the potential roles of retaining past locations of dynamic visual objects held in working memory.

      As also stated in response to comment 1 above, we now added these relevant considerations to our Discussion:

      Page 5 (Discussion): “In our study, the past location of the memory items was technically irrelevant for the task and could thus, in principle, be dropped after encoding. One possibility is that participants remapped the two memory items to their future locations soon after encoding, and had started – but not finished – dropping the past location by the time the cue arrived. In such a scenario, the past signal is merely a residual trace of the memory items that serves no purpose but still pulls gaze. Alternatively, however, the past locations may be utilised by the brain to help individuate/separate the two memory items. Moreover, by storing items with regard to multiple spatial frames (cf. 37) – here with regard to both past and future visual locations – it is conceivable that memories may become more robust to decay and/or interference. Also, while in our task past locations were never probed, in everyday life it may be useful to remember where you last saw something before it disappeared behind an occluder. In future work, it will prove interesting to systematically vary to the delay between encoding and cue to assess whether the reliance on the past location gradually dissipates with time (consistent with dropping an irrelevant feature), or whether the past trace remains preserved despite longer delays (consistent with preserving utility for working memory).”

      Reviewer 3, Comments:

      This study utilizes saccade metrics to explore, what the authors term the "past and future" of working memory. The study features an original design: in each trial, two pairs of stimuli are presented, first a vertical pair and then a horizontal one. Between these two pairs comes the cue that points the participant to one target of the first pair and another of the second pair. The task is to compare the two cued targets. The design is novel and original but it can be split into two known tasks - the first is a classic working memory task (a post-cue informs participants which of two memorized items is the target), which the authors have used before; and the second is a classic spatial attention task (a pre-cue signal that attention should be oriented left or right), which was used by numerous other studies in the past. The combination of these two tasks in one design is novel and important, as it enables the examination of the dynamics and overlapping processes of these tasks, and this has a lot of merit. However, each task separately is not new. There are quite a few studies on working memory and microsaccades and many on spatial attention and microsaccades. I am concerned that the interpretation of "past vs. future" could mislead readers to think that this is a new field of research, when in fact it is the (nice) extension of an existing one. Since there are so many studies that examined pre-cues and post-cues relative to microsaccades, I expected the interpretation here to rely more heavily on the existing knowledge base in this field. I believe this would have provided a better context of these findings, which are not only on "past" vs. "future" but also on "working memory" vs. "spatial attention".

      Thank you for considering our findings novel and important, while at the same time reminding us of the parallels to prior tasks studying spatial attention in perception and working memory. We fully agree that our task likely engages both attention to the (past) memory item as well as spatial attention to the upcoming (future) test stimulus. At the same time, there is a critical difference in spatial attention for the future in our task compared with ample prior tasks engaging spatial cueing of attention for perception. In our task, the cue never directly cues the future location. Rather, it exclusively cues the relevant memory item. It is the memory item that is associated with the relevant future location, according to the future rule. This integration of the rule-based future location into the memory representation is distinct from classical spatial-attention tasks in which attention is cued directly to a specific location via, for example, a spatial cue such as an arrow.

      Thus, if we wish to think about our task as engaging cueing of spatial attention for perception, we have to at least also invoke the process of cueing the relevant location via the appropriate memory item. We feel it is more parsimonious to think of this as attending to both the past and future location of a dynamic visual object in working memory.

      If we return to our opening example, when we see a bird disappear behind a building, we can keep in working memory where we last saw it, while anticipating where it will re-appear to guide our external spatial attention. Here too, spatial attention is fully dependent on working-memory content (the bird itself) – mirroring the dynamic semng in our study. Thus, we believe our findings contribute a fresh perspective, while of course also extending established fields. We now contextualize our finding within the literature and clarify our unique contribution in our revised manuscript:

      Page 5 (Discussion): “Building on the above, at face value, our task may appear like a study that simply combines two established tasks: tasks using retro-cues to study attention in working memory (e.g.,2,31-33) and tasks using pre-cues to study orienting of spatial attention to an upcoming external stimulus (e.g., 31,32,34–36). A critical difference with common pre-cue studies, however, is that the cue in our task never directly informed the relevant future location. Rather, as also stressed above, the future location was a feature of the cued memory item (according to the future rule), and not of the cue itself. Note how this type of scenario may not be uncommon in everyday life, such as in our opening example of a bird flying behind a building. Here too, the future relevant location is determined by the bird – i.e. the memory content – itself.”

      Reviewer 2, Recommendations:

      It would be helpful to set up predictions based on existing working memory models. Otherwise, the claim that the joint coding of past/future is "not trivial" is simply asserted, rather than contradicting an existing model or prior empirical results. If the non-trivial aspect is simply the ability to demonstrate the joint coding empirical through a good experimental design, make it clear that this is the contribution. For example, it may be that prevailing models predict exactly this finding, but nobody has been able to demonstrate it cleanly, as the authors do here. So the non-triviality is not that the result contradicts working memory models, but rather relates to the methodological difficulty of revealing such an effect.

      Thank you for your recommendation. First, please see our point-by-point responses to the individual comments above, where we also state relevant changes that we have made to our article, and where we clarify what we meant with “non trivial”. As we currently also state in our introduction, our work took as a starting point the framework that working memory is inherently about the past while being for the future (cf. van Ede & Nobre, Annual Review of Psychology, 2023). By virtue of our unique task design, we were able to empirically demonstrate that visual contents in working memory are selected via both their past and their future-relevant locations – with past and future memory attributes being engaged together in time. With “not trivial” we merely intend to make clear that there are viable alternatives than the findings we observed. For example, past could have been replaced by the future, or it could have been that item selection (through its past location) was required before its future-relevant location could be considered (i.e. past-before-future, rather than joint selection as we reported). We outline these alternatives in the second paragraph of our Discussion:

      Page 5 (Discussion): “Our finding of joint utilisation of past and future memory attributes emerged from at least two alternative scenarios of how the brain may deal with dynamic everyday working memory demands in which memory content is encoded at one location but needed at another.

      First, [….]”

      Our work was not motivated from a particular theoretical debate and did not aim to challenge ongoing debates in the working-memory literature, such as: slot vs. resource, active vs. silent coding, decay vs. interference, and so on. To our knowledge, none of these debates makes specific claims about the retention and selection of past and future visual memory attributes – despite this being an important question for understanding working memory in dynamics everyday semngs, as we hoped to make clear by our opening example.

      Reviewer 3, Recommendations:

      I recommend that the present findings be more clearly interpreted in the context of previous findings on working memory and attention. The task design includes two components - the first (post-cue) is a classic working memory task and the second (the pre-cue) is a classic spatial attention design. Both components were thoroughly studied in the past and this previous knowledge should be better integrated into the present conclusions. I specifically feel uncomfortable with the interpretation of past vs. future. I find this framework to be misleading because it reads like this paper is on a topic that is completely new and never studied before, when in fact this is a study on the interaction between working memory and spatial attention. I recommend the authors minimize this past-future framing or be more explicit in explaining how this new framework relates to the more common terminology in the field and make sure that the findings are not presented in a vacuum, as another contribution to the vibrant field that they are part of.

      Thank you for these recommendations. Please also see our point-by-point responses to the individual comments above. Here, we explained our logic behind using the terminology of past vs. future (in addition, see also our response to point 2 or reviewer 2). Here, we also stated relevant changes that we have made to our manuscript to explain how our findings complement – but are also distinct from – prior tasks that used pre-cues to direct spatial attention to an upcoming stimulus. As we explained above, in our task, the cue itself never contained information about the upcoming test location. Rather, the upcoming test location was a property of the memory item (given the future rule). Hence, we referred to this as a “future attribute” of the cued memory item, rather than as the “cued location” for external spatial attention. Still, we agree the future bias likely (also) reflects spatial allocation to the upcoming test array, and we explicitly acknowledge this in our discussion. For example:

      Page 5 (Discussion): “This signal may reflect either of two situations: the selection of a future-copy of the cued memory content or anticipatory attention to its the anticipated location of its associated test-stimulus. Either way, by the nature of our experimental design, this future signal should be considered a content-specific memory attribute for two reasons. First, the two memory contents were always associated with opposite testing locations, hence the observed bias to the relevant future location must be attributed specifically to the cued memory content. Second, we cued which memory item would become tested based on its colour, but the to-be-tested location was dependent on the item’s encoding location, regardless of its colour. Hence, consideration of the item’s future-relevant location must have been mediated by selecting the memory item itself, as it could not have proceeded via cue colour directly.”

      Page 6 (Discussion): “Building on the above, at face value, our task may appear like a study that simply combines two established tasks: tasks using retro-cues to study attention in working memory (e.g.,2,31-33) and tasks using pre-cues to study orienting of spatial attention to an upcoming external stimulus (e.g., 31,32,34–36). A critical difference with common pre-cue studies, however, is that the cue in our task never directly informed the relevant future location. Rather, as also stressed above, the future location was a feature of the cued memory item (according to the future rule), and not of the cue itself. Note how this type of scenario may not be uncommon in everyday life, such as in our opening example of a bird flying behind a building. Here too, the future relevant location is determined by the bird – i.e. the memory content – itself.”

    2. eLife assessment

      This important study advances our understanding of how past and future information is jointly considered in visual working memory by studying gaze biases in a memory task that dissociates the locations during encoding and memory tests. The evidence supporting the conclusions is convincing, with state-of-the-art gaze analyses that build on a recent series of experiments introduced by the authors. This work will be of broad interest to vision scientists interested in the interplay of vision, eye movements, and memory.

    3. Reviewer #1 (Public Review):

      In this study, the authors offer a fresh perspective on how visual working memory operates. They delve into the link between anticipating future events and retaining previous visual information in memory. To achieve this, the authors build upon their recent series of experiments that investigated the interplay between gaze biases and visual working memory. In this study, they introduce an innovative twist to their fundamental task. Specifically, they disentangle the location where information is initially stored from the location where it will be tested in the future. Participants are tasked with learning a novel rule that dictates how the initial storage location relates to the eventual test location. The authors leverage participants' gaze patterns as an indicator of memory selection. Intriguingly, they observe that microsaccades are directed towards both the past encoding location and the anticipated future test location. This observation is noteworthy for several reasons. Firstly, participants' gaze is biased towards the past encoding location, even though that location lacks relevance to the memory test. Secondly, there's a simultaneous occurrence of an increased gaze bias towards both the past and future locations. To explore this temporal aspect further, the authors conduct a compelling analysis that reveals the joint consideration of past and future locations during memory maintenance. Notably, microsaccades biased towards the future test location also exhibit a bias towards the past encoding location. In summary, the authors present an innovative perspective on the adaptable nature of visual working memory. They illustrate how information relevant to the future is integrated with past information to guide behavior.

    4. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Liu et al. reports a task that is designed to examine the extent to which "past" and "future" information is encoded in working memory that combines a retrocue with rules that indicate the location of an upcoming test probe. An analysis of microsaccades on a fine temporal scale shows the extent to which shifts of attention track the location of the encoded item (past) and the location of the future item (test probe). The location of the encoded grating and test probe were always on orthogonal axes (horizontal, vertical) so that biases in microsaccades could be used to track shifts of attention to one or the other axis (or mixtures of the two). The overall goal here was then to (1) create a methodology that could tease apart memory for the past and future, respectively, (2) to look at the time-course attention to past/future, and (3) to test the extent to which microsaccades might jointly encode past and future memoranda. Finally, some remarks are made about the plausibility of various accounts of working memory encoding/maintenance based on the examination of these time-courses.

      Strengths:

      This research has several notable strengths. It has a clear statement of its aims, is lucidly presented, and uses a clever experimental design that neatly orthogonalized "past" and "future" as operationalized by the authors. Figure 1b-d shows fairly clearly that saccade directions have an early peak (around 300ms) for the past and a "ramping" up of saccades moving in the forward direction. This seems to be a nice demonstration that the method can measure shifts of attention at a fine temporal resolution and differentiate past from future oriented saccades due to the orthogonal cue approach. The second analysis shown in Figure 2, reveals a dependency in saccade direction such that saccades toward the probe future were more likely also to be toward the encoded location than away from the encoded direction. This suggests saccades are jointly biased by both locations "in memory". The "central contribution" (as the authors characterize it) is that "the brain simultaneously retains the copy of both past and future-relevant locations in working memory, and (re)activates each during mnemonic selection", and that: "... while it is not surprising that the future location is considered, it is far less trivial that both past and future attributes would be retained and (re)activated together. This is our central contribution." The authors provide a nuanced analysis that offers persuasive evidence that past and future representations are jointly maintained in memory.

    1. Author response:

      Factual error in the eLife assessment to be corrected:

      In the eLife assessment, "ribosomal protein H59" should be changed to "helix 59 of the 28S ribosomal RNA" to make this factually correct.

      Provisional author response

      We thank the reviewers for their thorough and thoughtful readings of the manuscript. Our responses to the four suggestions made in their public reviews are below.

      Reviewer #1 (Public Review):

      Major points:

      (1) The identification of RAMP4 is a pivotal discovery in this paper. The sophisticated AlphaFold prediction, de novo model building of RAMP4's RBD domain, and sequence analyses provide strong evidence supporting the inclusion of RAMP4 in the ribosome-translocon complex structure.

      However, it is crucial to ensure the presence of RAMP4 in the purified sample. Particularly, a validation step such as western blotting for RAMP4 in the purified samples would strengthen the assertion that the ribosome-translocon complex indeed contains RAMP4. This is especially important given the purification steps involving stringent membrane solubilization and affinity column pull-down.

      As suggested, we will revise the manuscript to include Western blots showing that RAMP4 is retained at secretory translocons (and not multipass translocons) after solubilisation, affinity purification, and recovery of ribosome-translocon complexes.

      (2) Despite the comprehensive analyses conducted by the authors, it is challenging to accept the assertion that the extra density observed in TRAP class 1 corresponds to calnexin. The additional density in TRAP class 1 appears to be less well-resolved, and the evidence for assigning it as calnexin is insufficient. The extra density there can be any proteins that bind to TRAP. It is recommended that the authors examine the density on the ER lumen side. An investigation into whether calnexin's N-globular domain and P-domain are present in the ER lumen in TRAP class 1 would provide a clearer understanding.

      We agree that the Calnexin assignment is less confident than the other assignments in this manuscript, and that further support would be ideal. We have exhaustively searched our maps for any unexplained density connected with the putative Calnexin TMD, and have found none. This is consistent with Calnexin's lumenal domain being flexibly linked to its TMD, and thus would not be resolved in a ribosome-aligned reconstruction.

      Our assignment of this TMD to Calnexin was based on existing biochemical data (referenced in the paper) favouring this as the best working hypothesis by far: Calnexin is TRAP’s only abundant co-purifying factor, and their interaction is sensitive to point mutations in the Calnexin TMD. Recognising that this is not conclusive, we will ensure that the text and figures consistently describe this assignment as provisional or putative.

      (3) In the section titled 'TRAP competes and cooperates with different translocon subunits,' the authors present a compelling explanation for why TRAP delta defects can lead to congenital disorders of glycosylation. To enhance this explanation, it would be valuable if the authors could provide additional analyses based on mutations mentioned in the references. Specifically, examining whether these mutations align with the TRAP delta-OSTA structure models would strengthen the link between TRAP delta defects and the observed congenital disorders of glycosylation.

      We agree that mapping disease-causing point mutants to the TRAP delta structure could be potentially informative. Unfortunately, the referenced TRAP delta disease mutants act by simply impairing TRAP delta expression, and thus admit no such fine-grained analyses. However, sequence conservation is our next best guide to mutant function. We note in the text that the contact site charges on TRAP delta and RPN2 are conserved, and that the closest-juxtaposed interaction pair (K117 on TRAPδ and D386 on RPN2) is also the most conserved.

      Reviewer #2 (Public Review):

      Strengths:

      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.

      Weaknesses:

      A minor downside of the manuscript is the sheer volume of analyses and mechanistic hypotheses, which makes it sometimes difficult to follow. The authors might consider offloading some analyses based on weaker evidence to the supplement to maximize impact.

      We agree that the manuscript is long, and we will seek ways to streamline it in revision while avoiding the undesirable side effect of making important findings undiscoverable via literature searches (an unfortunate consequence of many supplemental data). Indeed, we chose eLife for its flexibility regarding article length and suitability for extended and detailed analyses.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      We are grateful for the overall positive feedback from the reviewer.

      We agree with the reviewer that our data showing cellular co-localization between PRC1 and BIN1 requires further investigation in future studies, however, we are confident that in the current form, our manuscript already presents multiple evidences for the role of BIN1 in mitotic processes. We would like to emphasize that PRC1 is not the sole BIN1 partner that connects it to mitotic processes, but it is only one out of more than a dozen that we identified in our study. Furthermore, the mitotic connection with BIN1 is not absolutely novel as BIN1 levels are mildly fluctuating during the cell cycle, similar to other proteins involved in the regulation of the cell cycle (Santos et al., 2015) and because DNM2 is also a well-accepted actor during mitosis (Thompson et al., 2002).

      The less marked co-localization between BIN1 and PRC1 compared to the strong co-localization between BIN1 and DNM2 can be a consequence of their weaker affinity and their partial binding. Yet, this does not necessarily imply that stronger interactions have more biological significance. For example, weaker affinities can be compensated by local concentrations to achieve an even higher degree of cellular complexes than of strongly binding interactions that are separated within the cell. Furthermore, even the degree of complex formation cannot be used intuitively to estimate the biological significance of a complex because complexes can trigger very important biological processes even at very low abundances, e.g. by catalyzing enzymatic reactions. Deciding what is and what is not “biologically significant” among the identified interactions remains to be answered in the future, once we are able to overview complex biological processes in a holistic manner.

      In the revised version, we implemented minor changes to further clarify the raised points.

      Reviewer #2:

      We thank the reviewer for the careful assessment and we are pleased to see the positive enthusiasm regarding our affinity interactomic strategy.

      The reviewer points out that affinities were only measured with a single technique, which is relatively unproven. While it is true that our work uses two techniques building on the same holdup concept, we rather believe that this approach is well-proven. The original holdup method was described almost 20 years ago and since then, it has been used in more than 10 publications for quantitative interactomics. Over the years, at least five distinct generations of the assay were developed, all building on the expertise of the preceding one. In the past, we extensively proved that the resulting affinities show excellent agreement with affinities measured with other methods, such as fluorescence polarization, isothermal titration calorimetry, or surface plasmon resonance (for example in Vincentelli et al. Nat. Meth. 2015; Gogl et al. 2020 Structure; Gogl et al. 2022 Nat.Com.). However, it is true that the most recent variation of this method family, called native holdup, is a fairly new approach published just a bit more than a year ago and this is only the third work that utilizes this method. Yet, in our original work describing the method, we demonstrated good agreement with the results of previous holdup experiments, as well as with orthogonal affinity measurements (Zambo et al. 2022).

      Importantly, the reviewer raises concerns regarding the number of replicates used in our study, as well as the reliability of our methodology. We are glad for such a comment as it allows us to explain our motives behind experimental design which is most often left out from scientific works to save space and keep focus on results. The reason why we use technical replicates instead of the typical biological replicates lies in the nature of the holdup assay. In a typical interactomic assay, such as immunoprecipitation, a lot of variables can perturb the outcome of the measurement, such as bait immobilization, or captured prey leakage during washing steps. The output of such an experiment is a list of statistically significant partners and to minimize these variabilities, biological replicates are used. In the case of a native holdup approach, a panel of an equal amount of resins, all saturated with different baits or controls, is mixed with an equal amount of cell extract, taken from a single tube, and after a brief incubation, the supernatant of this mixture is analyzed. The output of such an experiment is a list of relative concentrations of prey and to maximize its accuracy, we use technical replicates. Using an ideal analytical method, such as fluorescence, it is not necessary to use technical replicates to reach accurate results. For example, the general accuracy of a holdup experiment coupled with a robust analytical approach can be seen clearly in our fragmentomic holdup data shown in Figure 7C where mutant domains that do not have any impact on the interactome show extreme agreement in affinities. Unfortunately, mass spectrometry is less accurate as an analytical method, hence we use technical triplicates to compensate for this. Finally, in the case of BIN1, an independent nHU measurement was also performed using a less capable mass spectrometer. Not counting the 117 detected partners of BIN1 that were only detected in only one of these proteomic measurements, 29 partners were identified as common significant partners in both of these measurements showing nearly identical affinities with a mean standard deviation between measured pKapp values of 0.18, meaning that the obtained dissociation constants are within a <2.5-fold range with >95% probability. There were also 61 BIN1 partners that were detected in both proteomic measurements but were only identified as a significant interaction partner in one of these experiments. Yet many of them show binding in both assays, albeit were found to be not significant in one of these assays. For example, CDC20 shows 66% depletion in one assay (significant binding) while it shows 54% depletion in the other (not significant binding), or CKAP2 shows 58% depletion in one assay (significant binding) while it shows 41% depletion in the other (not significant binding). We hope that these examples show that statistical significance in nHU experiments rather signifies how certain we are in a particular affinity measurement and not the accuracy of the affinity measurement itself. While there are true discrepancies between some of the affinity measurements between these experiments, that would be possible to clarify with more experimental replicates, the raw data presented in our work clearly demonstrate the strength and robustness of a fully quantitative interactomic assay.

      In the revised version, we clarified the number of replicates in the text, in the figure legends, and included some of this discussion in the method section.

      The reviewer had some very useful comments regarding affinity differences between short fragments and full-length proteins. In his comment, he possibly made a typo as we find that fulllength proteins typically interact with higher affinities compared to short PxxP motif fragments in isolation and not weaker. The reviewer also comments that we explain this difference with cooperativity. In a previous preprint version, which the reviewer may have seen, this was indeed the case, but since we realized that we did not have sufficient evidence supporting this model, therefore we did not discuss this in detail in the last version submitted to eLife. To clarify this, we included more discussion about the observed differences in the affinities between fragments and full-length proteins, but since we have limited data to make solid conclusions, we do not go into details about underlying models.

      Instead of cooperativity, the reviewer suggests that the observed differences may originate from additional residues that were not included in our peptides. Indeed, many similar experiments fail because of suboptimal peptide library design. Our peptide library was constructed as 15-mer, xxxxxxPxxPxxxxx motifs and we do not see a strong contribution of residues at the far end of these peptides. Specificity logo reconstructions are expected to identify all key residues that participate in SH3 domain binding, and based on this, all key residues of the identified motifs can be included in shorter 10-mer, xxxPxxPxxx motifs. Therefore, it is unlikely that residues outside our peptide regions will greatly contribute to the site-specific interactions of SH3 domains. It is however possible that other sites, that are sequentially far away from the studied PxxP motifs, are also capable of binding to SH3 through a different surface, but in light of the small size of an isolated SH3 domain, we believe it is very unlikely. It is also possible that BIN1 could also interact with other types of SH3 binding motifs that were not included in our peptide library. We think a more likely explanation is some sort of cooperativity. Cooperativity, or rather synergism between different sites can be easily explained in typical situations, such as in the case of a bimolecular interaction that is mediated by two independent sites. In such an event, once one site is bound, the second binding event will likely also occur because of the high effective local concentration of the binding sites. However, cooperativity can also form in atypical conditions and a molecular explanation for these events is rather elusive. As BIN1 contains a single SH3 domain, its binding to targets containing more binding sites can be challenging to interpret. If these sites are part of a greater Pro-rich region, such as in the case of DNM2, it is possible that the entire region adopts a fuzzy, malleable, yet PPII-like helical conformation. Once the SH3 domain is recruited to this helical region, it can freely trans-locate within this region via lateral diffusion and it will pause on optimal PxxP motifs. As an alternative to this sliding mechanism, a diffusion-limited cooperative binding can also occur. If the two motifs are not part of the same Pro-rich region, but are relatively close in space, such as in the case of ITCH or PRC1, once a BIN1 molecule dissociates from one site, it has a higher chance to rebind to the second site due to higher local concentrations. Such an event can more likely occur if a transient, but relatively stable encounter complex exists between the two molecules, from which complex formation can occur at both sites (A+B↔AB; AB↔ABsite1; AB*↔ABsite2). However, this large effective local concentration in this encounter complex is only temporary because diffusion rapidly diminishes it, although weak electrostatic interactions can increase the lifetime of such encounter complexes. In contrast, the large effective local concentration in conventional multivalent binding is time-independent and only determined by the geometry of the complex. Finally, it may also occur that our empirical bait concentration estimation for immobilized biotinylated proteins is less accurate than the concentration estimation of peptide baits because we approximate this value based on peptide baits. For this technical reason, which was discussed in detail in the original paper describing the nHU approach, we are carefully using apparent affinities for nHU experiments. Nevertheless, even without accurate bait concentrations, our nHU experiment provides precise relative affinities and, thus partner ranking. Either of the mechanisms underlying the interactions we study would be difficult to further explore experimentally, especially at the proteomic level.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors):

      The data is poorly dealt with, and the figures are shown poorly. For example, Figure 2A is not even shown totally.

      We apologize for any difficulties that the reviewer encountered while attempting to view the figures. We have confirmed that all figures, including all panels of Figure 2, display correctly on the HTML and PDF versions of the article hosted at bioRxiv. The HTML and PDF versions generated by eLife also appears to contain all figures and panels in their entirety.

      Reviewer #2 (Recommendations For The Authors):

      Please refer to the public review for possible revisions.

      We thank Reviewer #2 for the summary and thoughtful comments provided in the Public Review. We note the point of possible revision noted from the Public Review: “It can be informative to directly demonstrate DPYD promoter-enhancer interactions. However, the genetic variants support the integration of regulatory activities.” In Figure 4, we provide evidence for direct promoterenhancer interaction though the use of 3C. We furthermore demonstrate that these interactions are dependent upon genotype at rs4294451 as stated by the reviewer. We have highlighted the promoter-enhancer interaction in the revised manuscript, lines 323-325. The role of genotype in this interaction is also specifically discussed in lines 378-381.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Gap junction channels establish gated intercellular conduits that allow the diffusion of solutes between two cells. Hexameric connexin26 (Cx26) hemichannels are closed under basal conditions and open in response to CO2. In contrast, when forming a dodecameric gapjunction, channels are open under basal conditions and close with increased CO2 levels. Previous experiments have implicated Cx26 residue K125 in the gating mechanism by CO2, which is thought to become carbamylated by CO2. Carbamylation is a labile post-translational modification that confers negative charge to the K125 side chain. How the introduction of a negative charge at K125 causes a change in gating is unclear, but it has been proposed that carbamylated K125 forms a salt bridge with the side chain at R104, causing a conformational change in the channel. It is also unclear how overall gating is controlled by changes in CO2, since there is significant variability between structures of gap-junction channels and the cytoplasmic domain is generally poorly resolved. Structures of WT Cx26 gap-junction channels determined in the presence of various concentrations of CO2 have suggested that the cytoplasmatic N-terminus changes conformation depending on the concentration of the gas, occluding the pore when CO2 levels are high.

      In the present manuscript, Deborah H. Brotherton and collaborators use an intercellular dyetransfer assay to show that Cx26 gap-junction channels containing the K125E mutation, which mimics carbamylation caused by CO2, is constitutively closed even at CO2 concentrations where WT channels are open. Several cryo-EM structures of WT and mutant Cx26 gap junction channels were determined at various conditions and using classification procedures that extracted more than one structural class from some of the datasets. Together, the features on each of the different structures are generally consistent with previously obtained structures at different CO2 concentrations and support the mechanism that is proposed in the manuscript. The most populated class for K125E channels determined at high CO2 shows a pore that is constricted by the N-terminus, and a cytoplasmic region that was better resolved than in WT channels, suggesting increased stability. The K125E structure closely resembles one of the two major classes obtained for WT channels at high CO2. These findings support the hypothesis that the K125E mutation biases channels towards the closed state, while WT channels are in an equilibrium between open and closed states even in the presence of high CO2. Consistently, a structure of K125E obtained in the absence of CO2 appeared to also represent a closed state but at lower resolution, suggesting that CO2 has other effects on the channel beyond carbamylation of K125 that also contribute to stabilizing the closed state. Structures determined for K125R channels, which are constitutively open because arginine cannot be carbamylated, and would be predicted to represent open states, yielded apparently inconclusive results.

      A non-protein density was found to be trapped inside the pore in all structures obtained using both DDM and LMNG detergents, suggesting that the density represents a lipid rather than a detergent molecule. It is thought that the lipid could contribute to the process of gating, but this remains speculative. The cytoplasmic region in the tentatively closed structural class of the WT channel obtained using LMNG was better resolved. An additional portion of the cytoplasmic face could be resolved by focusing classification on a single subunit, which had a conformation that resembled the AlphaFold prediction. However, this single-subunit conformation was incompatible with a C6-symmetric arrangement. Together, the results suggest that the identified states of the channel represent open states and closed states resulting from interaction with CO2. Therefore, the observed conformational changes illuminate a possible structural mechanism for channel gating in response to CO2.

      Some of the discussion involving comparisons with structures of other gap junction channels are relatively hard to follow as currently written, especially for a general readership. Also, no additional functional experiments are carried out to test any of the hypotheses arising from the data. However, structures were determined in multiple conditions, with results that were consistent with the main hypothesis of the manuscript. No discussion is provided, even if speculative, to explain the difference in behavior between hemichannels and gap junction channels. Also, no attempt was made to measure the dimensions of the pore, which is relevant because of the importance of identifying if the structures indeed represent open or closed states of the channel.

      We have considerably revised the manuscript in an attempt to make it more tractable. We respond to the individual comments below.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Brotherton et al. describes a structural study of connexin-26 (Cx26) gap junction channel mutant K125E, which is designed to mimic the CO2-inhibited form of the channel. In the wild-type Cx26, exposure to CO2 is presumed to close the channel through carbamylation of the residue K125. The authors mutated K125 to a negatively charged residue to mimic this effect, and they observed by cryo-EM analysis of the mutated channel that the pore of the channel is constricted. The authors were able to observe conformations of the channel with resolved density for the cytoplasmic loop (in which K125 is located). Based on the observed conformations and on the position of the N-terminal helix, which is involved in channel gating and in controlling the size of the pore, the authors propose the mechanisms of Cx26 regulation.

      Strengths:

      This is a very interesting and timely study, and the observations provide a lot of new information on connexin channel regulation. The authors use the state of the art cryo-EM analysis and 3D classification approaches to tease out the conformations of the channel that can be interpreted as "inhibited", with important implications for our understanding of how the conformations of the connexin channels controlled.

      Weaknesses:

      My fundamental question to the premise of this study is: to what extent can K125 carbamylation by recapitulated by a simple K125E mutation? Lysine has a large side chain, and its carbamylation would make it even slightly larger. While the authors make a compelling case for E125-induced conformational changes focusing primarily on the negative charge, I wonder whether they considered the extent to which their observation with this mutant may translate to the carbamoylated lysine in the wild-type Cx26, considering not only the charge but also the size of the modified side-chain.

      This is an important point. We agree that the difference in size will have a different effect on the structure. For kinases, aspartate or glutamate are often used as mimics of phosphorylated serine or threonine and these will have the same issues. The fact that we cannot resolve the relevant side-chains in the density may be indicative that the mutation doesn’t give the whole story. It may be able to shift the equilibrium towards the closed conformation, but not stably trap the molecule in that conformation. We include a comment to this effect in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The mechanism underlying the well-documented CO2-regulated activity of connexin 26 (Cx26) remains poorly understood. This is largely due to the labile nature of CO2-mediated carbamylation, making it challenging to visualize the effects of this reversible posttranslational modification. This paper by Brotherton et al. aims to address this gap by providing structural insights through cryo-EM structures of a carbamylation-mimetic mutant of the gap junction protein.

      Strengths:

      The combination of the mutation, elevated PCO2, and the use of LMNG detergent resulted in high-resolution maps that revealed, for the first time, the structure of the cytoplasmic loop between transmembrane helix (TM) 2 and 3.

      Weaknesses:

      The presented maps merely reinforce their previous findings, wherein wildtype Cx26 favored a closed conformation in the presence of high PCO2. While the structure of the TM2-TM3 loop may suggest a mechanism for stabilizing the closed conformation, no experimental data was provided to support this mechanism. Additionally, the cryo-EM maps were not effectively presented, making it difficult for readers to grasp the message.

      We have extensively revised the manuscript so that the novelty of this study is more apparent. There are three major points

      (1) The carbamylation mimetic pushes the conformation towards the closed conformation. Previously we just showed that CO2 pushes the conformation towards this conformation. Though we could show this was not due to pH, and could speculate this was due to carbamylation as suggested by previous mutagenesis studies, our data did not provide any mechanism whereby Lys125 was involved.

      (2) In going from the open to closed conformations, not only is a conformational change in TM2 involved, as we saw previously, but also a conformational change in TM1, the linker to the N-terminus and the cytoplasmic loop. Thus there is a clear connection between Lys125 and the conformation of the pore-closing N-terminus.

      (3) We observe for the first time in any connexin structure, density for the cytoplasmic loop. Since this loop is important in regulation, knowing how it might influence the positions of the transmembrane helices is important information if we are to understand how connexins can be regulated.

      Reviewing Editor:

      The reviewers have agreed on a list of suggested revisions that would improve the eLife assessment if implemented, which are as follows:

      (1) For completeness, Figure 1 could be supplied with an example of how the experiment would look like in the presence of CO2 - for the wild-type and for the K125E mutant. presumably for the wild-type this has been done previously in exactly this assay format, but this control would be an important part of characterization for the mutant. Page 4, lines 105106; "unsurprisingly, Cx26K125E gap junctions remain closed at a PCO2 of 55 mmHg." The data should be presented in the manuscript.

      We have now included the data with a PCO2 of 55mmH. This is now Figure 4 in our revised manuscript.

      (2) Would AlphaFold predictions show any interpretable differences in the E125 mutant, compared to the K125 (the wild-type)?

      We tried this in response to the reviewer’s suggestion. We did not see any interpretable differences. In general AlphaFold is not recognised as giving meaningful information around point mutations.

      (3) The K125R mutant appears to be a more effective control for extracting significant features from the K125E maps. Given that the use of a buffer containing high PCO2 is essential for obtaining high-resolution maps, wildtype Cx26 is unsuitable as an appropriate control. The K125R map, obtained at a high resolution (2.1Å), supports its suitability as a robust control.

      Though we are unsure what the referee is referring to here, we have rewritten this section and compare against the K125R map (figure 5a) as well as that derived from the wild-type protein. The important point is that the K125E mutant, causes a structural change that is consistent with the closure of the gap junctions that we observe in the dye-transfer assays.

      (4) Likewise, the rationale for using wildtype Cx26 maps obtained in DDM is unclear. Wildtype Cx26 seems to yield much better cryo-EM maps in LMNG. We suggest focusing the manuscript on the higher-quality maps, and providing supporting information from the DDM maps to discuss consistency between observations and the likely possibility that the nonprotein density in the pore is lipid and not detergent.

      The rationale for comparing the mutants against the wt Cx26 maps obtained in DDM was because the mutants were also solubilised in DDM. However, taking the lead from the referees’ comments, we have now rewritten the manuscript so that we first focus on the data we obtain from protein solubilised in LMNG. We feel this makes our message much clearer.

      (5) In general, the rationale for utilizing cryo-EM maps with the entire selected particles is unclear. Although the overall resolutions may slightly improve in this approach, the regions of interest, such as the N-terminus and the cytoplasmic loop, appear to be better ordered afer further classifications. The paper would be more comprehensible if it focuses solely on the classes representing the pore-constricting N-terminus (PCN) and the pore-open flexible Nterminus (POFN) conformations. Also, the nomenclatures used in the manuscript, such as "WT90-Class1", "K125E90-1", "LMNG90-class1", "LMNG90-mon-pcn" are confusing.

      LMNG90s are also wildtype; K125E-90-1 is in Class1 for this mutant and is similar to WT90Class2, which represents the PCN conformation. More consistent and intuitive nomenclatures would be helpful.

      We agree with the referees’ comments. This should now be clearer with our rewritten manuscript where we have simplified this considerably. We now call the conformations NConst (N-terminus defined and constricting the pore) and NFlex (N-terminus not visible) and keep this consistent throughout.

      (6) A potential salt bridge between the carbamylated K125 and R104 is proposed to account for the prevalence of Class-1 (i.e., PCN) in the majority of cryo-EM particles. However, the side chain densities are not well-defined, suggesting that such an interaction may not be strong enough to trap Cx26 in a closed conformation. Furthermore, the absence of experimental data to support this mechanism makes it unclear how likely this mechanism may be. Combining simple mutagenesis, such as R104E, with a dye transfer assay could offer support for this mechanism. Are there any published experimental results that could help address this question without the need for additional experimental work? Alternatively, as acknowledged in the discussion, this mechanism may be deemed as an "over-simplification." What is an alternative mechanism?

      R104 has been mutated to alanine in gap junctions and tested in a dye transfer assay as now mentioned in the text (Nijar et al, J Physiol 2021) supporting this role. In hemichannels R104 has been mutated to both alanine and glutamate and tested through dye loading assays Meigh et al, eLife 2013). Also in hemichannels R104 and K125 have been mutated to cysteines allowing them to be cross-linked through a disulphide bond. This mutant responds to a change in redox potential in a similar way to which the wild type protein responds to CO2 (Meigh et al, Open Biol 2015). Therefore, there is no doubt that the residues are important for the mechanism and the salt-bridge interaction seems a plausible mechanism to reconcile the mutagenesis data, however we cannot be sure that there are not other interactions involved that are necessary for closure. This information has now been included in the text.

      (7) The cryo-EM maps presented in the manuscript propose that gap junctions are constitutively open under normal PCO2 as the flexible N-terminus clears the solute permeation pathway in the middle of the channel. However, hemichannels appear to be closed under normal PCO2. It is puzzling how gap junctions can open when hemichannels are closed under normal PCO2 conditions. If this question has been addressed in previous studies, the underlying mechanism should be explicitly described in the introduction. If it remains an open question, differences in the opening mechanisms between hemichannels and gap junctions should be investigated.

      We suspect this is due to the difference in flexibility of gap junctions relative to hemichannels. However, a discussion of this is beyond this paper and would be complete speculation based on hemichannel structures of other connexins, performed in different buffering systems. There are no high resolution structures of Cx26 hemichannels.

      (8) A mystery density likely representing a lipid is abruptly introduced, but the significance of this discovery is unclear. It is hard to place the lipid on Figure S6 in the wider context of everything else that is discussed in the text. It would be helpful for readers if a figure were provided to show where the density is located in relation to all the other regions that are extensively discussed in the text.

      In the revised text this section has been completely rewritten. We have now include a more informative view in a new figure (Figure 1 – figure supplement 3).

      (9) Including and displaying even tentative pore-diameter measurements for the different states - this would be helpful for readers and provide a more direct visual cue as to the difference between open and closed states.

      We have purposely avoided giving precise measurements to the pore-diameter, since this depends on how we model the N-terminus. The first three residues are difficult to model into the density without causing stearic clashes with the neighbouring subunits.

      (10) Given that no additional experiments for channel function were carried out, it would be useful if to provide a more detailed discussion of additional mutagenesis results from the literature that are related to the experimental results presented.

      We have amplified this in the discussion (see answer to point 6).

      The reviewers also agreed that improvements in the presentation of the data would strengthen the manuscript. Here is a summary list of suggestions by reviewers aimed at helping improve how the data is presented:

      (1) Why is the pipette bright green in the top image, but rather weakly green in the bottom image in Figure 1 - is this the case for all images?

      (Now figure 4) This depends on whether the pipette was in the focal plane of view or not. The important point of these images is the difference in intensity of the donor vs the recipient cell. The graphs in figure 4c illustrate clearly the difference between the wild-type and the mutant gap junctions.

      (2) In figures 2-5, labels would help a lot in understanding what is shown - while the legends do provide the information on what is presented, it would help the reader to see the models/maps with labels directly in the panel. For example, Figure 2a/b - just indicating "WT90 Cx26" in pink and "K125E90" in blue directly in the panel would reduce the work for the reader.

      We have extensively modified the labels in the figures to address this issue.

      (3) Figure 4 - magenta and pink are fairly close, and to avoid confusion it might be useful to use a different color selection. This is especially true when structures are overlayed, as in this figure - the presentation becomes rather complicated, so the less confusion the color code can introduce, the better.

      (Now Figure 2) We have now changed pink to blue.

      (4) Figure 5 - a remarkably under-labelled figure.

      Now added labels.

      (5) Figure 6 - it would be interesting to add a comparison to Cx32 here as well for completeness, since the structure has been published in the meantime.

      Cx32 has now been included.

      (6) Figure 7 - please add equivalent labels on both sides of the model, left and right. Add the connecting lines for all of the tubes TM helices - this will help trace the structural elements shown. The legend does not quite explain the colors.

      We have modified the figure as suggested and explained the colours in the legend.

      (8) Fig.1 legend; Unclear what mCherry fluorescence represents. State that Cx26 was expressed as a translational fusion with mCherry.

      Now figure 4. We have now written “Montages each showing bright field DIC image of HeLa cells with mCherry fluorescence corresponding to the Cx26K125E-mCherry fusion superimposed (leftmost image) and the permeation of NBDG from the recorded cell to coupled cells.”

      (9) Fig. 3 b); Show R104 in the figure. Also E129-R98/R99 interaction is hard to acknowledge from the figure. It seems that the side chain density of E129 is not strong enough to support the modeled orientation.

      This is now Figure 1c. While the density in this region is sufficient to be confident of the main chain, we agree that the side chain density for the E129-R98/R99 interaction is not sufficiently clear to draw attention to and have removed the associated comment from the figure legend. The density is focussed on the linker between TM1 and the N-terminus and the KVRIEG motif. We prefer to omit R104, in order to keep the focus on this region. As described in the manuscript, the density for the R104 side chain is poor.

      (10) Fig. 3 c); Label the N-terminus and KVRIEG motif in the figure.

      Now Figure 1b. We have labelled the N-terminus. The KVRIEG motif is not visible in this map.

      (11) Page 9, lines 246-248; Restate, "We note, however, density near to Lys125, between Ser19 in the TM1-N-term linker, Tyr212 of TM4 and Tyr97 on TM3 of the neighbouring subunit, which we have been unable to explain with our modelling."

      We have reworded this.

      (12) Page 14, line 399; Patch clamp recording is not included in the manuscript.

      Patch clamp recordings were used to introduce dye into the donor cell.

      (13) On the same Figure 2, clashes are mentioned but these are hard to appreciate in any of the figures shown. Perhaps would be useful to include an inset showing this.

      We have modified Figure 2b slightly and added an explanation to highlight the clash. It is slightly confusing because the residues involved belong to neighbouring subunits.

      (14) The discussion related to Figure 6 is very hard to follow for readers who are not familiar with the context of abbreviations included on the figure labels. This figure could be improved to allow a general readership to identify more clearly each of the features and structural differences that are discussed in the text.

      We have extensively changed the text and updated the labels on the figure to make it much easier for the reader to follow.

      Below, you can find the individual reviews by each of the three reviewers.

      Reviewer #1 (Recommendations For The Authors):

      (1) In Figure 2d-e, the text discusses differences between K125E 90-1 and WT 90-class2 (7QEW), yet the figure compares K125E with 7QEQ. I suggest including a figure panel with a comparison between the two structures discussed in the manuscript text.

      This has been changed in the revised manuscript.

      Other comments have been addressed above.

    1. Reviewer #3 (Public Review):

      The authors delved into an important aspect of abortifacient diseases of livestock in Tanzania. The thoughts of the authors on the topic and its significance are implied, and the methodological approach needs further clarity. The number of wards in the study area, statistical selection of wards, type of questionnaire ie open or close-ended. Statistical analyses of outcomes were not clearly elucidated in the manuscript. Fifteen wards were mentioned in the text but 13 used what were the exclusion criteria. Observations were from pastoral, agropastoral, and smallholder agroecological farmers. No sample numbers or questionnaires were attributed to the above farming systems to correlate findings with management systems. The impacts of the research investigation output are not clearly visible as to warrant intervention methods. What were the identified pathogens from laboratory investigation, particularly with the use of culture and PCR not even mentioning the zoonotic pathogens encountered if any? The public health importance of any of the abortifacient agents was not highlighted.

      In conclusion, based on the intent of the authors and the content of this research, and the weight of the research topic, there are obvious weaknesses in the critical data analysis to demonstrate cause, effect, and impact.

    2. Reviewer #2 (Public Review):

      The paper "The Value of Livestock Abortion Surveillance in Tanzania: Identifying Disease Priorities and Informing Interventions" provides a comprehensive analysis of the importance of livestock abortion surveillance in Tanzania. The authors aim to highlight the significance of this surveillance system in identifying disease priorities and guiding interventions to mitigate the impact of livestock abortions on both animal and human health.

      Summary:

      The paper begins by discussing the context of livestock farming in Tanzania and the significant economic and social impact of livestock abortions. The authors then present a detailed overview of the livestock abortion surveillance system in Tanzania, including its objectives, methods, and data collection process. They analyze the data collected from this surveillance system over a specific period to identify the major causes of livestock abortions and assess their public health implications.

      Evaluation:

      Overall, this paper provides valuable insights into the importance of livestock abortion surveillance as a tool for disease prioritization and intervention planning in Tanzania. The authors effectively demonstrate the utility of this surveillance system in identifying emerging diseases, monitoring disease trends, and informing evidence-based interventions to control and prevent livestock abortions.

      Strengths:

      (1) Clear Objective: The paper clearly articulates its objective of highlighting the value of livestock abortion surveillance in Tanzania.

      (2) Comprehensive Analysis: The authors provide a thorough analysis of the surveillance system, including its methodology, data collection process, and findings as seen in the supplementary files.

      (3) Practical Implications: The paper discusses the practical implications of the surveillance system for disease control and public health interventions in Tanzania.

      (4) Well-Structured: The paper is well-organized, with clear sections and subheadings that facilitate understanding and navigation.

      Suggestions for Improvement:

      (1) Data Presentation: While the analysis is comprehensive, the presentation of data could be enhanced with the use of more visual aids such as tables, graphs, or charts to illustrate key findings.

      (2) Discussion Section: The paper could benefit from a more in-depth discussion of the implications of the findings for disease control strategies and policy formulation in Tanzania.

      (3) Future Directions: Including recommendations for future research or areas for further investigation would add depth to the paper.

      Summary:

      This paper contains thorough analysis and valuable insights. Overall, it makes a significant contribution to the literature on livestock abortion surveillance and its implications for disease control in Tanzania.

    3. Reviewer #1 (Public Review):

      Summary:

      The paper examined livestock abortion, as it is an important disease syndrome that affects productivity and livestock economies. If livestock abortion remains unexamined it poses risks to public health.

      Several pathogens are associated with livestock abortions across Africa however the livestock disease surveillance data rarely include information from abortion events, little is known about the aetiology and impacts of livestock abortions, and data are not available to inform prioritisation of disease interventions. Therefore the current study seeks to examine the issue in detail and proposes some solutions.

      The study took place in 15 wards in northern Tanzania spanning pastoral, agropastoral, and smallholder agro-ecological systems. The key objective is to investigate the causes and impacts of livestock abortion.

      The data collection system was set up such that farmers reported abortion cases to the field officers of the Ministry of Livestock and Fisheries livestock.

      The reports were made to the investigation teams. The team only included abortion of those that the livestock field officers could attend to within 72 hours of the event occurring.

      Also, a field investigation was carried out to collect diagnostic samples from aborted materials. In addition, aborting dams and questionnaires were administered to collect data on herd/flock management. Laboratory diagnostic tests were carried out for a range of abortigenic pathogens

      Over the period of the study, 215 abortion events in cattle (n=71), sheep 48 (n=44), and goats (n=100) were investigated. All 49 investigated cases varied widely across wards. The aetiological attribution, achieved for 19.5% of cases through PCR-based diagnostics, was significantly affected by delays in the field investigation.

      The result also revealed that vaginal swabs from aborting dams provided a practical and sensitive source of diagnostic material for pathogen detection.

      Livestock abortion surveillance can generate valuable information on causes of zoonotic disease outbreaks, and livestock reproductive losses and can identify important pathogens that are not easily captured through other forms of livestock disease surveillance. The study demonstrated the feasibility of establishing an effective reporting and investigation system that could be implemented across a range of settings, including remote rural areas,

      Strengths:

      The paper combines both science and socio-economic methodology to achieve the aim of the study. The methodology was well presented and the sequence was great. The authors explain where and how the data was collected. Figure 2 was used to describe the study area which was excellently done. The section on the investigation of cases was well written. The sample analysis was also well-written. The authors devoted a section to summarizing the investigated cases and description of the livestock 221-study population. The logit model was well-presented.

    4. eLife assessment

      This important study reports the use of a surveillance approach in identifying emerging diseases, monitoring disease trends, and informing evidence-based interventions in the control and prevention of livestock abortions, as it relates to their public health implications. The data support the convincing finding that abortion incidence is higher during the dry season, and occurs more in cross-bred and exotic livestock breeds. Aetiological and epidemiological data can be generated through established protocols for sample collection and laboratory diagnosis. These findings are of potential interest to the fields of veterinary medicine, public health, and epidemiology.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      The reviewers thoughtful comments have helped us make the manuscript both more comprehensive and clearer. Thank you for your time and effort. We know that this is a long and technical paper. In our responses we refer to three documents:

      • Original: the first original submission

      • Revision: the revised document (02 MillardFranklinHerzog2023 v2.pdf)

      • Difference: a document that shows the changes made to text (but not figures or tables) from the original to revision (03 MillardFranklinHerzog2023 diff.pdf).

      Reviewer #1 (Recommendations For The Authors):

      (1) In general, the paper is well written and addresses important questions of muscle mechanics and muscle modeling. In the current version, the model limitations are briefly summarized in the abstract. However, the discussion needs a more complete description of limitations as well as a discussion of types of data (in vivo, ex vivo, single fiber, wholes muscle, MTU, etc.) that can be modeled using this approach.

      Please see the response to comment 23 for more details of the limitations that have been added to the revised document.

      (2) The choice of a model with several tendon parameters for simulating single muscle fiber experiments is not well justified.

      A rigid-tendon model with a slack length of zero was, in fact, used for these simulations for both the VEXAT and Hill models. In case this is still not clear: a rigid-tendon model of zero length is equivalent to no tendon at all. The text that first mentions the tendon model has now been modified to make it clearer that the parameters of the model were set to be consistent with no tendon at all:

      Please see the following text:

      Original:

      • page 17, column 1, line 28 ”... rigid tendon of zero length,”

      • page 17, column 1, line 51 ”... rigid tendon of zero length.”

      Revision:

      • page 19, column 1, line 19 ”... we used a rigid-tendon of zero length (equivalent to ignoring the tendon)”

      • page 19, column 1, line 38 ”... coupled with a rigid-tendon of zero-length.”

      Difference:

      • page 21, column 1, line 19 ”... we used a rigid-tendon... ”

      • page 21, column 1, line 45 ”... rigid-tendon of zero length ...”

      (3) A table that clarifies how all model parameters were estimated needs to be included in the main part of the manuscript.

      Two tables have been added to the manuscript that detail the parameters of the elastic-tendon cat soleus model (in the main body of the text) and the rabbit psoas fibril model (in an appendix). Each table includes:

      • A plain language parameter name

      • The mathematical symbol for the parameter

      • The value and unit of the parameter

      • A coded reference to the data source that indicates both the experimental animal and how the data was used to evaluate the parameter.

      Please see the following text:

      Revision:

      • page 11

      • page 42

      Difference:

      • page 11

      • page 46

      (4) The supplemental information is not properly referenced in the main text. There are a number of smaller issues that also need to be addressed.

      Thank for your attention to detail. The following problems related to Appendix referencing have been fixed:

      • Appendices are now parenthetically referenced at the end of a sentence. However, a few references to figures (that are contained within anAppendix) still appear in the body of the sentence since moving these figure references makes the text difficult to understand.

      • All Appendices are now referenced in the main body of the text.

      (5) Abstract, line 6: While it is commonly assumed that the short range stiffness of muscle is due to cross bridges, Rack & Westbury (1974) noted that it occurs over a distance of 25-35 nm, and that many cross-bridges must be stretched even farther than this distance (their p. 348 middle). It seems unlikely that cross-bridges alone can actually account for the short-range stiffness.

      There are three parts to our response to this comment:

      (a) Rack & Westbury’s definition of short-range-stiffness and unrealistic cross-bridge stretches

      (b) Rack & Westbury’s definition of short-range-stiffness vs. linear-timeinvariant system theory

      (c) Updates to the paper

      a. Rack & Westbury’s definition of short-range-stiffness and unrealistic cross-bridge stretches.

      As you note, on page 348, Rack and Westbury write that ”If the short range stiffness is to be explained in terms of extension of cross-bridges, then many of them must be extended further than the 25-35 nm mentioned above.” Having re-read the paper, its not clear how these three factors are being treated in the 25−35 nm estimate:

      • the elasticity of the tendon and aponeurosis,

      • the elasticity of actin and myosin filaments,

      • and the cycling rate of the cross-bridges.

      Obviously the elasticity of the tendon, aponeurosis, actin, and myosin filaments will reduce the estimated amount of crossbridge strain during Rack and Westbury’s experiments. A potentially larger factor is the cycling rate of each cross-bridge. If each crossbridge cycles faster than 11 Hz (the maximum frequency Rack and Westbury used), then no single crossbridge would stretch by 25-35 nm. So why didn’t Rack and Westbury consider the cycling rate of crossbridges?

      Rack and Westbury’s reasoned that a perfectly elastic work loop would necessarily mean that all crossbridges stayed attached: as soon as a crossbridge cycles it would release its stored elastic energy and the work loop would no longer be elastic. Since Rack and Westbury measured some nearly perfect elastic work loops (the smallest loops in Fig. 2,3, and 4), I guess they assumed crossbridges remained attached during the 25-35 nm crossbridge stretch estimate. However, even Rack and Westbury note that none of the work loops they measured were perfectly elastic and so there is room to entertain the idea that crossbridges are cycling.

      Fortunately, for this discussion, crossbridge cycling rates have been measured.

      In-vitro measurements by Uyeda et al. show that crossbridges are cycling at 30 Hz when moving at 0.5-1.2 length/s. At this rate, there would be enough time for a single crossbridge to cycle nearly 2.72 times for every cycle of the 11 Hz sinusoidal perturbations, reducing its expected strain from 25-35 nm down to 9.2−12.9µm. This effect becomes even more pronounced if crossbridge cycling rate is used to explain the difference in sliding velocity between Uyeda et al.’s in-vitro data (0.5-1.2 length/s) and the maximum contraction velocity of an in-situ cat soleus (4.65 lengths/s, Scott et al.).

      b. Rack & Westbury’s definition of short-range-stiffness vs. linear-time-invariant system theory

      Rack and Westbury defined short-range-stiffness to describe a specific kind of force response of the muscle to cyclical length changes:

      • muscle force is linear with length change,

      • and independent of velocity.

      Rack and Westbury’s definition therefore fails when viscous forces become noticeable, because viscous forces are velocity dependent.

      On line 6 of the abstract the term ‘short-range-stiffness’ is not used because Rack and Westbury’s definition is too narrow for our purposes. Instead we are using the more general approach of approximating muscle as a linear-timeinvariant (LTI) system, where it is assumed that

      • the response of the system is linear

      • and time invariant.

      To unpack that a little, a muscle is considered in the ‘short-range’ in our work if it meets the criteria of a linear time-invariant (LTI) system:

      • the force response of muscle can be accurately described as a linear function of its length and velocity (its state)

      • and its response is not a function of time (which means constant stimulation, and no fatigue).

      In contrast to Rack and Westbury’s definition, the ‘short-range’ in linear systems theory is general enough to accommodate both elastic and viscous forces. In physical terms, small for an LTI approximation of muscle is larger than the short-range defined by Rack and Westbury: an LTI system can include velocity dependence, while short-range-stiffness ends when velocity dependence begins.

      c. Updates to the paper

      To make the differences between Rack and Westbury’s ‘short-range-stiffness’ and LTI system theory clearer: - We have removed all occurrences of ‘short-range’ that were associated with Kirsch et al. and have replaced this phrase with ‘small’.

      • On the first mention of Kirsch’s work we have made the wording more specific

      Revision:

      • page 1, column 1, lines 4,5

      • page 1, column 2, lines 14-21 ”Under constant activation ...”

      Difference: page 1, column 2, line 19-26

      • page 1, column 1, lines 4,5

      • page 1, column 2, lines 20-27 ”Under constant activation ...”

      • A footnote has been added to contrast the definition of ‘small’ in the context of an linear time invariant system to ‘short-range’ in the context of Rack and Westbury’s definition of short-range-stiffness.

      Revision: page 1, column 2, bottom

      Difference: page 1, column 2, bottom

      • In addition, we have added a brief overview of LTI system theory to make the analysis and results more easily understood:

      Revision: Figure 4 paragraph beginning on page 10, column 2, line 15 ”As long as ...”

      Difference: Figure 4 paragraph beginning on page 12, column 1, line 46 ”As long as ...”

      (6) Page 3, lines 6-8: It also seems unlikely that 25% of cross-bridges are attached at one time (Howard, 1997) even for supramaximal isometric stimulation. The number should be less than 20%. What would the ratio of load path stiffness be for low force movements such as changing the direction of a frictionless manipulandum or slow walking? The range of relative stiffnesses is of more interest than the upper limit.

      We have made the following updates to address this comment:

      • A 20% duty cycle now defines the upper bound stiffness of the actinmyosin load path.

      • We have also evaluated the lower bound actin-myosin stiffness when a single crossbridge is attached.

      • The stiffness of titin from Kellermayer et al. has been digitized at a length of 2 µm and 4 µm to more accurately capture the length dependence of titin’s stiffness.

      • We have added a new figure (Figure 14) to make it easier to compare the range of actin-myosin stiffness to titin-actin stiffness.

      • The text in the main body of the paper and the Appendix has been updated.

      • The script ’main ActinMyosinAndTitinStiffness.m’ used to perform the calculations and generate the figure is now a part of the code repository.

      Please see the following text:

      Revision

      • The paragraph beginning at page 2, column 2, line 45 ”The addition of a titin element ...”

      • Appendix A

      • Figure 14 (in Appendix A)

      Difference

      • The paragraph beginning at page 3, column 1, line 6: ”The addition of a titin element ...”

      • Appendix A

      • Figure 14 (in Appendix A)

      (7) Page 5, line 12: A word seems to be missing here, ”...together to further...”.

      Thank you for your attention to detail. The sentence has been corrected.

      Please see the following text:

      • Revision: page 4, column 2, line 40 ”... into a single ...”

      • Difference: page 5, column 1, line 18

      (8) Page 5, line 24-27: These ”theories” are not mutually exclusive, and it is misleading to suggest they are. There is evidence for binding of titin to actin at multiple locations and there is no reason why evidence supporting one binding location must detract from the evidence supporting other binding locations.

      The text has been modified to make it clear to readers that the different titinactin binding locations are not mutually exclusive. Please see the following text:

      • Revision: page 5, column 1, lines 17-19, the sentence beginning ”As previously mentioned, ...”

      • Difference: page 5, column 1, lines 41-44

      (9) Page 5, lines 48-51: Should cite Kellermayer and Granzier (1996) not Kellermayer et al. (1997).

      The reference to ‘Kellermayer et al.’ has been changed to ‘Kellermayer and Granzier’. The comment that the year of the reference should be changed from (1997) to (1996) is confusing: the 1996 paper is being referenced.

      For further details please see:

      • Revision: page 5, column 1, 39-40

      • Difference: page 5, column 2, line 19-22

      (10) Also, Dutta et al. (2018) should be cited as further showing that N2A titin by itself slows actin motility on myosin.

      Thank you for the suggestion. The sentence has been modified to include Dutta et al.:

      For further details please see:

      • Revision: page 5, column 1, 40

      • Difference: page 5, column 2, line 19-22

      (11) Figure 2 legend and elsewhere: it is odd to say that experiments used ”a cat soleus” when more than one cat coleus was used. Change to ”cat coleus”. See also page 15, line 15.

      Thank you for your attention to detail. All occurrences of ‘a cat soleus’ have been changed, with some sentence revision, to ‘cat soleus’.

      (12) Page 6, line 10: It is not clear why an MTU was used to simulate single muscle fiber experiments. What is the justification for choosing this particular model? Also, the choice of model might explain why the version with stiff tendon performs better than the version with an elastic tendon, but this is never mentioned. Why not use a muscle model with no tendon (e.g., Wakeling et al., 2021 J. Biomech.)?

      Please see the response to comment 2.

      (13) Millard et al.’s activation dynamics model also fails to capture the lengthdependence of activation dynamics (Shue and Crago, 1998; Sandercock and Heckman, 1997), which should be noted in the discussion along with other limitations.

      An additional limitations paragraph is in the revised manuscript that addresses this comment specifically. However, we have used Stephenson and Wendt as a reference for the shift in peak isometric force that comes with submaximal activation. In addition, we also reference Chow and Darling for the property that the maximum shortening velocity is reduced with submaximal activations.

      • Revision: page 22, column 1, line 41 ”Finally, the VEXAT model ...”

      • Difference: page 24, column 2, line 12 ”Finally, the VEXAT model ...”

      In addition, please see the response to comment 23.

      (14) Page 6, line 22: ”An underbar...”.

      Thank you for your attention to detail, this correction has been made.

      (14) Page 7, lines 27-32: This and other issues should be described in the Discussion under a heading of model limitations.

      Please see the response to comment 23.

      (15) Page 7, lines 43-44: Numerous papers from the last author’s laboratory contradict the claim that there is no force enhancement on the ascending limb by demonstrating that force enhancement does occur on the ascending limb (see e.g., Leonard & Herzog 2002, Peterson et al., 2004 and several papers from the Rassier laboratory).

      Thank you for your attention to detail. This statement is in error and has been removed. To improve this section of the paper, a paragraph has been added to briefly mention the experimental observations of residual force enhancement before proceeding to explain how this phenomena is represented by the model.

      Please see the following text:

      Revision:

      • the paragraph starting on page 7, column 2, line 43 ”When active muscle is lengthened, ...”

      • and the following paragraph starting on page 8, column 1, line 3 “To develop RFE, ”

      Difference:

      • the paragraph starting on page 8, column 2, line 15

      • and the following paragraph starting on page 9, column 1, line 6

      (17) Figure 3 legend and elsewhere: The authors use Prado et al. (2005) to determine several titin parameters, however the simulations seem to focus on cat soleus, but Prado et al.’s paper is on rabbits. More clarity is needed about which specific results from which species and muscles were used to parameterize the model.

      The new parameter table includes coded entries to indicate the literature source for experimental data, the animal it came from, and how the data was used. For example, the ‘ECM fraction’ has a source of ‘R[57]’ to show that the data came from rabbits from reference 57. For further details, please see the response to comment #3

      Please see the following text:

      • Revision: page 11, column 2, table section H: ‘ECM fraction’.

      • Difference: page 11, column 2, table section H: ‘ECM fraction’.

      To address this comment in a little more detail, we have had to use Prado et al. (2005) to give us estimates for only one parameter: P, the fraction of the passive force-length relation that is due to titin. Prado et al.’s measurements relating to P are unique to our knowledge: these are the only measurements we have to estimate P in any muscle, cat soleus or otherwise. Here we use the average of the values for P across the 5 muscles measured by Prado et al. as a plausible default value for all of our simulations.

      (18) Figure 4 seems unnecessary.

      Figure 4 has been removed.

      (19) Page 10, lines 17-18: provide the abbreviation (VAF) here with the definition (variance accounted for).

      Thank you for your attention to detail. The abbreviation has been added.

      Please see these parts of the manuscripts for details:

      • Revision: page 12, column 2, line 13

      • Difference: page 13, column 2, line 32

      (20) Page 11, lines 2-3: Here and elsewhere, it is clear that some model parameters have been optimized to fit the model. The main paper should include a table that lists all model parameters and how they were chosen or optimized, including but not limited to the information in Table 1 of the supplemental information section.

      See response to comment 3.

      (20) Page 17, lines 45 -49: Again, a substantial number of ad hoc adjustments to the model appear to be required. These should be described in the Discussion under limitations, and accounted for in the parameters table. See also legends to Fig. 12 and 13, page 19, lines 23-26.

      Please see the response to comment #3: a coded entry now appears to indicate the data source, the animal used in the experiment, and the method used to process the data. This includes entries for parameters which were estimated

      ‘E’ so that the model produced acceptable results in the simulations presented. In addition, the new discussion paragraph includes a number of sentences that use the adjustment to the active-titin-damping coefficient as an opening to discuss the limitations of the VEXAT’s titin-actin bond model and the circumstances under which the model’s parameters would need to be adjusted.

      Please see responses to comments 3 and 23 for additional details. In addition, please see the specific discussion text mentioning the change to βoPEVK:

      • Revision: page 22, column 1, line 30 ”In Sec. 3.3 we had ...”

      • Difference: page 24, column 1, line 49

      (22) Page 20, lines 50-11: It should be noted here that Tahir et al.’s (2018) model has both series and parallel elastic elements, provided by superposition of rotation (series) and translation (parallel) of a pulley.

      While it is true that Tahir et al.’s (2018) model has series and parallel elements, as do the other models mentioned, these models do not have the correct structure to yield a gain and phase response that mimics biological muscle. The text that I originally wrote attempted to explain this without going into the details. As you note, this explanation leaves something to be desired. The original text commenting on the models of Forcinito et al, Tahir et al, Haeufle et al., and Gunther et al. has been updated to be more specific.¨ Please see the parts of the following manuscripts for details:

      • Revision: page 22, column 2, line 20, the paragraph beginning ”The models of Forcinito ...”

      • Difference: page 24, column 2, line 44

      (23) Discussion: This section should include a description of model limitations, including the relatively large number of ad hoc modifications and how many parameters must be found by optimization in practice. The authors should discuss what types of data are most compatible for use with the model (ex vivo, in vivo, single fiber, whole muscle, MTU), requirements for applying the model to different types of data, and impediments to using the model on different types of data.

      An additional limitations paragraph has been added to the discussion.

      Please see the following text:

      • Revision: the paragraph beginning on page 22, column 1, line 11 ”Both the viscoelastic ...”

      • Difference: the paragraph beginning on page 24, column 1, line 27.

      Reviewer #2 (Recommendations For The Authors):

      (1) If it is possible to compare the output of this model to other more contemporary models which incorporate titin but are also simple enough to implement in whole-body simulation (such as the winding filament model), this would seem to greatly strengthen the paper.

      That’s an excellent idea, though beyond the scope of this already lengthy paper. Even though the Hill model we evaluated is a bit old it is widely used, and so, many readers will be interested in seeing the benchmark results. As benchmarking work is both difficult to fund and undertake, we do hope that others will evaluate their own models using the code and data we have provided.

      (2) I’m a little unclear on the basis for the transition between short- and midrange length changes, both in reality and in the model. And also about the range of strains that qualify as ”short”. It seems like there is potential for short range stiffness, although I would have thought more in the range of 1-2% strains than >3%, to be due to currently attached crossbridges. There is clear evidence that active titin is responsible for the low stiffness at very large strains that exceed actin-myosin overlap. But I am not clear on how a transitional stiffness on the descending limb of the force-length relationship is implemented in the model, and what aspect of physiology this is replicating. It may be helpful to clarify this further and indicate where in the model this stiffness arises.

      This question has several parts to it which I will paraphrase here:

      A Short-range stiffness acts over smaller strains than 3.8%. How is shortrange defined?

      B Where is the transition made between short-range and mid-range force response, both in reality and in the model. Also how does this change on the descending limb?

      C What components in the model contribute to the stiffness of the CE?

      A. Short-range stiffness acts over smaller strains than 3.8%. How is shortrange defined?

      The response to Reviewer 1’s comment # 5 directly addresses this question.

      B. Where is the transition made between short-range and mid-range forceresponse, both in reality and in the model. Also how does this change on the descending limb? We are going to rephrase the question because of changes in terminology that we have made in response to Reviewer 1’s comment #5.

      (i) What is the basis for the transition between the muscle behaving like an LTI system? Both in reality, and in the model. (ii) What happens outside the LTI range? (iii) Also how does this change on the descending limb?

      We will address this question one part at a time:

      (i) What is the basis for the transition between the muscle behaving like an LTI system? Both in reality, and in the model.

      A system’s response can be approximated as a linear-time-invariant (LTI) system as long as it is time-invariant, and its output can be expressed as a linear function of its input. In the context of Kirsch et al.’s experiment, the ‘system’ is the muscle, the ‘input’ is the time series of length data, and the ‘output’ is the time series of force data. Due to the requirement for timeinvariance, two experimental conditions must be met to approximate muscle as an LTI system:

      • the nominal length of the muscle stays constant over long periods of time,

      • and the nominal activation of the muscle stays constant.

      These conditions were met by default in Kirch et al.’s experiment, and also in our simulations of this experiment. The one remaining condition to assess is whether or not the muscle’s response is linear.

      To evaluate whether the muscle’s force is a linear function of the length change, Kirch et al. evaluated (Cxy)2 the coherence squared between the length and force time-series data. Even though the mathematical underpinnings of (Cxy)2 are complicated, the interpretation of (Cxy)2 is simple: muscle can be accurately approximated as a linear system if (Cxy)2 is close to 1, but the accuracy of this approximation becomes poor as (Cxy)2 approaches 0. Kirsch et al. used (Cxy)2 to identify a bandwidth in which the response of the muscle to the 1−3.8%ℓoM length changes was sufficiently linear for analysis: a lower bound of 4 Hz was identified using (Cxy)2 and the bandwidth of the input signal (15 Hz, 35 Hz, or 90 Hz) set the upper bound. In Fig. 3 of Kirsch et al. the (Cxy)2 at 4 Hz has a value of at least 0.67 for the 15 Hz and 90 Hz signals. To minimize error in our analysis and yet be consistent with Kirsch et al., we analyze the bandwidth common to both (Cxy)2 ≥ 0.67 and Kirsch et al.’s defined range. Though the bandwidth defined by the criteria (Cxy)2 ≥ 0.67 is usually larger than the one defined by Kirsch et al., there are some exceptions where the lower frequency bound of the models is higher than 4 Hz (now reported in Tables 4D and 5D).

      (ii) What happens outside the LTI range?

      When a muscle’s output cannot be considered a LTI it means that either that its length or activation is time-varying, or the relationship between length and force is no longer linear. In short, that the muscle is behaving as one would normally expect: time-varying and non-linearly. The wonderful part of Kirsch et al.’s work is that they found a surprisingly large region in the frequency domain where muscle behaves linearly and can be analyzed using the powerful tools of linear systems and signals.

      (iii) Also how does this change on the descending limb?

      Since nominal length of Kirsch et al.’s experiments is ℓoM it is not clear how the results of the perturbation experiments will change if the nominal length is moved firmly to the descending limb. However, we can see how the stiffness and damping values will change by examining Figure 9C and 9D which shows the calculated stiffness and damping of the VEXAT and Hill models as ℓM is lengthened from ℓoM down the descending limb: the stiffness and damping of the VEXAT model does not change much, while the Hill model’s stiffness changes sign and the damping coefficient changes a lot. What cannot be seen from Figure 9C and 9D is how the bandwidth over which the models are considered linear changes.

      We have made a number of updates to the text to more clearly communicate these details of our response to part (i):

      • Text has been edited so that it is clear that the terms ’short-range stiffness’ and ’small’ from Rack and Westbury’s work is not confused with ’stiffness’ and ’small’ from the LTI system’s analysis. Please see our response to comment # 5 for details.

      • We have added text to the main body of the paper to explain how the coherence squared metric was used to select a bandwidth in which the response of the system is approximately linear:

      • Revision: the paragraph that starts on page 11, column 1, line 3 ”Kirsch et al. used system identification ...”

      – Difference: page 13, column 2, line 1

      – Coherence is defined in Appendix D

      – Coherence is now also included in the example script ‘main SystemIdentificationExample.m’

      • The bandwidth over which model output can be considered linear (coherence squared > 0.67) has been added to Tables 4 and 5

      – Revision: see Table 4D, and Table 5D in Appendix E

      – Difference: see Table 4D, and Table 5D in Appendix E

      • Figures 6 and Figures 16 are annotated now if the plotted signal does not meet the linearity requirement of Cxy > 0.67.

      C. What components in the model contribute to the stiffness of the CE?

      There are three components that contribute to the stiffness of the CE which are pictured in Figure 1, appear in Eqn. 15, and are listed explicitly in Eqn. 76:

      (a) The XE, as represented by the afL(ℓ˜S+L˜M)k˜oX term in Eqn. 15.

      (b) The elasticity of the distal segment of titin, f2(ℓ˜2). Only f2(ℓ˜2) appears in Eqn. 15 because ℓ˜1 is a model state.

      (c) The extracellular matrix, as represented by the fECM(ℓ˜ECM)

      There is also a compressive element fKE, but it plays no role in the simulations presented in this work because it only begins to produce force at extremely short CE lengths (ℓ˜M < 0.1ℓoM).

      We have made the following changes to make these components clearer

      Figure 1A has been updated:

      – The symbols for a spring and a damper are now defined in Figure 1A

      – The ECM now has a spring symbol. Now all springs and dampers have the correct symbol in Figure 1A.

      – The caption now explicitly lists the rigid, viscoelastic, and elastic elements in the model

      The equations for the VEXAT’s CE stiffness and damping are now compared and contrasted to the the Hill model’s stiffness and damping in Sec. 3.1.

      – Revision: starting at page 14, column 2, line 1: Eqn. 28 and Eqn. 29 and surrounding text

      – Difference: page 17, column 1, line 22

      (3) This model appears to be an amalgamation of a phenomenological (forcelength and force-velocity relationships) and a mechanistic (crossbridge and titin stiffness and damping) model. While this may improve predictions, and so potentially be useful, it also seems like it limits the interpretation of physiological underpinnings of any findings. It may be helpful to explore in greater detail the implications of this approach.

      We have added a limitations paragraph to the discussion which addresses this comment and can be found in:

      • Revision: the paragraph beginning on page 22, column 1, line 11 ”Both the viscoelastic ...”

      • Difference: the paragraph beginning on page 24, column 1, line 27

      (4)As a biologist, I found the interpretation of phase and gain a little difficult and it may help the reader to show in greater detail the time series data and model predictions to highlight conditions under which the models do not accurately capture the magnitude and timing of force production.

      It is important that the ideas of phase and gain are understood, especially because little information can be gleaned from the time series data directly. There is some time series data in the paper already that compares each model’s response to its spring-damper of best fit: plots of the force response of each model and its spring damper of best fit can be found in Figures 6A, 6D, 6G, 6J, 16A, 16D, 16G, and 16J in the revised manuscript. While it is clear that models with a higher VAF more closely match the spring-damper of best fit, there is not much more that can be taken from time series data: the systematic differences, particularly in phase, are just not visually apparent in the time-domain but are clear in gain and phase plots in the frequency-domain.

      To make the meaning of phase and gain plots clearer, Figure 4 (Figure 5 in the first submission) has been completely re-made and includes plots that illustrate the entire process of going from two length and force timedomain signals to gain and phase plots in the frequency-domain. Included in this figure is a visual representation of transforming a signal from the time to the frequency domain (Fig. 4B and 4C), and also an illustration of the terms gain and phase (Fig. 4D). In addition, a small example file ’main SystemIdentificationExample.m’ has been added to the matlab code repository in the elife2023 branch to accompany Appendix D, which goes through the mathematics used to transform input and output time domain signals into gain and phase plots of the input-output relation. Small updates have been made to Figure 6 and 16 in the revised paper (Figures 7 and 18 in the first submission) to make the time domain signals from the spring-damper of best fit and the model output clearer. Finally, I have re-calculated the gain and phase profiles using a more advanced numerical method that trades off some resolution in frequency for more accuracy in the magnitude. This has allowed me to make Figures 6 and 16 easier to follow because the gain and phase responses are now lines rather than a scattering of points. We hope that these additions make the interpretation of gain and phase clearer.

      Please see

      Revision:

      – Figure 4 and caption on page 12

      – The opening 2 paragraphs of Sec 3.1 starting on page 10, column 2, line 4 ”In Kirsch et al.’s ...”

      – Figure 6 & 16: spring damper and model annotation added, plotted the gain and phase as lines

      – Appendix D: Updated to include coherence and the more advanced method used to evaluate the system transfer function, gain, and phase.

      Difference:

      – Figure 4 and caption on page 12

      – The opening 2 paragraphs of Sec 3.1 starting on page 12, column 1, line 34 and ending on page 13, column 2, line 29

      – Figure 6 & 16: spring damper and model annotation added

      – Appendix D

      (5) The actin-myosin and actin-titin load pathways are depicted as distinct in the model. However, given titin’s position in the center of myosin and the crossbridge connections between actin and myosin, this would seem to be an oversimplification. It seems worth considering whether the separation of these pathways is justified if it has any effect on the conclusions or interpretation.

      We have reworked one of the discussion paragraphs to focus on how our simulations would be affected by two mechanisms (Nishikawa et al.’s winding filament theory and DuVall et al.’s titin entanglement hypothesis) that make it possible for crossbridges to do mechanical work on titin.

      • Revision: the paragraph beginning on page 21, column 2, line 42 “The active titin model ...”

      • Difference: the paragraph beginning on page 23, column 2, line 48

      References

      Nishikawa KC, Monroy JA, Uyeno TE, Yeo SH, Pai DK, Lindstedt SL. Is titin a ‘winding filament’? A new twist on muscle contraction. Proceedings of the royal society B: Biological sciences. 2012 Mar 7;279(1730):981-90.

      DuVall M, Jinha A, Schappacher-Tilp G, Leonard T, Herzog W. I-Band Titin Interaction with Myosin in the Muscle Sarcomere during Eccentric Contraction: The Titin Entanglement Hypothesis. Biophysical Journal. 2016 Feb 16;110(3):302a.

    1. eLife assessment

      This study is of potential interest to readers in human genetics and quantitative genetics, as it presents a new method for homozygosity mapping in population-scale datasets, based on an innovative computational algorithm that efficiently identifies runs-of-homozygosity (ROH) segments shared by many individuals. Although the method is innovative and has the potential to be broadly useful, its power and limitations have not yet been adequately evaluated. The application of this new method to the UK Biobank dataset identifies several interesting associations, but it remains currently unclear under what conditions the new approach can provide additional power over existing genome-wide association study methods.

    2. Reviewer #1 (Public Review):

      In this manuscript, Naseri et al. present a new strategy for identifying human genetic variants with recessive effects on disease risk by the genome-wide association of phenotype with long runs-of-homozygosity (ROH). The key step of this approach is the identification of long ROH segments shared by many individuals (termed "shared ROH diplotype clusters" by the authors), which is computationally intensive for large-scale genomic data. The authors circumvented this challenge by converting the original diploid genotype data to (pseudo-)haplotype data and modifying the existing positional Burrow-Wheeler transformation (PBWT) algorithms to enable an efficient search for haplotype blocks shared by many individuals. With this method, the authors identified over 1.8 million ROH diplotype clusters (each shared by at least 100 individuals) and 61 significant associations with various non-cancer diseases in the UK Biobank dataset.

      Overall, the study is well-motivated, highly innovative, and potentially impactful. Previous biobank-based studies of recessive genetic effects primarily focused on genome-wide aggregated ROH content, but this metric is a poor proxy for homozygosity of the recessive alleles at causal loci. Therefore, searching for the association between phenotype and specific variants in the homozygous state is a key next step towards discovering and understanding disease genes/alleles with recessive effects. That said, I have some concerns regarding the power and error rate of the methods, for both identification of ROH diplotype clusters and subsequent association mapping. In addition, some of the newly identified associations need further validation and careful consideration of potential artifacts (such as cryptic relatedness and environment sharing).

      (1) Identification of ROH diplotype clusters.<br /> The practice of randomly assigning heterozygous sites to a homozygous state is expected to introduce errors, leading to both false positives and false negatives. An advantage that the authors claim for this practice is to reduce false negatives due to occasional mismatch (possibly due to genotyping error, or mutation), but it's unclear how much the false positive rate is reduced compared to traditional ROH detection algorithm. The authors also justified the "random allele drawing" practice by arguing that "the rate of false positives should be low" for long ROH segments, which is likely true but is not backed up with quantitative analysis. As a result, it is unclear whether the trade-off between reducing FNs and introducing FPs makes the practice worthwhile (compared to calling ROHs in each individual with a standard approach first followed by scanning for shared diplotypes across individuals using BWT). I would like to see a combination of back-of-envelope calculation, simulation (with genotyping errors), and analysis of empirical data that characterize the performance of the proposed method.

      In particular, I find the high number of ROH clusters in MHC alarming, and I am not convinced that this can be fully explained by a high density of SNPs and low recombination rate in this region. The authors may provide further support for their hypothesis by examining the genome-wide relationship between ROH cluster abundance and local recombination rate (or mutation rate).

      (2) Power of ROH association. Given that the authors focused on long segments only (which is a limitation of the current method), I am concerned about the power of the association mapping strategy, because only a small fraction of causal alleles are expected to be present in long, homozygous haplotypes shared by many individuals. It would be useful to perform a power analysis to estimate what fraction of true causal variants with a given effect size can be detected with the current method. To demonstrate the general utility of this method, the authors also need to characterize the condition(s) under which this method could pick up association signals missed by standard GWAS with recessive effects considered. I suspect some variants with truly additive effects can also be picked up by the ROH association, which should be discussed in the manuscript to guide the interpretation of results.

      (3) False positives of ROH association. GWAS is notoriously prone to confounding by population and environmental stratification. Including leading principal components in association testing alleviates this issue but is not sufficient to remove the effects of recent demographic structure and local environment (Zaidi and Mathieson 2020 eLife). Similar confounding likely applies to homozygosity mapping and should be carefully considered. For example, it is possible that individuals who share a lot of ROH diplotypes tend to be remotely related and live near each other, thus sharing similar environments. Such scenarios need to be excluded to further support the association signals.

      (4) Validation of significant associations. It is reassuring that some of the top associations are indirectly corroborated by significant GWAS associations between the same disease and individual SNPs present in the ROH region (Tables 1 and 2). However, more sanity checks should be done to confirm consistency in direction of effect size (e.g., risk alleles at individual SNPs should be commonly present in risk-increasing ROH segment, and vice versa) and the presence of dominance effect.

    3. Reviewer #2 (Public Review):

      The authors have proposed a computational algorithm to identify runs of homozygosity (ROH) segments in a generally outbred population and then study the association of ROH with self-reported disorders in the UK biobank. The algorithm certainly identifies such segments. However, more work is needed to justify the importance of ROH.

    4. Reviewer #3 (Public Review):

      A classic method to detect recessive disease variants is homozygosity mapping, where affected individuals in a pedigree are scanned for the presence of runs of homozygosity (ROH) intersecting in a given region. The method could in theory be extended to biobanks with large samples of unrelated individuals; however, no efficient method was available (to the best of my knowledge) for detecting overlapping clusters of ROH in such large samples. In this paper, the authors developed such a method based on the PBWT data structure. They applied the method to the UK biobank, finding a number of associations, some of them not discovered in single SNP associations.

      Major strengths:<br /> • The method is innovative and algorithmically elegant and interesting. It achieves its purpose of efficiently and accurately detecting ROH clusters overlapping in a given region. It is therefore a major methodological advance.<br /> • The method could be very useful for many other researchers interested in detecting recessive variants associated with any phenotype.<br /> • The statistical analysis of the UK biobank data is solid and the results that were highlighted are interesting and supported by the data.

      Major weaknesses:<br /> • The positions and IDs of the ROH clusters in the UK biobank are not available for other researchers. This means that other researchers will not be able to follow up on the results of the present paper.<br /> • The vast majority of the discoveries were in regions already known to be associated with their respective phenotypes based on standard GWAS.<br /> • The running time seems rather long (at least for the UK biobank), and therefore it will be difficult for other researchers to extensively experiment with the method in very large datasets. That being said, the method has a linear running time, so it is already faster than a naïve algorithm.

    5. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Naseri et al. present a new strategy for identifying human genetic variants with recessive effects on disease risk by the genome-wide association of phenotype with long runs-of-homozygosity (ROH). The key step of this approach is the identification of long ROH segments shared by many individuals (termed "shared ROH diplotype clusters" by the authors), which is computationally intensive for large-scale genomic data. The authors circumvented this challenge by converting the original diploid genotype data to (pseudo-)haplotype data and modifying the existing positional Burrow-Wheeler transformation (PBWT) algorithms to enable an efficient search for haplotype blocks shared by many individuals. With this method, the authors identified over 1.8 million ROH diplotype clusters (each shared by at least 100 individuals) and 61 significant associations with various non-cancer diseases in the UK Biobank dataset.

      Overall, the study is well-motivated, highly innovative, and potentially impactful. Previous biobank-based studies of recessive genetic effects primarily focused on genome-wide aggregated

      ROH content, but this metric is a poor proxy for homozygosity of the recessive alleles at causal loci. Therefore, searching for the association between phenotype and specific variants in the homozygous state is a key next step towards discovering and understanding disease genes/alleles with recessive effects. That said, I have some concerns regarding the power and error rate of the methods, for both identification of ROH diplotype clusters and subsequent association mapping. In addition, some of the newly identified associations need further validation and careful consideration of potential artifacts (such as cryptic relatedness and environment sharing).

      1) Identification of ROH diplotype clusters.

      The practice of randomly assigning heterozygous sites to a homozygous state is expected to introduce errors, leading to both false positives and false negatives. An advantage that the authors claim for this practice is to reduce false negatives due to occasional mismatch (possibly due to genotyping error, or mutation), but it's unclear how much the false positive rate is reduced compared to traditional ROH detection algorithm. The authors also justified the "random allele drawing" practice by arguing that "the rate of false positives should be low" for long ROH segments, which is likely true but is not backed up with quantitative analysis. As a result, it is unclear whether the trade-off between reducing FNs and introducing FPs makes the practice worthwhile (compared to calling ROHs in each individual with a standard approach first followed by scanning for shared diplotypes across individuals using BWT). I would like to see a combination of back-of-envelope calculation, simulation (with genotyping errors), and analysis of empirical data that characterize the performance of the proposed method.

      In particular, I find the high number of ROH clusters in MHC alarming, and I am not convinced that this can be fully explained by a high density of SNPs and low recombination rate in this region. The authors may provide further support for their hypothesis by examining the genome-wide relationship between ROH cluster abundance and local recombination rate (or mutation rate).

      Thanks for this insightful comment. Through additional experiments, we confirmed that the excessive number of ROH clusters in the MHC region is due to the higher density of markers per centimorgan. As discussed above at Essential Revision 2, we took this opportunity to modify our code to search for clusters with the minimum length in terms of cM instead of sites. We have also provided the genetic distance for reported clusters in the MHC region with significant association (genetic length (cM) column in Tables 1 and 2). We include the following in the main text:

      “We searched for ROH clusters using a minimum target length of 0.1 cM (Figure 3–figure supplement 1). As shown in the figure, there is no excessive number of ROH clusters in chromosome 6 as was spotted using a minimum number of variant sites.”

      Methods section, ROH algorithm subsection:

      “We implemented ROH-DICE to allow direct use of genetic distances in addition to variant sites for L. The program can take minimum target length L directly in cM and detect all ROH clusters greater than or equal to the target length in cM. The program holds a genetic mapping table for all the available sites, and cPBWT was modified to work directly with the genetic length instead of the number of sites.”

      2) Power of ROH association. Given that the authors focused on long segments only (which is a limitation of the current method), I am concerned about the power of the association mapping strategy, because only a small fraction of causal alleles are expected to be present in long, homozygous haplotypes shared by many individuals. It would be useful to perform a power analysis to estimate what fraction of true causal variants with a given effect size can be detected with the current method. To demonstrate the general utility of this method, the authors also need to characterize the condition(s) under which this method could pick up association signals missed by standard GWAS with recessive effects considered. I suspect some variants with truly additive effects can also be picked up by the ROH association, which should be discussed in the manuscript to guide the interpretation of results.

      We added a new experiment in the Results section “Evaluation of ROH clusters in simulated data” under Power of ROH-DICE in association studies. We compared the power of the ROH cluster with additive, recessive, and dominant models. Our simulation shows that using ROH clusters outperforms standard GWAS when a phenotype is associated with a set of consecutive homozygous sites. We added the following text:

      “...We calculated the p-values for both ROH clusters and all variant sites. We used a p-value cut-off of 0.05 divided by the number of tests for each phenotype to determine whether the calculated p-value was smaller than the threshold, indicating an association. For GWAS, only one variant site within the ROH cluster, contributing to the phenotype, was required. We tested for all additive, dominant, and recessive effects (Figure 1–figure supplement 3). The figure demonstrates that ROH-DICE outperforms GWAS when a phenotype is associated with a set of consecutive homozygous sites. The maximum effect size of 0.3 resulted in ROH clusters achieving a power of 100%, whereas the additive model only achieved 11%, and the dominant and recessive models achieved 52% and 70%, respectively. The GWAS with recessive effect yields the best results among other GWAS tests, however, its power is still lower than using ROH clusters.”

      3) False positives of ROH association. GWAS is notoriously prone to confounding by population and environmental stratification. Including leading principal components in association testing alleviates this issue but is not sufficient to remove the effects of recent demographic structure and local environment (Zaidi and Mathieson 2020 eLife). Similar confounding likely applies to homozygosity mapping and should be carefully considered. For example, it is possible that individuals who share a lot of ROH diplotypes tend to be remotely related and live near each other, thus sharing similar environments. Such scenarios need to be excluded to further support the association signals.

      We acknowledge that there could be confounding factors that may affect the association's results. To address this, we utilized principal component (PC) values and additional covariates while using PHESANT after our initial Chi-square tests. We also included your comments in our Discussion section:

      "We used age, gender, and genetic principal components as confounding variables in the association analysis. Genetic principal components can reduce the confounding effect brought on by population structure but it may be insufficient to completely eliminate the effects of recent demographic structure and the local environment45. For example, individuals sharing excessive ROH diplotypes may share similar environments since they are closely related and reside close to one another. Since we did not rule out related individuals, some of the reported GWAS signals may not be attributable to ROH.”

      4) Validation of significant associations. It is reassuring that some of the top associations are indirectly corroborated by significant GWAS associations between the same disease and individual SNPs present in the ROH region (Tables 1 and 2). However, more sanity checks should be done to confirm consistency in direction of effect size (e.g., risk alleles at individual SNPs should be commonly present in risk-increasing ROH segment, and vice versa) and the presence of dominance effect.

      The beta values for effect size are now included in all reported tables. All beta values for ROH-DICE are positive indicating carriers of these ROH diplotypes may increase the risk of certain non-cancerous diseases. Moreover, we conducted the suggested sanity check to confirm the consistency of the direction of risk-inducing ROH diplotypes and risk alleles.

      We also computed D’ as a measure of linkage between the reported GWAS results and ROH clusters. We found that most of the GWAS results and ROH clusters are strongly correlated. However, in a few cases, D' is small or close to zero. In such cases, the reported p-value from GWAS was also insignificant, while the ROH cluster indicated a significant association. We included these points in the Results section.

      Reviewer #3 (Public Review):

      A classic method to detect recessive disease variants is homozygosity mapping, where affected individuals in a pedigree are scanned for the presence of runs of homozygosity (ROH) intersecting in a given region. The method could in theory be extended to biobanks with large samples of unrelated individuals; however, no efficient method was available (to the best of my knowledge) for detecting overlapping clusters of ROH in such large samples. In this paper, the authors developed such a method based on the PBWT data structure. They applied the method to the UK biobank, finding a number of associations, some of them not discovered in single SNP associations.

      Major strengths:

      •           The method is innovative and algorithmically elegant and interesting. It achieves its purpose of efficiently and accurately detecting ROH clusters overlapping in a given region. It is therefore a major methodological advance.

      •           The method could be very useful for many other researchers interested in detecting recessive variants associated with any phenotype.

      •           The statistical analysis of the UK biobank data is solid and the results that were highlighted are interesting and supported by the data.

      Major weaknesses:

      •           The positions and IDs of the ROH clusters in the UK biobank are not available for other researchers. This means that other researchers will not be able to follow up on the results of the present paper.

      We included the SNP IDs, positions, and consensus alleles for all reported loci in the main tables. Moreover, additional information including beta and D’ values were added. The current information should allow researchers to follow up on the results. Supplementary File 2 contains beta, D’ values for all reported clusters.

      Supplementary File 3 contains the SNP IDs and consensus alleles for all reported clusters in Tables 1 and 2. The consensus allele denotes the allele with the highest occurrence in the reported clusters.

      •           The vast majority of the discoveries were in regions already known to be associated with their respective phenotypes based on standard GWAS.

      We agree that a majority of the ROH regions are indeed consistent with GWAS. However, some regions were missed by standard GWAS (e.g. chr6:25969631-26108168, hemochromatosis). Our message is that our method is a complementary approach to standard GWAS and will not replace standard GWAS analysis. See our response to Reviewer #2 Point Six.

      •           The running time seems rather long (at least for the UK biobank), and therefore it will be difficult for other researchers to extensively experiment with the method in very large datasets. That being said, the method has a linear running time, so it is already faster than a naïve algorithm.

      Thank you for your input. The algorithm used to locate matching blocks is efficient and the total CPU hours it consumed was the reported run time. Since it consumes very little memory and resources, it can be executed simultaneously for all chromosomes. We also noticed that a significant time was being spent parsing the input file and slightly modified our script to improve the parsing. We also re-ran it for all chromosomes in parallel and reported the elapsed time which was only 18 hours and 54 minutes.

      “This was achieved by running the ROH-DICE program, with a wall clock time of 18 hours and 54 minutes where the program was executed for all chromosomes in parallel (total CPU hours of ~ 242.5 hours). The maximum residence size for each chromosome was approximately 180 MB.”

    1. Author response;

      Reviewer #1 (Public Review):

      Authors investigated the role of OBOX4 in the zygotic genome activation (ZGA) in mice. Obox4 genes form an array of duplicated genes they were identified as a candidate ZGA factor based on expression patterns during early development. The role of OBOX4 was subsequently studied in embryonic stem cells and early embryos. It was found that transcriptional activation mediated by OBOX4 has similar features as that of DUX, which was previously identified as a zygotic transcription factor involved in ZGA and a major activator of the zygotic expression program. It was, however, unexpected that Dux knock-out did not impair embryonic development. The work by Guo et al. provides several lines of evidence that OBOX4-mediated activation of gene expression considerably overlaps with that of DUX and this redundancy might explain the loss of early developmental phenotype in Dux mutants. Consistent with this model, double mutants of Obox4 and Dux show impaired development. Given the difficulties with investigating details of the genetic model in double mutants at the preimplantation embryo stage, authors not only crossed genetic mutants, but also used (1) nuclear transfer of mutated nuclei of ESCs, which could be characterized on their own in separate experiments, and (2) antisense oligonucleotides (ASO) microinjection, which included a rescue control demonstrating that reintroducing OBOX4 is sufficient to rescue the phenotype caused by blocking both, Dux and Obox4.

      This work is important for the field because it reveals functional redundancy and plasticity of the zygotic genome activation in mammals, where the mouse model stands as a remarkable example of genome activation, which massively integrated long terminal repeat (LTR)-derived enhancers from retrotransposons and now two of the key activating zygotic factors appear to be encoded by tandemly duplicated clusters of different phylogenetic age. Identification of OBOX4 as a second factor partially redundant with DUX now allows us to decipher what constitutes the essential part of the ZGA program.

      We are grateful for the reviewer’s appreciation of our work, particularly the technical difficulty of knocking out two multicopy genes and the value of the rescue experiment.

      Reviewer #2 (Public Review):

      In this study, Guo et al., screened a few homeobox transcription factors and identified that Obox4 can induce the 2-cell like state in mouse embryonic stem cells (mESCs) (Fig. 1 and 2). The authors also compared in detail how Obox4 vs. Dux in activating 2C repeats and genes in mESCs (Fig. 3). Compared to Dux, Obox4 activates fewer 2C genes (Fig. 2). In addition, although both Obox4 and Dux bind to MERVL elements, Obox4 additionally binds to ERVK (Fig. 3). The authors then used three different approaches (i.e., SCNT-mediated KO, ASO-mediated KD, and genetic KO) to study how Obox4 and Dux regulates zygotic genome activation in embryos. Although there are some inconsistencies among different approaches, the authors were able to show that loss of both Obox4 and Dux causes more severe consequences than loss of single protein in embryonic development and zygotic genome activation (Fig. 4 and 5).

      Overall, this is a comprehensive study that addresses an important question that puzzles the community. However, some comparisons to the recent work by Ji et al (PMID: 37459895) are highly recommended. Ji et al knocked out the entire Obox cluster (including Obox4) in mice and found that Obox cluster KO causes 2-4 cell arrest without affecting Dux. That said, Obox proteins seem more critical than Dux in regulating ZGA, and Obox cluster KO cannot be compensated by Dux. Ji et al., also reported that maternal (Obox1, 2, 5, 7) and zygotic (Obox3, 4) Obox proteins redundantly regulate embryogenesis because loss of either is compatible to development. Consistent with Ji's work, Obox4 KO embryos generated in this study can develop to adulthood and are fertile. Since these two studies are highly relevant, some comparisons of Obox4 KO and Obox4/Dux DKO with the previous Obox cluster KO will greatly benefit the community.

      We thank the reviewer for appreciating the value of our study. We are aware of the work done to high standard by Ji et al. and have included a comparison between our data and the data by Ji et al. in the revised manuscript. Despite repeated attempts, various crossing strategies failed to produce Obox4KO/DuxKO mating pairs that could be used to produce large number of Obox4KO/DuxKO embryos required for in-depth transcriptome analysis. Based on the quality of the RNA-seq, we decided to perform comparative analysis using our ASO KD data and showed that Obox4 has distinct regulatory targets from those of other Obox family members, which is consistent with the phylogenetic distance within the family.

    1. Author response:

      A general comment was that this study left several key questions unanswered, in particular the causal mechanism for the reported ribosomal distributions. We have been interested in the evolution of asymmetric bacterial growth and aging for many years. However, a motivational difference is that we are more interested in the evolutionary process, and evolution by natural selection works on the phenotype. Thus, we wanted to start with the phenotype closest to fitness, appropriately defined for the conditions, work downwards. We examined first the asymmetry of elongation rates in single cells, then gene products, and now ribosomes. As we have pointed out, our demonstration of ribosomal asymmetry shows that the phenomenon was not peculiar and unique to the gene products we examined. Rather, the asymmetry is acting higher up in the metabolic network and likely affecting all genes. We find such conceptual guidance to be important. In the ideal world, of course we would have liked to have worked out the causal mechanisms in one swoop. In a less than ideal situation, it is a subjective decision as where to stop. We believe that the publication of this manuscript is more than appropriate at this juncture. We work at the interface of evolutionary theory and microbiology. Our results could appeal to both fields. If we attract new researchers, progress could be accelerated. Could the delay caused by publishing only completed stories slow the rate of discovery? These questions are likely as old as science (e.g., https://telliamedrevisited.wordpress.com/2021/01/28/how-not-to-write-a-response-to-reviewers/).

      We present below our response to specific comments by reviewers. We have not added a new discussion of papers suggested by Reviewer #1 because we feel that the speculations would have been too unfocused. We were already criticized for speculation in the Discussion about a link between aggregate size and ribosomal density.

      Respond to Major comments by Reviewer #1.

      (a) Fig. 1 only shows 2 divisions (rather than 3 as per Rev1) to avoid an overly elaborate figure. We have added text to the figure legend that the old and new poles and daughters in the subsequent 3, 4, 5, 6, and 7 generations can be determined by following the same notations and tracking we presented for generations 1 and 2 in Fig. 1. For example, if we know the old and new poles of any of the four daughters after 2 divisions (as in Fig. 1), and allow that daughter to elongate, become a mother, and divide to produce 2 “grand-daughters”, the polarity of the grand-daughters can also be determined.

      (b) Because division times were normalized and analyzed as quartiles, the raw values were never used. Rather than annotating unused values, we have provided the mean division times in the Material and Methods section on normalization to provide representative values.

      (c) We did not quantify in our study the changes over generations for three reasons. First, the sample sizes for the first generations (cohorts of 1, 2, 4, and 8 cells) are statistically small. Second, and most importantly, cells on an agar pad in a microscope slide, despite being inoculated as fresh exponentially growing cells, experience a growth lag, as all cells transferred to a new physiological condition. Thus, to be safe, we do not collect data from cohorts 1, 2, 4, and 8 to ensure that our cells are as much as possible physiologically uniform. Lastly, as we noted in the Material and Methods they also slow down after 7 generations (128 cells). Thus, we have collected ribosome and length measurements primarily from cohorts 16, 32, 64, and 128. Measurable cells from the 128 cohort are actually rare because a colony with that many cells often starts to form double layers, which are not measurable. Most of our measurements came from the 16, 32, and 64 cohorts, in which case a time series would not be meaningful. Some of these details were not included in our manuscript but have been added to the Material and Methods (Microscopy and time-lapse movies). For these reasons we have not added a time series as requested by the reviewer.

      (d) We have added the additional figure as requested, but as a supplement rather than in the main article (Supplemental Materials Fig. S1). This figure showed the normalized density of ribosomes along the normalized length of old and new daughters. The density was continuous rather than quartiles. This figure was included in the original manuscript, but readers recommended that it be removed because the all the analyzed data had been done with quartiles. Readers felt mislead and confused.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We greatly appreciate the comments from the editor and the reviewers, based on which we have made the revisions. We have responded to all the questions and summarized the revisions below. The changes are also highlighted in the manuscript.

      Additionally, we’ve noticed a few typos in the manuscript presented on the eLife website, which were not there in our originally submitted file.

      (1) In both the “Full text” presented on the eLife website and the pdf file generated after clicking “Download”: the last FC1000 in the second paragraph of the “Extensive induction curves fitting of TetR mutants” section should be FC1000WT .

      (2) In the pdf file generated after clicking “Download”: the brackets are all incorrectly formatted in the captions of Figure 4 and Figure 3—figure supplement 6.

      eLife assessment

      The fundamental study presents a two-domain thermodynamic model for TetR which accurately predicts in vivo phenotype changes brought about as a result of various mutations. The evidence provided is solid and features the first innovative observations with a computational model that captures the structural behavior, much more than the current single-domain models.

      We appreciate the supportive comments by the editor and reviewers.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors’ earlier deep mutational scanning work observed that allosteric mutations in TetR (the tetracycline repressor) and its homologous transcriptional factors are distributed across the structure instead of along the presumed allosteric pathways as commonly expected. Especially, in addition, the loss of the allosteric communications promoted by those mutations, was rescued by additional distributed mutations. Now the authors develop a two-domain thermodynamic model for TetR that explains these compelling data. The model is consistent with the in vivo phenotypes of the mutants with changes in parameters, which permits quantification. Taken together their work connects intra- and inter-domain allosteric regulation that correlate with structural features. This leads the authors to suggest broader applicability to other multidomain allosteric proteins. Here the authors follow their first innovative observations with a computational model that captures the structural behavior, aiming to make it broadly applicable to multidomain proteins. Altogether, an innovative and potentially useful contribution.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      None that I see, except that I hope that in the future, if possible, the authors would follow with additional proteins to further substantiate the model and show its broad applicability. I realize however the extensive work that this would entail.

      We thank the reviewer for the supportive comments and the suggestion to extend the model to other proteins, which we indeed plan to pursue in future studies.

      Reviewer #2 (Public Review):

      Summary:

      This combined experimental-theoretical paper introduces a novel two-domain statistical thermodynamic model (primarily Equation 1) to study allostery in generic systems but focusing here on the tetracycline repressor (TetR) family of transcription factors. This model, building on a function-centric approach, accurately captures induction data, maps mutants with precision, and reveals insights into epistasis between mutations.

      Strengths:

      The study contributes innovative modeling, successful data fitting, and valuable insights into the interconnectivity of allosteric networks, establishing a flexible and detailed framework for investigating TetR allostery. The manuscript is generally well-structured and communicates key findings effectively.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The only minor weakness I found was that I still don’t have a better sense into (a) intuition and (b) mathematical derivation of Equation 1, which is so central to the work. I would recommend that the authors provide this early on in the main text.

      We thank the reviewer for the suggestion. The full mathematical derivation of Equation 1 is given in the first section of the supplementary file. Given the length of the derivation, we think it’s better to keep it in the supplementary file rather than the main text. In the main text, the first subsection (overview of the two-domain thermodynamic model of allostery) of the Results section and the paragraph right before Equation 1 are meant for providing intuitive understandings of the two-domain model and the derivation of Equation 1, respectively.

      We would also like to point the reviewer to Figure 2-figure supplement 2 and Equations (12) to (18) in the supplementary file for an alternative derivation. They show that the equilibria among all molecular species containing the operator are dictated by the binding free energies, the ligand concentration, and the allosteric parameters. The probability of an unbound operator (proportional to the probability that the promoter is bound by a RNA polymerase, or the gene expression level) can thus be calculated using Equation (12), which then leads to main text Equation 1 following the derivation given there.

      Additionally, we’ve added a paragraph to the main text (line 248-260) to aid an intuitive understanding of Equation 1.

      “The distinctive roles of the three biophysical parameter on the induction curve as stipulated in Equation 1 could be understood in an intuitive manner as well. First, the value of εD controls the intrinsic strength of binding of TetR to the operator, or the intrinsic difficulty for ligand to induce their separation. Therefore, it controls how tightly the downstream gene is regulated by TetR without ligands (reflected in leakiness) and affects the performance limit of ligands (reflected in saturation). Second, the value of εL controls how favorable ligand binding is in free energy. When εL increases, the binding of ligand at low concentrations become unfavorable, where the ligands cannot effectively bind to TetR to induce its separation from the operator. Therefore, the fold-change as a function of ligand concentration only starts to noticeably increase at higher ligand concentrations, resulting in larger EC50. Third, as discussed above, γ controls the level of anti-cooperativity between the ligand and operator binding of TetR, which is the basis of its allosteric regulation. In other words, γ controls how strongly ligand binding is incompatible with operator binding for TetR, hence it controls the performance limit of ligand (reflected in saturation).”

      We hope that the reviewer will find this explanation helpful.

      Reviewer #3 (Public Review):

      Summary:

      Allosteric regulations are complicated in multi-domain proteins and many large-scale mutational data cannot be explained by current theoretical models, especially for those that are neither in the functional/allosteric sites nor on the allosteric pathways. This work provides a statistical thermodynamic model for a two-domain protein, in which one domain contains an effector binding site and the other domain contains a functional site. The authors build the model to explain the mutational experimental data of TetR, a transcriptional repress protein that contains a ligand and a DNA-binding domain. They incorporate three basic parameters, the energy change of the ligand and DNA binding domains before and after binding, and the coupling between the two domains to explain the free energy landscape of TetR’s conformational and binding states. They go further to quantitatively explain the in vivo expression level of the TetR-regulated gene by fitting into the induction curves of TetR mutants. The effects of most of the mutants studied could be well explained by the model. This approach can be extended to understand the allosteric regulation of other two-domain proteins, especially to explain the effects of widespread mutants not on the allosteric pathways. Strengths: The effects of mutations that are neither in the functional or allosteric sites nor in the allosteric pathways are difficult to explain and quantify. This work develops a statistical thermodynamic model to explain these complicated effects. For simple two-domain proteins, the model is quite clean and theoretically solid. For the real TetR protein that forms a dimeric structure containing two chains with each of them composed of two domains, the model can explain many of the experimental observations. The model separates intra and inter-domain influences that provide a novel angle to analyse allosteric effects in multi-domain proteins.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      As mentioned above, the TetR protein is not a simple two-main protein, but forms a dimeric structure in which the DNA binding domain in each chain forms contacts with the ligand-binding domain in the other chain. In addition, the two ligand-binding domains have strong interactions. Without considering these interactions, especially those mutants that are on these interfaces, the model may be oversimplified for TetR.

      We thank the reviewer for this valid concern and acknowledge that TetR is a homodimer. However, we’ve deliberately chosen to simplify this complexity in our model for the following reasons.

      (1) In this work, we aim to build a minimalist model for two-domain allostery withonly the most essential parameters for capturing experimental data. The simplicity of the model helps promote its mechanistic clarity and potential transferability to other allosteric systems.

      (2) Fewer parameters are needed in a simpler model. Our two-domain modelcurrently uses only three biophysical parameters, which are all demonstrated to have distinct influences on the induction curve (see the main text section “System-level ramifications of the two-domain model”). This enables the inference of parameters with high precision for the mutants, and the quantification of the most essential mechanistic effects of their mutations, provided that the model is shown to accurately recapitulate the comprehensive dataset. Thus, we found it was unnecessary to add another parameter for explicitly describing inter-chain coupling, which would likely incur uncertainty in the inference of parameters due to the redundancy of their effects on induction data, and prevent the model from making faithful predictions.

      (3) From a more biological point of view, TetR is an obligate dimer, meaning thatthe two chains must synchronize for function, supporting the two-domain simplification of TetR for binding concerns.

      Additionally, as shown in the subsection “Inclusion of single-ligand-bound state of repressor” of section 1 of the supplementary file, incorporating the dimeric nature of TetR in our model by allowing partial ligand binding does not change the functional form of main text equation 1 in any practical sense. Therefore, considering all the factors stated above, we think that increasing the complexity of the two-domain model will only be necessary if additional data emerge to suggest the limitation of our model.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This is an excellent work. I have only one suggestion for the authors. Interestingly, the authors also note that the epistatic interactions that they obtain are consistent with the structural features of the protein, which is not surprising. Within this framework, have the authors considered rescue mutations? Please see for example PMID: 18195360 and PMID: 15683227. If I understand right, this might further extend the applicability of their model. If so, the authors may want to add a comment to that effect.

      We thank the reviewer for the supportive comments and for pointing us to the useful references. We have added some comments to the main text regarding this point in line 332-336: “The diverse mechanistic origins of the rescuing mutations revealed here provide a rational basis for the broad distributions of such mutations. Integrating such thermodynamic analysis with structural and dynamic assessment of allosteric proteins for efficient and quantitative rescuing mutation design could present an interesting avenue for future research, particularly in the context of biomedical applications (PMID: 18195360, PMID: 15683227).”

      Reviewer #3 (Recommendations For The Authors):

      The authors should try to build a more realistic dimeric model for TetR to see if it could better explain experimental data. If it were too complicated for a revision, more discussions on the weakness of the current model should be given.

      We thank the reviewer for this valid concern and for the suggestion. The reasons for refraining from increasing the complexity of the model are fully discussed in our response to the reviewer’s public review given above. Primarily, we think that the value of a simple physical model is two-fold (e.g., the paradigm Ising model in statistical physics and the classic MWC model), first, its mechanistic clarity and potential transferability makes it a useful conceptual framework for understanding complex systems and establishing universal rules by comparing seemingly unrelated phenomena; second, it provides useful insights and design principles of specific systems if it can quantitatively capture the corresponding experimental data. Thus, given the current experimental data set, we believe it is justified to keep the two-domain model in its current form, while additional experimental data could necessitate a more complex model for TetR allostery in the future. Relevant discussions are added to the main text (line 443-446) and section 8 of the supplementary file.

      “It’s noted that the homodimeric nature of TetR is ignored in the current two-domain model to minimize the number of parameters, and additional experimental data could necessitate a more complex model for TetR allostery in the future (see supplementary file section 8 for more discussions).”

      Minor issues:

      (1) There is an error in Figure 3A, the 13th and 14th subgraphs are the same and should be corrected.

      We thank the reviewer for capturing this error, which has been corrected in the revised manuscript.

      (2) The criteria for the selection of mutants for analysis should be clearly given. Apart from deleting mutants that are in direct contact with the ligand of DNA, how many mutants are left, and how far are they are from the two sites? In line 257, what are the criteria for selecting these 15 mutants? Similarly, in line 332, what are the criteria for selecting these 8 mutants?

      We thank the reviewer for this comment. The data selection criteria are now added in section 7 of the supplementary file. The distances to the DNA operator and ligand of the 21 residues under mutational study are now added in Table 1 (Figure 3-figure supplement 9). The added materials are referenced in the main text where relevant.

      “7. Mutation selection for two-domain model analysis

      In this work, there are 24 mutants studied in total including the WT, and they contain mutations at 21 WT residues. We did not perform model parameter inference for the mutant G102D because of its flat induction curve (see the second subsection of section 2 and main text Figure 2—figure Supplement 3). Therefore, there are 23 mutants analyzed in main text Figure 5.

      Measuring the induction curve of a mutant involves a significant amount of experimental effort, which therefore is hard to be extended to a large number of mutants. Nonetheless, we aim to compose a set of comprehensive induction data here for validating our two-domain model for TetR allostery. To this end, we picked 15 individual mutants in the first round of induction curve measurements, which contains mutations spanning different regions in the sequence and structure of TetR (main text Figure 3—figure Supplement 1). Such broad distribution of mutations across LBD, DBD and the domain interface could potentially lead to diverse induction curve shapes and mutant phenotypes for validating the two-domain model. Indeed, as discussed in the main text section "Extensive induction curves fitting of TetR mutants", the diverse effects on induction curve from mutations perturbing different allosteric parameters predicted by the model, are successfully observed in these 15 experimental induction curves. Additionally, 5 of the 15 mutants contain a dead-rescue mutation pair, which helps us validate the model prediction that a dead mutation could be rescued by rescuing mutations that perturb the allosteric parameters in various ways.

      Eight mutation combinations were chosen for the second round of induction curve measurement for studying epistasis, where we paired up C203V and Y132A with mutations from different regions of the TetR structure. Such choice is largely based on two considerations. 1. As both C203V and Y132A greatly enhance the allosteric response of TetR, we want to probe why they cannot rescue a range of dead mutations as observed previously (PMID: 32999067). 2. C203V and Y132A are the only two mutants that show enhanced allosteric response in the first round of analysis. Combining detrimental mutations of allostery in a combined mutant could potentially lead to near flat induction curve, which is less useful for inference (see the second subsection of section 2).”

      Since the number of hotspots identified by DMS is not very large, why not analyze them all?

      We thank the reviewer for this comment. There are 41 hotspot residues in TetR (PMID: 36226916), which have 41*19=779 possible single mutations. It’s unfeasible to perform induction curve measurements for all of these 779 mutants in our current experiment. However, we agree that it would be helpful if we can obtain such a dataset in an efficient way.

      In line 257, there are 15 mutants mentioned, while in Figure 5, there are 23 mutants mentioned, in Figure 3-figure supplement 1, there are 21 mutants mentioned, and in line 226 of the supplementary file, there are 24 mutants mentioned, which is very confusing. Therefore, the data selection criteria used in this article should be given.

      We thank the reviewer for this comment. The data selection criteria are now given in section 7 of the supplementary file, which should clarify this confusion.

      (3) In Figure 4 of the Exploring epistasis between mutations section, the 6 weights of the additive models corresponding to each mutation combination are different. On one hand, it seems that there are no universal laws in these experimental data. On the other hand, unique parameters of a single mutation combination were not validated in other mutation combinations, which somewhat weakened the conclusions about the potential physical significance of these additive weights.

      We thank the reviewer for this comment. We admit that a quantitative universal law for tuning the 6 weights of the additive model does not manifest in our data, which indicates the mutation-specific nature of epistatic interactions in TetR as hinted in the different rescuing mutation distributions of different dead mutations (PMCID: PMC7568325). However, clear common trends in the weight tuning of combined mutants that contain common mutations do emerge, which comply with the structural features of the protein and provide explanations as to why C203V and Y132A don’t rescue a range of dead mutations (main text section “Exploring epistasis between mutations”). Additionally, the lack of a quantitative universal rule for tuning the 6 weights in our simple model doesn’t exclude the possibility of the existence of universal law for epistasis in TetR in another functional form, a point that could be explored in the future with more extensive joint experimental and computational investigations.

      In Eq. (27) of the supplementary file, the prior distribution of inter-domain coupling γ is given as a Gaussian distribution centered at 5 kBT. Since the absolute value of γ is important, can the authors explain why the prior distribution of γ is set to this value and what happens if other values are used?

      We thank the reviewer for the question. As explained in the corresponding discussions of Eq. (27) in the supplementary file, the prior of γ is chosen to serve as a soft constraint on its possible values based on the consideration that 1. inter-domain energetics for a TetR-like protein should be on the order of a few kBT; and 2. the prior distribution should reflect the experimental observation in the literature that γ has a small probability of adopting negative values upon mutations. Given our thorough validation of the statistical model and computational algorithm (see section 3 of the supplementary file), and the high precision in the parameter fitting results using experimental data (Figure 3 and Figure 4-figure supplement 2), we conclude that 1. the physical range of parameters encoded in their chosen prior distributions agrees well with the value reflected in the experimental data; 2. the inference results are predominantly informed by the data. Thus, changing the mean of the prior distribution of γ should not affect the inference results significantly given that it remains in the physical range.

      This point is explicitly shown in the added Table 2 (Figure 3-figure supplement 10), where we compare the current Bayesian inference results with those obtained after increasing the standard deviation of the Gaussian prior of γ from 2.5 to 5 kBT. As shown in the table, most inference results stay virtually unchanged at the use of this less informative prior, which confirms that they are predominantly informed by the data. The only exceptions are the slight increase of the inferred γ values for C203V, C203V-Y132A and C203V-G102D-L146A, reflecting the intrinsic difficulty of precise inference of large γ values with our model, as is already discussed in the second subsection of section 3 of the supplementary file. However, such observations comply with the common trend of epistatic interactions involving C203V presented in the main text and don’t compromise the ability of our model to accurately capture the induction curves of mutants. Relevant discussions are now added to the second subsection of section 3 of the supplementary file (line 368-385).

      “In our experimental dataset, such inference difficulty is only observed in the case of C203V, Y132A-C203V and C203V-G102D-L146A due to their large γ and γ + εL values (see main text Figure 3, Figure 3—figure Supplement 10 and Figure 4). As shown in main text Figure 3—figure Supplement 10, the inference results for the other 20 mutants stay highly precise and virtually unchanged after increasing the standard deviation of the Gaussian prior of γ (gstdγ ) from 2.5 to 5 kBT. This demonstrates that the inference results for these mutants are strongly informed by the induction data and there is no difficulty in the precise inference of the parameter values. On the other hand, the inferred γ values (especially the upper bound of the 95% credible region) for C203V, Y132A-C203V and C203V-G102D-L146A increased with gstdγ . This is because the induction curves in these cases are not sensitive to the value of γ given that it’s large enough as discussed above. Hence, when unphysically large γ values are permitted by the prior distribution, they could enter the posterior distribution as well. Such difficulty in the precise inference of γ values for these three mutants however, doesn’t compromise the ability of our model in accurately capturing the comprehensive set of induction data (see part iv below). Additionally, the increase of the inferred γ value of C203V at the use of larger gstdγ complies with the results presented in main text Figure 4, which show that the effect of C203V on γ tends to be compromised when combined with mutations closer to the domain interface."

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study provides potentially fundamental insight into the function and evolution of daily rhythms. The authors investigate the function of the putative core circadian clock gene Clock in the cnidarian Nematostella vectensis. While it parts still incomplete, the evidence suggests that, in contrast to mice and fruit flies, Clock in this species is important for daily rhythms under constant conditions, but not under a rhythmic light/dark cycle, suggesting that the major role of the circadian oscillator in this species could be a stabilizing function under non-rhythmic environmental conditions.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this nice study, the authors set out to investigate the role of the canonical circadian gene Clock in the rhythmic biology of the basal metazoan Nematostella vectensis, a sea anemone, which might illuminate the evolution of the Clock gene functionality. To achieve their aims the team generated a Clock knockout mutant line (Clock-/- ) by CRISPR/Cas9 gene deletion and subsequent crossing. They then compared wild-type (WT) with Clock-/- animals for locomotor activity and transcriptomic changes over time in constant darkness (DD) and under light/dark cycles to establish these phenotypes under circadian control and those driven by light cycles. In addition, they used Hybridization Chain Reaction-In situ Hybridization (HCR-ISH) to demonstrate the spatial expression of Clock and a putative circadian clocl-controlled gene Myh7 in whole-mounted juvenile anemones.

      The authors demonstrate that under LD both WT and Clock-/- animals were behaviourally rhythmic but under DD the mutants lost this rhythmicity, indicating that Clock is necessary for endogenous rhythms in activity. With altered LD regimes (LD6:6) they show also that Clock is light-dependent. RNAseq comparisons of rhythmic gene expression in WT and Clock-/- animals suggest that clock KO has a profound effect on the rhythmic genome, with very little overlap in rhythmic transcripts between the two phenotypes; of the rhythmic genes in both LD and DD in WT animals (220- termed clock-controlled genes, CCGS) 85% were not rhythmic in Clock-/- animals in either light condition. In silico gene ontology (GO) analysis of CCGS reflected process associated with circadian control. Correspondingly, those genes rhythmic in KO animals under DD (here termed neoCCGs) were not rhythmic in WT, lacked upstream E-box motifs associated with circadian regulation, and did not display any GO enrichment terms. 'Core' circadian genes (as identified in previous literature) in WT and Clock-/- animals were only rhythmic under entrainment (LD) conditions whilst Clock-/- displayed altered expression profiles under LD compared to WT. Comparing CCGs with previous studies of cycling genes in Nematostellar, the authors selected a gene from 16 rhythmic transcripts. One of these, Myh7 was detectable by both RNAseq and HCR-ISH and considered a marker of the circadian clock by the authors.

      The authors claim that the study reveals insights into the evolutionary origin of circadian timing; Clock is conserved across distant groups of organisms, having a function as a positive regulator of the transcriptional translational feedback loop at the heart of daily timing, but is not a central element of the core feedback loop circadian system in this basal species. Their behavioural and transcriptomic data largely support the claims that Clock is necessary for endogenous daily activity but that the putative molecular circadian system is not self-sustained under constant darkness (this was known already for WT animals)- rather it is responsive to light cycles with altered dynamics in Clock-/- specimens in some core genes under LD. In the main, I think the authors achieved their aims and the manuscript is a solid piece of important work. The Clock-/- animal is a useful resource for examining time-keeping in a basal metazoan.

      The work described builds on other transcriptomic-based works on cnidaria, including Nematostellar, and does probe into the molecular underpinnings with a loss-of-function in a gene known to be core in other circadian systems. The field of chronobiology will benefit from the evolutionary aspect of this work and the fact that it highlights the necessity to study a range of non-model species to get a fuller picture of timing systems to better appreciate the development and diversity of clocks.

      Strengths:

      The generation of a line of Clock mutant Nematostellar is a very useful tool for the chronobiological community and coupled with a growing suite of tools in this species will be an asset. The experiments seem mostly well conceived and executed (NB see 'weaknesses'). The problem tackled is an interesting one and should be an important contribution to the field.

      Weaknesses:

      I think the claims about shedding light on the evolutionary origin of circadian time maintenance are a little bold. I agree that the data do point to an alternative role for Clock in this animal in light responsiveness, but this doesn't illuminate the evolution of time-keeping more broadly in my view. In addition, these are transcriptomic data and so should be caveated- they only demonstrate the expression of genes and not physiology beyond that. The time-course analysis is weakened by its low resolution, particularly for the RAIN algorithm when 4-hour intervals constrain the analysis. I accept that only 24h rhythms were selected in the analysis from this but, it might be that detail was lost - I think a preferred option would be 2 or 3-hour resolution or 2 full 24h cycles of analysis.

      The authors discount the possibility of the observed 12h rhythmicity in Clock-/- animals by exposing them to LD6:6 cycles before free-running them in DD. I suggest that LD cycles are not a particularly robust way to entrain tidal animals as far as we know. Recent papers show inundation/mechanical agitation are more reliable cues (Kwiatkowski ER, et al. Curr Biol. 2023, 2;33(10):1867-1882.e5. doi: 10.1016/j.cub.2023.03.015; Zhang L., et al Curr Biol. 2013, 23;19, 1863-1873 doi.org/10.1016/j.cub.2013.08.038.) and might be more effective in revealing endogenous 12h rhythms in the absence of 24h cues.

      Response: We removed the suggestion that we used 6:6h LD to perform tidal entrainment. We generated this ultradian light condition to address the 24h rhythmicity observed in the NvClk1-/- in 12:12h LD.

      Reviewer #2 (Public Review):

      This manuscript addresses an important question: what is the role of the gene Clock in the control of circadian rhythms in a very primitive group of animals: Cnidaria. Clock has been found to be essential for circadian rhythms in several animals, but its function outside of Bilaterian animals is unknown. The authors successfully generated a severe loss-of-function mutant in Nematostella. This is an important achievement that should help in understanding the early evolution of circadian clocks. Unfortunately, this study currently suffers from several important weaknesses. In particular, the authors do not present their work in a clear fashion, neither for a general audience nor for more expert readers, and there is a lack of attention to detail. There are also important methodological issues that weaken the study, and I have questions about the robustness of the data and their analysis. I am hoping that the authors will be able to address my concerns, as this work should prove important for the chronobiology field and beyond. I have highlighted below the most important issues, but the manuscript needs editing throughout to be accessible to a broad audience, and referencing could be improved.

      Major issues:

      (1) Why do the authors make the claim in the abstract that CLOCK function is conserved with other animals when their data suggest that it is not essential for circadian rhythms? dCLK is strictly required in Drosophila for circadian rhythms. In mammals, there are two paralogs, CLOCK and NPAS2, but without them, there are no circadian rhythms either. Note also that the recent claim of BMAL1-independent rhythms in mammals by Ray et al., quoted in the discussion to support the idea that rhythms can be observed in the absence of the positive elements of the circadian core clock, had to be corrected substantially, and its main conclusions have been disputed by both Abruzzi et al. and Ness-Cohn et al. This should be mentioned.

      Response: According to our Behavioral and Transcriptomic data, CLOCK function is conserved in constant light condition. In LD context, the rhythmicity is maintained probably by the light-response pathway in Nematostella. We modified our rhythmic transcriptomic analysis and considered the context of the contested results by Ray et al., and discussed it in the revised manuscript.

      (2) The discussion of CIPC on line 222 is hard to follow as well. How does mRNA rhythm inform the function of CIPC, and why would it function as a "dampening factor"? Given that it is "the only core clock member included in the Clock-dependent CCGs," (220) more discussion seems warranted. Discussing work done on this protein in mammals and flies might provide more insight.

      Response: The initial sentence was unclear. Furthermore, since we restricted our rhythmic analysis to genes only found rhythmic with a p<0.01 with RAIN combined with JTK, NvCipc was no longer defined as rhythmic in free running.

      (3) The behavioral arrhythmicity seen with their Clock mutation is really interesting. However, what is shown is only an averaged behavior trace and a single periodogram for the entire population. This leaves open the possibility that individual animals are poorly synchronized with each other, rather than arrhythmic. I also note that in DD there seem to be some residual rhythms, though they do not reach significance. Thus, it is also possible that at least some individual animals retain weak rhythms. The authors should analyze behavioral rhythms in individual animals to determine whether behavioral rhythmicity is really lost. This is important for the solidity of their main conclusions.

      Response: Fig. 1 has been modified. We have separated the data for WT and NvClk1-/- animals to provide clarity on the average behavior pattern for each genotype. While the LSP analysis on the population average informs us about the synchronization of the population, it is true that it does not provide insight into individual rhythmicity. To address this, we analyzed individuals in all conditions using the Discorhythm website (Carlucci et al., 2019).

      In the revised figure, we have included a comparison plot of the acrophase of 24-hour rhythmic animals between genotypes using Cosinor analysis, which is most suitable for acrophase detection. This plot indicates the number of animals detected as significantly rhythmic, providing direct visual input to the reader regarding individual rhythmicity. Additionally, we have added Table 1, which contains the Cosinor period analysis (24 and 12 hours) of individuals for all genotypes and conditions, further enhancing the clarity of our findings.

      (4) There is no mention in the results section of the behavior of heterozygotes. Based on supplement figure 2A, there is a clear reduction in amplitude in the heterozygous animals. Perhaps this might be because there is only half a dose of Clock, but perhaps this could be because of a dominant-negative activity of the truncated protein. There is no direct functional evidence to support the claim that the mutant allele is nonfunctional, so it is important to discuss carefully studies in other species that would support this claim, and the heterozygous behavior since it raises the possibility that the mutant allele acts as a dominant negative.

      Response: Extended Data Fig.1 modified. We show NvClk1+/- normalized locomotion over time in DD of the population, comparison of individual normalized behavior amplitude, LSP of the average population and individual acrophase of only rhythmic 24h individuals. Indeed, we cannot discriminate Dominant-negative from non-functional allele.

      (5) I do not understand what the bar graphs in Figure 2E and 3B represent - what does the y-axis label refer to?

      Response: Not relevant to the revised manuscript.

      (6a) I note that RAIN was used, with a p<0.05 cut-off. I believe RAIN is quite generous in calling genes rhythmic, and the p-value cut-off is also quite high. What happens if the stringency is increased, for example with a p<0.01.

      Response: We acknowledge your concern regarding the stringency of our statistical analysis. To address this, we opted to combine both RAIN and JTK methods and applied a more stringent p-value cut-off of p<0.01.

      (6b) It would be worth choosing a few genes called rhythmic in different conditions (mutant or wild-type. LD or DD), and using qPCR to validate the RNAseq results. For example, in Figure 3D, Myh7 RNAseq data are shown, and they do not look convincing. I am surprised this would be called a circadian rhythm. In wild-type, the curve seems arrhythmic to me, with three peaks, and a rather large difference between the first and second ZT0 time point. In the Clock mutants, rhythms seem to have a 12hr period, so they should not be called rhythmic according to the material and methods, which says that only ca 24hr period mRNA rhythms were considered rhythmic. Also, the result section does not say anything about Myh7 rhythms. What do they tell us? Why were they presented at all?

      Response: Regarding the suggestion for independent verification of our RNAseq results, we agree that such validation would enhance the robustness of our findings. To address this, we chose to overlap our identified rhythmic genes under WT LD conditions with those from another transcriptomic study that shared similarities in experimental design. Notably, the majority of overlapping rhythmic genes between the studies are candidate pacemaker genes. We believe that this replication of biologically significant rhythmic genes strengthens the validity and reliability of our results (see Extended Data Fig. 2).

      Furthermore, we have decided to remove the NvMhc-st (mistakenly named Myh7, only rhythmic in WT DD in the new analysis) as it does not contribute substantively to the revised version of the manuscript.

      (7) The authors should explain better why only the genes that are both rhythmic in LD and DD are considered to be clock-controlled genes (CCGs). In theory, any gene rhythmic in DD could be a CCG. However, Leach and Reitzel actually found that most genes in DD1 do not cycle the next day (DD2)? This suggests that most "rhythmic" genes might show a transient change in expression due to prolonged obscurity and/or the stress induced by the absence of a light-dark cycle, rather than being clock controlled. Is this why the authors saw genes rhythmic under both LD and DD as actual CCGs? I would suggest verifying that in DD the phase of the oscillation for each CCG is similar to that in LD. If a gene is just responding to obscurity, it might show an elevated expression at the end of the dark period of LD, and then a high level in the first hours of DD. Such an expression pattern would be very unlikely to be controlled by the circadian clock.

      Response: As we modified our transcriptomic analysis, we do no longer analyze LD+DD rhythmic genes, but any genes rhythmic (RAIN and JTK p<0.01) in each condition. As such we end up with four list of genes corresponding to each experimental conditions.

      (8) Since there are still rhythms in LD in Clock mutants, I wonder whether there is a paralog that could be taking Clock's place, similar to NPAS2 in mammals.

      Response: see response to (1) > The only NPAS2 orthologous identified in Nematostella NPAS3 showed marginally significance (p=0.013) with RAIN in LD WT suggesting a regulation similar to the candidate pacemaker genes. As such we included within our candidate pacemaker genes list.

      (9) I do not follow the point the authors try to make in lines 268-272. The absence of anticipatory behavior in Drosophila Clk mutants results from disruption of the circadian molecular clock, due to the loss of Clk's circadian function. Which light-dependent function of Clock are the authors referring to, then? Also, following this, it should be kept in mind that clock mutant mice have a weakened oscillator. The effect on entrainment is secondary to the weakening of the oscillator, rather than a direct effect on the light input pathway (weaker oscillators have increased response to environmental inputs). The authors thus need to more clearly explain why they think there is a conservation of circadian and photic clock function.

      Response: Following the changes in our statistical analysis we reframed the discussion and address directly the circadian and the photic clock function (we call it light-response pathway in the manuscript)

      Recommendations for the authors:

      We suggest the following improvements:

      (1) Please undertake a serious effort to make this work more accessible to non-marine chronobiologists. This includes better explanations, and schemes of the animal when images of staining are shown (e.g. Fig.1b) which include the labeling of relevant morphological structures mentioned in the text (like "tentacle endodermis and mesenteries" (line 132)). Similar issues for mentioned life cycle stages like "late planula stage" (line 133), "bisected physa" (line 149).

      Response: Fig. 1b, we outlined the animal shaped and added 2 arrows to locate the tentacle endodermis and mesenteries. We replaced the term late planula stage, by larvae. And we rephrased bisected physa by tissue sampling.

      Please attend to details. This includes:

      • Wrong referrals to figures (currently line 151 refers to EDF2- but should be EDF 1 instead, there is a Fig.3f mentioned in the text, but there is no such Fig.).

      Response: Fixed

      • Mentioning of ZTs when the HCR stainings were performed.

      Response: Fixed

      • Fig.1 a shows a rather incomplete and thus potentially confusing phylogenetic tree. Vertebrates have at least two Clk orthologs (NPAS2 and CLK), please include both, use an outgroup, and rout the tree.

      Response: Identifying NPAS2 and CLK orthologous in all species added more confusion into the conclusion. However, we followed the suggestion of adding an outgroup using a CLK orthologous sequence identified in the sponge Amphimedon queenslandica and rout the tree. Thank for the suggestion.

      • What do the y-axis labels in Figure 2E and 3B refer to exactly? Y-axis label annotations in Fig.3a,d are entirely missing- what do the numbers refer to?

      Response: not relevant in the revised manuscript

      • Fig.2D- is the Go term enrichment referring to LD or DD?

      Response: to DD. We made it cleared on the figure 5.

      • Wording: "Clock regulates genetic pathways." What is meant by "genetic pathways"? There are no "non-genetic pathways". Could one simply say: "Clock regulates a variety of transcripts".

      Response: We modified our threshold to use only p.adj<0.01, which reduced the GO term numbers. We removed “genetic pathways” and now address the specific pathways: cell-cycle and neuronal.

      The use of the term "epistatic" is confusing (line 219), i.e. that light is epistatic to Clock. In genetics, epistasis is defined as the effect of gene interactions on phenotypes. To a geneticist, this implies that there is a second gene impacting on the phenotype of the Clock mutants. Please re-word.

      Response: “light is epistatic on Clock” has been re-phrased.

      The provided Supplementary tables are not well annotated. Several of them need guess-work about what is shown. For instance, for Supplementary Table 1, the Ns are unclear, which in total can go up to almost 200 per condition-genotype, but only about 30 animals for each were tested. Thus, where do the high totals in the LSP table come from? What do the numbers of each periodicity mean? Initially one might assume it was the number of animals that showed a periodogram peak at a given periodicity, but it seems that cannot be. Maybe it counted any period bin over statistical significance? Please clarify with better descriptions and labels.

      Response: Supplementary tables are now clearly annotated on their first Tabs. About Fig.1, we already addressed this point in the public review.

      Albeit not essential, it would be more reader-friendly to also add a summary table with average period and SD, power and SD, and percentage rhythmicity to the main figure.

      Response: Table 1 is added: it contains individual count of rhythmic animals (24h and 12h) with Cosinor. However, using Discorhythm we had to ask for a specific Period. Thus, we can only provide animal count significant for a given period value. And not an estimation of their own period.

      (2) Some of the terminology is quite confusing, in particular the double meaning of the word "clock" (i.e the pacemaker and the transcription factor). This is not a specific problem to this manuscript, but it would be helpful for the readability to try to improve this.

      Could the gene/transcript/protein be spelled: clk and Clk?

      Alternatively, for clarity- how about talking about "core pacemaker genes," "CLOCK-dependent rhythmic genes" and "CLOCK-independent rhythmic genes"?

      Response:

      Clock/CLOCK > NvClk / NvCLK and the mutant is NvClk1-/-

      Core clock genes > candidate pacemaker genes.

      CLOCK-dependent CCG > this notion no longer exists in the revised manuscript.

      CLOCK-independent CCG > this notion no longer exists in the revised manuscript.

      (3) The dismissal of the 12h rhythmicity in Clock-/- animals is not really convincing and should be reconsidered. LD6:6 cycles (before free-running animals in DD) is likely a not particularly robust way to entrain tidal animals. Recent papers show inundation/mechanical agitation are more reliable cues (Kwiatkowski ER, et al. Curr Biol. 2023, 2;33(10):1867-1882.e5. doi: 10.1016/j.cub.2023.03.015; Zhang L., et al Curr Biol. 2013, 23;19, 1863-1873 doi.org/10.1016/j.cub.2013.08.038.) and might be more effective in revealing endogenous 12h rhythms in the absence of 24h cues.

      Response: We removed the proposition of using 6:6hLD as Tidal entrainment. Instead, the LD 6:6 experiment reveals the direct light-dependency of the NvClk1-/- mutant.

      (4) There are significant questions raised on the validity of BMAL1-independent rhythms in mammals as suggested by the Ray et al study. See DOI: 10.1126/science.abe9230 and DOI: 10.1126/science.abf0922

      These technical comments should also be taken into account and the discussion adjusted accordingly to better reflect the ongoing discussions in the chronobiology field.

      Response: We modified our rhythmic analysis. As we cannot use BHQ or adjusted p-value which resulted in very genes, we defined 24h-rhythmic genes if p<0.01 with two different algorithms (RAIN and JTK). We propose this compromise to reduce the risk of false-positive. Furthermore, we discussed our methodology in the light of the significant questions raised by these papers you cited. We thank the reviewer for this important point.

      (5) The HCR stainings for clk are not very convincing. Normally, HCR should have more dots. In principle, the logic of HCR is such that it detects individual mRNA molecules in the cell. Thus, having only one strong dot/cell like in Fig.1b doesn't make much sense.

      Response: We were the first surprised by this single dot signal. We are experienced users of HCRv.3 across different species. We decided to remove the close-up (for further investigations) but to keep the full animal signal. According to our approach it is a convincing signal. However, the doty nature of the signal itself it is not easy to make it highly visible at full scale animal on the picture. We did our best to show the mRNA signal visible without altering the pattern.

      Furthermore, the controls for the HCR in situ hybridization are unclear. In the methods, there are two Clock probes described (B3 & B5) and two control probes (B1 & B3), however, in the negative control image, a combination of one Clock (B1) and one control (B3) probes is used and is unclear what "redundant detection" means in the legend of figure S2.

      Response: Considering the nature of the signal (single of few dots), we decided to use two probes with 2 different fluorophores. A noise is by nature random. Our hypothesis was: only overlapping fluorescent dots are true signal of NvClk mRNA.

      For Control probes we used two zebrafish probes labelling hypothalamic peptides.

      Based on the experience with non-Drosophila, non-mouse animal model systems the reviewers assume that non-sense mediated mRNA decay (NMD) is not strongly initiated upon Crispr-induced premature STOP-codons. If this assumption is correct it would be worth to mention it. Alternatively, it would be worth testing if Nematostella induces NMD, as this would be a great control for the HCR and the mutation itself. At which ZT was the HCR done?

      Response: We performed the HCR at ZT10 when NvClk is described to be at peak. It is now indicated in the Fig. 1b. The RNAseq detected a higher quantity of NvClk1 mRNA in the NvClk1-/- (see Fig. 4a). mRNA quantity regulation involves transcription, stabilization, and degradation. At this stage, we cannot identify which specific step is affected.

      For Fig.1c- please provide the binding site and sequence in the figure, simply include EDF 1 in the main figure.

      Response: We generated a clear indication in the new Fig.1c and EDF. 1b about the protein domains, the CRISPR binding site and the consequences on the DNA and AA sequences.

      (6) Please provide the individual trace data for the behavioral analyses either as supplementary files or as a link to an openly accessible database like DRYAD (see also comment 7 in the public review of reviewer 2). Maybe this is what is shown in Supplementary Table 1, but it is really not clear what is actually shown.

      Response: Fig.1 is updated. Table 1 is added. Supplementary Table 1 contains individual normalized locomotor data of each polyps for each genotypes and light conditions. Supplementary Table 2 contains the cosinor individual rhythmic behavior analysis based on the Supplementary Table 1.

      (7) It is not really clear if the mutation is a true loss-of-function or could also be dominant negative. While this is raised in the discussion, it should be more carefully considered. The reason why a dominant negative would be unlikely is unclear. More specifically also see comment 8) in the public review of reviewer 2.

      Response: Indeed, the results cannot tell us if it is a true loss of function, a dominant negative or non-functional allele. We addressed it in the first part of the discussion.

      (8) The pretty small overlap of rhythmic transcripts in LD and DD could reflect the true biology of a more core clock driven-process under constant conditions and a more light-driven process under LD. But still- wouldn't one expect that similar processes should be rhythmic? If not, why not?

      It would certainly add strength to the data if for one or two transcripts these results were independently verified by qPCR from an independent sampling. This could even be done for just two time points with the most extreme differences.

      Response: We appreciate the reviewer's comments and concerns regarding the overlap of rhythmic transcripts in different conditions. In response to the reviewer's query, we revised our interpretation of the transcriptomic data, acknowledging the limited overlap between light and genotype conditions in our study. This prompted us to reconsider the underlying biological processes driving rhythmic gene expression under constant conditions versus light-dark cycles.

      Regarding the suggestion for independent verification of our RNAseq results, we agree that such validation would enhance the robustness of our findings. To address this, we chose to overlap our identified rhythmic genes under WT LD conditions with those from another transcriptomic study that shared similarities in experimental design. Notably, the majority of overlapping rhythmic genes between the studies are candidate pacemaker genes. We believe that this replication of biologically significant rhythmic genes strengthens the validity and reliability of our results (see Extended Data Fig. 2).

      (9) Expression of myh7 : Checking for co-expression should be pretty straightforward by HCR. This is what this type of staining technique is really good for. Please do clk and myh7 co-staining if you want to claim co-expression. Otherwise don't make such a claim.

      Response: We agree that checking for co-expression should be straightforward by HCR. However, due to time constraints during the revision period, we are unable to conduct the double in-situ experiment. Additionally, upon careful consideration, we recognize that including myhc-st (mistakenly named myh7) staining and co-expression analysis would not significantly contribute to the main conclusions of our study. Therefore, we have decided to remove this analysis from the revised manuscript.

      (10) Missing methodological details:

      • The false discovery rate for each analysis should be included (see Hughes et al.,: "Guidelines for Genome-Scale Analysis of Biological Rhythms," 2017).

      Response: THE FDR is indicated for each gene in supplementary table 3

      • Fig.1f- continuous light- please provide a spectrum (If there is no good spectrophotometer available, please provide at least manufacturer information.

      Response: Unfortunately, we don’t have a good spectrophotometer available during the time of the revision. We added to the method the reference of the lamp. We found the light spectrum provided by the supplier. However, we did not add it to the revised manuscript.

      Author response image 1.

      Spectrum of the Aquastar t8

      Also, it would be easier for the reader, if the measurements of light intensity are provided in photons, because this is what the light receptors ultimately measure.

      Response: Modified.

      • Fig.2E- please add the consensus sequence used for circadian E-box vs. E-box to the figure.

      Response: In the revised manuscript Fig.4c, we show which E-box motifs we extracted for our promoter analysis. We as well changed our analysis and did no longer use HOMER, but we directly extracted promoter sequences and looked for canonical Ebox CANNTG and Circadian Ebox CACGTG and generate a Circadian Ebox enrichment output per gene promoter.

      (11) There has been some discussion about the evolutionary statement as stated by the authors. It appears that depending on the background of the reader, this can be misunderstood. We thus suggest to more clearly point out where the author thinks there is evolutionary conservation (a function for clk in the circadian oscillator under constant light or dark conditions) versus where there is no apparent evolutionary conservation (the situation under light-dark conditions).

      Response: In the revised manuscript we proposed a conserved function of NvCLK in constant darkness, and a light-response pathway compensating in LD conditions in the mutant.

      Please also consider the major comments 8 and 9 of the common review from reviewer 2.

      Reviewer #1 (Recommendations For The Authors):

      The hybridization chain-reaction ISH is OK but, I'm not sure I understand the control condition-this should be clarified. I would also welcome the use of Clock-/- animals in HCR as another, more direct level of control. In addition, the authors state that the Myh7 probes hybridise in anatomical regions resembling those for Clock (Fig 3e). It would be better to duplex these two probe sets with different fluors for a better representation of the relative spatial distributions of each transcript.

      Response: We agree that checking for co-expression should be straightforward by HCR. However, due to time constraints during the revision period, we are unable to conduct the double in-situ experiment. Additionally, upon careful consideration, we recognize that including myhc-st (mistakenly named myh7) staining and co-expression analysis would not significantly contribute to the main conclusions of our study. Therefore, we have decided to remove this analysis from the revised manuscript.

      We clarified in the methods the control probes design.

      Minor points:

      Figure legends do not all convey sufficient detail. For instance, Figure 1c needs a better explanation. Figure 3e- are these images both WT? Fig 3f doesn't exist and other figure text references do not align with figures and need an overhaul.

      Response: All errors have been fixed.

      Reviewer #2 (Recommendations For The Authors):

      Major issues:

      (1) The authors need to introduce their model system better for a broad audience. What are the tissues/cells that express Clock at a higher level? What is their function, does this provide a potential explanation for their specific Clock expression, and how CLOCK might regulate behavior? Terms such as "tentacle endodermis and mesenteries" (line 132), "late planula stage" (line 133), "bisected physa" (line 149) would need some explanation.

      Response: We modified term such as planula to larvae, and bisected physa to tissue samples.

      2) Some of the terminology used is quite confusing, because of the double-meaning of the word "clock" (i.e the pacemaker and the transcription factor). The authors use terms such as "clock-controlled genes", "core clock genes", "CLOCK-dependent clock-controlled genes", "neo-clock-controlled genes". Is there any way to help the reader? Here are several suggestions: "core pacemaker genes," "CLOCK-dependent rhythmic genes" and "CLOCK-independent rhythmic genes".

      Response: all the terminology has been clarified, see previous comments

      3) Also in the abstract, there is mention of "hierarchal light- and Clock-signaling" (52-3) - is this related to the statement on line 219 that light is epistatic to Clock? I do not quite understand what epistatic would mean here. Who is upstream of whom? LD modifies rhythmicity in Clock mutant animals, but Clock mutations also impact rhythmicity in LD. Also, as epistasis is defined as the effect of gene interactions on phenotypes - what is the secondary gene impacting the phenotype of the Clock mutants? I am not sure the term epistatic is appropriate in the present context.

      Response: Indeed, Epistatic is a genetic term which might be unclear in this context. We removed it.

      4) The control for the in situ hybridization is unclear. In the methods, there are two Clock probes described (B3 & B5) and two control probes (B1 & B3), however, in the negative control image, a combination of one Clock (B1) and one control (B3) probe is used, I am not sure what "redundant detection" means in the legend of figure S2. Also, the sequences of each Clock probe should be provided. It might be worth testing the Clock mutant the authors generated. Clock mRNA could be reduced due to non-sense, mediated RNA decay, since the mutation causes a premature stop codon. This would be a great additional control for the in situ hybridization. Even better would be if, by chance, the probes target the mutated sequence. The signal should then be completely lost.

      Response: HCR is a tilling probe. Which means the target transcript is covered by dozens of successive DNA sequence “primer-like” which allow the HCRv.3 technology. We cannot design a mutant probe specific with this technology.

      (5) I have concerns with rhythmic-expression calls, particularly as there is so little overlap between LD and DD, and that a completely different set of rhythmic genes is observed in Clock mutant and wild-type animals. I am not an expert in whole-genome expression studies, so I hope one of my colleague reviewers can weigh in.

      When describing rhythmicity analysis in the Methods, it states that Benjamini-Hochberg corrections were applied to account for multiple comparisons. However, the false discovery rate for each analysis should be included (see Hughes et al.,: "Guidelines for Genome-Scale Analysis of Biological Rhythms," 2017).

      Response: As explained before we cannot used Benjamini-Hochberg corrections as only few genes (mostly oscillator gene pass the threshold). As such we combined two different algorithms (RAIN and JTK) with a p<0.01 to detect confidently rhythmic genes while reducing the risk of false-positives.

      Minor issues:

      (1) Environmental inputs are not "circadian", as written in the title.

      Response: Title modified

      (2) In the abstract, the description of the Clock mutant behavioral phenotypes is hard to follow, with no mention of whether or not Clock mutant animals are behaviorally rhythmic or arrhythmic in constant conditions.

      Response: corrected

      (3) Abstract: A 6/6 h LD cycle is not a compressed tidal cycle as written in the abstract. Light is not an input to tidal rhythms.

      Response: corrected

      (4) Line 101: timeout is not a core clock gene in animals.

      Response: we removed it from the candidate pacemaker genes.

      (5) What is the evidence for the role of PAR-Zip proteins in the Nematostella clock? The reference provided does not mention those.

      Response: There is no functional data in Nematostella yet to support their role within the pacemaker. However based on their rhythmicity in LD and protein conservation, we included them within the candidate pacemaker genes list. The refences have been corrected.

      (6) Line 125. should refer to Fig 1C when describing the Clock protein.

      Response: corrected

      (7) Line 143-4. based on the figure, the region targeted by gRNA was not "close to the 5' end" as stated, it is closer to the middle of the gene sequence as shown in Figure 1C. A more accurate description would be a region in between the PAS domains.

      Response: Indeed we modified the figure and the text.

      (8) Line 150. The mutant allele is described as Clock1 initially, then for the rest of the paper as Clock-. SInce it is not clear that the allele is a null (see major comment #8), Clock1 should be used throughout the manuscript.

      Response: the allele is named NvClk1 in the revised manuscript

      (9) Figure 2A, the second CT/ZT0 is misplaced.

      Response: Fig. 2 modified in the revised manuscript

      (10) Figure legend for 2E and 3B. "The 1000bp upstream ATG" is unclear. I guess it means that 1000bp upstream of the putative initiation codon was used.

      Response: Right, and in the revised version we analyzed 5kb upstream the putative ATG.

      (11) Line 164. The authors write "We discovered..." , but wasn't it already known that these animals are behaviorally rhythmic?

      Response: Fixed

      (12) It would be worth mentioning in the results section the reduced amplitude of rhythms in LL compared to DD (in WT and seemingly also in Clock mutants).

      Response: Indeed, we observed a significant reduction in the mean amplitude in the NvClk1-/- in DD and LL compared WT and NvClk1-/- in LD, DD and LL. However, as rhythmicity is lost by virtually all mutants in LL and DD we do not think these results add to the current interpretation of the gene function.

      (13) Please correct the figure numbers in the main text, there are several mistakes.

      Response: Done

      (14) Line 196, most genes in the quoted study did not cycle on day 2, so whether they are truly clock controlled is questionable.

      Response: We agree, identifying free-running cycling genes in cnidarian remains a challenge to overcome. One of the limitations of this study was to detect rhythmic genes in LD which conserved rhythmicity in DD. However, considering different transcriptomic studies (cited in the discussion) it seems that in the cnidaria phyla rhythmic genes in LD are not necessarily the one we identified rhythmic in DD.

      (15) Line 204-206 needs to be rephrased. It is confusing.

      Response: rephrased

      (16) Line 216. Rephrase to something like: "A similar finding was made for."

      Response: rephrased

      (17) "Clock regulates genetic pathways" sounds quite odd. Do you mean it regulates preferentially specific genetic (or maybe better, molecular) pathways?

      Response: rephrased

      (18) Figure 4 and legend: Dashed lines indicating threshold are missing. Do the black and red dots represent WT and Clock-/-, as indicated in the legend, or up/down, as indicated in the figures?

      Response: Fig.5 modified accordingly. Colors in the Volcano plot indicate Up- (black) versus Down- (red) regulated. It is now coherent within the figure.

      (19) Legend for Extended figure 1. "Immature peptide sequence" is incorrect.

      Response: rephrased

      (20) Extended data Figure 4. What the asterisks labels is unclear.

      Response: EDF4 was modified and become EDF2 with different content. The * indicates NvClk mRNA

      (21) Line 228. Gene "isoforms". I guess the authors mean "paralogs".

      Response: corrected.

      (22) Line 232-3/Figure 3e. Please include a comparable image of the Clk ISH to facilitate the comparison of the spatial expression pattern. In addition, where and what is the "analysis" referred to - "the spatial expression pattern of Myh7 closely resembled that of Clock, as evidenced by our analysis"?

      Response: the analysis has been removed from the revised manuscript because we currently cannot perform the double ish.

      (23) Line 282-3. As mentioned above, it is difficult to be sure that circadian behavior is lost, if only looking at a population of animals.

      Response: Fig.1 corrected

      (24) Line 301-5. Rephrase.

      Response: Rephrased

      (25) Line 325. I am not convinced that the author can say that their mutant is amorphic. See Major comment 8.

      Response: corrected.

      (26) Line 351 "simplifying interactions with the environment". Please explain what is meant here.

      Response: this confusing sentence has been removed from the revised manuscript

    1. eLife assessment

      This important study provides previously unappreciated insights into the functions of protist eIF4E 5'mRNA cap-binding protein family members, thereby contributing to a better understanding of translation regulation in these organisms. The authors provide solid evidence to support the major conclusions of the article. However, the study may further benefit from establishing whether all of the eIF4E family members are indeed involved in translation and more direct evidence for the selectivity of their binding.

    2. Reviewer #1 (Public Review):

      Using A. carterae as a model system, this work investigates the properties of the trans-spliced SL leader sequences and the dinoflagellate eIF4E protein family members.

      Analysis was performed to identify the 5' cap type of the SL leader. Variation in the SL leader sequence and an abundance of modified bases was documented.

      Various aspects of the sequence and expression of the eIF4E family members were examined. This included phylogeny, mRNA, and protein expression levels in A. carterae, and the ability of eIF4E proteins to bind cap structures. Differences in expression levels and cap-binding capacity were characterized, leading to the proposition that eIF4E-1a serves as the major cap-binding protein in A. carterae.

      A major discussion point is the potential for differential eIF4E binding to specific SL leader sequences as a regulatory mechanism, which is an exciting prospect. However, despite indications of sequence variability and the presence of various nucleotide modifications in the SL, and the several eIF4E variants, direct evidence to support this hypothesis is lacking.

      It is an extensive and highly descriptive study. The work is presented clearly, although it is rather lengthy and contains repetition across the introduction, results, and discussion sections. Its style leans more towards a review format. As a non-expert in the field, I appreciated the extensive background however I do believe the paper would benefit from a more concise format.

    3. Reviewer #2 (Public Review):

      Summary:

      Jones et al. extend their previous work on the translation machinery in Dinoflagellate. In particular, they study the species Amphidium carterae. They characterize the type of cap structure mRNAs possess in this species, as well as the eight eIF4E family members A. carterae possesses and their affinity to the mRNA cap. They also establish the leader sequences of the transpliced mRNAs that A. carterae generates during gene expression.

      Strengths:

      The authors performed a solid phylogenetic and biochemical study to understand the structure of Dinoflagellate mRNAs at the 5'-UTR as well as the divergence and biochemical features of eIF4Es across Dinoflagellate. They also establish eIF4E-1a as the prototypical paralog of the eIF4E family of proteins. The scientific questions they ask are very relevant to the gene expression field across eukaryotes. The experiments and the phylogenetic analysis are performed with a very high quality. They perform a wide spectrum of experimental approaches and techniques to answer the questions.

      Weaknesses:

      The authors assume all eIF4E from Dinoflagellate are involved in translation, i.e., mRNA recruitment to the ribosome. Indeed, they think that the diverse biochemical features of all eIF4E in A. carterae have to do with the possible recruitment of different subsets of mRNAs to the ribosome for translation. I think that the biochemical differences among all paralogs also might be due to the involvement of some of them in different processes of RNA metabolism, other than translation. For instance, some of them could be involved only in RNA processing in the nucleus or mRNA storage in cytoplasmic foci.

    4. Reviewer #3 (Public Review):

      Summary:

      In this article, the authors provide an inventory of the 5' spliced leader sequences, cap structures, and eIF4E isoforms present in the model dinoflagellate species A. carterae. They provide evidence that the 5' cap structure is m7G, as it is in most characterized eukaryotes that do not employ trans-splicing for mRNA maturation, and that there are additional methylated nucleotides throughout the spliced leader RNAs. They then show that of the 8 different eIF4E species in A. carterae, only a subset of eIF4E1 and eIF4E2 proteins are detected and that the levels change according to time of day. Interestingly, while the eIF4E1 proteins bind a canonical cap nucleotide and are able to complement eIF4E-deficiency in yeast, an eIF4E2 paralog does not bind the traditional cap.

      Strengths:

      A strength of the article is that the authors have clearly presented the findings and by straying away from traditional model organisms, they have highlighted unique and interesting features of an understudied system for translational control. They provide complementary evidence for most findings using multiple techniques. E.g. the evidence that eIF4E1A binds m7GTP is supported by both pulldowns using m7GTP sepharose as well as SPR experiments to directly monitor binding of recombinant protein with affinity measurements. The methods are extremely detailed noting cell numbers, volumes, concentrations, etc. used in the experiments to be easily replicated.

      Weaknesses:

      While not necessary to support the author's conclusions, the significance of the work would be further enhanced by additional experiments to gain insights into mechanisms for translational control and to link specific SLs to organismal functions or mechanisms of mRNA recruitment.

      -Monitoring diel expression of SLs and direct sequencing of mature mRNA would yield insights into whether there is regulated expression of RNAs with different SLs or the SLs themselves. This would also allow the authors to perform gene ontology to link SL expression at different points in the diel cycle to related functions, e.g. photosynthesis.

      -In addition, the work would be strengthened by polysome sequencing or ribosome profiling as a function of the diel cycle, with analyses of when various spliced leader sequences are recruited to ribosomes in parallel with western blotting of polysome fractions to determine when various eIF4E isoforms are present on polysomes. This is a substantial expansion though from what the authors focused on in this manuscript, and not having these experiments does not undermine the findings presented. Alternatively, they could attempt to make bioinformatic comparisons with existing ribosome profiling datasets from a related dinoflagellate, Lingulodinium polyedrum, discussed briefly, if there were sufficient overlap between SL RNAs in these organisms.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Figures 1B, S4, and S5, Tibia sections would be more informative and promising as the growth plate is flat. Otherwise, histology of the knee would be preferred.

      We have added the tibia section images in Figures 1B, S4, and S5 (New Figure 1B, Figure 2-figure supplement 3A, and Figure 3-figure supplement 1A).

      (2) Figure 1C, The authors performed immunostaining for vimentin, alpha-SMA, Col1a1 and Col1a2. The authors should use adjusted sections for the immunostaining for different antibodies. It would avoid region-specific variations in the size and shape of sections and the data would be more reliable. Please correct and revise.

      We have provided immunostaining results using consecutive sections at the similar locations of the external ear (Figure 1C).

      (3) Figure 2A and throughout the manuscript where authors performed p-smad1/5/9 fluorescent immunostaining, the authors should also show non-phospho levels of p-smad1/5/9. Please correct and revise.

      We have tried different anti-Smad1/5/9 antibodies and the signals have very high background and are not presentable. We instead did a western blot on auricle samples and the results are in Figure 2-figure supplement 1A, suggesting that ablation of Bmpr1a led to loss of activation of Smad1/5/9 without affecting their expression. For different segments of external ear, we also provided WB results in Figure 2-figure supplement 4B. In addition, we added RNA-seq data regarding the Smad1,5,9 mRNA levels, which were not affected by Bmpr1a ablation (Figure 4-figure supplement 1B). Overall, these results suggest that Bmpr1a ablation does not affect the expression of Smad1/5/9.

      (4) Result 2, lines 131-134, the authors mentioned in the text that they observed no ear phenotype of Prrx1CreERT or Bmpr1af/f mice compared with wild-type mice (Figures S2A and S2B). However, the figures did not show histology pictures of wild-type mice. Please correct and revise.

      We have provided histological pictures of wild type mice (Figure 2-figure supplement 2C).

      (5) Result 5, lines 173-174 "We generated....Bmpr1a floxed mice". How did authors generate Col1a2-CreERT; Bmpr1af/f mice by crossing Prrx1Cre-ERT and Bmpr1af/f mice? Please correct and revise.

      It is a typo and has been corrected.

      (6) In the previous study by Soma Biswas et al., (Scientific Reports 2018, PMID 29855498) the authors mentioned in the result section that the mice with deletion of Bmpr1a using Prx1Cre looked morphologically normal. They did not mention the ear phenotype/microtia. Please explain how this study differs from current work and what are the limitations in the discussion.

      We did not observe an obvious ear phenotype in the adult transgenic Prrx1-CreERT; Bmpr1af/f mice. The reason could be that that the transgene label too few auricle chondrocytes as it has been for endosteal bones and periosteal bones in adult mice (Liu et al. Nat Genet 2022; Wilk, K. et al. Stem Cell Rep 2017; Julien A et al. J Bone Miner Res 2022). The difference is likely caused by the fact that the transgenic CreERT line was driven by a 2.3 kilobase promoter of Prrx1 that was inserted to unknow location in the genome. Since we do not carry the transgenic line any more, we cannot directly test the labelling efficiency of the transgenic line in auricle. We have discussed this point in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Chondrocytes are present in many parts of the body; some components are replaced by osteoblast cells, but others stay with their morphology. These cells are in different morphological and cellular conditions throughout the body. Is there any human variant study of Prrx1 and their association with auricle chondrocytes is present?

      We searched the literature and found no study on Prrx1 in auricle chondrocytes in human.

      Do auricle chondrocytes have Prrx1+ through their developmental stage, and what's the expression situation of Prrx1+ at articular cartilage and growth plates throughout development? Only a small population is positive throughout the development, or they lose as they develop.

      We traced Prrx1 lineage cells in Prrx1-CreERT; R26tdTomato mice that received TAM at E8.5, E13.5, or p21. We found that auricle chondrocytes were Tomato+ under these conditions even only one dose of TAM (1/10 of the dose for adult mice) was given to the pregnant mice at E8.5 or E13.5 (Figure 1-figure supplement 1). However, while E8.5 mice showed Tomato+ chondrocytes at both articular cartilage and growth plate, E13.5 or p21 mice showed much fewer Tomato+ chondrocytes at articular cartilage and growth plate (Figure 1-figure supplement 1). These results indicate that Prrx1 expression differs in cartilages during development, growth, and maintenance.

      What's your rationale for studying Bmpr1a ablation at the adult stage?

      Organ development and maintenance are different processes, especially for slow-turnover tissues. Organ maintenance is also important since it accounts for 90% of the lifetime of mice. While previous studies have uncovered essential roles for BMP signaling in chondrogenic differentiation during development, it remains unclear whether BMP signaling plays a role in cartilage maintenance in adult mice.

      Line no 128: Chondrocytes are shirked but still have normal proliferation; what's the author's thought about it?

      Sorry that we did not make it clear enough. Actually there were very few cells undergoing proliferation in auricle cartilage and Bmpr1a ablation did not alter that. We have rephrased these sentences.

      Do chondrocytes have protein trafficking defects or ER/Golgi stress?

      We checked the expression of proteins involved in protein trafficking and found that some were up-regulated and some were down-regulated (Figure 4-figure supplement 1D), which may reflect the shift from chondrocytes to osteoblasts and warrants further investigation. However, the expression of ER or Golgi stress-related genes, which play critical roles in chondrocyte differentiation and survival (Wang et al. 2018; Horigome et al. 2020), was not altered by Bmpr1a ablation (Figure 4-figure supplement 1E and 1F).

      How many Prrx paralogs are there in the system? Are all associated with auricle chondrocytes and similar mechanisms?

      There is one Prrx1 paralog, Prrx2. While Prrx1-/- mice lived for up to 24 hours after birth with low-set ears (Martin JF. Eta al. Genes Dev. 1995), Prrx2-/- mice are perfectly normal. Prx1-/-Prx2-/- double mutant mice died within an hour after birth and the pups showed no external ears (ten Berge D. et al. Development. 1998). We have added this information into the revised manuscript.

      Extracellular matrix (ECM) provides cell-to-cell interaction and environment for cell growth. Does Bmpr1a ablation lead to any changes in ECM at the auricle or growth plate chondrocytes?

      Our analysis showed that the expression of many ECM proteins was down-regulated in auricle cartilage of Prrx1-CreERT; Bmpr1af/f mice (Figure 4-figure supplement 1A). This may reflect the shift from chondrocytes to osteoblasts and warrants further investigation. However, immunostaining revealed that the expression of Aggrecan and Col10 in the growth plates was unaltered in adult Prrx1-CreERT; Bmpr1af/f mice compared to control mice (Figure 4-figure supplement 1C), likely due to the lack of marking of chondrocytes in growth plates.

      Microtia usually develops during the first trimester of pregnancy in humans. What's your view about studying at the adult stage compared to intrauterine development?

      Congenital microtia is a problem with the formation of external ear whereas microtia development in adult mice is a problem with the maintenance of the auricle chondrocytes. Organ maintenance is also an important process as it starts from 3 months of age and lasts for 90% of the lifetime of mice.

      In RNA sequencing protocol, Wikipedia pages keep updating, so it is very strange to cite the Wikipedia pages. Cite a research article for it.

      We have replaced this reference.

      Why do the authors have a very low FDR value for this study? How does this value strengthen the study?

      It was a typo that has been corrected.

      It needs further validation to show that Prrx1 marked cells are a good model for auricular chondrocyte-related studies.

      We show that Prrx1 marks auricle chondrocytes but few growth plate or articular chondrocytes in adult mice, suggestive its specificity. However, the use of Prrx1-CreERT line in auricle cartilage studies is complicated by the labelling of dermal cells in the external ear by Prrx1. We have discussed this point in the revised manuscript.

    1. eLife assessment

      This study uses ex vivo live imaging of the uterus, uterotubal junction, and oviduct post-mating to test the role of the sperm hook in the house mouse (Mus musculus) in sperm movement which could be interesting to evolutionary biologists. The work is useful as their live imaging revealed sperm behaviors in the female tract that have not been previously reported. However, the strength of evidence is incomplete since the limited quantification of the data is insufficient and the extensive speculation on the functions of these sperm behaviors is not supported by sufficient experimental evidence to support their conclusions.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors want to determine the role of the sperm hook of the house mouse sperm in movement through the uterus. The authors are trying to distinguish between two hypotheses put forward by others on the role of the sperm hook: (1) the sperm cooperation hypothesis (the sperm hook helps to form sperm trains) vs (2) the migration hypothesis (that the sperm hook is needed for sperm movement through the uterus). They use transgenic lines with fluorescent labels to sperm proteins, and they cross these males to C57BL/6 females in pathogen-free conditions. They use 2-photon microscopy on ex vivo uteri within 3 hours of mating and the appearance of a copulation plug. There are a total of 10 post-mating uteri that were imaged with 3 different males. They provide 10 supplementary movies that form the basis for some of the quantitative analysis in the main body figures. Their data suggest that the role of the sperm hook is to facilitate movement along the uterine wall.

      Strengths:

      Ex vivo live imaging of fluorescently labeled sperm with 2-photon microscopy is a powerful tool for studying the behavior of sperm.

      Weaknesses:

      The paper is descriptive and the data are correlations.

      The data are not properly described in the figure legends.

      When statistical analyses are performed, the authors do not comment on the trend that sperm from the three males behave differently from each other. This weakens confidence in the results. For example, in Figure 1 the sperm from male 3613 (blue squares) look different from male 838 (red circles), but all of these data are considered together. The authors should comment on why sperm across males are considered together when the individual data points appear to be different across males.

      Movies S8-S10 are single data points and no statistical analyses are performed. Therefore, it is unclear how penetrant the sperm movements are.

      Movies S1B - did the authors also track the movement of sperm located in the middle of the uterus (not close to the wall)? Without this measurement, they can't be certain that sperm close to the uterus wall travels faster.

      Movie S5A - is of lower magnitude (200 um scale bar) while the others have 50 and 20 uM scale bars. Individual sperm movement can be observed in the 20 uM (Movie 5SC). If the authors went to prove that there is no upsucking movement of sperm by the uterine contractions, they need to provide a high magnification image.

      Movie S8 - if the authors want to make the case that clustered sperm do not move faster than unclustered sperm, then they need to show Movie S8 at higher magnification. They also need to quantify these data.

      Movie S9C - what is the evidence that these sperm are dead or damaged?

      MovIe S10 - both slow- and fast-moving sperm are seen throughout the course of the movie, which does not support the authors' conclusion that sperm tails beat faster over time.

    3. Reviewer #2 (Public Review):

      Summary:

      The specific objective of this study was to determine the role of the large apical hook on the head of mouse sperm (Mus musculus) in sperm migration through the female reproductive tract. The authors used a custom-built two-photon microscope system to obtain digital videos of sperm moving within the female reproductive tract. They used sperm from genetically modified male mice that produce fluorescence in the sperm head and flagellar midpiece to enable visualization of sperm moving within the tract. Based on various observations, the authors concluded that the hook serves to facilitate sperm migration by hooking sperm onto the lining of the female reproductive tract, rather than by hooking sperm together to form a sperm train that would move them more quickly through the tract. The images and videos are excellent and inspirational to researchers in the field of mammalian sperm migration, but interpretations of the behaviors are highly speculative and not supported by controlled experimentation.

      Strengths:

      The microscope system developed by the authors could be of interest to others investigating sperm migration.

      The new behaviors shown in the images and videos could be of interest to others in the field, in terms of stimulating the development of new hypotheses to investigate.

      Weaknesses:

      The authors stated several hypotheses about the functions of the sperm behaviors they saw, but the hypotheses were not clearly stated or tested experimentally.

      The hypothesis statements were weakened by the use of hedge words, such as "may".

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the authors address a fundamental unresolved question in cerebellar physiology: do synapses between granule cells (GCs) and Purkinje cells (PCs) made by the ascending part of the axon (AA) have different synaptic properties from those made by parallel fibers? This is an important question, as GCs integrate sensorimotor information from numerous brain areas with a precise and complex topography.

      Summary:

      The authors argue that CGs located close to PCs essentially contact PC dendrites via the ascending part of their axons. They demonstrate that joint high-frequency (100 Hz) stimulation of distant parallel fibers and local CGs potentiates AA-PC synapses, while parallel fiber-PC synapses are depressed. On the basis of paired-pulse ratio analysis, they concluded that evoked plasticity was postsynaptic. When individual pathways were stimulated alone, no LRP was observed. This associative plasticity appears to be sensitive to timing, as stimulation of parallel fibers first results in depression, while stimulation of the AA pathway has no effect. NMDA, mGluR1 and GABAA receptors are involved in this plasticity.

      Strengths:

      Overall, the associative modulation of synaptic transmission is convincing, and the experiments carried out support this conclusion. However, weaknesses limit the scope of the results.

      Weaknesses:

      One of the main weaknesses of this study is the suggestion that high-frequency parallel-fiber stimulation cannot induce long term potentiation unless combined with AA stimulation. Although we acknowledge that the stimulation and recording conditions were different from those of other studies, according to the literature (e.g. Bouvier et al 2016, Piochon et al 2016, Binda et al, 2016, Schonewille et al 2021 and others), high-frequency stimulation of parallel fibers leads to long-term postsynaptic potentiation under many different experimental conditions (blocked or unblocked inhibition, stimulation protocols, internal solution composition). Furthermore, in vivo experiments have confirmed that high-frequency parallel fibers are likely to induce long-term potentiation (Jorntell and Ekerot, 2002; Wang et al, 2009). This article provides further evidence that long-term plasticity (LTP and LTD) at this connection is a complex and subtle mechanism underpinned by many different transduction pathways. It would therefore have been interesting to test different protocols or conditions to explain the discrepancies observed in this dataset.

      Even though this is not the main result of this study, we acknowledge that the control experiments done on PF stimulation add a puzzling result to an already contradictory literature. High frequency parallel fibre stimulation (in isolation) has been shown to induce long term potentiation in vitro, but not always, and most importantly, this has been shown in vivo. This was in fact the reason for choosing that particular stimulation protocol. Examination of in vitro studies, however, show that the results are variable and even contradictory. Most were done in the presence of GABAA receptor antagonists, including the SK channel blocker Bicuculline, whereas in the study by Binda (2016), LTP was blocked by GABAA receptor inhibition. In some studies also, LTP was under the control of NMDAR activation only, whereas in Binda (2016), it was under the control of mGluR activation. Moreover, most experiments were done in mice, whereas our study was done in rats. Our results reveal intricate mechanisms working together to produce plasticity, which are highly sensitive to in vitro conditions. We designed our experiments to be close to physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to reproduce PF-LTP, but it was not the aim of this study to dissect the subtleties of the different experimental protocols and models. We will modify the Discussion to describe that point fully including differences in experimental conditions.

      Another important weakness is the lack of evidence that the AAs were stimulated. Indeed, without filling the PC with fluorescent dye or biocytin during the experiment, and without reconstructing the anatomical organization, it is difficult to assess whether the stimulating pipette is positioned in the GC cluster that is potentially in contact with the PC with the AAs. According to EM microscopy, AAs account for 3% of the total number of synapses in a PC, which could represent a significant number of synapses. Although the idea that AAs repeatedly contact the same Purkinje cell has been propagated, to the best of the review author's knowledge, no direct demonstration of this hypothesis has yet been published. In fact, what has been demonstrated (Walter et al 2009; Spaeth et al 2022) is that GCs have a higher probability of being connected to nearby PCs, but are not necessarily associated with AAs.

      We fully agree with the reviewer that we have not identified morphologically ascending axon synapses, and we stress this fact both in the first paragraph of the Results section, and again at the beginning of Discussion. Our point is mainly topographical, given the well documented geometrical organisation of the cerebellar cortex, and strictly speaking, inputs are local (including ascending axon) or distal (parallel fibre). Similarly, the studies by Isope and Barbour (2002) and Walter et al. (2009), just like Sims and Hartell (2005 and 2006), have coined the term ‘ascending axon’ when drawing conclusions about locally stimulated inputs. Moreover, our results do not rely on or assume multiple contacts, stronger connections, or higher probability of connections between ascending axons and Purkinje cells. Our results only demonstrate a different plasticity outcome for the two types of inputs. Therefore, our manuscript could be rephrased with the terms ‘local’ and ‘distal’ granule cell inputs, but this would have no more implication for the results or the computation performed in Purkinje cells. However, in our experience, this is more confusing to the reader, and as we already stress this point in the manuscript, we do not wish to make this modification. However we will modify the abstract of the manuscript to clarify that point.

      Reviewer #2 (Public Review):

      Summary:

      The authors describe a form of synaptic plasticity at synapses from granule cells onto Purkinje cells in the mouse cerebellum, which is specific to synapses proximal to the cell body but not to distal ones. This plasticity is induced by the paired or associative stimulation of the two types of synapses because it is not observed with stimulation of one type of synapse alone. In addition, this form of plasticity is dependent on the order in which the stimuli are presented, and is dependent on NMDA receptors, metabotropic glutamate receptors and to some degree on GABAA receptors. However, under all experimental conditions described, there is a progressive weakening or run-down of synaptic strength. Therefore, plasticity is not relative to a stable baseline, but relative to a process of continuous decline that occurs whether or not there is any plasticity-inducing stimulus.

      As highlighted by the reviewer, we observed a postsynaptic rundown of the EPSC amplitude for both input pathways. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation, and the progressive decrease of the EPSC amplitude during the course of an experiment leads to an underestimate of the absolute potentiation. We have taken the view to provide a strong set of control data rather than selecting experiments based on subjective criteria or applying a cosmetic compensation procedure. We have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown. Comparison shows a highly significant potentiation of the ascending axon EPSC. Depression of the parallel fibre EPSC, on the other hand, was not significantly different from rundown, and we have not spoken of parallel fibre long term depression. The data show thus very clearly that ascending axon and parallel fibre synapses behave differently following the costimulation protocol.

      Strengths:

      The focus of the authors on the properties of two different synapse-types on cerebellar Purkinje cells is interesting and relevant, given previous results that ascending and parallel fiber synapses might be functionally different and undergo different forms of plasticity. In addition, the interaction between these two synapse types during plasticity is important for understanding cerebellar function. The demonstration of timing and order-dependent potentiation of only one pathway, and not another, after associative stimulation of both pathways, changes our understanding of potential plasticity mechanisms. In addition, this observation opens up many new questions on underlying intracellular mechanisms as well as on its relevance for cerebellar learning and adaptation.

      Weaknesses and suggested improvements:

      A concern with this study is that all recordings demonstrate "rundown", a progressive decrease in the amplitude of the EPSC, starting during the baseline period and continuing after the plasticity-induction stimulus. In the absence of a stable baseline, it is hard to know what changes in strength actually occur at any set of synapses. Moreover, the issues that are causing rundown are not known and may or may not be related to the cellular processes involved in synaptic plasticity. This concern applies in particular to all the experiments where there is a decrease in synaptic strength.

      We have provided an answer to that point directly below the summary paragraph. Moreover, if the phenomenon causing rundown was involved in plasticity, it should affect plasticity of both inputs, which was not the case, clearly distinguishing the ascending axon and parallel fibre inputs.

      The authors should consider changes in the shape of the EPSC after plasticity induction, as in Fig 1 (orange trace) as this could change the interpretation.

      Figure 1 shows an average response composed of evoked excitatory and inhibitory synaptic currents. The third section of Supplementary material (supplementary figure 3) shows that this complex shape is given by an EPSC followed by a delayed disynaptic IPSC. We would like to point out that while separating EPSC from IPSC might appear difficult from average traces due to the averaged jitter in the onset of the synaptic currents, boundaries are much clearer when analysing individual traces. In the same section we discuss the results of experiments in which transient applications of SR 95531 before and after the induction protocol allowed us to measure the EPSC, while maintaining the experimental conditions during induction. Analysis of the kinetics of the EPSCs during gabazine application at the beginning and end of experiments, showed that there is no change in the time to peak of both AA and PF response. The decay time of AA and PF EPSC are slightly longer at the end of the experiment, even if the difference is not significant for AA inputs (we will add this analysis to the revised version of the paper). Our analysis, that uses as template the EPSCs kinetics measured at the beginning and at the end of the experiments, takes directly into account these changes. The results show clearly that the presence of disynaptic inhibition doesn’t significantly affect the measure of the peak EPSC after the induction protocol nor the estimate of plasticity.

      In addition, the inconsistency with previous results is surprising and is not explained; specifically, that no PF-LTP was induced by PF-alone repeated stimulation.

      In our experimental conditions, PF-LTP was not induced when stimulating PF only, the only condition that reproduces experiments in the literature. As discussed in our response to reviewer 1, a close look at the literature, however, reveals variabilities and contradictions behind seemingly similar results. They reveal intricate mechanisms working together to produce plasticity, which are sensitive to in vitro conditions. We designed our experiments to be close to physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to observe PF-LTP. We will modify the discussion section to discuss that point fully in the context of past results.

      The authors test the role of NMDARs, GABAARs and mGluRs in the phenotype they describe. The data suggest that the form of plasticity described here is dependent on any one of the three receptors. However, the location of these receptors varies between the Purkinje cells, granule cells and interneurons. The authors do not describe a convincing hypothetical model in which this dependence can be explained. They suggest that there is crosstalk between AA and PF synapses via endocannabinoids downstream of mGluR or NO downstream of NMDARs. However, it is not clear how this could lead to the long-term potentiation that they describe. Also, there is no long-lasting change in paired-pulse ratio, suggesting an absence of changes in presynaptic release.

      We suggest in the result section that the transient change in paired pulse ratio (PPR) is linked to a transient presynaptic effect only, which has been reported by others. This suggests that the long lasting changes observed are postsynaptic, like other reports with similar trains of stimulation, and we will modify the manuscript to state this clearly.

      Concerning the involvement of multiple molecular pathways, investigators often tested for the involvement of NMDAR or mGluRs in cerebellar plasticity, rarely both. Here we showed that both pathways are involved. The conjunctive requirement for NMDAR and mGluR activation can easily be explained based on the dependence of cerebellar LTP and LTD on the concentrations of both NO and postsynaptic calcium (Coesman et al., 2004; Safo and Regehr, 2005; Bouvier et al., 2016; Piochon et al., 2016). NO production has been linked to the activation of NMDARs in granule cell axons (Casado et al., 2002; Bidoret et al., 2009; Bouvier et al., 2016), occasionally in molecular layer interneurones (Kono et al., 2019). NO diffuses to activate Guanylate Cyclase in the Purkinje cell. Based on the literature also, different mechanisms can feed a calcium increase, including mGluRs activation. Therefore NMDARs and mGluRs can reasonably cooperate to control postsynaptic plasticity. The associative nature of AA-LTP is more complex to explain, i.e. the requirement for co-activation of AA and PF inputs, and indicates a necessary cross talk between synaptic sites. We propose that either one of the receptors is absent from AA synapses, and a signal needs to propagate from PF to AA synapses, or that both receptors are present but a signal is required to activate one of the receptors at AA synapses.

      We also observed an effect of GABAergic inhibition. GABAergic inhibition was elegantly shown by Binda (2016) to regulate calcium entry together with mGluRs, and control plasticity induction. A similar mechanism could contribute to our results, although inhibition might have additional effects. We will modify the discussion of the manuscript and add a diagram to highlight the links between the different molecular pathways and potential cross talk mechanisms, and the location of receptors.

      Is the synapse that undergoes plasticity correctly identified? In this study, since GABAergic inhibition is not blocked for most experiments, PF stimulation can result in both a direct EPSC onto the Purkinje cell and a disynaptic feedforward IPSC. The authors do address this issue with Supplementary Fig 3, where the impact of the IPSC on the EPSC within the EPSC/IPSC sequence is calculated. However, a change in waveform would complicate this analysis. An experiment with pharmacological blockade will make the interpretation more robust. The observed dependence of the plasticity on GABAA receptors is an added point in favor of the suggested additional experiments.

      We did consider that due to long recording times there might be kinetic changes, and that’s the reason why the experiments of Supplementary figure 3 were done with pharmacological blockade of GABAAR with gabazine, both before and again after LTP induction. The estimate of the amplitude of the EPSC is based on the actual kinetics of the response at both times.

      A primary hypothesis of this study is that proximal, or AA, and distal, or PF, synapses are different and that their association is specifically what drives plasticity. The alternative hypothesis is that the two synapse-types are the same. Therefore, a good control for pairing AA with PF would be to pair AA with AA and PF with PF, thereby demonstrating that pairing with each other is different from pairing with self.

      Pairing AA with AA would be difficult because stimulation of AA can only be made from a narrow band below the PC and we would likely end up stimulating overlapping sets of synapses.. However, Figure 5 shows the effect of stimulating PF and PF, while also mimicking the sparse and dense configuration of the usual experiment. It shows that sparse PF do not behave like AA. Sims and Hartell (2006) also made an experiment with sparse PF inputs and observed clear differences between sparse local (AA) and sparse distal (PF) synapses.

      It is hypothesized that the association of a PF input with an AA input is similar to the association of a PF input with a CF input. However, the two are very different in terms of cellular location, with the CF input being in a position to directly interact with PF-driven inputs. Therefore, there are two major issues with this hypothesis: 1) how can sub-threshold activity at one set of synapses affect another located hundreds of micrometers away on the same dendritic tree? 2) There is evidence that the CF encodes teaching/error or reward information, which is functionally meaningful as a driver of plasticity at PF synapses. The AA synapse on one set of Purkinje cells is carrying exactly the same information as the PF synapses on another set of Purkinje cells further up and down the parallel fiber beam. It is suggested that the two inputs carry sensory vs. motor information, which is why this form of plasticity was tested. However, the granule cells that lead to both the AA and PF synapses are receiving the same modalities of mossy fiber information. Therefore, one needs to presuppose different populations of granule cells for sensory and motor inputs or receptive field and contextual information. As a consequence, which granule cells lead to AA synapses and which to PF synapses will change depending on which Purkinje cell you're recording from. And that's inconsistent with there being a timing dependence of AA-PF pairing in only one direction. Overall, it would be helpful to discuss the functional implications of this form of plasticity.

      We do not hypothesise that association of the AA and PF inputs is similar to the association of PF and climbing fibre inputs. We compare them because it is the only other known configuration triggering associative plasticity in Purkinje cells. We conclude that ‘The climbing fibre is not the only key to associative plasticity’, and it is indeed interesting to observe that even if the inputs are very small compared to the powerful climbing fibre input, they can be effective at inducing plasticity. Physiologically, the climbing fibre signal has been clearly linked to error and reward signals, but reward signals are also encoded by granule cell inputs (Wagner et al., 2017). We will modify the discussion to make sure that we do not suggest equivalence with CF induced LTD.

      Moreover, we fully agree that AA and PF synapses made up by a given granule cell carry the same information, and cannot encode sensory and motor information at the same time. Yet, these synapses carry different information. AA synapses from a local granule cell deliver information about the local receptive field, but PF synapses from the same granule cell will deliver contextual information about that receptive field to distant Purkinje cells. In the context of sensorimotor learning, movement is learnt with respect to a global context, not in isolation, therefore learning a particular association must be relevant. The associative plasticity we describe here could help explain this functional association. Difference in timing of the inputs therefore should represent difference in the timing of activation of different granule cells which receive either local information or information from different receptive fields. We will modify the discussion to make sure we do not suggest association between sensory and motor inputs, and clarify our view of local receptive field and context about ongoing activity.

      Reviewer #3 (Public Review):

      Granule cells' axons bifurcate to form parallel fibers (PFs) and ascending axons (AAs). While the significance of PFs on cerebellar plasticity is widely acknowledged, the importance of AAs remains unclear. In the current paper, Conti and Auger conducted electrophysiological experiments in rat cerebellar slices and identified a new form of synaptic plasticity in the AA-Purkinje cell (PC) synapses. Upon simultaneous stimulation of AAs and PFs, AA-PC EPSCs increased, while PFs-EPSCs decreased. This suggests that synaptic responses to AAs and PFs in PCs are jointly regulated, working as an additional mechanism to integrate motor/sensory input. This finding may offer new perspectives in studying and modeling cerebellum-dependent behavior. Overall, the experiments are performed well. However, there are two weaknesses. First, the baseline of electrophysiological recordings is influenced significantly by run-down, making it difficult to interpret the data quantitatively. The amplitude of AA-EPSCs is relatively small and the run-down may mask the change. The authors should carefully reexamine the data with appropriate controls and statistics. Second, while the authors show AA-LTP depends on mGluR, NMDA receptors, and GABA-A receptors, which cell types express these receptors and how they contribute to plasticity is not clarified. The recommended experiments may help to improve the quality of the manuscript.

      As highlighted by the reviewer and developed above in response to reviewer 2, we observed a postsynaptic rundown of the EPSC amplitude. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation. Moreover, we have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown, and provide a baseline. Comparison shows a highly significant potentiation of the ascending axon EPSC, relative to baseline and relative to these control experiments. Depression of the parallel fibre EPSC on the other hand was not significantly different from rundown. For that reason we have not spoken of parallel fibre long term depression. The data, however, show that ascending axon and parallel fibre synapses behave very differently following the costimulation protocol.

      We have discussed above in our response to reviewer 2 the potential involvement of mGluRs, NMDARs and GABAARs. We will modify the discussion of the manuscript and add a diagram to highlight the links between the different molecular pathways and potential cross talk mechanisms, and the location of receptors.