10,000 Matching Annotations
  1. Jun 2025
    1. Reviewer #2 (Public review):

      In this study, the authors show that alterations in the lipid composition of the inner mitochondrial membrane, particularly changes in cardiolipin (CL) content, lead to defects in electron transport, supercomplex formation, and oxidative stress. Using liver-specific CLS knockout mice, which are characterized by dysfunctional capacity for cardiolipin synthesis, the authors highlight an underappreciated role for CL in MASH pathology. Overall, this is an interesting study highlighting the importance of functional/physiological electron transport (and in this context, electron leakage) in MASH pathophysiology. Despite that, this manuscript has several weaknesses that require attention.

      (1) For all LKO studies, it is stated that the decrease in hepatic CL is causal for the observed phenotype. However, it is evident that many other lipids are impacted by CLS KO, including a marked increase in hepatic PG. In this respect, the authors show no evidence that the observed metabolic phenotype is indeed due to the reduction in CL and not to other accompanying changes.

      (2) In the results, the authors highlight that 'MASLD has been shown to alter the total cellular lipidome in liver.' Given that this study focused on CL, it would be useful to include specific studies that pointed to changes in hepatic CL content in MASLD/MASH/fibrosis.

      (3) The initial human mitochondrial lipidomics studies show a reduction in mitochondrial CL and PG content. What was the content/expression of CL synthase and PGP synthase in these samples? If this cannot be assessed, is there any association of CLS or PGPS expression and MASLD/fibrosis (etc) in publicly available databases (e.g, GEP liver) that may explain the reduction in mitochondrial PG and CL content?

      (4) The validation of MASH in patients (Figure 1B) is not convincing (ie., no quantification/scoring provided). NAS /fibrosis scoring (according to Kleiner) would help to define if all patients have indeed MASH, and what subset has fibrosis. Could the reduction in CL/PG content be (also) associated with fibrosis? In addition, Masson's Trichrome should be added to Figure 1B.

      (5) In human lipidomics, the authors suggest that reductions are observed in tetralinoleoyl CL (Figure 1C). However, Figure 1C only shows the combined FA acyl chain length + unsaturation, therefore not allowing for FA-specific ID (unless such data are available from the LC/MS analysis).

      (6) Figures 1 J/K/I. It is obvious that the background in all murine immunoblotting analysis has been altered. The authors should provide unaltered images for these immunoblots.

      (7) For Figure 1, it is unclear what is meant by 'we performed all mitochondrial lipidomic analyses by quantifying lipids per mg of mitochondrial proteins'. Was the murine lipidomics carried out on fractionated mitochondria or whole liver? If whole liver, then how were the data corrected, particularly given that PG is not a mitochondria-specific lipid?

      (8) While total CL content seems indeed decreased across the different mouse models, this is mostly due to 1-2 CL species showing a pronounced reduction, with the remainder being unaltered. This should at least be acknowledged in the results. This is similarly the case in the LKO livers.

      (9) Figure 2. A secondary biochemical analysis of changes in lipid content should be provided, e.g., total triglyceride content, particularly given that the histology analysis does not show any major changes in hepatic lipid droplets/steatosis. In addition, the Masson's Trichrome staining shows almost no collagen deposition.

      (10) Figure 3. 'CLS deletion modestly reduced glucose handling' should be reworded. The LKO mice show improved glucose tolerance (despite the MASH phenotype), which is not evident from the above wording.

      (11) Looking at the mechanism behind the increase in hepatic steatosis, the authors state that lipid accumulation can occur due to increased lipogenesis, or dysfunctional VLDL secretion or beta oxidation, and subsequently assessed the relevant proteins/pathways. What about fatty acid uptake, which is also one of the four major pathways impacted in MASLD? This should be included in this assessment in Figure 3.

      (12) For Figure 5A, it is simply stated 'CLS deletion promotes liver fibrosis in standard chow-fed condition', and it is unclear what is highlighted within the selected EM images and what the arrows refer to. The authors should clarify this within the text.

    2. Reviewer #3 (Public review):

      Summary:

      Mitochondrial oxphos causes lipid accumulation, leading to MASH, although the mechanism has been poorly understood. In this study, Funai and colleagues identify that reductions in cardiolipin in the mitochondria cause disruptions in the electron transport chain. Knockout of cardiolipin synthase was sufficient to drive MASH phenotypes, increase respiratory capacity, and cause electron leak at complexes II and III. It is well established that loss of cardiolipin increases ROS. Studies to date have been performed on whole tissue lysates, but to rule out which changes in mitochondrial lipids are driven by changes in mitochondrial number versus lipid synthesis/turnover, the authors uniquely purified mitochondria from human and mouse livers in MASH and NASH models for this study. This study provides critical information to the field that will inevitably help us better understand the mechanisms underlying MASH and NASH onset. The evidence provided is both convincing and compelling. With further suggested revision experiments, this study has the potential to change our understanding of MASH and NASH pathogenesis.

      Strengths:

      The authors use a unique approach of lipidomics on purified mitochondria. They also analyze many distinct MASH models and provide a unique resource for the field of comprehensive lipidomics analysis of the different ways in which MASH can be induced. The use of human tissue elevates the impact/significance of the findings.

      Weaknesses:

      The data on the super complexes was the least compelling, and frankly, I do not think the authors needed those data to make a compelling argument! The authors should shift their focus more to the compelling electron leak data they have collected. If possible, it would also strengthen the work to include cardiolipin rescues on more of the experiments. Finally, expanding their explanations of the model systems would be very helpful for the readership.

    3. Reviewer #4 (Public review):

      Summary:

      Here, the authors wish to shed light on factors that contribute to the development of liver disease in what used to be called 'the metabolic syndrome'. This is a human-health problem of considerable significance, and the insights they provide, namely the implication of a defect in mitochondrial cardiolipin (CL) content to the progression from metabolic dysfunction-associated steatotic liver disease to steatohepatitis, are plausible.

      Strengths:

      The experimental evidence proffered is derived from the observation of lower levels of (CL) in mitochondria from the liver of patients undergoing liver transplant or resection due to end-stage steatohepatitis compared with mitochondria derived from livers of patients with other conditions. This correlation is buttressed by observations made in mice with liver-selective compromise in CL synthesis and which suggest a pathological environment associated with mitochondrial dysfunction and enhanced oxidative stress, features deemed to play a role in the progression from steatotic liver disease to steatohepatitis.

      The paper is well written, and the findings are well explained and superficially convincing.

      Weaknesses:

      It is unclear how much can be learned from compromising a key enzyme that produces a key mitochondrial lipid in a busy metabolic organ like the liver - isn't the discovery of a mitochondrial defect in such a context rather trivial? And how reliably can these findings be related to the human observations? Most importantly, the chain of causality implied by the title is unproven: the key question of whether or not (somehow) preventing the drop in cardiolipin content affects the course of steatohepatitis remains unanswered.

    1. eLife Assessment

      This important study identifies a novel Legionella effector, Lfat1, which binds F-actin via a coiled-coil domain and structurally resembles the RID toxin, with cryo-EM revealing key interactions mediated by a hydrophobic helical hairpin. While the study is mostly complete and has compelling data, a few minor changes are recommended.

    2. Reviewer #1 (Public review):

      The manuscript by Zeng et al. describes the discovery of an F-actin-binding Legionella pneumophila effector, which they term Lfat1. Lfat1 contains a putative fatty acyltransferase domain that structurally resembles the Rho-GTPase Inactivation (RID) domain toxin from Vibrio vulnificus, which targets small G-proteins. Additionally, Lfat1 contains a coiled-coil (CC) domain.

      The authors identified Lfat1 as an actin-associated protein by screening more than 300 Legionella effectors, expressed as GFP-fusion proteins, for their co-localization with actin in HeLa cells. Actin binding is mediated by the CC domain, which specifically binds to F-actin in a 1:1 stoichiometry. Using cryo-EM, the authors determined a high-quality structure of F-actin filaments bound to the actin-binding domain (ABD) of Lfat1. The structure reveals that actin binding is mediated through a hydrophobic helical hairpin within the ABD (residues 213-279). A Y240A mutation within this region increases the apparent dissociation constant by two orders of magnitude, indicating a critical role for this residue in actin interaction.

      The ABD alone was also shown to strongly associate with F-actin upon overexpression in cells. The authors used a truncated version of the Lfat1 ABD to engineer an F-actin-binding probe, which can be used in a split form. Finally, they demonstrate that full-length Lfat1, when overexpressed in cells, fatty acylates host small G-proteins, likely on lysine residues.

      While this is a solid study, the authors should consider the following points when preparing a revised manuscript:

      Major points:

      (1) Legionella effectors are often activated by binding to eukaryote-specific host factors, including actin. The authors should test the following: a) whether Lfat1 can fatty acylate small G-proteins in vitro; b) whether this activity is dependent on actin binding; and c) whether expression of the Y240A mutant in mammalian cells affects the fatty acylation of Rac3 (Figure 6B), or other small G-proteins.

      (2) It should be demonstrated that lysine residues on small G-proteins are indeed targeted by Lfat1. Ideally, the functional consequences of these modifications should also be investigated. For example, does fatty acylation of G-proteins affect GTPase activity or binding to downstream effectors?

      (3) Line 138: Can the authors clarify whether the Lfat1 ABD induces bundling of F-actin filaments or promotes actin oligomerization? Does the Lfat1 ABD form multimers that bring multiple filaments together? If Lfat1 induces actin oligomerization, this effect should be experimentally tested and reported. Additionally, the impact of Lfat1 binding on actin filament stability should be assessed. This is particularly important given the proposed use of the ABD as an actin probe.

      (4) Line 180: I think it's too premature to refer to the interaction as having "high specificity and affinity." We really don't know what else it's binding to.

      (5) The authors should reconsider the color scheme used in the structural figures, particularly in Figures 2D and S4.

      (6) In Figure 3E, the WT curve fits the data poorly, possibly because the actin concentration exceeds the Kd of the interaction. It might fit better to a quadratic.

      (7) The authors propose that the individual helices of the Lfat1 ABD could be expressed on separate proteins and used to target multi-component biological complexes to F-actin by genetically fusing each component to a split alpha-helix. This is an intriguing idea, but it should be tested as a proof of concept to support its feasibility and potential utility.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Zheng et al reports the structural and biochemical study of novel effectors from the bacterial pathogen Legionella pneumophila. The authors continued from results from their earlier screening for L. pneumophila proteins that affect host F-actin dynamics to show that Llfat1 (Lpg1387) interacts with actin via a novel actin-binding domain (ABD). The authors also determined the structure of the Lfat1 ABD-F-actin complex, which allowed them to develop this ABD as a probe for F-actin. Finally, the authors demonstrated that Llfat1 is a lysine fatty acyltransferase that targets several small GTPases in host cells.

      Strengths:

      This is a very complete work that shows the structure of a novel bacterial actin-binding protein in complex with F-actin, and the biochemical activity of the protein was also revealed. Overall, this is a very exciting study and should be of great interest to scientists in both bacterial pathogenesis and the actin cytoskeleton of eukaryotic cells.

      Weaknesses:

      (1) The authors should use biochemical reactions to analyze the KFAT of Llfat1 on one or two small GTPases shown to be modified by this effector in cellulo. Such reactions may allow them to determine the role of actin binding in its biochemical activity. This notion is particularly relevant in light of recent studies that actin is a co-factor for the activity of LnaB and Ceg14 (PMID: 39009586; PMID: 38776962; PMID: 40394005). In addition, the study should be discussed in the context of these recent findings on the role of actin in the activity of L. pneumophila effectors.

      (2) The development of the ABD domain of Llfat1 as an F-actin domain is a nice extension of the biochemical and structural experiments. The authors need to compare the new probe to those currently commonly used ones, such as Lifeact, in labeling of the actin cytoskeleton structure.

    1. eLife Assessment

      The work provides important insights into how this lncRNA regulates gene expression via complex mechanisms, however, the biological relevance awaits validation in other models. This paper provides extensive and carefully analysed data that is of value in efforts to understand the role of the lncRNA EPB41L4A-AS1 in a human cell line. The data is generally convincing and supported by clever integrative analysis; however, the known extensive artefacts from individual Gapmer oligonucleotides cast some doubt over the interpretation of those experiments where only one targeting and one control Gapmer are used.

    2. Reviewer #1 (Public review):

      Monziani and Ulitsky present a large and exhaustive study on the lncRNA EPB41L4A-AS1 using a variety of genomic methods. They uncover a rather complex picture of an RNA transcript that appears to act via diverse pathways to regulate the expression of large numbers of genes, including many snoRNAs. The activity of EPB41L4A-AS1 seems to be intimately linked with the protein SUB1, via both direct physical interactions and direct/indirect of SUB1 mRNA expression.

      The study is characterised by thoughtful, innovative, integrative genomic analysis. It is shown that EPB41L4A-AS1 interacts with SUB1 protein and that this may lead to extensive changes in SUB1's other RNA partners. Disruption of EPB41L4A-AS1 leads to widespread changes in non-polyA RNA expression, as well as local cis changes. At the clinical level, it is possible that EPB41L4A-AS1 plays disease-relevant roles, although these seem to be somewhat contradictory with evidence supporting both oncogenic and tumour suppressive activities.

      A couple of issues could be better addressed here. Firstly, the copy number of EPB41L4A-AS1 is an important missing piece of the puzzle. It is apparently highly expressed in the FISH experiments. To get an understanding of how EPB41L4A-AS1 regulates SUB1, an abundant protein, we need to know the relative stoichiometry of these two factors. Secondly, while many of the experiments use two independent Gapmers for EPB41L4A-AS1 knockdown, the RNA-sequencing experiments apparently use just one, with one negative control (?). Evidence is emerging that Gapmers produce extensive off-target gene expression effects in cells, potentially exceeding the amount of on-target changes arising through the intended target gene. Therefore, it is important to estimate this through the use of multiple targeting and non-targeting ASOs, if one is to get a true picture of EPB41L4A-AS1 target genes. In this Reviewer's opinion, this casts some doubt over the interpretation of RNA-seq experiments until that work is done. Nonetheless, the Authors have designed thorough experiments, including overexpression rescue constructs, to quite confidently assess the role of EPB41L4A-AS1 in snoRNA expression.

      It is possible that EPB41L4A-AS1 plays roles in cancer, either as an oncogene or a tumour suppressor. However, it will in the future be important to extend these observations to a greater variety of cell contexts.

      This work is valuable in providing an extensive and thorough analysis of the global mechanisms of an important regulatory lncRNA and highlights the complexity of such mechanisms via cis and trans regulation and extensive protein interactions.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Monziani et al. identified long noncoding RNAs (lncRNAs) that act in cis and are coregulated with their target genes located in close genomic proximity. The authors mined the GeneHancer database, and this analysis led to the identification of four lncRNA-target pairs. The authors decided to focus on lncRNA EPB41L4A-AS1.

      They thoroughly characterised this lncRNA, demonstrating that it is located in the cytoplasm and the nuclei, and that its expression is altered in response to different stimuli. Furthermore, the authors showed that EPB41L4A-AS1 regulates EPB41L4A transcription, leading to a mild reduction in EPB41L4A protein levels. This was not recapitulated with siRNA-mediated depletion of EPB41L4AAS1. RNA-seq in EPB41L4A-AS1-depleted cells with single LNA revealed 2364 DEGs linked to pathways including the cell cycle, cell adhesion, and inflammatory response. To understand the mechanism of action of EPB41L4A-AS1, the authors mined the ENCODE eCLIP data and identified SUB1 as an lncRNA interactor. The authors also found that the loss of EPB41L4A-AS1 and SUB1 leads to the accumulation of snoRNAs, and that SUB1 localisation changes upon the loss of EPB41L4A-AS1. Finally, the authors showed that EPB41L4A-AS1 deficiency did not change the steady-state levels of SNORA13 nor RNA modification driven by this RNA. The phenotype associated with the loss of EPB41L4A-AS1 is linked to increased invasion and EMT gene signature.

      Overall, this is an interesting and nicely done study on the versatile role of EPB41L4A-AS1 and the multifaceted interplay between SUB1 and this lncRNA, but some conclusions and claims need to be supported with additional experiments. My primary concerns are using a single LNA gapmer for critical experiments, increased invasion, and nucleolar distribution of SUB1- in EPB41L4A-AS1-depleted cells. These experiments need to be validated with orthogonal methods.

      Strengths:

      The authors used complementary tools to dissect the complex role of lncRNA EPB41L4A-AS1 in regulating EPB41L4A, which is highly commendable. There are few papers in the literature on lncRNAs at this standard. They employed LNA gapmers, siRNAs, CRISPRi/a, and exogenous overexpression of EPB41L4A-AS1 to demonstrate that the transcription of EPB41L4A-AS1 acts in cis to promote the expression of EPB41L4A by ensuring spatial proximity between the TAD boundary and the EPB41L4A promoter. At the same time, this lncRNA binds to SUB1 and regulates snoRNA expression and nucleolar biology. Overall, the manuscript is easy to read, and the figures are well presented. The methods are sound, and the expected standards are met.

      Weaknesses:

      The authors should clarify how many lncRNA-target pairs were included in the initial computational screen for cis-acting lncRNAs and why MCF7 was chosen as the cell line of choice. Most of the data uses a single LNA gapmer targeting EPB41L4A-AS1 lncRNA (eg, Fig. 2c, 3B, and RNA-seq), and the critical experiments should be using at least 2 LNA gapmers. The specificity of SUB1 CUT&RUN is lacking, as well as direct binding of SUB1 to lncRNA EPB41L4A-AS1, which should be confirmed by CLIP qPCR in MCF7 cells. Finally, the role of EPB41L4A-AS1 in SUB1 distribution (Figure 5) and cell invasion (Figure 8) needs to be complemented with additional experiments, which should finally demonstrate the role of this lncRNA in nucleolus and cancer-associated pathways. The use of MCF7 as a single cancer cell line is not ideal.

    4. Reviewer #3 (Public review):

      Summary:

      In this paper, the authors made some interesting observations that EPB41L4A-AS1 lncRNA can regulate the transcription of both the nearby coding gene and genes on other chromosomes. They started by computationally examining lncRNA-gene pairs by analyzing co-expression, chromatin features of enhancers, TF binding, HiC connectome, and eQTLs. They then zoomed in on four pairs of lncRNA-gene pairs and used LNA antisense oligonucleotides to knock down these lncRNAs. This revealed EPB41L4A-AS1 as the only one that can regulate the expression of its cis-gene target EPB41L4A. By RNA-FISH, the authors found this lncRNA to be located in all three parts of a cell: chromatin, nucleoplasm, and cytoplasm. RNA-seq after LNA knockdown of EPB41L4A-AS1 showed that this increased >1100 genes and decreased >1250 genes, including both nearby genes and genes on other chromosomes. They later found that EPB41L4A-AS1 may interact with SUB1 protein (an RNA-binding protein) to impact the target genes of SUB1. EPB41L4A-AS1 knockdown reduced the mRNA level of SUB1 and altered the nuclear location of SUB1. Later, the authors observed that EPB41L4A-AS1 knockdown caused an increase of snRNAs and snoRNAs, likely via disrupted SUB1 function. In the last part of the paper, the authors conducted rescue experiments that suggested that the full-length, intron- and SNORA13-containing EPB41L4A-AS1 is required to partially rescue snoRNA expression. They also conducted SLAM-Seq and showed that the increased abundance of snoRNAs is primarily due to their hosts' increased transcription and stability. They end with data showing that EPB41L4A-AS1 knockdown reduced MCF7 cell proliferation but increased its migration, suggesting a link to breast cancer progression and/or metastasis.

      Strengths:

      Overall, the paper is well-written, and the results are presented with good technical rigor and appropriate interpretation. The observation that a complex lncRNA EPB41L4A-AS1 regulates both cis and trans target genes, if fully proven, is interesting and important.

      Weaknesses:

      The paper is a bit disjointed as it started from cis and trans gene regulation, but later it switched to a partially relevant topic of snoRNA metabolism via SUB1. The paper did not follow up on the interesting observation that there are many potential trans target genes affected by EPB41L4A-AS1 knockdown and there was limited study of the mechanisms as to how these trans genes (including SUB1 or NPM1 genes themselves) are affected by EPB41L4A-AS1 knockdown. There are discrepancies in the results upon EPB41L4A-AS1 knockdown by LNA versus by CRISPR activation, or by plasmid overexpression of this lncRNA.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Mollá-Albaladejo et al. investigate the neurons downstream of GR64f and Gr66a, called G2Ns. They identify downstream neurons using trans-Tango labeling with RFP and then perform bulk RNA-seq on the RFP-sorted cells. Gene expression is up- or downregulated between the cell populations and between fed and starved states. They specifically identify Leukocinin as a neuropeptide that is upregulated in starved Gr66a cells. Leucokinin cells, identified by a GAL4 line indeed show higher expression when starved, especially in the SEZ. Furthermore, Leucokinin cells colocalize with the transTango signal from downstream neurons of both GRs. This connection is confirmed with GRASP. According to EM data, Leucokinin cells in the SEZ receive a lot of input and connect to many downstream neurons. In behavior experiments performed with flies lacking Leucokinin neurons, flies show reduced responsiveness to sugar and bitter mixtures when starved. The authors suggest that Leucokinin neurons integrate bitter and sugar tastes and that their output is modified by a hunger state.

      Strengths:

      The authors use a multitude of tools to identify SELK neurons downstream of taste sensory neurons and as starvation-sensitive cells. This study provides an example of how combining genetic labeling, RNA-seq, and EM analysis can be combined to investigate neural circuits.

      Weaknesses:

      The authors do not show a functional connection between sensory neurons and SELK neurons. Additionally, data from RNA seq, anatomical studies, and EM analysis are sometimes contradictory in terms of connectivity. GRASP signal is not foolproof that cells are synaptically connected.

      We appreciate the reviewer’s comments. Unfortunately, we have not successfully demonstrated a functional response of SELK neurons using in vivo calcium imaging with UAS-GCaMP7 (we tried f, m, and s versions), primarily due to challenges in obtaining stable signals. We stimulated GRNs using sucrose, caffeine, or a mixture of both, and maybe even if the concentrations were high, they were not enough to induce a response.

      Regarding GRASP, we acknowledge its limitations as a standalone technique for establishing genuine synaptic connections between neurons, as some signals may reflect false positives resulting from the mere proximity of the candidate neurons. To strengthen our findings, we complemented these results by demonstrating the positive colocalization of the Leucokinin antibody signal over the Gr66aGal4>trans-TANGO and Gr64f-Gal4>trans-TANGO (Figure 4), confirming that Leucokinin neurons are indeed postsynaptic to both sweet and bitter GRNs. Moreover, we incorporated BacTrace data to highlight the direct connectivity between sweet and bitter GRNs (now Figure 5E).

      In the revised manuscript, we have introduced the active-GRASP technique (Macpherson et al., 2015). In this version of GRASP, the presynaptic half of GFP (GFP 1-10) is fused to synaptobrevin, which becomes accessible in the membrane of the presynaptic neuron within the synaptic cleft upon presynaptic stimulation (in our case, by stimulating with sucrose sweet Gr64f<sup>GRNs</sup> and with caffeine the bitter Gr66a<sup>GRNs</sup>). Utilizing this technique, we successfully demonstrated (see new Figure 5B and 5D) that when presented with water, no signal was detected in the Gr66a-LexA, Lk-Gal4 > active-GRASP, or Gr64f-LexA, Lk-Gal4 > active-GRASP transgene flies. However, in the presence of caffeine, Gr66aLexA, Lk-Gal4 > active-GRASP transgene flies exhibited a clear signal in the SEZ, and similarly, sucrose presentation to Gr64f-LexA, Lk-Gal4 > active-GRASP transgene flies yielded a detectable signal. The results obtained from active-GRASP provide additional evidence supporting the connectivity between SELK neurons and both Gr64f<sup>GRNs</sup> and Gr66a<sup>GRNs</sup>, further indicating the functional connectivity of the GRNs and SELK neurons.

      The authors describe a behavioral phenotype when flies are starved, however, they do not use a specific driver for the described cell type, thus they should also tone down their claims.

      We agree with the reviewer that the Lk-Gal4 driver line used labels SELK, LHLK, and ABLK neurons. The behavior examined in this paper, the Proboscis Extension Response (PER), measures the initiation of feeding. Although the neural circuit involved in this behavior is primarily confined to the SEZ where SELK neurons are located, we cannot rule out the possibility that other Lk neurons may also play a role in the process. To restrict expression of the Tetanus Toxin, we have utilized the tsh-Gal80 (Clyne et al., 2008) transgene in combination with the Lk-Gal4>UAS-TNT and Lk-Gal4>UAS-TNT<sup>imp</sup> constructs to prevent the expression of the Tetanus Toxin in ABLK neurons, thereby restricting its expression to the SELK and LHLK neurons in the central brain. The new results (Sup Figure 7A) indicate that ABLK neurons do not play a role in integrating sweet and bitter information. However, we acknowledge the reviewer's point that we are still silencing LHLK neurons, so we have adjusted our claims to align more closely with our data

      Generally, the authors do not provide a big advancement to the field and some of the results are contradictory with previous publications.

      We believe our work does not contradict previous findings, nor does it invalidate the role of ABLK neurons in water homeostasis or the role of LHLK neurons in regulating sleep via starvation. We provide additional information on the possible role of SELK neurons in integrating gustatory information. The location of SELK neurons in the SEZ suggests that they may play a role in feeding behavior, and we have demonstrated that these neurons are indeed involved in integrating gustatory information to influence feeding decisions. We consider we have contributed by highlighting a new role for the Leucokinin neuropeptide in feeding behavior.

      Reviewer #2 (Public review):

      Summary:

      A core task of the brain is processing sensory cues from the environment. The neural mechanisms of how sensory information is transmitted from peripheral sense organs to subsequent being processing in defined brain centers remain an important topic in neuroscience. The taste system hereby assesses the palatability of food by evaluating the chemical composition and nutrient content while integrating the current need for energy by assessing the satiation level of the organism. The current manuscript provides insights into the early circuits of gustatory coding using the fruit fly as a model. By combining trans-tango and FACS- based bulk RNAseq to assess the target neurons of sweet sensing (using Gr64fGal4) and bitter sensing (using Gr66a-Gal4) in a first set of experiments the authors investigate genes that are differentially expressed or co-expressed in normal and starved conditions. With a focus on neuropeptides and neurotransmitters, different expressions in the different conditions were assessed resulting in the identification of Leucokinin as a potentially interesting gene. The notion is further supported by RNAseq of Lk- Gal4>mCD8:GFP sorted cells and immunostainings. GRASP and BacTrace experiments further support that the two Lk- expressing cells in the SEZ should indeed be postsynaptic to both types of sensories. Using EM-based connectomics data (based on a previous publication by Engert et al.), the authors also look for downstream targets of the bitter versus sweet gustatory neurons to identify the Lk-neurons. Based on the morphology they identify candidates and further depict the potential downstream neurons in the connectome, which appears largely in agreement with GRASP experiments. Finally silencing the Lk- neurons shows an increased PER response in starved flies (when combined with bitter compounds) as well as increased feeding neurons shows an increased PER response in starved flies (when combined with bitter compounds) as well as increased feeding in a FlyPad assay. Strengths:

      Overall this is an intriguing manuscript, which provides insight into the organization of 2nd order gustatory neurons. It specifically provides strong evidence for the Lk-neurons as a target of sweet and bitter GRNs and provides evidence for their role in regulating sweet vs bitter-based behavioral responses. Particularly the integration of different techniques and datasets in an elegant fashion is a strong side of the manuscript. Moreover to put the known LK-neurons into the context of 2nd order gustatory signalling is strengthening the knowledge about this pathway.

      Weaknesses:

      I do not see any major weakness in the current manuscript. Novelty is to some degree lessened by the fact, that the RNAseq approach did not identify new neurons but rather put the known LK-neurons as major findings. Similarly, the final behavioral section is not very deep and to some degree corroborates the previous publication by the Keene and Nässel labs - that said, the model they propose is indeed novel (but lacks depth in analyses; e.g. there is no physiology that would support the modulation of Lk neurons by either type of GRN). The connectomic section appears a bit out of place and after reading it it's not really clear what one should make of the potential downstream neurons (particularly since the Lk-receptor expression has been previously analyzed); here it might have been interesting to address if/how Lk-neurons may signal directly via a classical neurotransmitter (an information that might be found easily in the adult brain single-cell data).

      We thank the reviewer for the comment. Indeed, we attempted in vivo Ca imaging but were unsuccessful. We have rewritten the connectomic section to better integrate it with the rest of the text and have reanalyzed the data obtained. We considered gathering data from the single-cell adult dataset, but this dataset includes the entire adult fly brain, encompassing SELK and LHLK neurons, making it impossible to differentiate between the two types of Lk neurons. Any further analysis will require transcriptomic analysis of SELK via scRNAseq under the different metabolic conditions tested in this study work.

      Reviewer #3 (Public review):

      Summary:

      To make feeding decisions, animals need to process three types of information: positive cues like sweetness, negative cues like bitterness, and internal states such as hunger or satiety. This study aims to identify where the information is integrated into the fruit fly brain. The authors applied RNA sequencing on second-order gustatory neurons responsible for sweet and bitter processing, under fed and starved conditions. The sequencing data reveal significant changes in gene expression across sweet vs. bitter pathways and fed vs. starved states. The authors focus on the neuropeptide Leucokinin (Lk), whose expression is dependent on the starvation state. They identify a pair of neurons, named SELK neurons, which express Lk and receive direct input from both sweet and bitter gustatory neurons. These SELK neurons are ideal candidates to integrate gustatory and internal state information. Behavioral experiments show that blocking these neurons in starved flies alters their tolerance to bitter substances during feeding.

      Strengths:

      (1) The study employs a well-designed approach, targeting specific neuronal populations, which is more efficient and precise compared to traditional large-scale genetic screening methods.

      (2) The RNAseq results provide valuable data that can be utilized in future studies to explore other molecules beyond Lk.

      (3) The identification of SELK neurons offers a promising avenue for future research into how these neurons integrate conflicting gustatory signals and internal state information.

      Weaknesses:

      (1) Unfortunately, due to technical challenges, the authors were unable to directly image the functional activity of SELK neurons.

      (2) In the behavioral experiments, tetanus toxin was used to block SELK neurons. Since these neurons may release multiple neurotransmitters or neuropeptides, the results do not specifically demonstrate that Leucokinin (Lk) is the critical factor, as suggested in Figure 8. To address this, I recommend using RNAi to inhibit Lk expression in SELK neurons and comparing the outcomes to wild-type controls via the PER assay.

      We appreciate the author's comments and suggestions. As noted, Tetanus Toxin silences the neuron’s activity, affecting the functioning of various neurotransmitters and neuropeptides released by the targeted neuron. In response to the reviewer's recommendation, we employed an RNAi line specifically designed to silence Leucokinin production in Lk-expressing neurons.

      The results presented in Supplementary Figure 7B demonstrate that knocking down Leucokinin in Lk neurons significantly reduces the flies' tolerance to caffeine in sweet food.

      It is crucial to highlight that the sucrose concentration used in Figure 7C was 50mM, whereas in Supplementary Figure 7B, it was increased to 100mM. This adjustment was necessary because the Lk-Gal4, UAS-RNAi, and Lk-Gal4>UAS-RNAi transgenic lines exhibited reduced sensitivity to sucrose compared to the Lk-Gal4>UAS-TNT or Lk-Gal4>UAS-TNT<sup>imp</sup> lines. We aimed to establish a sucrose concentration that would elicit a 50% Proboscis Extension Response (PER) without adding any other compound, thereby allowing us to evaluate the additional effect of caffeine in the food.

      However, according to the data derived from the connectome, SELK neurons might be cholinergic, and this neurotransmitter might be involved in controlling also the behavior of the flies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      To get more evidence for connections between sensory cells and SELK neurons, could the authors also analyze a second available EM data set? Would setting a different threshold (>5 synapses) reveal connections to both sensories? Comparisons between SELK in- and outputs from EM data and Tango labeling also seem to differ quite a lot based on provided images - can the authors count cell bodies in the stainings? Further proof would be to provide functional imaging data that shows that SELK neurons respond to sugar and bitter compounds.

      In this study, we utilized the recently published EM dataset for the Drosophila central brain connectome (Dorkenwald et al., 2024; Flywire.ai). Changing the number of synapses affects the counts of pre- and postsynaptic neurons. We set a threshold of more than five synapses, as recommended by Flywire, to avoid false positives (Dorkenwald et al., 2024). This threshold has been widely used in recent papers (Engert et al., 2022; Shiu et al., 2022; Walker et al., 2025).

      The neuron counts in the connectomic data differ from those in the trans- and retro-TANGO experiments. In our initial trans-TANGO experiment, which labeled postsynaptic neurons in the Gr64fGal4 and Gr66a-Gal4 transgenic lines, we counted the labeled neurons (see Supplementary Figure 1C) and observed considerable variability between different brains. Due to anticipated variability, we did not count the labeled neurons from trans-TANGO and retro-TANGO techniques in the Leucokinin neurons. Furthermore, neither technique labels all postsynaptic or presynaptic neurons, respectively. A recent study on the retro-TANGO technique (Sorkac et al., 2023) found a minimum threshold: the presynaptic neuron must form a certain number of synapses with the neuron of interest to be adequately labeled. According to this paper, the established threshold is 17 synapses. It is likely that the trans-TANGO technique also has a threshold relating to the number of labeled neurons, contingent on the synapse count. This would explain the discrepancy between the two results.

      Unfortunately, we have not been able to provide functional data pointing to the activation of SELK neurons by sucrose or caffeine. However, our active-GRASP data indicates that the connectivity between Gr64f<sup>GRNs</sup> and Gr66a<sup>GRNs</sup> with SELK neurons is present and functional.

      How many Leucokinin-positive cells are in the SEZ? Does the RNA-seq data provide further information about the SELK neurons? Potential receptor candidates for how they integrate hunger signals? AMPKa was described to be required in LHLK neurons.

      There are two SELK neurons in the SEZ. Due to the nature of our bulk RNA sequencing (RNAseq), we cannot link any additional gene expressions detected in our transcriptomic analysis specifically to the SELK neurons regarding the integration of various signaling processes. Furthermore, the single-cell RNA sequencing (scRNAseq) data available from the Drosophila brain, as reported by Li et al. (2022), does not allow accurate differentiation between SELK and LHLK neurons. To understand how these neurons integrate both metabolic and sensory information, it is crucial to conduct a focused RNAseq study specifically on the SELK neurons to understand how these neurons integrate both metabolic and sensory information. This targeted analysis would provide the necessary insights to elucidate their functional roles better. However, according to the data derived from the connectome, SELK neurons might be cholinergic, and this neurotransmitter might be involved in controlling also the behavior of the flies.

      According to previous studies (Yurgel et al., 2019), the Lk-GAL4 line is also expressed in the VNC, thus the authors could make use of the tsh-GAL80 tool to clean up the line. This study also performed GCaMP imaging in fed and 24h starved animals in SELK and couldn't find a difference, can the authors explain this discrepancy?

      We thank the reviewer for this suggestion. We have now added a new piece of data using the tsh-Gal80 transgene in our PER experiments (Supplementary Figure 7A). Blocking the expression of TNT in the ABLK neurons does not affect the main conclusion of the behavioral results. As stated previously, we were unable to obtain in vivo Ca imaging responses in SELK neurons upon exposure to sucrose, caffeine, or mixtures of sucrose and caffeine. We do not believe this is a discrepancy with previous works like Yurgel et al., 2019. It is likely that we faced technical issues regarding expression stability and that the stimulation was possibly too weak to detect changes in GFP levels

      Reviewer #2 (Recommendations for the authors):

      As mentioned above I do not have any major comments on the manuscript, but there are a few points that I feel should be considered:

      (1) The identification of the Lk-candidate neurons in the connectome remains a bit mysterious. In the method sections, this reads as follows "manual and visual criteria were applied to identify the neurons of interest ". a) What precisely was done to get to the candidates?b) Are there alternative candidates that may be Lk-neurons? c) How would another neuron affect the conclusion of the downstream analysis?

      We thank the reviewer for this comment. We have now modified and added new information in the connectomic section, reinforcing our conclusions and correcting the results obtained.

      Our GRASP, BacTRace, and immunohistochemistry experiments pointed to SELK neurons as postsynaptic to both Gr64f<sup>GRNs</sup> (sweet) and Gr66a<sup>GRNs</sup> (bitter). To identify which neurons in the connectome could be the SELK neurons, we utilized a previously described set of GRNs already identified in the connectome (Shiu et al., 2022). We extracted all postsynaptic neurons to the sweet and bitter GRNs identified and intersected both datasets, retaining only those candidate hits receiving simultaneous input from sweet and bitter GRNs. This process yielded a total of 333 hits. Through visual inspection, we discarded all hits that were merely neuronal fragments or neurons that clearly were not our candidates. We narrowed the list down to a final set of 17 candidate neurons whose arborization was located in the SEZ. We reduced the candidates to two final entries from this list: ID 720575940623529610 (GNG.276) and ID 720575940630808827 (GNG.685). The GNG.276 neuron had a counterpart in the SEZ identified as GNG.246. Both of these neurons were annotated as DNg70 in the Flywire database. GNG.685 had a counterpart identified as GNG.595, and these two neurons were classified as DNg68. In both cases, the neuronal candidates, DNg70 and DNg68, were classified as descending neurons, a characteristic of previously described SELK neurons (Nässel et al., 2021). In our initial analysis published in bioRxiv and sent for revision, we identified DNg70 as potentially the SELK neurons based solely on the morphology of the neurons via visual inspection. However, we employed a better method to determine which candidate is more likely to be the SELK neurons, concluding that DNg68, rather than DNg70, represents the SELK neurons. Briefly, we performed an immunohistochemistry for GFP in the Lk-Gal4>UAS-CD8:GFP flies. We aligned the resulting image in a Drosophila reference brain (JRC2018 U) using the CMTK Registration plugin in ImageJ. The resulting image was skeletonized using the Single Neurite Tracer plugin in ImageJ and later uploaded to the Flywire Gateway platform to compare the structure of the aligned and skeletonized SELK neurons to our candidates. This comparison clearly indicated that the DNg68 neurons are the best candidates for representing the SELK neurons, rather than DNg70. We have updated the text and Figures 6 and Supplementary Figure 6 to reflect the new results. These new results do not alter the conclusions of the paper.

      (2) In the transcriptomic experiments It seems that the raw transcripts are reporters, rather than normalised data. Why?

      All transcriptomic data is normalized. In Figure 1 the differential expression was calculated using Deseq2 normalized counts. In Figure 2, Transcripts Per Million (TPM) were calculated using the Salmon package and normalized for the gene length.

      (3) The expression of nAChRbeta1 in the transcriptomic data is rather striking. However, this remains currently not addressed: is this expression real?

      We have not confirmed the upregulation or downregulation in gene expression for other but for Leucokinin, which is our main interest. We found the presence of nAChRbeta1 interesting, as GRNs are cholinergic (Jaeger et al., 2018), suggesting that it would make sense to find cholinergic receptors in G2Ns. However, it is possible that these receptors are expressed in all G2Ns and serve as a common means of communication.

      (4) The description of the behavioural experiments in the results section is rather brief. I had a hard time following it since the genotypes are not repeated nor is it stated what is different in the experimental group vs control (but instead simply what changes in the experimental group, in a rather discussion-like fashion).

      We thank the reviewer for the comment, we have rewritten this section to improve its clarity.

      (5) If I understand the genetics for the behavioural experiments correctly it addresses the entire Lk-Gal4 expressing population, thus it is not possible to describe the role of the two SEZ neurons, but rather LkGal4 neurons. This should be clarified.

      We thank the reviewer for this comment. Indeed, the Lk-Gal4 driver we used drives expression in all Leucokinin neurons, making it impossible to distinguish between the SELK, LHLK, or ABLK neurons. We have added a new piece of behavioral data by using the tsh-Gal80 transgene to prevent the expression of TNT in the ABLK neurons (Supplementary Figure 7A), but still we cannot distinguish between SELK and LHLK. We have rewritten the text to clarify this fact.

      Reviewer #3 (Recommendations for the authors):

      Overall, the manuscript is well-written, I only have one minor suggestion for improvement. In Figure 8C, please clarify the use of TNT to block Lk release.

      We thank the reviewer for the comment, we have clarified the use of TNT in the text.

      References Clyne, J. D. & Miesenböck, G. Sex-Specific Control and Tuning of the Pattern Generator for Courtship Song in Drosophila. Cell 133, 354–363 (2008).

      Dorkenwald, S. et al. Neuronal wiring diagram of an adult brain. Nature 634, 124–138 (2024).

      Engert, S., Sterne, G. R., Bock, D. D. & Scott, K. Drosophila gustatory projections are segregated by taste modality and connectivity. Elife 11, e78110 (2022).

      Jaeger, A. H. et al. A complex peripheral code for salt taste in Drosophila. Elife 7, e37167 (2018).

      Macpherson, L. J. et al. Dynamic labelling of neural connections in multiple colours by trans-synaptic fluorescence complementation. Nat Commun 6, 10024 (2015).

      Nässel, D. R. Leucokinin and Associated Neuropeptides Regulate Multiple Aspects of Physiology and Behavior in Drosophila. Int J Mol Sci 22, 1940 (2021).

      Shiu, P. K., Sterne, G. R., Engert, S., Dickson, B. J. & Scott, K. Taste quality and hunger interactions in a feeding sensorimotor circuit. eLife 11, e79887 (2022).

      Walker, S. R., Peña-Garcia, M. & Devineni, A. V. Connectomic analysis of taste circuits in Drosophila. Sci. Rep. 15, 5278 (2025).

    2. eLife Assessment

      This study provides valuable insights into the organization of second-order circuits of gustatory neurons, particularly in how these circuits integrate opposing taste inputs and are modulated by metabolic state to regulate feeding behavior. Through an elegant combination of complementary techniques, the authors identify the target neurons involved in gustatory integration. The evidence supporting their conclusions is convincing.

    3. Reviewer #1 (Public review):

      Summary:

      Mollá-Albaladejo et al. investigate the neurons downstream of GR64f and Gr66a, called G2Ns. They identify downstream neurons using trans-Tango labeling with RFP and then perform bulk RNA-seq on the RFP-sorted cells. Gene expression is up- or downregulated between the cell populations and between fed and starved states. They specifically identify Leukocinin as a neuropeptide that is upregulated in starved Gr66a cells. Leucokinin cells, identified by a GAL4 line, indeed show higher expression when starved, especially in the SEZ. Furthermore, Leucokinin cells colocalize with the trans-Tango signal from downstream neurons of both GRs. This connection is confirmed with GRASP and active GRASP. According to EM data, Leucokinin cells in the SEZ receive a lot of input and connect to many downstream neurons. In behavior experiments performed with flies lacking Leucokinin neurons, flies show reduced responsiveness to sugar and bitter mixtures when starved. The authors suggest that Leucokinin neurons integrate bitter and sugar tastes and that their output is modified by a hunger state.

      Strengths:

      The authors use a multitude of tools to identify SELK neurons downstream of taste sensory neurons and as starvation-sensitive cells. This study provides an example of how combining genetic labeling, RNA-seq, and EM analysis can be used to investigate the function of specific neural circuits.

      Weaknesses:

      The authors now provide more evidence to show a functional connection between sensory neurons and SELK neurons, for example, by using active GRASP, however, different staining methods reveal different connectivity patterns. The authors describe a behavioral phenotype when flies are starved, however, the phenotype can still not clearly be assigned to the SELK neurons.

    4. Reviewer #2 (Public review):

      Summary:

      A core task of the brain is processing sensory cues from the environment. The neural mechanisms of how sensory information is transmitted from peripheral sense organs to subsequent being processing in defined brain centers remains an important topic in neuroscience. The taste system hereby assesses the palatability of food by evaluating the chemical composition and nutrient content while integrating the current need of energy by assessing the satiation level of the organism. The current manuscript provides insights into the early circuits gustatory coding using the fruit fly as model. By combining trans-tango and FACS-based bulk RNAseq to assess the target neurons of sweet sensing (using by Gr64f-Gal4) and bitter sensing (using Gr66a-Gal4) in a first set of experiments the authors investigate genes that are differentially expressed or co-expressed in normal and starved conditions. With a focus on neuropeptides and neurotransmitters differential expression in the different conditions were assessed resulting in the identification of Leucokinin as potentially interesting gene. The notion is further supported by RNAseq of Lk-Gal4>mCD8:GFP sorted cells and immunostainings. GRASP and BacTrace experiments further supports that the two Lk expressing cells in the SEZ should indeed be postsynaptic to both type of sensors. Using EM-based connectomics data (based on a previous publication by Engert et al.), the authors also look for downstream targets of the bitter versus sweet gustatory neurons to identify the Lk-neurons. Based on morphology they identify candidates and further depict the potential downstream neurons in the connectome, which appears largely in agreement with GRASP experiments. Finally silencing the Lk-neurons shows an increased PER response in starved flies (when combined with bitter compounds) as well as increased feeding in a FlyPad assay.

      Strengths:

      Overall this is an intriguing manuscript, which provides insight into the organization of 2nd order gustatory neurons. It specifically provides strong evidence for the Lk-neurons as target of sweet and bitter GRNs and provides evidence for their role in regulating sweet vs bitter based behavioral responses. Particularly the integration of different techniques and datasets in an elegant fashion is a strong side of the manuscript. Moreover to put the known LK-neurons into the context of 2nd order gustatory signalling is strengthening the knowledge about this pathway.

      Weaknesses:

      I do not see any major weakness in the current manuscript. Novelty is to some degree lessened by the fact, that the RNAseq approach did not identify new neurons but rather put the known LK-neurons as major finding. Similarly the final behavioral section is not very deep and to some degree corroborates the previous publication by the Keene and Nässel labs- that said, the model they propose is indeed novel (but lacks depth in analyses, e.g. there is no physiology that would support the modulation of Lk neurons by either type of GRN). The connectomic section appears a bit out of place and after reading it it's not really clear what one should make of the potential downstream neurons (particularly since the Lk-receptor expression has been previously analyzed); here it might have been interesting to address if/how Lk-neurons may signal directly via a classical neurotransmitter (an information that might be found easily in the adult brain single-cell data).

      Comments on the latest version:

      I feel all points have been included to a satisfactory degree.

    5. Reviewer #3 (Public review):

      Summary:

      To make feeding decisions, animals need to process three types of information: positive cues like sweetness, negative cues like bitterness, and internal states such as hunger or satiety. This study aims to identify where the information is integrated in the fruit fly brain. The authors applied RNA sequencing on second-order gustatory neurons responsible for sweet and bitter processing, under fed and starved conditions. The sequencing data reveal significant changes in gene expression across sweet vs. bitter pathways and fed vs. starved states. The authors focus on the neuropeptide Leucokinin (Lk), whose expression is dependent on the starvation state. They identify a pair of neurons, named SELK neurons, which express Lk and receive direct input from both sweet and bitter gustatory neurons. These SELK neurons are ideal candidates to integrate gustatory and internal state information. Behavioral experiments show that blocking these neurons in starved flies alters their tolerance to bitter substances during feeding.

      Strengths:

      (1) The study employs a well-designed approach, targeting specific neuronal populations, which is more efficient and precise compared to traditional large-scale genetic screening methods.

      (2) The RNAseq results provide valuable data that can be utilized in future studies to explore other molecules beyond Lk.

      (3) The identification of SELK neurons offers a promising avenue for future research into how these neurons integrate conflicting gustatory signals and internal state information.

      Weaknesses:

      Unfortunately, due to technical challenges, the authors were unable to directly image the functional activity of SELK neurons.

    1. Author response:

      Reviewer #1:

      As this code was developed for use with a 4096 electrode array, it is important to be aware of double-counting neurons across the many electrodes. I understand that there are ways within the code to ensure that this does not happen, but care must be taken in two key areas. Firstly, action potentials traveling down axons will exhibit a triphasic waveform that is different from the biphasic waveform that appears near the cell body, but these two signals will still be from the same neuron (for example, see Litke et al., 2004 "What does the eye tell the brain: Development of a System for the Large-Scale Recording of Retinal Output Activity"; figure 14). I did not see anything that would directly address this situation, so it might be something for you to consider in updated versions of the code.

      We thank the reviewer for this insightful comment. We agree that signals from the same neuron may be collected by adjacent channels. To address this concern in our software, we plan to add a routine to SpikeMAP that allows users to discard nearby channels where spike count correlations exceed a pre-determined threshold. Because there is no ground truth to map individual cells to specific channels on the hd-MEA, a statistical approach is warranted.

      Secondly, spike shapes are known to change when firing rates are high, like in bursting neurons (Harris, K.D., Hirase, H., Leinekugel, X., Henze, D.A. & Buzsáki, G. Temporal interaction between single spikes and complex spike bursts in hippocampal pyramidal cells. Neuron 32, 141-149 (2001)). I did not see this addressed in the present version of the manuscript.

      This is a valid concern. To ensure that firing rates are relatively constant over the duration of a recording, we will plot average spike rates using rolling windows of a fixed duration. We expect that population firing rates will remain relatively stable across the duration of recordings.

      Another area for possible improvement would be to build on the excellent validation experiments you have already conducted with parvalbumin interneurons. Although it would take more work, similar experiments could be conducted for somatostatin and vasoactive intestinal peptide neurons against a background of excitatory neurons. These may have different spike profiles, but your success in distinguishing them can only be known if you validate against ground truth, like you did for the PV interneurons.

      We agree that further cycles of experiments could be performed with SOM, VIP, and other neuronal subtypes, and we hope that researchers will take advantage of SpikeMAP too. We will clarify this possibility in the Discussion section of the manuscript.

      Reviewer #2:

      Summary:

      While I find that the paper is nicely written and easy to follow, I find that the algorithmic part of the paper is not really new and should have been more carefully compared to existing solutions. While the GT recordings to assess the possibilities of a spike sorting tool to distinguish properly between excitatory and inhibitory neurons are interesting, spikeMAP does not seem to bring anything new to state-of-the-art solutions, and/or, at least, it would deserve to be properly benchmarked. I would suggest that the authors perform a more intensive comparison with existing spike sorters.

      We thank the reviewer for this comment. As detailed in Table 1, SpikeMAP is the only method that performs E/I sorting on large-scale multielectrodes, hence a comparison to competing methods is not currently possible. That being said, many of the pre-processing steps of SpikeMAP (Figure 1) involve methods that are already well-established in the literature and available under different packages. To highlight the contribution of our work and facilitate the adoption of SpikeMAP, we plan to provide a “modular” portion of SpikeMAP that is specialized in performing E/I sorting and can be added to the pipeline of other packages such as KiloSort more clearly.  This modularized version of the code will be shared freely along with the more complete version already available.

      Weaknesses:

      (1) The global workflow of spikeMAP, described in Figure 1, seems to be very similar to that of Hilgen et al. 2020 (10.1016/j.celrep.2017.02.038). Therefore, the first question is what is the rationale of reinventing the wheel, and not using tools that are doing something very similar (as mentioned by the authors themselves). I have a hard time, in general, believing that spikeMAP has something particularly special, given its Methods, compared to state-of-the-art spike sorters.

      We agree with the reviewers that there are indeed similarities between our work and the Hilgen et al. paper. However, while the latter employs optogenetics to stimulate neurons on a large-scale array, their technique does not specifically target inhibitory (e.g., PV) neurons as described in our work. We will clarify our paper accordingly.

      This is why, at the very least, the title of the paper is misleading, because it lets the reader think that the core of the paper will be about a new spike sorting pipeline. If this is the main message the authors want to convey, then I think that numerous validations/benchmarks are missing to assess first how good spikeMAP is, with reference to spike sorting in general, before deciding if this is indeed the right tool to discriminate excitatory vs inhibitory cells. The GT validation, while interesting, is not enough to entirely validate the paper. The details are a bit too scarce for me, or would deserve to be better explained (see other comments after).

      The title of our work will be edited to make it clear that while elements of the pipeline are well-established and available from other packages, we are the first to extend this pipeline to E/I sorting on large-scale arrays.

      (2) Regarding the putative location of the spikes, it has been shown that the center of mass, while easy to compute, is not the most accurate solution [Scopin et al, 2024, 10.1016/j.jneumeth.2024.110297]. For example, it has an intrinsic bias for finding positions within the boundaries of the electrodes, while some other methods, such as monopolar triangulation or grid-based convolution, might have better performances. Can the authors comment on the choice of the Center of Mass as a unique way to triangulate the sources?

      We agree with the reviewer and will point out limits of the center-of-mass algorithm based on the article of Scopin et al (2024). Further, we will augment the existing code library to include monopolar triangulation or grid-based convolution as options available to end-users.

      (3) Still in Figure 1, I am not sure I really see the point of Spline Interpolation. I see the point of such a smoothing, but the authors should demonstrate that it has a key impact on the distinction of Excitatory vs. Inhibitory cells. What is special about the value of 90kHz for a signal recorded at 18kHz? What is the gain with spline enhancement compared to without? Does such a value depend on the sampling rate, or is it a global optimum found by the authors?

      We will clarify these points. Specifically, the value of 90kHz was chosen because it provided a reasonable temporal characterization of spikes; this value, however, can be adjusted within the software based on user preference.

      (4) Figure 2 is not really clear, especially panel B. The choice of the time scale for the B panel might not be the most appropriate, and the legend filtered/unfiltered with a dot is not clear to me in Bii.

      We will re-check Fig.2B which seems to have error in rendering, likely due to conversion from its original format.

      In panel E, the authors are making two clusters with PCA projections on single waveforms. Does this mean that the PCA is only applied to the main waveforms, i.e. the ones obtained where the amplitudes are peaking the most? This is not really clear from the methods, but if this is the case, then this approach is a bit simplistic and does not really match state-of-the-art solutions. Spike waveforms are quite often, especially with such high-density arrays, covering multiple channels at once, and thus the extracellular patterns triggered by the single units on the MEA are spatio-temporal motifs occurring on several channels. This is why, in modern spike sorters, the information in a local neighbourhood is often kept to be projected, via PCA, on the lower-dimensional space before clustering. Information on a single channel only might not be informative enough to disambiguate sources. Can the authors comment on that, and what is the exact spatial resolution of the 3Brain device? The way the authors are performing the SVD should be clarified in the methods section. Is it on a single channel, and/or on multiple channels in a local neighbourhood?

      Here, the reviewer is suggesting that it may be better to perform PCA on several channels at once, since spikes can occur at several channels at the same time. To address this concern, small routine will be written allowing users to choose how many nearby channels to be selected for PCA.

      (5) About the isolation of the single units, here again, I think the manuscript lacks some technical details. The authors are saying that they are using a k-means cluster analysis with k=2. This means that the authors are explicitly looking for 2 clusters per electrode? If so, this is a really strong assumption that should not be held in the context of spike sorting, because, since it is a blind source separation technique, one cannot pre-determine in advance how many sources are present in the vicinity of a given electrode. While the illustration in Figure 2E is ok, there is no guarantee that one cannot find more clusters, so why this choice of k=2? Again, this is why most modern spike sorting pipelines do not rely on k-means, to avoid any hard-coded number of clusters. Can the authors comment on that?

      It is true that k=2 is a pre-determined choice in our software. In practice, we found that k>2 leads to poorly defined clusters. However, we will ensure that this parameter can be adjusted in the software. Furthermore, if the user chooses not to pre-define this value, we will provide the option to use a Calinski-Harabasz criterion to select k.

      (6) I'm surprised by the linear decay of the maximal amplitude as a function of the distance from the soma, as shown in Figure 2H. Is it really what should be expected? Based on the properties of the extracellular media, shouldn't we expect a power law for the decay of the amplitude? This is strange that up to 100um away from the soma, the max amplitude only dropped from 260 to 240 uV. Can the authors comment on that? It would be interesting to plot that for all neurons recorded, in a normed manner V/max(V) as function of distances, to see what the curve looks like.

      We share the reviewer’s concern and will add results that include a population of neurons to assess the robustness of this phenomenon.

      (7) In Figure 3A, it seems that the total number of cells is rather low for such a large number of electrodes. What are the quality criteria that are used to keep these cells? Did the authors exclude some cells from the analysis, and if yes, what are the quality criteria that are used to keep cells? If no criteria are used (because none are mentioned in the Methods), then how come so few cells are detected, and can the authors convince us that these neurons are indeed "clean" units (RPVs, SNRs, ...)?

      We applied stringent criteria to exclude cells, and we will revise the main text to be clear about these criteria, which include a minimum spike rate and the use of LDA to separate out PCA clusters. For the cells that were retained, we will include SNR estimates.

      (8) Still in Figure 3A, it looks like there is a bias to find inhibitory cells at the borders, since they do not appear to be uniformly distributed over the MEA. Can the authors comment on that? What would be the explanation for such a behaviour? It would be interesting to see some macroscopic quantities on Excitatory/Inhibitory cells, such as mean firing rates, averaged SNRs... Because again, in Figure 3C, it is not clear to me that the firing rates of inhibitory cells are higher than Excitatory ones, whilst they should be in theory.       

      We will include a comparison of firing rates for E and I neurons. It is possible that I cells are located at the border of the MEA due to the site of injections of the viral vector, and not because of an anatomical clustering of I cells per se. We will clarify the text accordingly.

      (9) For Figure 3 in general, I would have performed an exhaustive comparison of putative cells found by spikeMAP and other sorters. More precisely, I think that to prove the point that spikeMAP is indeed bringing something new to the field of spike sorting, the authors should have compared the performances of various spike sorters to discriminate Exc vs Inh cells based on their ground truth recordings. For example, either using Kilosort [Pachitariu et al, 2024, 10.1038/s41592-024-02232-7], or some other sorters that might be working with such large high-density data [Yger et al, 2018, 10.7554/eLife.34518].

      As mentioned previously, Kilosort and related approaches do not address the problem of E/I identification (see Table 1). However, they do have pre-processing steps in common with SpikeMAP. We will add some specific comparison points – for instance, the use of k-means and PCA (which is more common across packages) and the use of cubic spline interpolation (which is less common). Further, we will provide a stand-alone E/I sorting module that can be added to the pipeline of other packages, so that users can use this functionality without having to migrate their entire analysis.

      (10) Figure 4 has a big issue, and I guess the panels A and B should be redrawn. I don't understand what the red rectangle is displaying.

      We apologize for this issue. It seems there was a rendering problem when converting the figure from its original format. We will address this issue in the revised version of the manuscript.

      (11) I understand that Figure 4 is only one example, but I have a hard time understanding from the manuscript how many slices/mice were used to obtain the GT data? I guess the manuscript could be enhanced by turning the data into an open-access dataset, but then some clarification is needed. How many flashes/animals/slices are we talking about? Maybe this should be illustrated in Figure 4, if this figure is devoted to the introduction of the GT data.

      We will mention how many flashes/animals/slices were employed in the GT data and provide open access to these data.

      (12) While there is no doubt that GT data as the ones recorded here by the authors are the most interesting data from a validation point of view, the pretty low yield of such experiments should not discourage the use of artificially generated recordings such as the ones made in [Buccino et al, 2020, 10.1007/s12021-020-09467-7] or even recently in [Laquitaine et al, 2024, 10.1101/2024.12.04.626805v1]. In these papers, the authors have putative waveforms/firing rate patterns for excitatory and inhibitory cells, and thus, the authors could test how good they are in discriminating the two subtypes.

      We thank the reviewer for the suggestion that SpikeMAP could be tested on artificially generated spike trains and will add the citation of the two papers mentioned. We hope future efforts will employ SpikeMAP on both synthetic and experimental data to explore the neural dynamics of E and I neurons in healthy and pathological circuits of the brain.

    2. eLife Assessment

      In this manuscript, the authors describe a software package for automatic differentiation of action potentials generated by excitatory and inhibitory neurons, acquired using high-density microelectrode arrays. The work is valuable as it offers a tool with the potential to automatically identify these neuron types in vitro. However, it is incomplete due to limited comparison with ground truth data from optogenetically identified interneuron subtypes and with existing spike sorting pipelines available to users.

    3. Reviewer #1 (Public review):

      Summary:

      The authors note that while many software packages exist for spike sorting, these do not automatically differentiate with known accuracy between excitatory and inhibitory neurons. Moreover, most existing spike sorting packages are for in vivo use, where the majority of electrodes are separated from each other by several hundred microns or more. There is a need for spike sorting packages that can take advantage of high-density electrode arrays where all electrodes are within a few tens of microns of other electrodes. Here, the authors offer such a software package with SpikeMAP, and they validate its performance in identifying parvalbumin interneurons that were optogenetically stimulated.

      Strengths:

      The main strength of this work is that the authors use ground truth measures to show that SpikeMAP can take features of spike shapes to correctly identify known parvalbumin interneurons against a background of other neuron types. They use spike width and peak to peak distance as the key features for distinguishing between neuron types, a method that has been around for many years (Barthó, Peter, et al. "Characterization of neocortical principal cells and interneurons by network interactions and extracellular features." Journal of neurophysiology 92.1 (2004): 600-608.), but whose performance has not been validated in the context of high density electrode arrays.

      Another strength of this approach is that it is automated - a necessity if your electrode array has 4096 electrodes. Hand-sorting or even checking such a large number of channels is something even the cruelest advisor would not wish upon a graduate student. With such large channel counts, it is essential to have automated methods that are known to work accurately. Hence, the combination of validation and automation is an important advance.

      A nice feature of this work is that with high-density electrode arrays, the spike waveforms appear on multiple nearby electrodes simultaneously. And since spike amplitudes fall off with distance, this allows triangulation of neuron locations within the regular electrode array. Thus, spike correlations between neuron types, or within neuron types, can be plotted as a function of distance. While SpikeMAP is not the first to do this (Peyrache, Adrien, et al. "Spatiotemporal dynamics of neocortical excitation and inhibition during human sleep." Proceedings of the National Academy of Sciences 109.5 (2012): 1731-1736.), it is a welcome capability of this package.

      It is also good that the code for this package is open-source, allowing a community of people (I expect in vitro labs will especially want to use this) to use the code and further improve it.

      Weaknesses:

      As this code was developed for use with a 4096 electrode array, it is important to be aware of double-counting neurons across the many electrodes. I understand that there are ways within the code to ensure that this does not happen, but care must be taken in two key areas. Firstly, action potentials traveling down axons will exhibit a triphasic waveform that is different from the biphasic waveform that appears near the cell body, but these two signals will still be from the same neuron (for example, see Litke et al., 2004 "What does the eye tell the brain: Development of a System for the Large-Scale Recording of Retinal Output Activity"; figure 14). I did not see anything that would directly address this situation, so it might be something for you to consider in updated versions of the code. Secondly, spike shapes are known to change when firing rates are high, like in bursting neurons (Harris, K.D., Hirase, H., Leinekugel, X., Henze, D.A. & Buzsáki, G. Temporal interaction between single spikes and complex spike bursts in hippocampal pyramidal cells. Neuron 32, 141-149 (2001)). I did not see this addressed in the present version of the manuscript.

      Another area for possible improvement would be to build on the excellent validation experiments you have already conducted with parvalbumin interneurons. Although it would take more work, similar experiments could be conducted for somatostatin and vasoactive intestinal peptide neurons against a background of excitatory neurons. These may have different spike profiles, but your success in distinguishing them can only be known if you validate against ground truth, like you did for the PV interneurons.

      Appraisal:

      This work addresses the need for an automated spike sorting software package for high-density electrode arrays. Although no spike sorting software is flawless, the package presented here, SpikeMAP, has been validated on PV interneurons, inspiring a degree of confidence. This is a good start, and further validation on other neuron types could increase that confidence. Groups doing in vitro experiments, where 4096 electrode arrays are more common, could find this system particularly helpful.

    4. Reviewer #2 (Public review):

      Summary:

      In this paper, entitled "SpikeMAP: An unsupervised spike sorting pipeline for cortical excitatory and inhibitory 2 neurons in high-density multielectrode arrays with ground-truth validation", the authors present spikeMAP, a pipeline for the analysis of large-scale recordings of in vitro cortical activity. According to the authors, spikeMAP not only allows for the detection of spikes produced by single neurons (spike sorting), but also allows for the reliable distinction between genetically determined cell types by utilizing viral and optogenetic strategies as ground-truth validation. While I find that the paper is nicely written and easy to follow, I find that the algorithmic part of the paper is not really new and should have been more carefully compared to existing solutions. While the GT recordings to assess the possibilities of a spike sorting tool to distinguish properly between excitatory and inhibitory neurons are interesting, spikeMAP does not seem to bring anything new to state-of-the-art solutions, and/or, at least, it would deserve to be properly benchmarked. I would suggest that the authors perform a more intensive comparison with existing spike sorters.

      Strengths:

      The GT recordings with optogenetic activation of the cells, based on the opsins, is interesting and might provide useful data to quantify how good spike sorting pipelines are, in vitro, to discriminate between excitatory and inhibitory neurons. Such an approach can be quite complementary to artificially generated ground truth.

      Weaknesses:

      (1) The global workflow of spikeMAP, described in Figure 1, seems to be very similar to that of Hilgen et al. 2020 (10.1016/j.celrep.2017.02.038). Therefore, the first question is what is the rationale of reinventing the wheel, and not using tools that are doing something very similar (as mentioned by the authors themselves). I have a hard time, in general, believing that spikeMAP has something particularly special, given its Methods, compared to state-of-the-art spike sorters. This is why, at the very least, the title of the paper is misleading, because it lets the reader think that the core of the paper will be about a new spike sorting pipeline. If this is the main message the authors want to convey, then I think that numerous validations/benchmarks are missing to assess first how good spikeMAP is, with reference to spike sorting in general, before deciding if this is indeed the right tool to discriminate excitatory vs inhibitory cells. The GT validation, while interesting, is not enough to entirely validate the paper. The details are a bit too scarce for me, or would deserve to be better explained (see other comments after).

      (2) Regarding the putative location of the spikes, it has been shown that the center of mass, while easy to compute, is not the most accurate solution [Scopin et al, 2024, 10.1016/j.jneumeth.2024.110297]. For example, it has an intrinsic bias for finding positions within the boundaries of the electrodes, while some other methods, such as monopolar triangulation or grid-based convolution,n might have better performances. Can the authors comment on the choice of the Center of Mass as a unique way to triangulate the sources?

      (3) Still in Figure 1, I am not sure I really see the point of Spline Interpolation. I see the point of such a smoothing, but the authors should demonstrate that it has a key impact on the distinction of Excitatory vs. Inhibitory cells. What is special about the value of 90kHz for a signal recorded at 18kHz? What is the gain with spline enhancement compared to without? Does such a value depend on the sampling rate, or is it a global optimum found by the authors?

      (4) Figure 2 is not really clear, especially panel B. The choice of the time scale for the B panel might not be the most appropriate, and the legend filtered/unfiltered with a dot is not clear to me in Bii. In panel E, the authors are making two clusters with PCA projections on single waveforms. Does this mean that the PCA is only applied to the main waveforms, i.e. the ones obtained where the amplitudes are peaking the most? This is not really clear from the methods, but if this is the case, then this approach is a bit simplistic and does not really match state-of-the-art solutions. Spike waveforms are quite often, especially with such high-density arrays, covering multiple channels at once, and thus the extracellular patterns triggered by the single units on the MEA are spatio-temporal motifs occurring on several channels. This is why, in modern spike sorters, the information in a local neighbourhood is often kept to be projected, via PCA, on the lower-dimensional space before clustering. Information on a single channel only might not be informative enough to disambiguate sources. Can the authors comment on that, and what is the exact spatial resolution of the 3Brain device? The way the authors are performing the SVD should be clarified in the methods section. Is it on a single channel, and/or on multiple channels in a local neighbourhood?

      (5) About the isolation of the single units, here again, I think the manuscript lacks some technical details. The authors are saying that they are using a k-means cluster analysis with k=2. This means that the authors are explicitly looking for 2 clusters per electrode? If so, this is a really strong assumption that should not be held in the context of spike sorting, because, since it is a blind source separation technique, one can not pre-determine in advance how many sources are present in the vicinity of a given electrode. While the illustration in Figure 2E is ok, there is no guarantee that one can not find more clusters, so why this choice of k=2? Again, this is why most modern spike sorting pipelines do not rely on k-means, to avoid any hard-coded number of clusters. Can the authors comment on that?

      (6) I'm surprised by the linear decay of the maximal amplitude as a function of the distance from the soma, as shown in Figure 2H. Is it really what should be expected? Based on the properties of the extracellular media, shouldn't we expect a power law for the decay of the amplitude? This is strange that up to 100um away from the soma, the max amplitude only dropped from 260 to 240 uV. Can the authors comment on that? It would be interesting to plot that for all neurons recorded, in a normed manner V/max(V) as function of distances, to see what the curve looks like.

      (7) In Figure 3A, it seems that the total number of cells is rather low for such a large number of electrodes. What are the quality criteria that are used to keep these cells? Did the authors exclude some cells from the analysis, and if yes, what are the quality criteria that are used to keep cells? If no criteria are used (because none are mentioned in the Methods), then how come so few cells are detected, and can the authors convince us that these neurons are indeed "clean" units (RPVs, SNRs, ...)?

      (8) Still in Figure 3A, it looks like there is a bias to find inhibitory cells at the borders, since they do not appear to be uniformly distributed over the MEA. Can the authors comment on that? What would be the explanation for such a behaviour? It would be interesting to see some macroscopic quantities on Excitatory/Inhibitory cells, such as mean firing rates, averaged SNRs... Because again, in Figure 3C, it is not clear to me that the firing rates of inhibitory cells are higher than Excitatory ones, whilst they should be in theory.

      (9) For Figure 3 in general, I would have performed an exhaustive comparison of putative cells found by spikeMAP and other sorters. More precisely, I think that to prove the point that spikeMAP is indeed bringing something new to the field of spike sorting, the authors should have compared the performances of various spike sorters to discriminate Exc vs Inh cells based on their ground truth recordings. For example, either using Kilosort [Pachitariu et al, 2024, 10.1038/s41592-024-02232-7], or some other sorters that might be working with such large high-density data [Yger et al, 2018, 10.7554/eLife.34518].

      (10) Figure 4 has a big issue, and I guess the panels A and B should be redrawn. I don't understand what the red rectangle is displaying.

      (11) I understand that Figure 4 is only one example, but I have a hard time understanding from the manuscript how many slices/mices were used to obtain the GT data? I guess the manuscript could be enhanced by turning the data into an open-access dataset, but then some clarification is needed. How many flashes/animals/slices are we talking about? Maybe this should be illustrated in Figure 4, if this figure is devoted to the introduction of the GT data.

      (12) While there is no doubt that GT data as the ones recorded here by the authors are the most interesting data from a validation point of view, the pretty low yield of such experiments should not discourage the use of artificially generated recordings such as the ones made in [Buccino et al, 2020, 10.1007/s12021-020-09467-7] or even recently in [Laquitaine et al, 2024, 10.1101/2024.12.04.626805v1]. In these papers, the authors have putative waveforms/firing rate patterns for excitatory and inhibitory cells, and thus, the authors could test how good they are in discriminating the two subtypes.

    1. eLife Assessment

      This study provides important information on the ultrastructural organization of layer 1 of the human neocortex. The quantitative assessment of various synaptic parameters, astrocytic coverage and mitochondrial morphology is based on convincing experimental approaches. These data provide new information on the detailed morphology of human neocortical tissue that will be of interest to neuroscientists working on different network functions.

    2. Reviewer #1:

      Summary:

      The Authors investigated the anatomical features of the excitatory synaptic boutons in layer 1 of the human temporal neocortex. They examined the size of the synapse, the macular or the perforated appearance and the size of the synaptic active zone, the number and volume of the mitochondria, the number of the synaptic and the dense core vesicles, also differentiating between the readily releasable, the recycling and the resting pool of synaptic vesicles. The coverage of the synapse by astrocytic processes was also assessed, and all the above parameters were compared to other layers of the human temporal neocortex. The Authors conclude that the subcellular morphology of the layer 1 synapses is suitable for the functions of the neocortical layer, i.e. the synaptic integration within the cortical column. The low glial coverage of the synapses might allow the glutamate spillover from the synapses enhancing synaptic crosstalk within this cortical layer.

      Strengths:

      The strengths of this paper are the abundant and very precious data about the fine structure of the human neocortical layer 1. Quantitative electron microscopy data (especially that derived from the human brain) are very valuable, since this is a highly time- and energy consuming work. The techniques used to obtain the data, as well as the analyses and the statistics performed by the Authors are all solid, strengthen this manuscript, and support the conclusions drawn in the discussion.

    3. Reviewer #2:

      The study of Rollenhagen et al examines the ultrastructural features of Layer 1 of human temporal cortex. The tissue was derived from drug-resistant epileptic patients undergoing surgery, and was selected as further from the epilepsy focus, and as such considered to be non-epileptic. The analyses has included 4 patients with different age, sex, medication and onset of epilepsy. The manuscript is a follow-on study with 3 previous publications from the same authors on different layers of the temporal cortex:

      Layer 4 - Yakoubi et al 2019 eLife

      Layer 5 - Yakoubi et al 2019 Cerebral Cortex,

      Layer 6 - Schmuhl-Giesen et al 2022 Cerebral Cortex

      They find, the L1 synaptic boutons mainly have single active zone a very large pool of synaptic vesicles and are mostly devoid of astrocytic coverage.

      Strengths:

      The MS is well written easy to read. Result section gives a detailed set of figures showing many morphological parameters of synaptic boutons and surrounding glial elements. The authors provide comparative data of all the layers examined by them so far in the Discussion. Given that anatomical data in human brain are still very limited, the current MS has substantial relevance. The work appears to be generally well done, the EM and EM tomography images are of very good quality. The analyses is clear and precise.

      Weaknesses:

      The authors made all the corrections required and answered all of my concerns, included additional data sets, and clarified statements where needed.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The Authors investigated the anatomical features of the excitatory synaptic boutons in layer 1 of the human temporal neocortex. They examined the size of the synapse, the macular or the perforated appearance and the size of the synaptic active zone, the number and volume of the mitochondria, the number of the synaptic and the dense core vesicles, also differentiating between the readily releasable, the recycling and the resting pool of synaptic vesicles. The coverage of the synapse by astrocytic processes was also assessed, and all the above parameters were compared to other layers of the human temporal neocortex. The Authors conclude that the subcellular morphology of the layer 1 synapses is suitable for the functions of the neocortical layer, i.e. the synaptic integration within the cortical column. The low glial coverage of the synapses might allow the glutamate spillover from the synapses enhancing synaptic crosstalk within this cortical layer.

      Strengths:

      The strengths of this paper are the abundant and very precious data about the fine structure of the human neocortical layer 1. Quantitative electron microscopy data (especially that derived from the human brain) are very valuable, since this is a highly time- and energy consuming work. The techniques used to obtain the data, as well as the analyses and the statistics performed by the Authors are all solid, strengthen this manuscript, and support the conclusions drawn in the discussion.

      Comments on latest version:

      The third version of this paper has been substantially improved. The English is significantly better, there are only few paragraphs and sentences which are hard to understand (see my comments and suggestions below). Almost all of my suggestions were incorporated.

      We would like to thank the reviewer for the comments and incorporated the suggestions within the latest version of the manuscript.

      Remaining minor concerns:

      About epileptic and non-epileptic (non-affected) tissue. I am aware that temporal lobe neocortical tissue derived from epileptic patients is regarded as non-affected by many groups, and they are quite similar to the cortex of non-epileptic (tumour) patients in their electrophysiological properties and synaptic physiology. But please, note, that one paper you cited did not use samples from epileptic patients, but only tissue from non-epileptic tumor patients (Molnár et al. PLOS 2008).

      When you look deeper, and make thorough comparison of tissues derived from epileptic and non-epileptic patients, there are differences in the fine structure, as well as in several electrophysiological features. See for example Tóth et al., J Physiol, 2018, where higher density of excitatory synapses were found in L2 of neocortical samples derived from epileptic patients compared to non-epileptic (tumor) patients. Furthermore, the appearance of population bursts is similar, but their occurrence is more frequent and their amplitude is higher in tissue from epileptic compared to non-epileptic patients. So, I still cannot agree, that temporal neocortex of epileptic patients with the seizure focus in the hippocampus would be non-affected. Therefore I suggested to use the term biopsy tissue.

      We are thankful for this comment on using non-epileptic tissue also by others. We are also aware that Molnár et al. 2008 worked with tumor tissue.

      It is still not emphasized in the first paragraph of the Discussion, that only excitatory axon terminals were investigated.

      We now mentioned in the first paragraph of the discussion that only excitatory synaptic boutons were investigated.

      The text in the Results and the Discussion are somewhat inconsistent.

      The last two paragraphs of the Results section ends with several sentences which should be part of the discussion, such as line 328: This finding strongly supports multivesicular release... or line 344: --- pointing towards a layer-specific regulation of the putative RRP. Moreover, the results suggest that... and line 370: ... it is most likely... Please, correct this.

      We disagree with the reviewer on these points because these sentences summarizes the findings.

      The first paragraph of the Discussion summarizes the work of the quantitative EM work and gives one conclusion about the astrocytic coverage. This last sentence is inconsistent with the other parts of the paragraph. I would either write that "astrocytic coverage was also investigated" (or something similar), or move this sentence to the paragraph which discusses the astrocytic coverage.

      Results line 180-183. "Special connections" between astrocytic processes and synaptic boutons are mentioned, but not shown. Either show these (but then prove with staining!), or leave out this paragraph.

      We deleted this paragraph as suggested.

      Reviewer #2 (Public review):

      Summary:

      The study of Rollenhagen et al examines the ultrastructural features of Layer 1 of human temporal cortex. The tissue was derived from drug-resistant epileptic patients undergoing surgery, and was selected as further from the epilepsy focus, and as such considered to be non-epileptic. The analyses has included 4 patients with different age, sex, medication and onset of epilepsy. The manuscript is a follow-on study with 3 previous publications from the same authors on different layers of the temporal cortex:

      Layer 4 - Yakoubi et al 2019 eLife

      Layer 5 - Yakoubi et al 2019 Cerebral Cortex,

      Layer 6 - Schmuhl-Giesen et al 2022 Cerebral Cortex

      They find, the L1 synaptic boutons mainly have single active zone a very large pool of synaptic vesicles and are mostly devoid of astrocytic coverage.

      Strengths:

      The MS is well written easy to read. Result section gives a detailed set of figures showing many morphological parameters of synaptic boutons and surrounding glial elements. The authors provide comparative data of all the layers examined by them so far in the Discussion. Given that anatomical data in human brain are still very limited, the current MS has substantial relevance. The work appears to be generally well done, the EM and EM tomography images are of very good quality. The analyses is clear and precise.

      Weaknesses:

      The authors made all the corrections required and answered all of my concerns, included additional data sets, and clarified statements where needed.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Minor suggestions:

      Synaptic density, lines 189-193. If you say "comparatively" high, then compare to something (cite your own work for the other layers, and tell the approximative values for the other layers). Same in line 194 comparably high to what? Other option: say "relatively high".

      We corrected the sentences as suggested by the reviewer.

      Line 206: When present, mitochondria (comma missing)

      Corrected as suggested by the reviewer.

      Line 265: Dot is missing at the end of the sentence (after Shapira et al. 2003)

      Corrected as suggested by the reviewer.

      Lines 300-301: Check the English for this sentence: significant difference BETWEEN TWO sublaminae and not significant difference for both sublaminae.

      Corrected as suggested by the reviewer.

      Lines 304-305: Check the sentence, please, it is not understandable without the text in parenthesis.

      Corrected as suggested by the reviewer.

      Line 354 Dot missing at the end of the sentence (after Figure 6A, B)

      Corrected as suggested by the reviewer.

      Line 354-358: Please rephrase this sentence (too complicated, not understandable). I do not understand why results of the L4, L5, L6 are described here. What does it mean "Astrocytes and their fine processes formed a relatively dense, but a comparably loose network within the neuropil in L1"? Dense or loose?

      In the experiment measuring the volume fraction of astrocytic processes (Figure 6C), all six cortical layers were analyzed, thus we compared the values obtained for L1 with the results for L4, L5 and L6. For more clarity, we rephrased the sentence: “Astrocytes and their fine processes formed a relatively dense network in L4 and L5, but a comparably loose one within the neuropil in L1…” We also rephrased other sentences in this paragraph (as also suggested below).

      Lines 359-369: Please rephrase this paragraph. The sentences are too complicated, have too many parentheses, and are not understandable. I suggest to write first how many synapses were examined in L1 and L4, then how many of them were on spine and on dendrites (either n or %). Then give the values how many (n or %) of them were "tripartite synapses", out of spine synapses and of dendritic synapses in both layers. How many of them were partially covered in both layers. Please, write the data in a systematic way. The best would be to give the values in a table as well. This way it will be more understandable (now, it is chaotic, hard to follow).

      We rephrased the paragraph and added a new table (3).

      Line 383: Dot missing from the end of the sentence.

      Corrected as suggested by the reviewer.

      Line 436: Reconsider "comparably low compared to". The comparably means what in this case? The whole paragraph is hard to understand, please, check and review for improvements to the use of English or use chatGPT to check it.

      We corrected the sentence according to the reviewer’s suggestion.

      Line 487: Same thing again: "The comparably largest size of the RP in L1 when compared..." What would you like to say with "comparably"? Check the meaning of this word in a dictionary, please. I have the feeling that you are using this word instead of "relatively".

      Corrected as suggested by the reviewer.

      Line 488 "and TO that found fot L4 and L5 in rodents..."

      Corrected as suggested by the reviewer.

      Line 493-495: Same again, comparably when compared, correct, please.

      Corrected as suggested by the reviewer.

      Supplemental figures: Now I do understand why Hu-01 and Hu-02 are twice, and I think, 3 patients were examined for L1a and three for L1b. But which side is which on the subfigures? Left side (Hu-01, 02 03) was used for L1a, or L1b? Could you write this in the legend, or mark on the figure (at least at one subfigure), please?

      We implemented a comment for clarity.

    1. eLife Assessment

      This theoretical study makes a useful contribution to our understanding of a subtype of type 2 diabetes - ketosis-prone diabetes mellitus (KPD) - with a potential impact on our broader understanding of diabetes and glucose regulation. The article presents an ordinary differential equation-based model for KPD that incorporates a number of distinct timescales - fast, slow, as well as intermediate, incorporating a key hypothesis of reversible beta cell deactivation. The presented evidence is solid and shows that observed clinical disease trajectories may be explained by a simple mathematical model in a particular parameter regime.

    2. Reviewer #1 (Public review):

      The goal of this work is to understand the clinical observation of a subgroup of diabetics who experience extremely high levels of blood glucose levels after a period of high carbohydrate intake. These symptoms are similar to the onset of Type 1 diabetes but, crucially, have been observed to be fully reversible in some cases.

      The authors interpret these observations by analyzing a simple yet insightful mathematical model in which β-cells temporarily stop producing insulin when exposed to high levels of glucose. For a specific model realization of such dynamics (and for specific parameter values) they show that such dynamics lead to two distinct stable states. One is the relatively normal/healthy state in which β-cells respond appropriately to glucose by releasing insulin. In contrast, when enough β-cells "refuse" to produce insulin in a high-glucose environment, there is not enough insulin to reduce glucose levels, and the high-glucose state remains locked in because the high-glucose levels keep β-cells in their inactive state. The presented mathematical analysis shows that in their model the high-glucose state can be entered through an episode of high glucose levels and that subsequently the low-glucose state can be re-entered through prolonged insulin intake.

      The strength of this work is twofold. First, the intellectual sharpness of translating clinical observations of ketosis-prone type 2 diabetes (KPD) into the need for β-cell responses on intermediate timescales. Second, the analysis of a specific model clearly establishes that the clinical observations can be reproduced with a model in which β-cells dynamics reversibly enter a non-insulin-producing state in a glucose-dependent fashion.

      The likely impact of this work is a shift in attention in the field from a focus on the short and long-term dynamics in glucose regulation and diabetes progression to the intermediate timescales of β-cell dynamics. I expect this to lead to much interest in probing the assumptions behind the model to establish what exactly the process is by which patients enter a 'KPD state'. Furthermore, I expect this work to trigger much research on how KPD relates to "regular" type 2 diabetes and to lead to experimental efforts to find/characterize previously overlooked β-cell phenotypes.

      In summary, the authors claim that observed clinical dynamics and possible remission of KPD can be explained through introducing a temporarily inactive β-cell state into a "standard model" of diabetes. The evidence for this claim comes from analyzing a mathematical model and clearly presented.

    3. Reviewer #2 (Public review):

      In this manuscript, Ridout et al. present an intriguing extension of beta cell mass-focused models for diabetes. Their model incorporates reversible glucose-dependent inactivation of beta cell mass, which can trigger sudden-onset hyperglycemia due to bistability in beta cell mass dynamics. Notably, this hyperglycemia can be reversed with insulin treatment. The model is simple, elegant, and thought-provoking.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Concerning the grounding in experimental phenomenology, it would be beneficial to identify specific experiments to strengthen the model. In particular, what evidence supports reversible beta cell inactivation? This could potentially be tested in mice, for instance, by using an inducible beta cell reporter, treating the animals with high glucose levels, and then measuring the phenotype of the marked cells. Such experiments, if they exist, would make the motivation for the model more compelling.

      There is some direct evidence of reversible beta cell inactivation in rodent / in vitro models. We had already mentioned this in the discussion, but we have added some text emphasizing / clarifying the role of this evidence (lines 359–362).

      Others have also argued that some analyses of insulin treatment in conventional T2D, which has a stronger effect in patients with higher glucose before treatment, provides indirect evidence of reversal of glucotoxicity. We have also mentioned this in the revised paper (lines 284–285).

      For quantitative experiments, the authors should be more specific about the features of beta cell dysfunction in KPD. Does the dysfunction manifest in fasting glucose, glycemic responses, or both? Is there a ”pre-KPD” condition? What is known about the disease’s timescale?

      The answers to some of these questions are not entirely clear—patients present with very high glucose, and thus must be treated immediately. Due to a lack of antecedent data it is not entirely clear what the pre-KPD condition is, but there is some evidence that KPD is at least not preceded by diabetes symptoms. This point is already noted in the introduction of the paper and Table 1. However, we have added a small note clarifying that this does not rule out mild hyperglycemia, as in prediabetes (and indeed, as our model might predict) (lines 76–77). Similarly, due to the necessity of immediate insulin treatment, it is not clear from existing data whether the disorder manifests more strongly in fasting glucose or glucose response, although it is likely in both. (We might infer this since continuous insulin treatment does not produce fasting hypoglycemia, and the complete lack of insulin response to glucose shortly after presentation should produce a strong effect in glycemic response.) We believe our existing description of KPD lists all of the relevant timescales, however we have also slightly clarified this description in response to the first referee’s comments (lines 66–73, 83)

      The authors should also consider whether their model could apply to other conditions besides KPD. For example, the phenomenology seems similar to the ”honeymoon” phase of T1D. Making a strong case for the model in this scenario would be fascinating.

      This is an excellent idea, which had not occurred to us. We have briefly discussed this possibility in the remission (lines 281–291), but plan to analyze it in more detail in a future manuscript.

      Reviewer #1 (Recommendations for the author):

      Whenever simulation results are presented, parameter values should be specified right there in the figure captions.

      We have added the values of glucotoxicity parameters to the caption of Figure 2. In other figures, we have explicitly mentioned which panel of Figure 2 the parameters are taken from. Description of the non-glucotoxicity parameters is a bit cumbersome (there are a lot of them, but our model of fast dynamics is slightly different from Topp et al. so it does not suffice to simply say we took their parameters) so we have referred the reader to the Materials and Methods for those.

      I was confused by the language in Figure 4. Could the authors clarify whether they argue that: (1) the observed KPD behaviour is the result of the system switching from one stable state to another when perturbed with high glucose intake? (2) the observed KPD behaviour is the result of one of the steady states disappearing with high glucose intake?

      What we mean to say is that during a period of high sugar intake or exogeneous insulin treatment, one of the fixed points is temporarily removed—it is still a fixed point of the “normal” dynamics, but not a fixed point of the dynamics with the external condition added. Since when glucose (insulin) intake is high enough, only the low (high)-β fixed point is present, under one of these conditions the dynamics flow toward that fixed point. When the external influx of glucose/insulin is turned off, both fixed points are present again—but if the dynamics have moved sufficiently far during the external forcing, the fixed point they end up in will have switched from one fixed point to the other. We have edited the text to make this clearer (lines 153–185). Do note, however, that in response to both referee’s comments (see below), Figures 3 and 4 have been replaced with more illuminating ones. This specific point is now addressed by the new Figure 3.

      The adaptation of the prefactor ’c’ was confusing to me. I think I understood it in the end, but it sounded like, ”here’s a complication, but we don’t explain it because it doesn’t really matter”. I think the authors can explain this better (or potentially leave out the complication with ’c’ altogether?).

      Indeed, the existence of an adaptation mechanism is important for our overall picture of diabetes pathogenesis, but not for many of our analyses, which assume prediabetes. Nonetheless, we agree that the current explanation of it’s role is confusing because of its vagueness. We have elaborated the explanation of the type of dynamics we assume for c, adding an equation for its dynamics to the “Model” section of the Materials and methods, explained in lines 456–465. We have also amended Figure 1 to note this compensation.

      I expect the main impact of this work will be to get clinical practitioners and biomedical researchers interested in the intermediate timescale dynamics of β-cells and take seriously the possibility that reversible inactive states might exist. But this impact will only be achieved when the results are clearly and easily understandable by an audience that is not familiar with mathematical modelling. I personally found it difficult to understand what I was supposed to see in the figures at first glance. Yes, the subtle points are indeed explained in the figure captions, but it might be advantageous to make the points visually so clear that a caption is barely needed. For example, when claiming that a change in parameters leads to bistability, why not plot the steady state values as a function of that parameter instead of showing curves from which one has to infer a steady state?

      I would advise the authors to reconsider their visual presentation by, e.g., presenting the figures to clinical practitioners or biomedical researchers with just a caption title to test whether such an audience can decipher the point of the figure! This is of course merely a personal suggestion that the authors may decide to ignore. I am making this suggestion only because I believe in the quality of this work and that improving the clarity of the figures and the ease with which one can understand the main points would potentially lead to a much larger impact on the presented results.

      This is a very good point. We have made several changes. Firstly, we have added smaller panels showing the dynamics of β to Figure 2; previously, the reader had to infer what was happening to β from G(t). Secondly, we have completely replaced the two figures showing dβ/dt, and requiring the reader to infer the fixed points of β, with bifurcation diagrams that simply show the fixed points of G and β. The new figures show through bifurcation diagrams how there are multiple fixed points in KPD, how glucose or insulin treatment force the switching of fixed points, and how the presence of bistability depends on the rate of glucotoxicity. (These new figures are Fig. 3–5 in the revised manuscript.)

      Could the authors explicitly point out what could be learned from their work for the clinic? At the moment treatment consists of giving insulin to patients. If I understand correctly, nothing about the current treatment would change if the model is correct. Is there maybe something more subtle that could be relevant to devising an optimal treatment for KPD patients?

      This is another very good point. We have added a new figure (Fig. 7) in our results section showing how this model, or one like it, can be analyzed to suggest an insulin treatment schedule (once parameters for an individual patient can be measured), and added some discussion of this point (lines 224–240) as well as lifestyle changes our model might suggest for KPD patients to the discussion (lines 413–425).

      Similarly, could the authors explicitly point out how their model could be experimentally tested? For example, are the functions f(G) and g(G) experimentally accessible? Related to that, presumably the shape of those functions matters to reproduce the observed behaviour. Could the authors comment on that / analyze how reproducing the observed behaviour puts constraints on the shape of the used functions and chosen parameter values?

      g(G) has not been carefully measured in cellular data, however it could be in more quantative versions of existing experiments. Further, our model indeed requires some general features for the forms of f(G) and g(G) to produce KPD-like phenomena. We have added some comment on this to the discussion section of the revised manuscript (lines 367–372).

      Could the authors explicitly spell out which parameters they think differ between individual KPD patients, and which parameters differ between KPD patients and ’regular’ type 2 diabetics?

      In general we expect all parameters should vary both among KPD patients and between KPD / “conventional” T2D. The primary parameter determining whether KPD and conventional T2D, is seen, however, is the ratio kIN/kRE. We have elaborated on both these points in the revised mansuscript. (Lines 186–192, 250–257.)

      I was confused about the timescale of remission. At one point the authors write “KPD patients can often achieve partial remission: after a few weeks or months of treatment with insulin” but later the authors state that “the duration of the remission varies from 6 months to 10 years”.

      The former timescale is the typical timescale achieve remission. After remission is reached, however, it may or may not last—patients may experience a relapse, where their condition worsens and they again require insulin. We have edited the text to clarify this distinction (lines 66–73).

      When the authors talk about intermediate timescales in the main text could they specify an actual unit of time, such as days, weeks, or months as it would relate to the rate constants in their model for those transitions?

      We have done so (lines 86–87, figure 1 caption, figure 2 caption). Getting KPD-like behavior requires (at high glucose) the deactivation process to be somewhat faster than the reactivation process, so the relevant scales are between weeks (reactivation) and days (deactivation at high G).

      The authors state ”Our simple model of β-cell adaptation also neglects the known hyperglycemiainduced leftward shift in the insulin secretion curve f(G) in Eq. (2)) ”. This seems an important consideration. Could the authors comment on why they did not model this shift, and/or explicitly discuss how including it is expected to change the model dynamics?

      We agree that this process seems potentially relevant, as it seems to happen on a relatively fast timescale compared to glucose-induced β-cell death. It is, however, not so well characterized quantitatively that including it is a simple matter of putting in known values—we would be making assumptions that would complicate the interpretation of our results.

      It is clear that this effect will need to be considered when quanitatively modelling real patient data. However, it is also straightforward to argue that this effect by itself cannot produce KPD-like symptoms, and will only tend to reduce the rate of glucotoxocity necessary to produce bibstability. We have added a discussion of this in the revisions (lines 307–315). We have also, in general, expanded the discussion of the effects that each neglected detail we have mentioned is expected to have (lines 292–315).

      The authors end with a statement that their results may “contribute to explanation of other observations that involve rapid onset or remission of diabetes-like phenomena, such as during pregnancy or for patients on very low calorie diets.” Could the authors spell out exactly how their model potentially relates to these phenomena?

      Our thinking is that, even when another direct cause, such as loss of insulin resistance, is implicated in reversal of diabetes, some portion of the effect may be explained by reversal of glucotoxicity. This is indeed at this point just a hypothesis, but we have expanded on it briefly in the revision. (Lines 281–291.)

      Minor typos:

      In Figure 2.D the last zero of 200 on the axis was cut off.

      Line 359 - there is a missing word ”in the analysis”.

      We have fixed these typos, thanks.

      Reviewer #2 (Recommendations for the author):

      The manuscript could be significantly improved in two key areas: the presentation of the analysis, and the relation with experimental phenomenology.

      Regarding the analysis presentation, the figures could be substantially enhanced with minimal effort from the authors. At present, they are sparse, lack legends, and offer only basic analysis. The authors should consider presenting, for example, a bifurcation diagram for beta cell mass and fasting glucose levels as a function of kIN, and how insulin sensitivity and average meal intake modulate this relationship. The goal should be to present clear, testable predictions in an intuitive manner. Currently, the specific testable predictions of the model are unclear.

      The response to this question is copied from the reponses to related questions from the first referee.

      This is a very good point. We have made several changes. Firstly, we have added smaller panels showing the dynamics of β to Figure 2; previously, the reader thad to infer what was happening to β from G(t). Secondly, we have completely replaced the two figures showing dβ/dt, and requiring the reader to infer the fixed points of β, with bifurcation diagrams that simply show the fixed points of G and β. The new figures show through bifurcation diagrams how there are multiple fixed points in KPD, how glucose or insulin treatment force the switching of fixed points, and how the presence of bistability depends on the rate of glucotoxicity. We have also supplemented our phase diagram that shows the effects of SI and the total beta cell population with bifurcation diagrams showing β as SI and βTOT are varied. (These new figures are Fig. 3–5 in the present manuscript.) Finally, we have added another figure analyzing the model’s predictions for the optimal insulin treatment and the resulting time needed to achieve remission (Fig. 7)

    1. eLife Assessment

      The findings are important and intriguing, with theoretical or practical implications beyond a single subfield. The computational methods employed are clever and sophisticated and the strength of evidence is convincing. Both the hypotheses and the exploratory nature of additional analyses are clearly stated.

    2. Reviewer #1 (Public review):

      Summary:

      The authors use a sophisticated and novel task design and Bayesian computational modeling to test their hypothesis that information generalization (operationalized as a combination of self-insertion and social contagion) in social situations is disrupted in Borderline Personality Disorder. Their main finding relates to the observation that two different models best fit the two tested groups: While the model assuming both self-insertion and social contagion to be present when estimating others' social value preferences fit the control group best, a model assuming neither of these processes provided the best fit to BPD participants.

      Strengths:

      The two revisions have substantially strengthened the paper and the manuscript is much clearer and easier to follow now. The introduction now precisely states the author's hypotheses, and the connections to the theoretical framework are presented with much greater clarity. I appreciate that the authors now clearly label exploratory analyses where applicable.

      The strengths of the presented work lie in the sophisticated task design and the thorough investigation of their theory by use of mechanistic computational models to elucidate social decision-making and learning processes in BPD. Although at present it is not clear whether the differing strategies in impression formation observed in BPD are in any way causal to negative outcomes in the condition, the study represents an important step towards better understanding cognitive processes in BPD. The paradigm and models are also potentially relevant for the investigation of other psychiatric conditions.

    3. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer 1:

      The authors frequently refer to their predictions and theory as being causal, both in the manuscript and in their response to reviewers. However, causal inference requires careful experimental design, not just statistical prediction. For example, the claim that "algorithmic differences between those with BPD and matched healthy controls" are "causal" in my opinion is not warranted by the data, as the study does not employ experimental manipulations or interventions which might predictably affect parameter values. Even if model parameters can be seen as valid proxies to latent mechanisms, this does not automatically mean that such mechanisms cause the clinical distinction between BPD and CON, they could plausibly also refer to the effects of therapy or medication. I recommend that such causal language, also implicit to expressions like "parameter influences on explicit intentional attributions", is toned down throughout the manuscript.

      Thankyou for this chance to be clearer in the language. Our models and paradigm introduce a from of temporal causality, given that latent parameter distributions are directly influenced by latent parameter estimates at a previous point in time (self-uncertainty and other uncertainty directly governs social contagion). Nevertheless, we appreciate the reviewers perspective and have now toned down the language to reflect this.

      Abstract:

      ‘Our model makes clear predictions about the mechanisms of social information generalisation concerning both joint and individual reward.’

      Discussion:

      ‘We can simulate this by modelling a framework that incorporates priors based on both self and a strong memory impression of a notional other (Figure S3).’

      ‘We note a strength of this work is the use of model comparison to understand algorithmic differences between those with BPD and matched healthy controls.’

      Although the authors have now much clearer outlined the stuy's aims, there still is a lack of clarity with respect to the authors' specific hypotheses. I understand that their primary predictions about disruptions to self-other generalisation processes underlying BPD are embedded in the four main models that are tested, but it is still unclear what specific hypotheses the authors had about group differences with respect to the tested models. I recommend the authors specify this in the introduction rather than refering to prior work where the same hypotheses may have been mentioned.

      Thankyou for this further critique which has enabled us to more cleary refine our introduction. We have now edited our introduction to be more direct about our hypotheses, that these hypotheses are instantiated into formal models, and what our predictions were. We have also included a small section on how previous predictions from other computational assessments of BPD link to our exploratory work, and highlighted this throughout the manuscript.

      ‘This paper seeks to address this gap by testing explicitly how disruptions in self-other generalization processes may underpin interpersonal disruptions observed in BPD. Specifically, our hypotheses were: (i) healthy controls will demonstrate evidence for both self-insertion and social contagion, integrating self and other information during interpersonal learning; and (ii) individuals with BPD will exhibit diminished self-other integration, reflected in stronger evidence for observations that assume distinct self-other representations.

      We tested these hypotheses by designing a dynamic, sequential, three-phase Social Value Orientation (Murphy & Ackerman, 2014) paradigm—the Intentions Game—that would provide behavioural signatures assessing whether BPD differed from healthy controls in these generalization processes (Figure 1A). We coupled this paradigm with a lattice of models (M1-M4) that distinguish between self-insertion and social contagion (Figure 1B), and performed model comparison:

      M1. Both self-to-other (self-insertion) and other-to-self (social contagion) occur before and after learning M2. Self-to-other transfer only occurs M3. Other-to-self transfer only occurs M4. Neither transfer process, suggesting distinct self-other representations

      We additionally ran exploratory analysis of parameter differences and model predictions between groups following from prior work demonstrating changes in prosociality (Hula et al., 2018), social concern (Henco et al., 2020), belief stability (Story et al., 2024a), and belief updating (Story, 2024b) in BPD to understand whether discrepancies in self-other generalisation influences observational learning. By clearly articulating our hypotheses, we aim to clarify the theoretical contribution of our findings to existing literature on social learning, BPD, and computational psychiatry.’

      Caveats should also be added about the exploratory nature of the many parameter group comparisons. If there are any predictions about group differences that can be made based on prior literature, the authors should make such links clear.

      Thank you for this. We have now included caveats in the text to highlight the exploratory nature of these group comparisons, and added direct links to relevant literature where able:

      Introduction

      ‘We additionally ran exploratory analysis of parameter differences and model predictions between groups following from prior work demonstrating changes in prosociality (Hula et al., 2018), social concern (Henco et al., 2020), belief stability (Story et al., 2024a), and belief updating (Story, 2024b) in BPD to understand whether discrepancies in self-other generalisation influences observational learning. By clearly articulating our hypotheses, we aim to clarify the theoretical contribution of our findings to existing literature on social learning, BPD, and computational psychiatry.’

      Model Comparison

      ‘We found that CON participants were best fit at the group level by M1 (Frequency = 0.59, Exceedance Probability = 0.98), whereas BPD participants were best fit by M4 (Frequency = 0.54, Exceedance Probability = 0.86; Figure 2A). This suggests CON participants are best fit by a model that fully integrates self and other when learning, whereas those with BPD are best explained as holding disintegrated and separate representations of self and other that do not transfer information back and forth.

      We first explore parameters between separate fits (see Methods). Later, in order to assuage concerns about drawing inferences from different models, we examined the relationships between the relevant parameters when we forced all participants to be fit to each of the models (in a hierarchical manner, separated by group). In sum, our model comparison is supported by convergence in parameter values when comparisons are meaningful (see Supplementary Materials). We refer to both types of analysis below.’

      Phase 2 analysis

      ‘Prior work predicts those with BPD should focus more intently on public social information, rather than private information that only concerns one party (Henco et al., 2020). In BPD participants, only new beliefs about the relative reward preferences – mutual outcomes for both player - of partners differed (see Fig 2E): new median priors were larger than median preferences in phase 1 (mean = -0.47; = -6.10, 95%HDI: -7.60, -4.60).’

      ‘Models of moral preference learning (Story et al., 2024) predicts that BPD vs non-BPD participants have more rigid beliefs about their partners. We found that BPD participants were equally flexible around their prior beliefs about a partner’s relative reward preferences (= -1.60, 95%HDI: -3.42, 0.23), and were less flexible around their beliefs about a partner’s absolute reward preferences (=-4.09, 95%HDI: -5.37, -2.80), versus CON (Figure 2B).’

      Phase 3 analysis

      ‘Prior work predicts that human economic preferences are shaped by observation (Panizza, et al., 2021; Suzuki et al. 2016; Yu et al, 2021), although little-to-no work has examined whether contagion differs for relative vs. absolute preferences. Associative models predict that social contagion may be exaggerated in BPD (Ereira et al., 2018).… As a whole, humans are more susceptible to changing relative preferences more than selfish, absolute reward preferences, and this is disrupted in BPD.’

      Psychometric and Intentional Attribution analysis

      ‘Childhood trauma, persecution, and poor mentalising in BPD are all predicted to disrupt one’s ability to change (Fonagy & Luyten, 2009).’

      ‘Prior work has also predicted that partner-participant preference disparity influences mental state attributions (Barnby et al., 2022; Panizza et al., 2021).’

      I'm not sure I understand why the authors, after adding multiple comparison correction, now list two kinds of p-values. To me, this is misleading and precludes the point of multiple comparison corrections, I therefore recommend they report the FDR-adjusted p-values only. Likewise, if a corrected p-value is greater than 0.05 this should not be interpreted as a result.

      We have now adjusted the exploratory results to include only the FDR corrected values in the text.

      ‘We assessed conditional psychometric associations with social contagion under the assumption of M3 for all participants. We conducted partial correlation analyses to estimate relationships conditional on all other associations and retained all that survived bootstrapping (5000 reps), permutation testing (5000 reps), and subsequent FDR correction. When not controlled for group status, RGPTSB and CTQ scores were both moderately associated with MZQ scores (RGPTSB r = 0.41, 95%CI: 0.23, 0.60, p[fdr]=0.043; CTQ r = 0.354 95%CI: 0.13, 0.56, p[fdr]=0.02). This was not affected by group correction. CTQ scores were moderately and negatively associated with shifts in individualistic reward preferences (; r = -0.25, 95%CI: -0.46, -0.04, p[fdr]=0.03). This was not affected by group correction. MZQ scores were in turn moderately and negatively associated with shifts in prosocial-competitive preferences () between phase 1 and 3 (r = -0.26, 95%CI: -0.46, -0.06, p[fdr]=0.03). This was diminished when controlled for group status (r = 0.13, 95%CI: -0.34, 0.08, p[fdr]=0.20). Together this provides some evidence that self-reported trauma and self-reported mentalising influence social contagion (Fig S11). Social contagion under M3 was highly correlated with contagion under M1 demonstrating parsimony of outcomes across models (Fig S12).

      Prior work has predicted that partner-participant preference disparity influences mental state attributions (Barnby et al., 2022; Panizza et al., 2021). We tested parameter influences on explicit intentional attributions in Phase 2 while controlling for group status. Attributions included the degree to which they believed their partner was motived by harmful intent (HI) and self-interest (SI). According with prior work (Barnby et al., 2022), greater disparity of absolute preferences before learning was associated on a trend level with reduced attributions of SI (<= -0.23, p[fdr]=0.08), and greater disparity of relative preferences before learning exaggerated attributions of HI = 0.21, p[fdr]=0.08), but did not survive correction (Figure S4B). This is likely due to partners being significantly less individualistic and prosocial on average compared to participants (= -5.50, 95%HDI: -7.60, -3.60; = 12, 95%HDI: 9.70, 14.00); partners are recognised as less selfish and more competitive.’

      Can the authors please elaborate why the algorithm proposed to be employed by BPD is more 'entropic', especially given both their self-priors and posteriors about partners' preferences tended to be more precise than the ones used by CON? As far as I understand, there's nothing in the data to suggest BPD predictions should be more uncertain. In fact, this leads me to wonder, similarly to what another reviewer has already suggested, whether BPD participants generate self-referential priors over others in the same way CON participants do, they are just less favourable (i.e., in relation to oneself, but always less prosocial) - I think there is currently no model that would incorporate this possibility? It should at least be possible to explore this by checking if there is any statistical relationship between the estimated θ_ppt^m and 〖p(θ〗_par |D^0).

      Thank you for this opportunity to be clearer in our wording. We belief the reviewer is referring to this line in the discussion: ‘In either case, the algorithm underlying the computational goal for BPD participants is far higher in entropy and emphasises a less stable or reliable process of inference.’

      We note in the revised Figure 2 panel E and in the results that those with BPD under M4 show insertion along absolute reward (they still expect diminished selfishness in others), but neutral priors over relative reward (around 0, suggesting expectations of neither prosocial or competitive tendencies of others). Thus, θ_ppt^m (self preference) and θ_par^m (other preference) are tightly associated for absolute, but not relative reward.

      In our wording, we meant that whether under model M4 or M1, those with BPD either show a neutral prior over relative reward (M4) or a prior with large variance over relative reward (M1), showing expectations of difference between themselves and their partner. In both cases, expectation about a partner’s absolute reward preferences is diminished vs. CON participants. We have strengthened our language in the discussion to clarify this:

      ‘In either case, the algorithm underlying the computational goal for BPD participants is far higher in uncertainty, whether through a neutral central tendency (M4) or large variance (M1) prior over relative reward in phase 2, and emphasises a less certain and reliable expectation about others.’

      To note, social contagion under M3 was highly correlated with contagion under M1 (see Fig S11). This provides some preliminary evidence that trauma impacts beliefs about individualism directly, whereas trauma and persecutory beliefs impact beliefs about prosociality through impaired trait mentalising" - I don't understand what the authors mean by this, can they please elaborate and add some explanation to the main text?

      We have now clarified this in the text:

      ‘Together this provides some evidence that self-reported trauma and self-reported mentalising influence social contagion (Fig S11). Social contagion under M3 was highly correlated with contagion under M1 demonstrating parsimony of outcomes across models (Fig S12).’

      I noted that at least some of the newly added references have not been added to the bibliography (e.g., Hitchcock et al. 2022).

      Thankyou for noticing this omission. We have now ensured all cited works are in the reference list.

      Reviewer 2:

      The paper is not based on specific empirical hypotheses formulated at the outset, but, rather, it uses an exploratory approach. Indeed, the task is not chosen in order to tackle specific empirical hypotheses. This, in my view, is a limitation since the introduction reads a bit vague and it is not always clear which gaps in the literature the paper aims to fill. As a further consequence, it is not always clear how the findings speak to previous theories on the topic.’

      As I wrote in the public review, however, I believe that an important limitation of this work is that it was not based on testing specific empirical hypotheses formulated at the outset, and on selecting the experimental paradigm accordingly. This is a limitation because it is not always clear which gaps in the literature the paper aims to fill. As a consequence, although it has improved substantially compared to the previous version, the introduction remains a bit vague. As a further consequence, it is not always clear how the findings speak to previous theories on the topic. Still, despite this limitation, the paper has many strengths, and I believe it is now ready for publication

      Thank you for this further critique. We appreciate your appraisal that the work has improved substantially and is ready for publication. We nevertheless have opted to clarify our introduction and aprior predictions throughout the manuscript (please see response to Reviewer 1).

      Reviewer 3:

      Although the authors note that their approach makes "clear and transparent a priori predictions," the paper could be improved by providing a clear and consolidated statement of these predictions so that the results could be interpreted vis-a-vis any a priori hypotheses.

      In line with comments from both Reviewer 1 and 2, we have clarified our introduction to make it clear what our aprior predictions and hypotheses are about our core aims and exploratory analyses (see response to Reviewer 1).

      The approach of using a partial correlation network with bootstrapping (and permutation) was interesting, but the logic of the analysis was not clearly stated. In particular, there are large group (Table 1: CON vs. BPD) differences in the measures introduced into this network. As a result, it is hard to understand whether any partial correlations are driven primarily by mean differences in severity (correlations tend to be inflated in extreme groups designs due to the absence of observation in middle of scales forming each bivariate distribution). I would have found these exploratory analyses more revealing if group membership was controlled for.

      Thank you for this chance to be clearer in our methods. We have now written a more direct exposition of this exploratory method:

      ‘Exploratory Network Analysis

      To understand the individual differences of trait attributes (MZQ, RGPTSB, CTQ) with other-to-self information transfer () across the entire sample we performed a network analysis (Borsboom, 2021). Network analysis allows for conditional associations between variables to be estimated; each association is controlled for by all other associations in the network. It also allows for visual inspection of the conditional relationships to get an intuition for how variables are interrelated as a whole (see Fig S11). We implemented network analysis with the bootNet package in r using the ‘estimateNetwork’ function with partial correlations (Epskamp, Borsboom & Fried, 2018). To assess the stability of the partial correlations we further implemented bootstrap resampling with 5000 repetitions using the ‘bootnet’ function. We then additionally shuffled the data and refitted the network 5000 times to determine a p<sub>permuted</sub> value; this indicates the probability that a conditional relationship in the original network was within the null distribution of each conditional relationship. We then performed False Discovery Rate correction on the resulting p-values. We additionally controlled for group status for all variables in a supplementary analysis (Table S4).’

      We have also further corrected for group status and reported these results as a supplementary table, and also within the main text alongside the main results. We have opted to relegate Figure 4 into a supplementary figure to make the text clearer.

      ‘We explored conditional psychometric associations with social contagion under the assumption of M3 for all participants (where everyone is able to be influenced by their partner). We conducted partial correlation analyses to estimate relationships conditional on all other associations and retained all that survived bootstrapping (5000 reps), permutation testing (5000 reps), and subsequent FDR correction. When not controlled for group status, RGPTSB and CTQ scores were both moderately associated with MZQ scores (RGPTSB r = 0.41, 95%CI: 0.23, 0.60, p[fdr]=0.043; CTQ r = 0.354 95%CI: 0.13, 0.56, p[fdr]=0.02). This was not affected by group correction. CTQ scores were moderately and negatively associated with shifts in individualistic reward preferences (; r = -0.25, 95%CI: -0.46, -0.04, p[fdr]=0.03). This was not affected by group correction. MZQ scores were in turn moderately and negatively associated with shifts in prosocial-competitive preferences () between phase 1 and 3 (r = -0.26, 95%CI: -0.46, -0.06, p[fdr]=0.03). This was diminished when controlled for group status (r = 0.13, 95%CI: -0.34, 0.08, p[fdr]=0.20). Together this provides some evidence that self-reported trauma and self-reported mentalising influence social contagion (Fig S11). Social contagion under M3 was highly correlated with contagion under M1 demonstrating parsimony of outcomes across models (Fig S12).’

      Discussion first para: "effected -> affected"

      Thanks for spotting this. We have now changed it.

      Add "s" to "participant: "Notably, despite differing strategies, those with BPD achieved similar accuracy to CON participant."

      We have now changed this.

    1. Author response:

      Reviewer #1 (Public review):

      Summary:

      Argunşah et al. describe and investigate the mechanisms underlying the differential response dynamics of barrel vs septa domains of the whisker-related primary somatosensory cortex (S1). Upon repeated stimulation, the authors report that the response ratio between multi- and single-whisker stimulation increases in layer (L) 4 neurons of the septal domain, while remaining constant in barrel L4 neurons. This difference is attributed to the short-term plasticity properties of interneurons, particularly somatostatin-expressing (SST+) neurons. This claim is supported by the increased density of SST+ neurons found in L4 of the septa compared to barrels, along with a stronger response of (L2/3) SST+ neurons to repeated multi- vs single-whisker stimulation. The role of the synaptic protein Elfn1 is then examined. Elfn1 KO mice exhibited little to no functional domain separation between barrel and septa, with no significant difference in single- versus multi-whisker response ratios across barrel and septal domains. Consistently, a decoder trained on WT data fails to generalize to Elfn1 KO responses. Finally, the authors report a relative enrichment of S2- and M1-projecting cell densities in L4 of the septal domain compared to the barrel domain.

      Strengths:

      This paper describes and aims to study a circuit underlying differential response between barrel columns and septal domains of the primary somatosensory cortex. This work supports the view that barrel and septal domains contribute differently to processing single versus multi-whisker inputs, suggesting that the barrel cortex multiplexes sensory information coming from the whiskers in different domains.

      We thank the reviewer for the very neat summary of our findings that barrel cortex multiplexes converging information in separate domains.

      Weaknesses:

      While the observed divergence in responses to repeated SWS vs MWS between the barrel and septal domains is intriguing, the presented evidence falls short of demonstrating that short-term plasticity in SST+ neurons critically underpins this difference. The absence of a mechanistic explanation for this observation limits the work's significance. The measurement of SST neurons' response is not specific to a particular domain, and the Elfn1 manipulation does not seem to be specific to either stimulus type or a particular domain.

      We appreciate the reviewer’s perspective. Although further research is needed to understand the circuit mechanisms underlying the observed phenomenon, we believe our data suggest that altering the short-term dynamics of excitatory inputs onto SST neurons reduces the divergent spiking dynamics in barrels versus septa during repetitive single- and multi-whisker stimulation. Future work could examine how SST neurons, whose somata reside in barrels and septa, respond to different whisker stimuli and the circuits in which they are embedded. At this time, however, the authors believe there is no alternative way to test how the short-term dynamics of excitatory inputs onto SST neurons, as a whole, contribute to the temporal aspects of barrel versus septa spiking.

      The study's reach is further constrained by the fact that results were obtained in anesthetized animals, which may not generalize to awake states.

      We appreciate the reviewer’s concern regarding the generalizability of our findings from anesthetized animals to awake states. Anesthesia was employed to ensure precise individual whisker stimulation (and multi-whisker in the same animal), which is challenging in awake rodents due to active whisking. While anesthesia may alter higher-order processing, core mechanisms, such as short and long term plasticity in the barrel cortex, are preserved under anesthesia (Martin-Cortecero et al., 2014; Mégevand et al., 2009).

      The statistical analysis appears inappropriate, with the use of repeated independent tests, dramatically boosting the false positive error rate.

      Thank you for your feedback on our analysis using independent rank-based tests for each time point in wild-type (WT) animals. To address concerns regarding multiple comparisons and temporal dependencies (for Figure 1F and 4D for now but we will add more in our revision), we performed a repeated measures ANOVA for WT animals (13 Barrel, 8 Septa, 20 time points), which revealed a significant main effect of Condition (F(1,19) = 16.33, p < 0.001) and a significant Condition-Time interaction (F(19,361) = 2.37, p = 0.001). Post-hoc tests confirmed significant differences between Barrel and Septa at multiple time points (e.g., p < 0.0025 at times 3, 4, 6, 7, 8, 10, 11, 12, 16, 19 after Bonferroni posthoc correction), supporting a differential multi-whisker vs. single-whisker ratio response in WT animals. In contrast, a repeated measures ANOVA for knock-out (KO) animals (11 Barrel, 7 Septa, 20 time points) showed no significant main effect of Condition (F(1,14) = 0.17, p = 0.684) or Condition-Time interaction (F(19,266) = 0.73, p = 0.791), indicating that the Barrel-Septa difference observed in WT animals is absent in KO animals.

      Furthermore, the manuscript suffers from imprecision; its conclusions are occasionally vague or overstated. The authors suggest a role for SST+ neurons in the observed divergence in SWS/MWS responses between barrel and septal domains. However, this remains speculative, and some findings appear inconsistent. For instance, the increased response of SST+ neurons to MWS versus SWS is not confined to a specific domain. Why, then, would preferential recruitment of SST+ neurons lead to divergent dynamics between barrel and septal regions? The higher density of SST+ neurons in septal versus barrel L4 is not a sufficient explanation, particularly since the SWS/MWS response divergence is also observed in layers 2/3, where no difference in SST+ neuron density is found.

      Moreover, SST+ neuron-mediated inhibition is not necessarily restricted to the layer in which the cell body resides. It remains unclear through which differential microcircuits (barrel vs septum) the enhanced recruitment of SST+ neurons could account for the divergent responses to repeated SWS versus MWS stimulation.

      We fully appreciate the reviewer’s comment. We currently do not provide any evidence on the contribution of SST neurons in the barrels versus septa in layer 4 on the response divergence of spiking observed in SWS versus MWS. We only show that these neurons differentially distribute in the two domains in this layer. It is certainly known that there is molecular and circuit-based diversity of SST-positive neurons in different layers of the cortex, so it is plausible that this includes cells located in the two domains of vS1, something which has not been examined so far. Our data on their distribution are one piece of information that SST neurons may have a differential role in inhibiting barrel stellate cells versus septa ones. Morphological reconstructions of SST neurons in L4 of the somatosensory barrel cortex has shown that their dendrites and axons project locally and may confine to individual domains, even though not specifically examined (Fig. 3 of Scala F et al., 2019). The same study also showed that L4 SST cells receive excitatory input from local stellate cells) and is known that they are also directly excited by thalamocortical fibers (Beierlein et al., 2003; Tan et al., 2008), both of which facilitate.

      As shown in our supplementary figure, the divergence is also observed in L2/3 where, as the reviewer also points out, where we do not have a differential distribution of SST cells, at least based on a columnar analysis extending from L4. There are multiple scenarios that could explain this “discrepancy” that one would need to examine further in future studies. One straightforward one is that the divergence in spiking in L2/3 domains may be inherited from L4 domains, where L4 SST act on. Another is that even though L2/3 SST neurons are not biased in their distribution their input-output function is, something which one would need to examine by detailed in vitro electrophysiological and perhaps optogenetic approaches in S1. Despite the distinctive differences that have been found between the L4 circuitry in S1 and V1 (Scala F et al., 2019), recent observations indicate that small but regular patches of V1 marked by the absence of muscarinic receptor 2 (M2) have high temporal acuity (Ji et al., 2015), and selectively receive input from SST interneurons (Meier et al., 2025). Regions lacking M2 have distinct input and output connectivity patterns from those that express M2 (Meier et al., 2021; Burkhalter et al., 2023). These findings, together with ours, suggest that SST cells preferentially innervate and regulate specific domains -columns- in sensory cortices.

      Regardless of the mechanism, the Elfn1 knock-out mouse line almost exclusively affects the incoming excitability onto SST neurons (see also reply to comment below), hence what can be supported by our data is that changing the incoming short-term synaptic plasticity onto these neurons brings the spiking dynamics between barrels and septa closer together.

      The Elfn1 KO mouse model seems too unspecific to suggest the role of the short-term plasticity in SST+ neurons in the differential response to repeated SWS vs MWS stimulation across domains. Why would Elfn1-dependent short-term plasticity in SST+ neurons be specific to a pathway, or a stimulation type (SWS vs MWS)? Moreover, the authors report that Elfn1 knockout alters synapses onto VIP+ as well as SST+ neurons (Stachniak et al., 2021; previous version of this paper)-so why attribute the phenotype solely to SST+ circuitry? In fact, the functional distinctions between barrel and septal domains appear largely abolished in the Elfn1 KO.

      Previous work by others and us has shown that globally removing Elfn1 selectively removes a synaptic process from the brain without altering brain anatomy or structure. This allows us to study how the temporal dynamics of inhibition shape activity, as opposed to inhibition from particular cell types. We will nevertheless update the text to discuss more global implications for SST interneuron dynamics and include a reference to VIP interneurons that contain Elfn1.

      When comparing SWS to MWS, we find that MWS replaces the neighboring excitation which would normally be preferentially removed by short-term plasticity in SST interneurons, thus providing a stable control comparison across animals and genotypes. On average, VIP interneurons failed to show modulation by MWS. We were unable to measure a substantial contribution of VIP cells to this process and also note that the Elfn1 expressing multipolar neurons comprise only ~5% of VIP neurons (Connor and Peters, 1984; Stachniak et al., 2021), a fraction that may be lost when averaging from 138 VIP cells. Moreover, the effect of Elfn1 loss on VIP neurons is quite different and marginal compared to that of SST cells, suggesting that the primary impact of Elfn1 knockout is mediated through SST+ interneuron circuitry. Therefore, even if we cannot rule out that these 5% of VIP neurons contribute to barrel domain segregation, we are of the opinion that their influence would be very limited if any.

      Reviewer #2 (Public review):

      Summary:

      Argunsah and colleagues demonstrate that SST-expressing interneurons are concentrated in the mouse septa and differentially respond to repetitive multi-whisker inputs. Identifying how a specific neuronal phenotype impacts responses is an advance.

      Strengths:

      (1) Careful physiological and imaging studies.

      (2) Novel result showing the role of SST+ neurons in shaping responses.

      (3) Good use of a knockout animal to further the main hypothesis.

      (4) Clear analytical techniques.

      We thank the reviewer for their appreciation of the study.

      Weaknesses:

      No major weaknesses were identified by this reviewer. Overall, I appreciated the paper but feel it overlooked a few issues and had some recommendations on how additional clarifications could strengthen the paper. These include:

      (1) Significant work from Jerry Chen on how S1 neurons that project to M1 versus S2 respond in a variety of behavioral tasks should be included (e.g. PMID: 26098757). Similarly, work from Barry Connor's lab on intracortical versus thalamocortical inputs to SST neurons, as well as excitatory inputs onto these neurons (e.g. PMID: 12815025) should be included.

      We thank the reviewer for these valuable resources that we overlooked. We will include Chen et al. (2015), Cruikshank et al. (2007) and Gibson et al. (1999) to contextualize S1 projections and SST+ inputs, strengthening the study’s foundation as well as Beierlein et al. (2003) which nicely show both local and thalamocortical facilitation of excitatory inputs onto L4 SST neurons, in contrast to PV cells. The paper also shows the gradual recruitment of SST neurons by thalamocortical inputs to provide feed-forward inhibition onto stellate cells (regular spiking) of the barrel cortex L4 in rat.

      (2) Using Layer 2/3 as a proxy to what is happening in layer 4 (~line 234). Given that layer 2/3 cells integrate information from multiple barrels, as well as receiving direct VPm thalamocortical input, and given the time window that is being looked at can receive input from other cortical locations, it is not clear that layer 2/3 is a proxy for what is happening in layer 4.

      We agree with the reviewer that what we observe in L2/3 is not necessarily what is taking place in L4 SST-positive cells. The data on L2/3 was included to show that these cells, as a population, can show divergent responses when it comes to SWS vs MWS, which is not seen in L2/3 VIP neurons. Regardless of the mechanisms underlying it, our overall data support that SST-positive neurons can change their activation based on the type of whisker stimulus and when the excitatory input dynamics onto these neurons change due to the removal of Elfn1 the recruitment of barrels vs septa spiking changes at the temporal domain. Having said that, the data shown in Supplementary Figure 3 on the response properties of L2/3 neurons above the septa vs above the barrels (one would say in the respective columns) do show the same divergence as in L4. This suggests that a circuit motif may exist that is common to both layers, involving SST neurons that sit in L4, L5 or even L2/3. This implies that despite the differences in the distribution of SST neurons in septa vs barrels of L4 there is an unidentified input-output spatial connectivity motif that engages in both L2/3 and L4. Please also see our response to a similar point raised by reviewer 1.

      (3) Line 267, when discussing distinct temporal response, it is not well defined what this is referring to. Are the neurons no longer showing peaks to whisker stimulation, or are the responses lasting a longer time? It is unclear why PV+ interneurons which may not be impacted by the Elfn1 KO and receive strong thalamocortical inputs, are not constraining activity.

      We thank the reviewer for their comment and will clarify the statement.

      This convergence of response profiles was further clear in stimulus-aligned stacked images, where the emergent differences between barrels and septa under SWS were largely abolished in the KO (Figure 4B). A distinction between directly stimulated barrels and neighboring barrels persisted in the KO. In addition, the initial response continued to differ between barrel and septa and also septa and neighbor (Figure 4B). This initial stimulus selectivity potentially represents distinct feedforward thalamocortical activity, which includes PV+ interneuron recruitment that is not directly impacted by the Elfn1 KO (Sun et al., 2006; Tan et al., 2008). PV+ cells are strongly excited by thalamocortical inputs, but these exhibit short-term depression, as does their output, contrasting with the sustained facilitation observed in SST+ neurons. These findings suggest that in WT animals, activity spillover from principal barrels is normally constrained by the progressive engagement of SST+ interneurons in septal regions, driven by Elfn1-dependent facilitation at their excitatory synapses. In the absence of Elfn1, this local inhibitory mechanism is disrupted, leading to longer responses in barrels, delayed but stronger responses in septa, and persistently stronger responses in unstimulated neighbors, resulting in a loss of distinction between the responses of barrel and septa domains that normally diverge over time (see Author response image 1 below).

      Author response image 1.

      A) Barrel responses are longer following whisker stimulation in KO. B) Septal responses are slightly delayed but stronger in KO. C) Unstimulated neighbors show longer persistent responses in KO.

      (4) Line 585 "the earliest CSD sink was identified as layer 4..." were post-hoc measurements made to determine where the different shank leads were based on the post-hoc histology?

      Post hoc histology was performed on plane-aligned brain sections which would allow us to detect barrels and septa, so as to confirm the insertion domains of each recorded shank. Layer specificity of each electrode therefore could therefore not be confirmed by histology as we did not have coronal sections in which to measure electrode depth.

      (5) For the retrograde tracing studies, how were the M1 and S2 injections targeted (stereotaxically or physiologically)? How was it determined that the injections were in the whisker region (or not)?

      During the retrograde virus injection, the location of M1 and S2 injections was determined by stereotaxic coordinates (Yamashita et al., 2018). After acquiring the light-sheet images, we were able to post hoc examine the injection site in 3D and confirm that the injections were successful in targeting the regions intended. Although it would have been informative to do so, we did not functionally determine the whisker-related M1 and whisker-related S2 region in this experiment.

      (6) Were there any baseline differences in spontaneous activity in the septa versus barrel regions, and did this change in the KO animals?

      Thank you for this interesting question. Our previous study found that there was a reduction in baseline activity in L4 barrel cortex of KO animals at postnatal day (P)12, but no differences were found at P21 (Stachniak et al., 2023).

      Reviewer #3 (Public review):

      Summary:

      This study investigates the functional differences between barrel and septal columns in the mouse somatosensory cortex, focusing on how local inhibitory dynamics, particularly involving Elfn1-expressing SST⁺ interneurons, may mediate temporal integration of multi-whisker (MW) stimuli in septa. Using a combination of in vivo multi-unit recordings, calcium imaging, and anatomical tracing, the authors propose that septa integrate MW input in an Elfn1-dependent manner, enabling functional segregation from barrel columns.

      Strengths:

      The core hypothesis is interesting and potentially impactful. While barrels have been extensively characterized, septa remain less understood, especially in mice, and this study's focus on septal integration of MW stimuli offers valuable insights into this underexplored area. If septa indeed act as selective integrators of distributed sensory input, this would add a novel computational role to cortical microcircuits beyond what is currently attributed to barrels alone. The narrative of this paper is intellectually stimulating.

      We thank the reviewer for finding the study intellectually stimulating.

      Weaknesses:

      The methods used in the current study lack the spatial and cellular resolution needed to conclusively support the central claims. The main physiological findings are based on unsorted multi-unit activity (MUA) recorded via low-channel-count silicon probes. MUA inherently pools signals from multiple neurons across different distances and cell types, making it difficult to assign activity to specific columns (barrel vs. septa) or neuron classes (e.g., SST⁺ vs. excitatory).

      The recording radius (~50-100 µm or more) and the narrow width of septa (~50-100 µm or less) make it likely that MUA from "septal" electrodes includes spikes from adjacent barrel neurons.

      The authors do not provide spike sorting, unit isolation, or anatomical validation that would strengthen spatial attribution. Calcium imaging is restricted to SST⁺ and VIP⁺ interneurons in superficial layers (L2/3), while the main MUA recordings are from layer 4, creating a mismatch in laminar relevance.

      We thank the reviewer for pointing out the possibility of contamination in septal electrodes. Importantly, it may not have been highlighted, although reported in the methods, but we used an extremely high threshold (7.5 std, in methods, line 583) for spike detection in order to overcome the issue raised here, which restricts such spatial contaminations. Since the spike amplitude decays rapidly with distance, at high thresholds, only nearby neurons contribute to our analysis, potentially one or two. We believe that this approach provides a very close approximation of single unit activity (SUA) in our reported data. We will include a sentence earlier in the manuscript to make this explicit and prevent further confusion.

      Regarding the point on calcium imaging being performed on L2/3 SST and VIP cells instead of L4. Both reviewer 1 and 2 brought up the same issue and we responded as follows. As shown in our supplementary figure, the divergence is also observed in L2/3 where we do not have a differential distribution of SST cells, at least based on a columnar analysis extending from L4. There are multiple scenarios that could explain this “discrepancy” that one would need to examine further in future studies. One straightforward one is that the divergence in spiking in L2/3 domains may be inherited from L4 domains, where L4 SST act on. Another is that even though L2/3 SST neurons are not biased in their distribution their input-output function is, something which one would need to examine by detailed in vitro electrophysiological and perhaps optogenetic approaches in S1. Despite the distinctive differences that have been found between the L4 circuitry in S1 and V1 (Scala F et al., 2019), recent observations indicate that small but regular patches of V1 marked by the absence of muscarinic receptor 2 (M2) have high temporal acuity (Ji et al., 2015), and selectively receive input from SST interneurons (Meier et al., 2025). Regions lacking M2 have distinct input and output connectivity patterns from those that express M2 (Meier et al., 2021; Burkhalter et al., 2023). These findings, together with ours, suggest that SST cells preferentially innervate and regulate specific domains -columns- in sensory cortices.

      Furthermore, while the role of Elfn1 in mediating short-term facilitation is supported by prior studies, no new evidence is presented in this paper to confirm that this synaptic mechanism is indeed disrupted in the knockout mice used here.

      We thank Reviewer #3 for noting the absence of new evidence confirming Elfn1’s disruption of short-term facilitation in our knockout mice. We acknowledge that our study relies on previously strong published data demonstrating that Elfn1 mediates short-term synaptic facilitation of excitatory inputs onto SST+ interneurons (Sylwestrak and Ghosh, 2012; Tomioka et al., 2014; Stachniak et al., 2019, 2023). These studies consistently show that Elfn1 knockout abolishes facilitation in SST+ synapses, leading to altered temporal dynamics, which we hypothesize underlies the observed loss of barrel-septa response divergence in our Elfn1 KO mice (Figure 4). Nevertheless, to address the point raised, we will clarify in the revised manuscript (around lines 245-247 and 271-272) that our conclusions are based on these established findings, stating: “Building on prior evidence that Elfn1 knockout disrupts short-term facilitation in SST+ interneurons (Sylwestrak and Ghosh, 2012; Tomioka et al., 2014; Stachniak et al., 2019, 2023), we attribute the abolished barrel-septa divergence in Elfn1 KO mice to altered SST+ synaptic dynamics, though direct synaptic measurements were not performed here.”

      Additionally, since Elfn1 is constitutively knocked out from development, the possibility of altered circuit formation-including changes in barrel structure and interneuron distribution, cannot be excluded and is not addressed.

      We thank Reviewer #3 for raising the valid concern that constitutive Elfn1 knockout could potentially alter circuit formation, including barrel structure and interneuron distribution. To address this, we will clarify in the revised manuscript (around line ~271 and in the Discussion) that in our previous studies that included both whole-cell patch-clamp in acute brain slices ranging from postnatal day 11 to 22 (P11 - P21) and in vivo recordings from barrel cortex at P12 and P21, we saw no gross abnormalities in barrel structure, with Layer 4 barrels maintaining their characteristic size and organization, consistent with wild-type (WT) mice (Stachniak et al., 2019, 2023). While we cannot fully exclude subtle developmental changes, prior studies indicate that Elfn1 primarily modulates synaptic function rather than cortical cytoarchitecture (Tomioka et al., 2014). Elfn1 KO mice show no gross morphological or connectivity differences and the pattern and abundance of Elfn1 expressing cells (assessed by LacZ knock in) appears normal (Dolan and Mitchell, 2013).

      We will add the following to the Discussion: “Although Elfn1 is constitutively knocked out, we find here and in previous studies that barrel structure is preserved (Stachniak et al., 2019, 2023). Further, the distribution of Elfn1 expressing interneurons is not different in KO mice, suggesting minimal developmental disruption (Dolan and Mitchell, 2013). Nonetheless, we acknowledge that subtle circuit changes cannot be ruled out without the usage of time-depended conditional knockout of the gene.”

      References

      (1) Beierlein, M., Gibson, J. R. & Connors, B. W. (2003). Two dynamically distinct inhibitory networks in layer 4 of the neocortex. J. Neurophysiol. 90, 2987–3000.

      (2) Burkhalter, A., D’Souza, R. D. & Ji, W. (2023). Integration of feedforward and feedback information streams in the modular architecture of mouse visual cortex. Annu. Rev. Neurosci. 46, 259–280.

      (3) Chen, J. L., Margolis, D. J., Stankov, A., Sumanovski, L. T., Schneider, B. L. & Helmchen, F. (2015). Pathway-specific reorganization of projection neurons in somatosensory cortex during learning. Nat. Neurosci. 18, 1101–1108.

      (4) Connor, J. R. & Peters, A. (1984). Vasoactive intestinal polypeptide-immunoreactive neurons in rat visual cortex. Neuroscience 12, 1027–1044.

      (5) Cruikshank, S. J., Lewis, T. J. & Connors, B. W. (2007). Synaptic basis for intense thalamocortical activation of feedforward inhibitory cells in neocortex. Nat. Neurosci. 10, 462–468.

      (6) Dolan, J. & Mitchell, K. J. (2013). Mutation of Elfn1 in mice causes seizures and hyperactivity. PLoS One 8, e80491.

      (7) Gibson, J. R., Beierlein, M. & Connors, B. W. (1999). Two networks of electrically coupled inhibitory neurons in neocortex. Nature 402, 75–79.

      (8) Ji, W., Gămănuţ, R., Bista, P., D’Souza, R. D., Wang, Q. & Burkhalter, A. (2015). Modularity in the organization of mouse primary visual cortex. Neuron 87, 632–643.

      (9) Martin-Cortecero, J. & Nuñez, A. (2014). Tactile response adaptation to whisker stimulation in the lemniscal somatosensory pathway of rats. Brain Res. 1591, 27–37.

      (10) Mégevand, P., Troncoso, E., Quairiaux, C., Muller, D., Michel, C. M. & Kiss, J. Z. (2009). Long-term plasticity in mouse sensorimotor circuits after rhythmic whisker stimulation. J. Neurosci. 29, 5326–5335.

      (11) Meier, A. M., Wang, Q., Ji, W., Ganachaud, J. & Burkhalter, A. (2021). Modular network between postrhinal visual cortex, amygdala, and entorhinal cortex. J. Neurosci. 41, 4809–4825.

      (12) Meier, A. M., D’Souza, R. D., Ji, W., Han, E. B. & Burkhalter, A. (2025). Interdigitating modules for visual processing during locomotion and rest in mouse V1. bioRxiv 2025.02.21.639505.

      (13) Scala, F., Kobak, D., Shan, S., Bernaerts, Y., Laturnus, S., Cadwell, C. R., Hartmanis, L., Froudarakis, E., Castro, J. R., Tan, Z. H., et al. (2019). Layer 4 of mouse neocortex differs in cell types and circuit organization between sensory areas. Nat. Commun. 10, 4174.

      (14) Stachniak, T. J., Sylwestrak, E. L., Scheiffele, P., Hall, B. J. & Ghosh, A. (2019). Elfn1-induced constitutive activation of mGluR7 determines frequency-dependent recruitment of somatostatin interneurons. J. Neurosci. 39, 4461–4475.

      (15) Stachniak, T. J., Kastli, R., Hanley, O., Argunsah, A. Ö., van der Valk, E. G. T., Kanatouris, G. & Karayannis, T. (2021). Postmitotic Prox1 expression controls the final specification of cortical VIP interneuron subtypes. J. Neurosci. 41, 8150–8166.

      (16) Stachniak, T. J., Argunsah, A. Ö., Yang, J. W., Cai, L. & Karayannis, T. (2023). Presynaptic kainate receptors onto somatostatin interneurons are recruited by activity throughout development and contribute to cortical sensory adaptation. J. Neurosci. 43, 7101–7118.

      (17) Sun, Q.-Q., Huguenard, J. R. & Prince, D. A. (2006). Barrel cortex microcircuits: Thalamocortical feedforward inhibition in spiny stellate cells is mediated by a small number of fast-spiking interneurons. J. Neurosci. 26, 1219–1230.

      (18) Sylwestrak, E. L. & Ghosh, A. (2012). Elfn1 regulates target-specific release probability at CA1-interneuron synapses. Science 338, 536–540.

      (19) Tan, Z., Hu, H., Huang, Z. J. & Agmon, A. (2008). Robust but delayed thalamocortical activation of dendritic-targeting inhibitory interneurons. Proc. Natl. Acad. Sci. USA 105, 2187–2192.

      (20) Tomioka, N. H., Yasuda, H., Miyamoto, H., Hatayama, M., Morimura, N., Matsumoto, Y., Suzuki, T., Odagawa, M., Odaka, Y. S., Iwayama, Y., et al. (2014). Elfn1 recruits presynaptic mGluR7 in trans and its loss results in seizures. Nat. Commun. 5, 4501.

      (21) Yamashita, T., Vavladeli, A., Pala, A., Galan, K., Crochet, S., Petersen, S. S. & Petersen, C. C. (2018). Diverse long-range axonal projections of excitatory layer 2/3 neurons in mouse barrel cortex. Front. Neuroanat. 12, 33.

    2. eLife Assessment

      Argunşah et al. investigate the mechanisms underlying the differential response dynamics of barrel vs septa domains in shaping the responses to single vs multiple whiskers. Based on the observation of a higher density of SST+ interneurons in the septa, the authors investigate the hypothesis that Elfn1-dependent short-term plasticity shapes these responses. This important study is, however, supported by incomplete evidence; factors restricting the strength of evidence are the limited spatial resolution of the multi-unit activity, as well as the lack of a mechanistic explanation. This provocative and intellectually stimulating hypothesis provides a contribution to work on how different cell types shape cortical representation.

    3. Reviewer #1 (Public review):

      Summary:

      Argunşah et al. describe and investigate the mechanisms underlying the differential response dynamics of barrel vs septa domains of the whisker-related primary somatosensory cortex (S1). Upon repeated stimulation, the authors report that the response ratio between multi- and single-whisker stimulation increases in layer (L) 4 neurons of the septal domain, while remaining constant in barrel L4 neurons. This difference is attributed to the short-term plasticity properties of interneurons, particularly somatostatin-expressing (SST+) neurons. This claim is supported by the increased density of SST+ neurons found in L4 of the septa compared to barrels, along with a stronger response of (L2/3) SST+ neurons to repeated multi- vs single-whisker stimulation. The role of the synaptic protein Elfn1 is then examined. Elfn1 KO mice exhibited little to no functional domain separation between barrel and septa, with no significant difference in single- versus multi-whisker response ratios across barrel and septal domains. Consistently, a decoder trained on WT data fails to generalize to Elfn1 KO responses. Finally, the authors report a relative enrichment of S2- and M1-projecting cell densities in L4 of the septal domain compared to the barrel domain.

      Strengths:

      This paper describes and aims to study a circuit underlying differential response between barrel columns and septal domains of the primary somatosensory cortex. This work supports the view that barrel and septal domains contribute differently to processing single versus multi-whisker inputs, suggesting that the barrel cortex multiplexes sensory information coming from the whiskers in different domains.

      Weaknesses:

      While the observed divergence in responses to repeated SWS vs MWS between the barrel and septal domains is intriguing, the presented evidence falls short of demonstrating that short-term plasticity in SST+ neurons critically underpins this difference. The absence of a mechanistic explanation for this observation limits the work's significance. The measurement of SST neurons' response is not specific to a particular domain, and the Elfn1 manipulation does not seem to be specific to either stimulus type or a particular domain. The study's reach is further constrained by the fact that results were obtained in anesthetized animals, which may not generalize to awake states. The statistical analysis appears inappropriate, with the use of repeated independent tests, dramatically boosting the false positive error rate. Furthermore, the manuscript suffers from imprecision; its conclusions are occasionally vague or overstated.

      The authors suggest a role for SST+ neurons in the observed divergence in SWS/MWS responses between barrel and septal domains. However, this remains speculative, and some findings appear inconsistent. For instance, the increased response of SST+ neurons to MWS versus SWS is not confined to a specific domain. Why, then, would preferential recruitment of SST+ neurons lead to divergent dynamics between barrel and septal regions? The higher density of SST+ neurons in septal versus barrel L4 is not a sufficient explanation, particularly since the SWS/MWS response divergence is also observed in layers 2/3, where no difference in SST+ neuron density is found. Moreover, SST+ neuron-mediated inhibition is not necessarily restricted to the layer in which the cell body resides. It remains unclear through which differential microcircuits (barrel vs septum) the enhanced recruitment of SST+ neurons could account for the divergent responses to repeated SWS versus MWS stimulation.

      The Elfn1 KO mouse model seems too unspecific to suggest the role of the short-term plasticity in SST+ neurons in the differential response to repeated SWS vs MWS stimulation across domains. Why would Elfn1-dependent short-term plasticity in SST+ neurons be specific to a pathway, or a stimulation type (SWS vs MWS)? Moreover, the authors report that Elfn1 knockout alters synapses onto VIP+ as well as SST+ neurons (Stachniak et al., 2021; previous version of this paper)-so why attribute the phenotype solely to SST+ circuitry? In fact, the functional distinctions between barrel and septal domains appear largely abolished in the Elfn1 KO.

    4. Reviewer #2 (Public review):

      Summary:

      Argunsah and colleagues demonstrate that SST-expressing interneurons are concentrated in the mouse septa and differentially respond to repetitive multi-whisker inputs. Identifying how a specific neuronal phenotype impacts responses is an advance.

      Strengths:

      (1) Careful physiological and imaging studies.

      (2) Novel result showing the role of SST+ neurons in shaping responses.

      (3) Good use of a knockout animal to further the main hypothesis.

      (4) Clear analytical techniques.

      Weaknesses:

      No major weaknesses were identified by this reviewer. Overall I appreciated the paper but feel it overlooked a few issues and had some recommendations on how additional clarifications could strengthen the paper. These include:

      (1) Significant work from Jerry Chen on how S1 neurons that project to M1 versus S2 respond in a variety of behavioral tasks should be included (e.g. PMID: 26098757). Similarly, work from Barry Connor's lab on intracortical versus thalamocortical inputs to SST neurons, as well as excitatory inputs onto these neurons (e.g. PMID: 12815025) should be included.

      (2) Using Layer 2/3 as a proxy to what is happening in layer 4 (~line 234). Given that layer 2/3 cells integrate information from multiple barrels, as well as receiving direct VPm thalamocortical input, and given the time window that is being looked at can receive input from other cortical locations, it is not clear that layer 2/3 is a proxy for what is happening in layer 4.

      (3) Line 267, when discussing distinct temporal response, it is not well defined what this is referring to. Are the neurons no longer showing peaks to whisker stimulation, or are the responses lasting a longer time? It is unclear why PV+ interneurons which may not be impacted by the Elfn1 KO and receive strong thalamocortical inputs, are not constraining activity.

      (4) Line 585 "the earliest CSD sink was identified as layer 4..." were post-hoc measurements made to determine where the different shank leads were based on the post-hoc histology?

      (5) For the retrograde tracing studies, how were the M1 and S2 injections targeted (stereotaxically or physiologically)? How was it determined that the injections were in the whisker region (or not)?

      (6) Were there any baseline differences in spontaneous activity in the speta versus barrel regions, and did this change in the KO animals?

    5. Reviewer #3 (Public review):

      Summary:

      This study investigates the functional differences between barrel and septal columns in the mouse somatosensory cortex, focusing on how local inhibitory dynamics, particularly involving Elfn1-expressing SST⁺ interneurons, may mediate temporal integration of multi-whisker (MW) stimuli in septa. Using a combination of in vivo multi-unit recordings, calcium imaging, and anatomical tracing, the authors propose that septa integrate MW input in an Elfn1-dependent manner, enabling functional segregation from barrel columns.

      Strengths:

      The core hypothesis is interesting and potentially impactful. While barrels have been extensively characterized, septa remain less understood, especially in mice, and this study's focus on septal integration of MW stimuli offers valuable insights into this underexplored area. If septa indeed act as selective integrators of distributed sensory input, this would add a novel computational role to cortical microcircuits beyond what is currently attributed to barrels alone. The narrative of this paper is intellectually stimulating.

      Weaknesses:

      The methods used in the current study lack the spatial and cellular resolution needed to conclusively support the central claims. The main physiological findings are based on unsorted multi-unit activity (MUA) recorded via low-channel-count silicon probes. MUA inherently pools signals from multiple neurons across different distances and cell types, making it difficult to assign activity to specific columns (barrel vs. septa) or neuron classes (e.g., SST⁺ vs. excitatory). The recording radius (~50-100 µm or more) and the narrow width of septa (~50-100 µm or less) make it likely that MUA from "septal" electrodes includes spikes from adjacent barrel neurons. The authors do not provide spike sorting, unit isolation, or anatomical validation that would strengthen spatial attribution. Calcium imaging is restricted to SST⁺ and VIP⁺ interneurons in superficial layers (L2/3), while the main MUA recordings are from layer 4, creating a mismatch in laminar relevance.

      Furthermore, while the role of Elfn1 in mediating short-term facilitation is supported by prior studies, no new evidence is presented in this paper to confirm that this synaptic mechanism is indeed disrupted in the knockout mice used here. Additionally, since Elfn1 is constitutively knocked out from development, the possibility of altered circuit formation-including changes in barrel structure and interneuron distribution, cannot be excluded and is not addressed.

    1. eLife Assessment

      This important study reports that the human posterior inferotemporal cortex (hPIT) functions as an attentional priority map, integrating both top-down and bottom-up attentional signals rather than serving solely as an object-processing region. The experiments and analyses are well conducted and provide convincing evidence that hPIT bridges dorsal and ventral attention networks and is robustly modulated by attention across diverse visual tasks. The study will be relevant for researchers investigating visual attention, high-level visual cortex, and the neural mechanisms that integrate endogenous and exogenous attentional control.

    2. Reviewer #1 (Public review):

      The manuscript titled "The distinct role of human PIT in attention control" by Huang et al. investigates the role of the human posterior inferotemporal cortex (hPIT) in spatial attention. Using fMRI experiments and resting-state connectivity analyses, the authors present compelling evidence that hPIT is not merely an object-processing area, but also functions as an attentional priority map, integrating both top-down and bottom-up attentional processes. This challenges the traditional view that attentional control is localized primarily in frontoparietal networks.

      The manuscript is strong and of high potential interest to the cognitive neuroscience community. Below, I raise questions and suggestions to help with the reliability, methodology, and interpretation of the findings.

      (1) The authors argue that hPIT satisfies the criteria for a priority map, but a clearer justification would strengthen this claim. For example, how does hPIT meet all four widely recognized criteria, such as spatial selectivity, attentional modulation, feature invariance, and input integration, when compared to classical regions such as LIP or FEF? A more systematic summary of how hPIT meets these benchmarks would be helpful. Additionally, to what extent are the observed attentional modulations in hPIT independent of general task difficulty or behavioral performance?

      (2) The authors report that hPIT modulation is invariant to stimulus category, but there appear to be subtle category-related effects in the data. Were the face, scene, and scrambled images matched not only in terms of luminance and spatial frequency, but also in terms of factors such as semantic familiarity and emotional salience? This may influence attentional engagement and bias interpretation.

      (3) The result that attentional load modulates hPIT is important and adds depth to the main conclusions. However, some clarifications would help with the interpretation. For example, were there observable individual differences in the strength of attentional modulation? How consistent were these effects across participants?

      (4) The resting-state data reveal strong connections between hPIT and both dorsal and ventral attention networks. However, the analysis is correlational. Are there any complementary insights from task-based functional connectivity or latency analyses that support a directional flow of information involving hPIT? In addition, do the authors interpret hPIT primarily as a convergence hub receiving input from both DAN and VAN, or as a potential control node capable of influencing activity in these networks? Also, were there any notable differences between hemispheres in either the connectivity patterns or attentional modulation?

      (5) A few additional questions arise regarding the anatomical characteristics of hPIT: How consistent were its location and size across participants? Were there any cases where hPIT could not be reliably defined? Given the proximity of hPIT to FFA and LOp, how was overlap avoided in ROI definition? Were the functional boundaries confirmed using independent contrasts?

    3. Reviewer #2 (Public review):

      Summary

      This study investigates the role of the human posterior inferotemporal cortex (hPIT) in attentional control, proposing that hPIT serves as an attentional priority map that integrates both top-down (endogenous) and bottom-up (exogenous) attentional processes. The authors conducted three types of fMRI experiments and collected resting-state data from 15 participants. In Experiment 1, using three different spatial attention tasks, they identified the hPIT region and demonstrated that this area is modulated by attention across tasks. In Experiment 2, by manipulating the presence or absence of visual stimuli, they showed that hPIT exhibits strong attentional modulation in both conditions, suggesting its involvement in both bottom-up and top-down attention. Experiment 3 examined the sensitivity of hPIT to stimulus features and attentional load, revealing that hPIT is insensitive to stimulus category but responsive to task load - further supporting its role as an attentional priority map. Finally, resting-state functional connectivity analyses showed that hPIT is connected to both dorsal and ventral attention networks, suggesting its potential role as a bridge between the two systems. These findings extend prior work on monkey PITd and provide new insights into the integration of endogenous and exogenous attention.

      Strengths

      (1) The study is innovative in its use of specially designed spatial attention tasks to localize and validate hPIT, and in exploring the region's role in integrating both endogenous and exogenous attention, as prior works focus primarily on its involvement in endogenous attention.

      (2) The authors provided very comprehensive experiment designs with clear figures and detailed descriptions.

      (3) A broad range of analyses was conducted to support the hypothesis that hPIT functions as an attentional priority map -- including experiments of attentional modulation under both top-down and bottom-up conditions, sensitivity to stimulus features and task load, and resting-state functional connectivity. These analyses showed consistent results.

      (4) Multiple appropriate statistical analyses - including t-tests, ANOVAs, and post-hoc tests - were conducted, and the results are clearly reported.

      Weaknesses

      (1) The sample size is relatively small (n = 15), and inter-subject variability is big in Figures 5 and 6, as seen in the spread of individual data points and error bars. The analysis of attention-modulated voxel map intersections appears to be influenced by multiple outliers.

      (2) The authors acknowledge important limitations, including the lack of exploration of feature-based attention and the temporal constraints inherent to fMRI.

      (3) Prior research has established that regions such as the prefrontal cortex (PFC) and posterior parietal cortex (PPC) are involved in both endogenous and exogenous attention and have been proposed as attentional priority maps. It remains unclear what is uniquely contributed by hPIT, how it functionally interacts with these classical attentional hubs, and whether its role is complementary or redundant. The study would benefit from more direct comparisons with these regions.

      (4) The functional connectivity analysis is only performed on resting-state data, and this approach does not capture context-dependent interactions. Task-based data analysis can provide stronger evidence.

      (5) The study does not report whether attentional modulation in hPIT is consistent across the two hemispheres. A comparison of hemispheric effects could provide important insight into lateralization and inter-individual variability, especially given the bilateral localization of hPIT.

    4. Author response:

      Reviewer #1 (Public review):

      The manuscript titled "The distinct role of human PIT in attention control" by Huang et al. investigates the role of the human posterior inferotemporal cortex (hPIT) in spatial attention. Using fMRI experiments and resting-state connectivity analyses, the authors present compelling evidence that hPIT is not merely an object-processing area, but also functions as an attentional priority map, integrating both top-down and bottom-up attentional processes. This challenges the traditional view that attentional control is localized primarily in frontoparietal networks.

      The manuscript is strong and of high potential interest to the cognitive neuroscience community. Below, I raise questions and suggestions to help with the reliability, methodology, and interpretation of the findings.

      Thank you for a nice summary of the key points of our study. Below you will find our responses to your questions.

      (1) The authors argue that hPIT satisfies the criteria for a priority map, but a clearer justification would strengthen this claim. For example, how does hPIT meet all four widely recognized criteria, such as spatial selectivity, attentional modulation, feature invariance, and input integration, when compared to classical regions such as LIP or FEF? A more systematic summary of how hPIT meets these benchmarks would be helpful. Additionally, to what extent are the observed attentional modulations in hPIT independent of general task difficulty or behavioral performance?

      Great suggestions! For the first suggestion, we will include a clearer justification in the revised manuscript. For the second one, all participants received task practice prior to scanning, and task accuracy exceeded 90% (we will explicitly report the accuracy rate in revision), suggesting the tasks were not overly demanding. Although ceiling effects limit the interpretability of behavioral-performance correlations, we argue that higher task demands would likely require greater attentional effort, leading to stronger modulation in hPIT, which aligns with our findings when we manipulated the attentional load.

      (2) The authors report that hPIT modulation is invariant to stimulus category, but there appear to be subtle category-related effects in the data. Were the face, scene, and scrambled images matched not only in terms of luminance and spatial frequency, but also in terms of factors such as semantic familiarity and emotional salience? This may influence attentional engagement and bias interpretation.

      The response of hPIT is generally insensitive to stimulus category, however, the reviewer is correct in noticing that attentional modulation in hPIT is slightly stronger to faces than scenes and scrambled images. Although faces used in the task had neutral expressions and the scene pictures were also neutral, it is indeed possible that potential semantic familiarity or emotional salience may contribute to the subtle category-related effects in the results of experiment 3. This point will be noted in the revised manuscript.

      (3) The result that attentional load modulates hPIT is important and adds depth to the main conclusions. However, some clarifications would help with the interpretation. For example, were there observable individual differences in the strength of attentional modulation? How consistent were these effects across participants?

      Yes, individual differences exist. In the revised manuscript, we will include individual subject data points in the figure 6B.

      (4) The resting-state data reveal strong connections between hPIT and both dorsal and ventral attention networks. However, the analysis is correlational. Are there any complementary insights from task-based functional connectivity or latency analyses that support a directional flow of information involving hPIT? In addition, do the authors interpret hPIT primarily as a convergence hub receiving input from both DAN and VAN, or as a potential control node capable of influencing activity in these networks? Also, were there any notable differences between hemispheres in either the connectivity patterns or attentional modulation?

      We agree that besides resting-state connection, task-based functional connectivity analyses would have the potential to provide additional information about whether hPIT serves as a convergence node or a control hub. While fMRI data are not the best to generate directional flow of information due to the low temporal resolution, we will conduct task-based functional connectivity analyses.

      We also observed modest hemispheric asymmetries in connectivity—for instance, both left and right hPIT showed stronger connectivity with right-hemisphere attention nodes. This will be described in the revised supplement.

      (5) A few additional questions arise regarding the anatomical characteristics of hPIT: How consistent were its location and size across participants? Were there any cases where hPIT could not be reliably defined? Given the proximity of hPIT to FFA and LOp, how was overlap avoided in ROI definition? Were the functional boundaries confirmed using independent contrasts?

      The size and location of hPIT are generally consistent across subjects, as shown in Supplementary Figure 1. The consistency is also supported by figure 4C. The hPIT is defined by conjunction maps across three tasks and then manually delineated avoiding overlapping voxels with FFA and LOp. The FFA was defined using an independent contrast (Exp3 contrast [face-scene]) and the Lop location was defined by anatomical parcellation (Glasser et al., 2016).

      Reviewer #2 (Public review):

      Summary

      This study investigates the role of the human posterior inferotemporal cortex (hPIT) in attentional control, proposing that hPIT serves as an attentional priority map that integrates both top-down (endogenous) and bottom-up (exogenous) attentional processes. The authors conducted three types of fMRI experiments and collected resting-state data from 15 participants. In Experiment 1, using three different spatial attention tasks, they identified the hPIT region and demonstrated that this area is modulated by attention across tasks. In Experiment 2, by manipulating the presence or absence of visual stimuli, they showed that hPIT exhibits strong attentional modulation in both conditions, suggesting its involvement in both bottom-up and top-down attention. Experiment 3 examined the sensitivity of hPIT to stimulus features and attentional load, revealing that hPIT is insensitive to stimulus category but responsive to task load - further supporting its role as an attentional priority map. Finally, resting-state functional connectivity analyses showed that hPIT is connected to both dorsal and ventral attention networks, suggesting its potential role as a bridge between the two systems. These findings extend prior work on monkey PITd and provide new insights into the integration of endogenous and exogenous attention.

      Strengths

      (1) The study is innovative in its use of specially designed spatial attention tasks to localize and validate hPIT, and in exploring the region's role in integrating both endogenous and exogenous attention, as prior works focus primarily on its involvement in endogenous attention.

      (2) The authors provided very comprehensive experiment designs with clear figures and detailed descriptions.

      (3) A broad range of analyses was conducted to support the hypothesis that hPIT functions as an attentional priority map -- including experiments of attentional modulation under both top-down and bottom-up conditions, sensitivity to stimulus features and task load, and resting-state functional connectivity. These analyses showed consistent results.

      (4) Multiple appropriate statistical analyses - including t-tests, ANOVAs, and post-hoc tests - were conducted, and the results are clearly reported.

      Thank you for a nice summary of the key points and strengths of our study.

      Weaknesses

      (1) The sample size is relatively small (n = 15), and inter-subject variability is big in Figures 5 and 6, as seen in the spread of individual data points and error bars. The analysis of attention-modulated voxel map intersections appears to be influenced by multiple outliers.

      We agree that the sample size (n = 15) is not ideal, and we acknowledge that some data points in Figures 5 and 6 appear to be potential outliers. However, according to conventional outlier detection criteria, all data points are within three standard deviations of the group mean and were therefore retained for analysis. Moreover, the attention-modulated voxel intersection map shown in Figure 4C is insensitive to outliers, because the intersection map plotted is based on the number of subjects.

      (2) The authors acknowledge important limitations, including the lack of exploration of feature-based attention and the temporal constraints inherent to fMRI.

      Yes, we hope to address these limitations in future studies.

      (3) Prior research has established that regions such as the prefrontal cortex (PFC) and posterior parietal cortex (PPC) are involved in both endogenous and exogenous attention and have been proposed as attentional priority maps. It remains unclear what is uniquely contributed by hPIT, how it functionally interacts with these classical attentional hubs, and whether its role is complementary or redundant. The study would benefit from more direct comparisons with these regions.

      In this study, we define the ROI base on intersection across three different types of spatial attention tasks, and the hPIT stands out in showing spatial attentional modulation across tasks. This could be due to the weak lateralized responses in PFC/PPC. To evaluate whether a region qualifies as a priority map, we applied four criteria (as mentioned in introduction). While dorsal and ventral attention network (DAN and VAN) regions can be considered important components of the priority map system, our findings suggest that among the regions tested, hPIT meets all four criteria. In Experiment 2, we included regions such as VFC (as part of PFC) and IPS (as part of PPC), and our findings suggest these areas are more involved in top-down attention. We agree with the reviewer’s suggestion and will perform additional analysis on PPC and PFC.

      (4) The functional connectivity analysis is only performed on resting-state data, and this approach does not capture context-dependent interactions. Task-based data analysis can provide stronger evidence.

      We acknowledge that resting-state FC is limited in assessing task-specific communication. To further investigate the role of hPIT, we plan to conduct task-based functional connectivity analyses.

      (5) The study does not report whether attentional modulation in hPIT is consistent across the two hemispheres. A comparison of hemispheric effects could provide important insight into lateralization and inter-individual variability, especially given the bilateral localization of hPIT.

      We thank the reviewer for this suggestion. hPIT was localized bilaterally using the same intersection-based method in Experiment 1. We have now performed additional analysis and found in Experiment 3, the difference in attentional modulation between high and low load conditions was significant in the right hPIT but not in the left. This result will be reported in the revised manuscript.

    1. eLife Assessment

      This study provides important findings on the neural circuits underlying dishabituation of the olfactory avoidance response in Drosophila. The data as presented provide solid evidence that the dishabituation involves distinct pathways from habituation. They show that reward-activated dopaminergic neurons provide input for within-modal dishabituation, while punishment-activated dopaminergic neurons provide input for cross-modal dishabituation. The work will interest neuroscientists, particularly behavioral neuroscientists working on habituation, neural circuits, and the dopaminergic system.

    2. Reviewer #1 (Public review):

      Summary:

      Charonitakis and co-authors characterize dishabituation in adult flies, where they use olfactory habituation to octanol, then dishabituate the flies with disruptions of electric shock or yeast odors. They systematically investigate the neurotransmitters and neural circuits involved in dishabituation and figure out a lot about how this process works in the brain, as an independent circuit. I like the paper, and I like the very structured approach to figuring out the problem.

      Strengths:

      The introduction nicely sets the stage for the work presented, bringing in knowledge from other organisms and motivating the study.

      The results section lays out a logical set of experiments, using a common set of behavioral assays in many flies exposed to thermogenetic or optogenetic manipulation. The paper systematically figures out the necessity and/or sufficiency of specific brain regions and neurotransmitters, culminating in a new understanding of how the important process of dishabituation works.

      I like the bar graph representation for the data throughout, with the helpful icons - if a paper figures are going to be 90% bar graphs, it helps when they are super clear like this! And I like how all the parts build up to the conclusion in the last figure, nicely summarizing the thorough characterization of dishabituation.

      Weaknesses:

      There are no major concerns, but some material could be added for clarity and to make the work more accessible to a more general scientific audience. A figure clearly showing the habituation protocol and the use of the dishabituators would be a good addition, even if the procedure has been done before and is cited. There can always be readers who are seeing this for the first time.

      It would also be nice to comment on other ways dishabituation can happen (for example, when the stimulus is removed for a short time and returns) and what their time scales are.

      And more generally, the paper could perhaps improve by making a stronger case for why the results are important not just for flies but for neuroscience in general.

    3. Reviewer #2 (Public review):

      This is an interesting study in Drosophila comparing potentially differential requirements for subsets of Kenyon Cells (KCs) and Dopaminergic neurons (DANS) in olfactory dishabituation driven by either a novel odor ("homosensory") or footshock ("heterosensory). The authors measure olfactory aversion to Octanol (OCT) in a T-maze, induce olfactory habituation with a 4-minute prior exposure to OCT, and use either brief yeast odor (YO) or footshock (FS) to achieve dishabituation. The major observation that YO-mediated dishabituation is mediated by reward-activated DANs (PAM cluster), while FS-mediated dishabituation is mediated by punishment-activated PPL-DANs is generally solid and convincing. Also convincing are experiments showing the involvement of KCs in the pathway for YO and FS-induced dishabituation, and the argument that KCs drive DAN activation that causes dishabituation, though not experimentally shown, is more than reasonable. The work is significant because, as the authors take pains to point out, circuits and pathways for dishabituation have been very lightly studied, and clear identification of dopaminergic neuron subsets in dishabituation achieved by different means represents unique and interesting progress.

      However, the claim that this represents a fundamental difference between homosensory and heterosensory pathways for dishabituation is overstated. The introductory section does not adequately present current broad models for habituation and dishabituation. There are many different time scales, even for Drosophila olfactory habituation. These, as well as potential underlying mechanistic differences, need to be acknowledged; any claim should be specifically qualified for the time scales being studied here. Additionally, there are several unclear, vague, and inaccurate sections and statements. A more careful, precise, and considered presentation of current views, as well as more measured claims of the impact of the findings, would substantially enhance my enthusiasm.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, Charonitakis, Pasadaki et al. investigated the neural circuits underlying homosensory/within-modal and heterosensory/cross-modal dishabituation of the olfactory avoidance response in Drosophila. Taking advantage of the accessible and sophisticated gene expression manipulation tools in the flies, this study traced neural pathways underlying response facilitation caused by different types of sensory stimuli and revealed both distinct and convergent neural components underlying these different forms of behavioral plasticity. The study first demonstrated that olfactory habituation of the octanol avoidance response can be facilitated by either a different odor (homosensory stimulus) or a foot shock (heterosensory stimulus). Then, the flies' nervous system was manipulated with gene expression tools to identify key neural components involved in mediating the behavioral facilitation caused by different types of sensory stimuli. It was found that different sensory stimuli are input into different parts of the nervous system, and signals converge in the mushroom bodies to generate response facilitation. It was also found that these facilitatory pathways are different from the olfactory habituation pathway in the lateral horns.

      Strengths:

      The authors took full advantage of the advanced genetic tools in flies and performed a series of experiments to pinpoint neural components in each pathway.

      Weaknesses:

      The key issue is that the main concepts of this manuscript appear to be based on a misunderstanding/misinterpretation of the literature. As the authors set out to settle the debate "whether the novel dishabituating stimulus elicits sensitization of the habituated circuits, or it engages distinct neuronal routes to bypass habituation reinstating the naïve response", it seems that the authors based their investigation on the premise that "sensitization" is mediated by a facilitatory process within the S-R pathway, and "dishabituation" by a facilitatory process outside the S-R pathway. This is not the status quo in the field, particularly with the prevailing theory like the Dual-Process Theory.

      The original version of Dual-Process Theory (Groves and Thompson 1970, but also see Thompson 2008, Neurobiol Learn Mem) already hypothesized that habituation happens within the specific S-R pathway, and sensitization occurs separately in an "organism-wide" state system that modulates the output of all S-R pathways. Dishabituation is recognized by the Dual-Process Theory as sensitization (organism-wide facilitation) manifested on top of existing habituation (depressed S-R pathway). This notion has been supported by a wide range of studies, including cat spinal cord reflex (e.g. Spencer et al. 1966) and work in Aplysia on heterosynaptic facilitation for both sensitization and dishabituation. Therefore, simply showing that the newly identified facilitatory pathways are outside the S-R habituation pathway is insufficient to demonstrate dishabituation.

      As behavioral facilitation of a habituated response can be achieved by dishabituating (specific recovery of the S-R pathway) and/or superimposed sensitizing (organism-wide) processes, dishabituation and sensitization of this olfactory response must be first dissociated; however, the study provided no evidence for the dissociation. Without this piece of evidence, the claim of this paper that the newly identified pathways mediate dishabituation is not fully supported.

      The literature review of this manuscript has some discrepancies. In the introduction, the authors wrote "initial studies in Aplysia were consistent with the "dual-process theory" (Groves and Thompson 1979), where response recovery due to dishabituation appeared to result from sensitization superimposed on habituation, thus driving reversal of the attenuated response (Carew, Castellucci et al. 1971, Hochner, Klein et al. 1986, Marcus, Nolen et al. 1988, Ghirardi, Braha et al. 1992, Cohen, Kaplan et al. 1997, Antonov, Kandel et al. 1999, Hawkins, Cohen et al. 2006)." Hochner 1986 and Marcus 1988 in fact indicated otherwise. Hochner 1986 suggests that dishabituation and sensitization involve different molecular processes, while Marcus 1988 showed that dishabituation and sensitization have different behavioral characteristics. Therefore, the authors' statement is not supported by the cited literature.

    5. Author response:

      Below, we will address point by point any and all concerns of the reviewers.

      Reviewer #1:

      There are no major concerns, but some material could be added for clarity and to make the work more accessible to a more general scientific audience.

      We will add text for clarity and to make the work more accessible to a general audience per this comment and similar suggestions of the other reviewers.

      (1.1) A figure clearly showing the habituation protocol and the use of the dishabituators would be a good addition, even if the procedure has been done before and is cited. There can always be readers who are seeing this for the first time.

      We do think this is a good idea as the time scales of the experiment will be clearly marked as well and we plan to generate one in the revised manuscript.

      (1.2) It would also be nice to comment on other ways dishabituation can happen (for example, when the stimulus is removed for a short time and returns) and what their time scales are.

      If the stimulus is withheld, spontaneous recovery occurs, a process distinct from dishabituation and worth exploring on its own. In a previous publication (Semelidou et al. eLife 2018;7:e39569), we have shown that in this habituation paradigm with 4 min exposure either to the aversive Octanol, or the attractive Ethyl Acetate, spontaneous recovery occurs on or after 6 minutes after the habituated stimulus is withheld. This contrasts the immediate effect of the single dishabituating stimulus, delivered for a few seconds at the end of exposure to the habituator. Granted that per Thomson (Neurobiol Learn Mem. 2009), spontaneous recovery is a characteristic of habituation, we will work this point in the text.

      (1.3) And more generally, the paper could perhaps improve by making a stronger case for why the results are important not just for flies but for neuroscience in general.

      Thank you for the encouragement. We will try to rationally generalize our findings.

      Reviewer #2:

      (2.1) However, the claim that this represents a fundamental difference between homosensory and heterosensory pathways for dishabituation is overstated.

      We had no intention of stating more than the fact that footshock and yeast odor dishabituators relay these stimuli to the mushroom bodies via distinct dopaminergic neurons, hence differentiating distinct dishabituating stimuli via the mechanosensory (footshock) and olfactory (yeast odor) modalities as they engage the mushroom bodies. As the reviewer suggests we will use more measured and specific language to state the above.

      (2.2) The introductory section does not adequately present current broad models for habituation and dishabituation.

      This was not done intentionally, but rather because we aimed at a less extended introductory section and ostensibly this resulted in brief and possibly inadequate presentation of current habituation models. We will present a much more detailed introduction and detail of habituation and dishabituation models in the revised manuscript (Also see reply to point 3.5 below).

      (2.3) There are many different time scales, even for Drosophila olfactory habituation. These, as well as potential underlying mechanistic differences, need to be acknowledged; any claim should be specifically qualified for the time scales being studied here.

      We understand and appreciate the point of the reviewer, as well as its significance and we will address this both in the revised text, but also by the paradigm figure we will add as stated above (point 1.1), where the time scales will be explicitly included and emphasized.

      (2.4) Additionally, there are several unclear, vague, and inaccurate sections and statements. A more careful, precise, and considered presentation of current views, as well as more measured claims of the impact of the findings, would substantially enhance my enthusiasm.

      We will address these concerns of course, though pointing out the specific offending parts would ascertain addressing them thoroughly. As stated above, we will incorporate current views in the introduction and when discussing our results and their impact.

      Reviewer #3:

      (3.1) The key issue is that the main concepts of this manuscript appear to be based on a misunderstanding/misinterpretation of the literature. As the authors set out to settle the debate "whether the novel dishabituating stimulus elicits sensitization of the habituated circuits, or it engages distinct neuronal routes to bypass habituation reinstating the naïve response", it seems that the authors based their investigation on the premise that "sensitization" is mediated by a facilitatory process within the S-R pathway, and "dishabituation" by a facilitatory process outside the S-R pathway. This is not the status quo in the field, particularly with the prevailing theory like the Dual-Process Theory.

      We appreciate the reviewer’s comment and the opportunity to clarify the conceptual framework of our work. Our intention was in fact to test the Groves and Thomson hypothesis (Neurobiol Learn Mem. 2009), in our olfactory habituation system. As such, dishabituation could have been the result of a facilitatory process within the S-R pathway, or from mechanisms outside of it. Our experimental design allowed to distinguish these possibilities and our results clearly show that dishabituation involves circuitry outside the S-R pathway. We do thank the reviewer for pointing out that we have not articulated clearly this intention and we will take care to communicate this effectively in the revised manuscript.

      (3.2) The original version of Dual-Process Theory (Groves and Thompson 1970, but also see Thompson 2008, Neurobiol Learn Mem) already hypothesized that habituation happens within the specific S-R pathway, and sensitization occurs separately in an "organism-wide" state system that modulates the output of all S-R pathways.

      As mentioned above, we are aware of the Dual-Process hypothesis. In fact, our data demonstrate that activity outside the olfactory S-R pathway, engaging novel neuronal circuits, mediates dishabituation. Unlike habituation, these circuits mediating dishabituation include at minimum, the mushroom bodies, the dopaminergic system and the APL neurons. In our view this does not support the “organism-wide state” system, but rather particular circuits that in agreement with the Groves and Thomson hypothesis, are outside the S-R pathway and modulate its behavioral output. We will work these concepts in the discussion section of the revised manuscript.

      (3.3) Dishabituation is recognized by the Dual-Process Theory as sensitization (organism-wide facilitation) manifested on top of existing habituation (depressed S-R pathway). This notion has been supported by a wide range of studies, including cat spinal cord reflex (e.g. Spencer et al. 1966) and work in Aplysia on heterosynaptic facilitation for both sensitization and dishabituation. Therefore, simply showing that the newly identified facilitatory pathways are outside the S-R habituation pathway is insufficient to demonstrate dishabituation.

      We respectfully disagree with the concluding sentence here. In all of our experiments, we observe a clear recovery of olfactory avoidance after exposure to the footshock, or yeast odor dishabituators. Moreover, the dishabituators are emulated by (photo)activation of particular neuronal circuits and the recovery of olfactory avoidance is blocked when these circuits are silenced. Regardless of whether this recovery is classified as dishabituation via sensitization or another facilitatory process, the key point is that the habituated response is reliably reinstated contingent upon the dishabituating stimulus. We believe this meets the established criteria for dishabituation.

      (3.4) As behavioral facilitation of a habituated response can be achieved by dishabituating (specific recovery of the S-R pathway) and/or superimposed sensitizing (organism-wide) processes, dishabituation and sensitization of this olfactory response must be first dissociated; however, the study provided no evidence for the dissociation. Without this piece of evidence, the claim of this paper that the newly identified pathways mediate dishabituation is not fully supported.

      We agree with the reviewer that we have not provided specific evidence dissociating dishabituation and sensitization of the particular olfactory response beyond the evidence implicating particular circuitry in the outcome of facilitation of the olfactory response.

      It should be noted that in photoactivation of the implicated circuitries in naïve flies, we do not observe enhanced octanol avoidance, suggesting that activation of these circuits alone does not induce sensitization. Moreover, our results show that neither footshock nor yeast odor drive an organism-wide sensitization, as silencing specific circuits was sufficient to block dishabituation—something that would not be expected if a global sensitization process was responsible of reinstating the olfactory response.

      Nonetheless, we will also attempt to dissociate sensitization from dishabituation using mutants previously reported deficient in sensitization (Duerr and Quinn, PNAS 1982), assuming these mutants retain normal olfactory habituation. We will also try sensitization protocols in the case of within-modal dishabituation to further clarify the underlying mechanisms. In principle, this includes using diluted Octanol as the habituating stimulus and attempt dishabituation with concentrated octanol.

      (3.5) The literature review of this manuscript has some discrepancies. In the introduction, the authors wrote "initial studies in Aplysia were consistent with the "dual-process theory" (Groves and Thompson 1979), where response recovery due to dishabituation appeared to result from sensitization superimposed on habituation, thus driving reversal of the attenuated response (Carew, Castellucci et al. 1971, Hochner, Klein et al. 1986, Marcus, Nolen et al. 1988, Ghirardi, Braha et al. 1992, Cohen, Kaplan et al. 1997, Antonov, Kandel et al. 1999, Hawkins, Cohen et al. 2006)." Hochner 1986 and Marcus 1988 in fact indicated otherwise. Hochner 1986 suggests that dishabituation and sensitization involve different molecular processes, while Marcus 1988 showed that dishabituation and sensitization have different behavioral characteristics. Therefore, the authors' statement is not supported by the cited literature.

      We are grateful to the reviewer for pointing out these significant discrepancies, consequent of multiple rounds of edits followed by our own oversight. These important publications for this manuscript will be referenced properly in the revised version of the manuscript.

    1. eLife Assessment

      This manuscript presents a valuable computational tool for identifying 3-5 gene regulatory network topologies capable of generating oscillatory dynamics. The application of Monte Carlo Tree Search to circuit design is novel and effectively expands the scale at which non-linear behaviours can be explored in silico. The efficiency of the proposed algorithm is convincing, and the work will be of interest to the systems and synthetic biology communities. While the evolutionary implications remain unclear, the methodological contribution represents a significant advance in the field.

    2. Joint Public Review:

      This manuscript presents an algorithm for identifying network topologies that exhibit a desired qualitative behaviour, with a particular focus on oscillations. The approach is first demonstrated on 3-node networks, where results can be validated through exhaustive search, and then extended to 5-node networks, where the search space becomes intractable. Network topologies are represented as directed graphs, and their dynamical behaviour is classified using stochastic simulations based on the Gillespie algorithm. To efficiently explore the large design space, the authors employ reinforcement learning via Monte Carlo Tree Search (MCTS), framing circuit design as a sequential decision-making process.

      This work meaningfully extends the range of systems that can be explored in silico to uncover non-linear dynamics and represents a valuable methodological advance for the fields of systems and synthetic biology.

      Strengths

      The evidence presented is strong and compelling. The authors validate their results for 3-node networks through exhaustive search, and the findings for 5-node networks are consistent with previously reported motifs, lending credibility to the approach. The use of reinforcement learning to navigate the vast space of possible topologies is both original and effective, and represents a novel contribution to the field. The algorithm demonstrates convincing efficiency, and the ability to identify robust oscillatory topologies is particularly valuable. Expanding the scale of systems that can be systematically explored in silico marks a significant advance for the study of complex gene regulatory networks.

      Weaknesses

      The principal weakness of the manuscript lies in the interpretation of biological robustness. The authors identify network topologies that sustain oscillatory behaviour despite perturbations to the system or parameters. However, in many cases, this persistence is due to the presence of partially redundant oscillatory motifs within the network. While this observation is interesting and of clear value for circuit design, framing it as evidence of evolutionary robustness may be misleading. The "mutant" systems frequently exhibit altered oscillatory properties, such as changes in frequency or amplitude. From a functional cellular perspective, mere oscillation is insufficient - preservation of specific oscillation characteristics is often essential. This is particularly true in systems like circadian clocks, where misalignment with environmental cycles can have deleterious effects. Robustness, from an evolutionary standpoint, should therefore be framed as the capacity to maintain the functional phenotype, not merely the qualitative behaviour.

      A secondary limitation is that, despite the methodological advances, the scale of the systems explored remains modest. While moving from 3- to 5-node systems is non-trivial, five elements still represent a relatively small network. It is somewhat surprising that the algorithm does not scale further, particularly when considering the performance of MCTS in other domains - for instance, modern chess engines routinely explore far larger decision trees. A discussion on current performance bottlenecks and potential avenues for improving scalability would be valuable.

      Finally, it is worth noting that the emergence of oscillations in a model often depends not only on the topology but also critically on parameter choices and the nature of the nonlinearities. The use of Hill functions and high Hill coefficients is a common strategy to induce oscillatory dynamics. Thus, the reported results should be interpreted within the context of the modelling assumptions and parameter regimes employed in the simulations.

    1. eLife Assessment

      This useful study reports findings that support the use of the Open Field Test in Drosophila as a model to study "emotion-like states", which are behavioral responses to several stressful or aversive treatments, and resilience upon their subsequent removal. Behavioral data, by employing established stress-causing treatments and genetic manipulations, are solid. While the results and conceptual framework of this work will be of interest to behaviorists regardless of animal models, the novelty of this work over previous studies could have been clearer.

    2. Reviewer #1 (Public review):

      Summary:

      Animal behavior is continuously influenced by the internal state moment-by-moment, including emotion primitives, as the authors pointed out. Although emotion is a more human-related state, evolutionary conservation is undeniable, which can be inferred by the behavioral manifestation. To further elaborate on the neuronal mechanisms of emotion primitives, the simplest behavioral parameter related to emotional primitives should be well-characterized. In this study, the authors described in detail wall-following behavior (WAFO) and the total walking distance (TOWA) using flies after subjecting them to various conditions or flies being genetically manipulated according to the previous reports that could affect emotion primitives. Overall, the study is well designed and structured. In addition, the discussion on emotion primitives will be of value to the field.

      Strengths:

      The strength of this study is its use of a simple behavioral parameter, TOWA, and also a simple design of behavior, WAFO. The importance of the behavioral assay is reproducibility and comparability. In fact, the author demonstrated a summary of comparisons where different treatments result in scalable behavioral changes in WAFO and TOWA.

      Weaknesses:

      The weakness of the study is the lack of further experiments to support their assumption related to TOWA.

      The authors suggested that TOWA can be interpreted as a behavioral proxy for exogenously induced arousal. However, it could be interpreted as higher activity, although the authors argued that the circadian clock increasing locomotor activity around ZT0 and ZT12 does not affect TOWA, and therefore TOWA is not related to the locomotor activity per se. As the author cited, flies lose locomotor activity in the circular arena of 6.6 cm in diameter, whereas they continuously move during a 1-h recording in the authors' arena of 1 cm in diameter.

      I would agree that the arena of 1 cm in diameter, but not 6.6 cm in diameter, serves as an exogenous stimulus inducing arousal, and TOWA is manifested by arousal. However, TOWA would also be affected by other behavioral parameters, including the activity, motivation for exploration, or perception of the space. Therefore, it could be reasonable to re-examine some of the flies tested in this study in the circular arena of 6.6 cm in diameter. If arousal is biased by the components presented in Figure 6 and TOWA can assess mainly exogenously induced arousal, the treatment altering TOWA in the arena of 1 cm in diameter would not affect their behavior in the arena of 6.6 cm in diameter. My concern is that Figure 6 may demonstrate too simplistic a diagram to interpret the results. I would suggest adding the experiments using the arena of 6.6 cm diameter or softening the argument.

    3. Reviewer #2 (Public review):

      Summary:

      This work seeks to establish the Open Field Test (OFT) as a paradigm to measure emotion-like states in the fruit fly Drosophila. To do this, the authors first applied various stressors and aversive stimuli to wild-type flies and tracked their locomotion. By measuring wall-following (WAFO) and total walking (TOWA), they showed that these behaviors are generally increased by stressors, but return to baseline levels after their removal. Then, they used the same approach to analyze the effects of pharmacological, genetic, and neuronal activity manipulations, showing that diazepam, serotonin, dopamine, and neuropeptide F affect locomotion in the OFT in largely expected ways that are consistent with their functions in rodents. Finally, the authors demonstrate that wild-type fly strains from the laboratory or caught in the wild differ significantly in their OFT behavior, with wild-caught flies generally behaving as if more 'stressed'. Given the numerous advantages of Drosophila, this study can form the foundation for using the OFT in conjunction with this animal model to elucidate the molecular and neuronal mechanisms that underlie emotion primitives.

      Strengths:

      The main strength of the paper is the rigorous use of several stressful or aversive treatments and their subsequent removal to show that WAFO is a robust proxy for stress-like emotional primitives across multiple stimuli. The pharmacological, molecular, and neuronal activity manipulations, although more limited in scope, lend further credence to the authors' central claim.

      Weaknesses:

      The conceptual advance of this research is unclear, as previous work (Mohammad et al., 2016, Curr Biol.) carried out similar treatments and manipulations and reached largely similar conclusions. Moreover, while WAFO is a good proxy for 'stress', I am not convinced that TOWA necessarily represents an emotional state in all cases. Indeed, as the authors themselves acknowledge, changes in total walking may be associated with other factors, such as starvation-induced hyperactivity, physical exhaustion after sleep deprivation, increased sex drive after mating, alcohol sedation, etc. Another unclear point is the interpretation of some unexpected results, such as the finding that both serotonin transporter overexpression and its knockdown give the same phenotype. Finally, there are some issues with the use of the OFT in rodent research (e.g., inconsistent effects of anxiolytic drugs; see Rosso et al., 2022, Neurosci Biobehav Rev., for a meta-analysis). These should be explained to place the Drosophila findings in their appropriate context.

    1. Reviewer #3 (Public review):

      Summary:

      This is a valuable study providing solid evidence that the putative non-canonical initiation factor eIF2A has little or no role in the translation of any expressed mRNAs in cultured human (primarily HeLa) cells. Previous studies have implicated eIF2A in GTP-independent recruitment of initiator tRNA to the small (40S) ribosomal subunit, a function analogous to canonical initiation factor eIF2, and in supporting initiation on mRNAs that do not require scanning to select the AUG codon or that contain near-cognate start codons, especially upstream ORFs with non-AUG start codons, and may use the cognate elongator tRNA for initiation. Moreover, the detected functions for eIF2A were limited to, or enhanced by, stress conditions where canonical eIF2 is phosphorylated and inactivated, suggesting that eIF2A provides a back-up function for eIF2 in such stress conditions. CRISPR gene editing was used to construct two different knock-out cell lines that were compared to the parental cell line in a large battery of assays for bulk or gene-specific translation in both unstressed conditions and when cells were treated with inhibitors that induce eIF2 phosphorylation. None of these assays identified any effects of eIF2A KO on translation in unstressed or stressed cells, indicating little or no role for eIF2A as a back-up to eIF2 and in translation initiation at near-cognate start codons, in these cultured cells.

      The study is very thorough and generally well executed, examining bulk translation by puromycin labeling and polysome analysis and translational efficiencies of all expressed mRNAs by ribosome profiling, with extensive utilization of reporters equipped with the 5'UTRs of many different native transcripts to follow up on the limited number of genes whose transcripts showed significant differences in translational efficiencies (TEs) in the profiling experiments. They also looked for differences in translation of uORFs in the profiling data and examined reporters of uORF-containing mRNAs known to be translationally regulated by their uORFs in response to stress, going so far as to monitor peptide production from a uORF itself. The high precision and reproducibility of the replicate measurements instil strong confidence that the myriad of negative results they obtained reflects the lack of eIF2A function in these cells rather than data that would be too noisy to detect small effects on the eIF2A mutations. They also tested and found no evidence for a recent claim that eIF2A localizes to the cytoplasm in stress and exerts a global inhibition of translation. Given the numerous papers that have been published reporting functions of eIF2A in specific and general translational control, this study is important in providing abundant, high-quality data to the contrary, at least in these cultured cells.

      Strengths:

      The paper employed two CRISPR knock-out cell lines and subjected them to a combination of high-quality ribosome profiling experiments, interrogating both main coding sequences and uORFs throughout the translatome, which was complemented by extensive reporter analysis, and cell imaging in cells both unstressed and subjected to conditions of eIF2 phosphorylation, all in an effort to test previous conclusions about eIF2A functioning as an alternative to eIF2.

      Weaknesses:

      No major issues were observed as the authors have provided additional evidence of the extent of ISR induction by tunicamycin. The discussion was also expanded to address concerns stemming from the previous version of the manuscript.

      [Editors note: Reviewers and editors concluded that the authors revised the article in a satisfactory manner and no further concerns were raised]

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary: 

      Beyond what is stated in the title of this paper, not much needs to be summarized. eIF2A in HeLa cells promotes translation initiation of neither the main ORFs nor short uORFs under any of the conditions tested. 

      Strengths: 

      Very comprehensive, in fact, given the huge amount of purely negative data, an admirably comprehensive and well-executed analysis of the factor of interest. 

      Weaknesses: 

      The study is limited to the HeLa cell line, focusing primarily on KO of eIF2A and neglecting the opposite scenario, higher eIF2A expression which could potentially result in an increase in non-canonical initiation events. 

      We thank the reviewer for the positive evaluation. As suggested by the reviewer in the detailed recommendations, we will clarify in the title, abstract and text that our conclusions are limited to HeLa cells. Furthermore, as suggested we will test the effect of eIF2A overexpression on the luciferase reporter constructs, and will upload a revised manuscript.

      Reviewer #2 (Public review):

      Summary 

      Roiuk et al describe a work in which they have investigated the role of eIF2A in translation initiation in mammals without much success. Thus, the manuscript focuses on negative results. Further, the results, while original, are generally not novel, but confirmatory, since related claims have been made before independently in different systems with Haikwad et al study recently published in eLife being the most relevant. 

      Despite this, we find this work highly important. This is because of a massive wealth of unreliable information and speculations regarding eIF2A role in translation arising from series of artifacts that began at the moment of eIF2A discovery. This, in combination with its misfortunate naming (eIF2A is often mixed up with alpha subunit of eIF2, eIF2S1) has generated a widespread confusion among researchers who are not experts in eukaryotic translation initiation. Given this, it is not only justifiable but critical to make independent efforts to clear up this confusion and I very much appreciate the authors' efforts in this regard.  

      Strengths 

      The experimental investigation described in this manuscript is thorough, appropriate and convincing. 

      Weaknesses 

      However, we are not entirely satisfied with the presentation of this work which we think should be improved. 

      We thank the reviewer for the positive evaluation. We will revise the manuscript according to the reviewer's suggestions made in the detailed recommendations.

      Reviewer #3 (Public review):

      Summary: 

      This is a valuable study providing solid evidence that the putative non-canonical initiation factor eIF2A has little or no role in the translation of any expressed mRNAs in cultured human (primarily HeLa) cells. Previous studies have implicated eIF2A in GTP-independent recruitment of initiator tRNA to the small (40S) ribosomal subunit, a function analogous to canonical initiation factor eIF2, and in supporting initiation on mRNAs that do not require scanning to select the AUG codon or that contain near-cognate start codons, especially upstream ORFs with non-AUG start codons, and may use the cognate elongator tRNA for initiation. Moreover, the detected functions for eIF2A were limited to, or enhanced by, stress conditions where canonical eIF2 is phosphorylated and inactivated, suggesting that eIF2A provides a back-up function for eIF2 in such stress conditions. CRISPR gene editing was used to construct two different knockout cell lines that were compared to the parental cell line in a large battery of assays for bulk or gene-specific translation in both unstressed conditions and when cells were treated with inhibitors that induce eIF2 phosphorylation. None of these assays identified any effects of eIF2A KO on translation in unstressed or stressed cells, indicating little or no role for eIF2A as a back-up to eIF2 and in translation initiation at near-cognate start codons, in these cultured cells. 

      The study is very thorough and generally well executed, examining bulk translation by puromycin labeling and polysome analysis and translational efficiencies of all expressed mRNAs by ribosome profiling, with extensive utilization of reporters equipped with the 5'UTRs of many different native transcripts to follow up on the limited number of genes whose transcripts showed significant differences in translational efficiencies (TEs) in the profiling experiments. They also looked for differences in translation of uORFs in the profiling data and examined reporters of uORF-containing mRNAs known to be translationally regulated by their uORFs in response to stress, going so far as to monitor peptide production from a uORF itself. The high precision and reproducibility of the replicate measurements instil strong confidence that the myriad of negative results they obtained reflects the lack of eIF2A function in these cells rather than data that would be too noisy to detect small effects on the eIF2A mutations. They also tested and found no evidence for a recent claim that eIF2A localizes to the cytoplasm in stress and exerts a global inhibition of translation. Given the numerous papers that have been published reporting functions of eIF2A in specific and general translational control, this study is important in providing abundant, high-quality data to the contrary, at least in these cultured cells. 

      Strengths: 

      The paper employed two CRISPR knock-out cell lines and subjected them to a combination of high-quality ribosome profiling experiments, interrogating both main coding sequences and uORFs throughout the translatome, which was complemented by extensive reporter analysis, and cell imaging in cells both unstressed and subjected to conditions of eIF2 phosphorylation, all in an effort to test previous conclusions about eIF2A functioning as an alternative to eIF2. 

      Weaknesses: 

      There is some question about whether their induction of eIF2 phosphorylation using tunicamycin was extensive enough to state forcefully that eIF2A has little or no role in the translatome when eIF2 function is strongly impaired. Also, similar conclusions regarding the minimal role of eIF2A were reached previously for a different human cell line from a study that also enlisted ribosome profiling under conditions of extensive eIF2 phosphorylation; although that study lacked the extensive use of reporters to confirm or refute the identification by ribosome profiling of a small group of mRNAs regulated by eIF2A during stress. 

      We thank the reviewer for the positive evaluation. We will revise the manuscript according to the recommendations made in the detailed recommendations. Regarding the two points mentioned here:

      (1) The reason eIF2alpha phosphorylation does not increase appreciably is because unfortunately the antibody is very poor. The fact that the Integrated Stress Response (ISR) is induced by our treatment can be seen, for instance, by the fact that ATF4 protein levels increase strongly (in the very same samples where eIF2alpha phosphorylation does not increase much, in Suppl. Fig. 5E). We will strengthen the conclusion that the ISR is indeed activated with additional experiments/data as suggested by the reviewer.

      (2) We agree that our results are in line with results from the previous study mentioned by the reviewer, so we will revise the manuscript to mention this other study more extensively in the discussion.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) I suggest to state (already in the abstract, but perhaps also even in the title, definitely in the rest of the paper) that this analysis is limited to the HeLa cell line. 

      As suggested, we have now specified in both the title and the abstract that the work is done in HeLa cells.

      (2) In my view, it is a pity that the authors - given the tools are available - did not check the impact of high eIF2A levels on expression of individual mRNAs under normal and stress conditions. I am not suggesting to repeat ribo-seq in this setup, it would be too much to ask for, but re-examining some of the many reporters the authors generated with eIF2A overexpressed may point to some function, e.g. increased number of non-canonical initiation events (non-AUG-initiated)? If anything, the use of HeLa and the primary focus on eIF2A KO neglecting the prospective impact of eIF2A overexpression should be mentioned as two main limitations of this study. 

      We thank the reviewer for the good suggestion to test our synthetic reporters with eIF2A overexpression. New Suppl. Fig. 4G now shows that overexpression of eIF2A does not affect translation of synthetic reporters carrying an ATG start codon in different initiation contexts, or carrying near-cognate start codons, in agreement with a lack of effect on translation which we previously observed with loss of eIF2A.

      (3) Ribo-seq with eIF2A. Did the authors focus on ORFs that are known, or whose isoforms are known, to be non-AUG initiated? Would the loss of eIF2A decrease FPs in their CDSes under at least some conditions?

      We have now assessed the read distribution on the eIF4G2 transcript in both the control and tunicamycin conditions ( Author response image 1). In our hands, eIF4G2 is one of the best examples of non-AUG initiation in human cells, since the main coding sequence starts with GTG and the CDS is well translated. Nonetheless, we do not observe any significant changes in read distribution (panels A-B) or overall translation efficiency of eIF4G2 upon eIF2A loss (panels C-D).

      Author response image 1.

      (A-B) Average reads occupancy on the eIF4G2 (ENST0000339995) transcript in DMSO treated (panel A, n=3) or tunicamycin treated samples (panel B, n=2) derived from either control (black) or eIF2A-KO (red) HeLa cells. Reads counts were normalized to sequencing depth and averaged between either 3 (DMSO-treated) or 2 (tunicamycin-treated) replicates. Graphs were then smoothened with a sliding window of 3 nt. (C-D) The total number of reads mapping to the eIF4G2 CDS, normalized to library sequencing depth per replica was quantified. No significant difference between control and eIF2A-KO cells was observed in either DMSO treated (panel C) or tunicamycin treated (panel D) cells. Significance by unpaired, two-sided, t-test. ns = not significant.

      Thank you for giving me the opportunity to review this article.

      Reviewer #2 (Recommendations for the authors):

      While some of our suggestions below may be considered subtle, in our opinion they are important and it would be good if the authors consider them for their revision, we also have a couple of technical suggestions. 

      (1) Abstract. 

      The authors failed to identify the role of eIF2A in translation initiation and have provided compelling evidence that eIF2A is not involved in recognition of non-AUG codons as start codons nor in recruitment of initiator tRNA during stress conditions which are two activities most commonly misattributed to eIF2A. However, they have not exhausted all possible potential functions of eIF2A, see below, it is also possible that eIF2A may have a role not yet suggested by anyone and it may function in translation initiation in special circumstances that have not been tested yet. The authors indeed discuss such possibility in the Discussion section. Given that there is genetic evidence (that is unaffected by biochemical impurities) linking eIF2A to other initiation factors (5B and 4E), we are not yet convinced that eIF2A does not have any role in translation initiation and therefore we find the last sentence of the abstract premature. We suggest to soften this statement into something like this: whether eIF2A has any role in translation remains unknown, it may even have a role in a different aspect of RNA Biology. 

      We agree with the reviewer. We changed the last sentence of the abstract to read as follows:

      “It is possible that eIF2A plays a role in translation regulation in specific conditions that we have not tested here, or that it plays a role in a different aspect of RNA biology.”

      (2) Recently eIF2A has been implicated in ribosomal frameshifting, see Wei et al 2023 DOI: 10.1016/j.celrep.2023.112987 

      Could authors look into PEG10 mRNA ribosome profile to see if there are detectable statistically significant changes in footprint density downstream of frameshift site between WT and eIF2A Kos? It is likely that the coverage will be insufficient to give a definitive answer, but it is worth checking, it would be a pity to miss it. 

      We thank the reviewer for this suggestion. We have now looked at the distribution of ribosome footprints on the PEG10 transcript variant that is expressed in HeLa cells (ENST00000482108) and indeed observe coverage downstream of the annotated stop codon, consistent with a frameshifting event that results in an extended protein isoform being translated. Visual assessment of the read distribution between the main ORF and the "ORF extension" does not show a substantial difference between control and eIF2A knock-out cells ( Author response image 2A-B). Additionally, we quantified the ratio of reads mapping to the PEG10 ORF upstream of the slippery site versus those mapping downstream, extending into the predicted longer protein. Nonetheless, we could not detect significant changes between control and eIF2A-KO cells in either tested condition ( Author response image 2C-D).

      Author response image 2.

      (A-B) Average reads occupancy on the PEG10 (ENST00000482108) transcript in DMSO treated (panel A, n=3) or tunicamycin treated samples (panel B, n=2) derived from either control (black) or eIF2A-KO (red) HeLa cells are shown. Reads counts were normalized to sequencing depth and averaged between either 3 (DMSO-treated) or 2 (tunicamycin-treated) replicates. Graphs were then smoothened with a sliding window of 3 nt. (C-D) The ratio of reads mapping to the ORF upstream of the slippery site to reads mapping to the predicted extended protein downstream to the slippery site is shown. Reads counts were normalized to the sequencing depth. Neither DMSO treated samples (panel C) nor tunicamycin treated samples (panel D) had a significant difference between control and eIF2A-KO cells. Significance by unpaired, two-sided, t-test. ns = not significant.

      (3) Introduction 

      Given the volume of unreliable claims regarding eIF2A in the literature and the overall confusion it is very difficult (may even be impossible) to write a clear coherent introduction into the topic. Nonetheless, there are few points that need to be taken into account. 

      The authors state that eIF2A is capable to recruit initiator tRNA citing Zoll et al 2002. This activity was later shown to be a biochemical artefact (which was most likely reproduced by Kim et al 2018), eIF2A fraction was contaminated with eIF2D which does bind tRNAs in GTP-independent manner. eIF2A purified from RRL separates from initiator tRNA binding activity, see Dmitriev et al 2010 DOI: 10.1074/jbc.M110.119693. This point is also relevant to the second paragraph of Discussion, it should be acknowledged that it has been shown previously that eIF2A does not bind the initiator tRNA.

      We appreciate the advice provided by the reviewer. We have modified both the introduction and the 2nd paragraph of the discussion to reflect that the tRNA-binding activity is due to contaminating eIF2D rather than eIF2A.

      In many cases the authors describe certain claims as facts even though they refute them themselves. For example 

      "Such eIF2A-driven non-AUG initiation events were shown to play a crucial role in different aspects of cell physiology and disease progression: cellular adaptation during the integrated stress response (Chen et al., 2019; Starck et al., 2016)"  While non-AUG initiation events do play crucial roles in different aspects of cell physiology (reviewed in Andreev et al 2023 doi: 10.1186/s13059-022-02674-2) eIF2A has nothing to do with it as the authors show themselves. Therefore different language should be used, e.g.. "eIF2A has been suggested (or proposed or reported) to be responsible for non-AUG initiation events that were shown to play ..." 

      The word "shown" is used in many other instances for the claims that the authors refute. "Shown" is only appropriate for strong evidence that leaves little doubt. 

      We agree with the reviewer and made the suggested changes in the text.

      (4) Supplementary Fig. 1. 

      Panel C is used to argue that eIF2A has a higher concentration than in the nucleus, perhaps it is worth explaining how this conclusion was drawn. If levels in cytoplasm are comparable to GAPDH and Tubulin but less than c-Myc in nucleus does it really mean that there is less eIF2A in the nucleus than in cytoplasm? This is not obvious to us. Also, presumably WCL stands for Whole Cell Lysate, it would be nice to introduce this abbreviation somewhere. 

      To compare levels of eIF2A in the nuclear and cytosolic fractions, we lysed the two fractions in equal volumes of buffer (i.e. the cytosolic fraction was extracted in 200 µl of hypotonic buffer, and the nuclear fraction was extracted in 200 µl of cell extraction buffer). This assures that per microliter of lysate we have the same number of "cytosols" or nuclei. Hence, equal intensity bands in the cytosolic and nuclear fractions would mean that half of the protein is in the nucleus and half is in the cytosol. We originally described this in the Methods section, but now also mention it in the Results and in the figure legend.

      We replaced WCL with "whole cell" in the figure. 

      (5) The differential translation analysis is described very briefly "To obtain values of translation efficiency, log2 fold changes, and adjusted p values the DESeq2 software package was used". Was TE calculated based on ribosome footprint to RNA-seq ratios? How exactly DESeq2 was used here? TE measured in this way spuriously correlates with RNA-seq values, see Larsson et al 2010 DOI: 10.1073/pnas.1006821107, perhaps it would be worse assessing differential translation with anota2seq (Oertlin et al 2019 doi: 10.1093/nar/gkz223.)? Anota2seq avoids calculating the ratios and enables comprehensive analysis of differential translation including detection of buffered translation which might be the case here while avoiding artefacts that may arise from varying RNA levels.  

      We now specified in more detail in the Methods section how we analyzed the data. Indeed, the DeSeq2 was used on translation efficiency values, which we calculated as the ratio of ribosome footprints to RNA-seq. 

      As suggested, we have now also performed the analysis using anota2seq (Suppl. Fig. 3C) and this analysis identified zero transcripts that are translationally regulated, in agreement with our analysis.

      (6) Section "eIF2a-inactivating stresses do not redirect tRNA delivery function to eIF2A." 

      The description of ISR mechanism is a bit inaccurate. Strictly speaking eIF2alpha phosphorylation does not inactivate it eIF2alpha. It results in formation of a very stable eIF2*GDP*eIF2B complex, thus severely depleting eIF2B which serves as a GEF for eIF2. This in turn reduces the ternary complex (eIF2*GTP*tRNAi) concentration since there is no free eIF2B to exchange GDP for GTP. Without getting into much detail, we think it would be more accurate to say that eIF2alpha phosphorylation leads to ternary complex depletion instead of saying that stress inactivates eIF2alpha. 

      We agree with the reviewer - we were trying to use simple, compact wording. We have now reworded the section title to "No detectable role for eIF2A in translation when eIF2 is inhibited" and rephrased the subsequent text to be correct.

      Also the subtitle uses eIF2a with small a that stands for alpha which potentially could lead to substantial confusion since in this case the difference between eIF2alpha and eIF2A is only in capitalisation of the last letter, many text-mining engines such as modern LLMs may not be able to pick the differences. Perhaps it would be better to refer to eIF2alpha by the HGNC approved name of its gene - eIF2S1 to avoid further confusions. For clarity it may be stated at the beginning that eIF2S1 is commonly known as eIF2alpha. 

      We thank the reviewer for this point. We have removed all instances of eIF2a (with lowercase a) from the manuscript to avoid this source of confusion. In the first instance of eIF2a we also added the official HGNC gene name. However, we prefer to use eIF2a instead of eIF2S1 because people outside the translation field tend to know the subunit as eIF2a, and we think it is important that also people outside the translation field read this manuscript, since some of the questionable papers on eIF2A come from labs working at the interface between translation and other fields.

      Minor 

      Introduction 

      (7) "uses the CAT anticodon" change CAT to CAU 

      We corrected CAT to CAU

      (8) "In the canonical initiation pathway", change "canonical" to "most common", canonical is somewhat a judgemental statement that originates in theology. Same applies to numerous occurrences of "canonical AUG", simply using "AUG" would be simpler and more accurate as you will avoid giving impression that there are "non-canonical AUGs".  

      Done.

      (9) "eIF2A was initially considered to be a functional analogue of prokaryotic IF2 (Merrick and Anderson, 1975), however later this role was reassigned to the above-mentioned heterotrimeric factor eIF2 (a,b,g) (Levin et al., 1973)." - there is a chronological contradiction within this sentence, the initial consideration is attributed to 1975 while its later reassignment to 1973. 

      We are grateful to the reviewer for spotting this mistake. There was a citation problem; we fixed it and now cite the correct paper for the initial discovery of eIF2A to PMID 5472357 (Shafritz et al 1970).

      (10) "On the other hand, studies on the role of eIF2A on viral IRES translation have arrived at conflicting results." Remove "On the other hand" since conflicting results have been mentioned above. In fact the entire sentence is somewhat redundant given prior "For example, eIF2A has been studied in the context of internal ribosome entry sites (IRES), where it was found to act both as a suppressor and an activator of IRESmediated initiation."  

      We have rewritten the paragraph to make it more coherent.

      (11) Fig. 1. C-D. is using CHX abbreviation for cycloheximide, this need to be mentioned on the legend or elsewhere in the text. Otherwise CHX may not be clear for a reader uninitiated in ribosome profiling. 

      We now mention in the figure legend that CHX stands for cycloheximide and indicate that it was used as a negative control to block translation. 

      (12) Page 7, section "Ribosome profiling reveals a few eIF2Adependent transcripts" 

      In this section you describe ribosome profiling experiments and identify few transcripts whose translation seems to be changing based on ribosome profiling data. Then you attempt to verify them using gene expression reporters and reasonably suggest that these are false positives. In essence this section argues that there are no eIF2A-dependent transcripts, therefore the title of this subsection is misleading, it makes sense to rename it so that it better reflects the content of this section. 

      We agree and have renamed the section to "Ribosome profiling identifies no eIF2Adependent transcripts"

      (13) Page 8, top. Rephrase "To do this, we performed ribosome profiling on control and eIF2AKO cells, which sequences the mRNA footprints protected by ribosomes."  

      Fixed.

      (14) Page 10, bottom. "Several studies have reported that eIF2A can delivery alternative initiator tRNAs to uORFs with nearcognate start codons". Change "delivery" to "deliver". 

      Thanks for spotting it. We corrected to “deliver”

      (15) Page 13 "This suggests that, as in non-stressed conditions, eIF2A has a minimal effect on global translation also when eIF2a activity is low." - rephrase to avoid impression that eIF2alpha activity is low in normal conditions, also please see comment #6 above. 

      We fixed this sentence to read: “This suggests that, as in non-stressed conditions, eIF2A has a minimal effect on global translation also when the integrated stress response is active.”

      Reviewer #3 (Recommendations for the authors):

      - The experimental data in Fig. S5E do not support the claim of increased eIF2 phosphorylation on TM treatment; although, comparing Fig. S5A with Fig. 1B supports a marked reduction in bulk translation and the reporter data in Fig. 4A show the expected induction of the uORF-containing reporters by TM. Because these are the conditions employed for ribosome profiling in stress conditions shown in Fig. 4B, it would be reassuring to document TM-induced translational efficiencies of ATF4 and the other known mRNAs resistant to eIF2 phosphorylation in the ribosome profiling data, including gene browser images of the replicate experiments. If the induction of TEs by TM for such mRNAs was not robust, it would be valuable to repeat the analysis using arsenite (SA) treatment, which produces a greater inhibition of bulk translation. 

      Unfortunately, the eIF2alpha antibody is not very good and also detects the nonphosphorylated protein, causing high background and poor apparent induction in response to tunicamycin. The fact that the ISR was activated is visible from the induction of ATF that was assessed by western blot in the Suppl. Fig. 5E. To ensure that our ribosome profiling libraries also recorded the activation of ISR we built single gene plots for ATF4 both in control and HeLa eIF2A-KO cell. As shown in  Author response image 3 A&B in both cell lines tunicamycin treatment led to the induction of ATF4. This can also be seen by the 4-fold induction in ATF4 translation efficiency in response to tunicamycin in both WT and eIF2A-KO cells ( Author response image 3C). Additionally, we checked that another marker induced by tunicamycin, HSPA5, is also translationally upregulated in both cell lines, as well as the downstream target of ATF4 – PPP1R15B. ( Author response image 3C). 

      Author response image 3.

      (A-B) Average read occupancy on the ATF4 (ENST00000674920) transcript in DMSO treated (n=3) or tunicamycin treated samples (n=2) derived from either control (panel A) or eIF2A-KO (panel B) HeLa cells are shown. Read counts were normalized to sequencing depth and averaged between either 3 (DMSO-treated) or 2 (tunicamycin-treated) replicates. Graphs were then smoothened with a sliding window of 3 nt. (C) Scatter plot of log2(fold change) of Translation Efficiency TM/DMSO for control cells on the xaxis versus eIF2AKO cells on the y-axis. The induction of ATF4 as well as the downstream target PPP1R15B are shown. The upregulation of HSP5A translation, the other hallmark of ER-stress induced by tunicamycin treatment is shown.

      - It should be pointed out in the text that in both published studies being cited here of cells lacking eIF2A, that by Gaikwad et al. on a yeast eIF2A deletion mutant, and that by Ichihara et al. on human HEK293 CRISPR KO cells, the analyses included stress conditions in which eIF2 phosphorylation is induced (amino acid starvation or SA treatment, respectively), as was conducted here.  

      Good point - we added this information into the introduction: 

      "Furthermore, loss of eIF2A in several systems did not recapitulate these effects on non-AUG initiation in either non-stressed or stress conditions (caused either by amino acid depletion or sodium arsenate treatment) (Gaikwad et al., 2024; Ichihara et al., 2021)."

      - The Ichihara et al. (2021) study just mentioned reached some of the same conclusions for HEK cells obtained here by conducting ribosome profiling in untreated and SA-treated cells, finding only 1 mRNA (untreated) or four mRNAs (SA-treated cells) that showed significantly reduced TEs in the eIF2A knockout vs. parental cells. It seems appropriate for the authors to expand their treatment of this prior work by summarizing its findings in some detail and also noting how their study goes beyond this previous one. 

      We have added a paragraph to the discussion pointing out that our data agree fully with Ichihara et al. (2021), and that Ichihara et al. (2021) also found only very few mRNAs that change in TE upon loss of eIF2A in either non-stressed or stressed conditions.

    3. Reviewer #2 (Public review):

      Summary

      Roiuk et al describe a work in which they have investigated the role of eIF2A in translation initiation in mammals without much success. Thus, the manuscript focuses on negative results. Further, the results, while original, are generally not novel, but confirmatory, since related claims have been made before independently in different systems with Haikwad et al study recently published in eLife being the most relevant.

      Despite this, we find this work highly important. This is because of a massive wealth of unreliable information and speculations regarding eIF2A role in translation arising from series of artifacts that began at the moment of eIF2A discovery. This, in combination with its misfortunate naming (eIF2A is often mixed up with alpha subunit of eIF2, eIF2S1) has generated a widespread confusion among researchers who are not experts in eukaryotic translation initiation. Given this, it is not only justifiable but critical to make independent efforts to clear up this confusion and I very much appreciate the authors' efforts in this regard.

      Strengths

      The experimental investigation described in this manuscript is thorough, appropriate and convincing.

      Weaknesses

      No major weaknesses as the authors have improved their presentation.

    4. eLife Assessment

      In this valuable study, Roiuk et al combined ribosome profiling and reporter assays to provide compelling evidence that eIF2A does not have a major impact on mRNA translation in HeLa cells. These findings are consistent with several recent publications that disaffirm the previously proposed role of eIF2A in directing protein synthesis under stress. Considering that stress-dependent perturbations in translation play a major role in homeostasis and several pathological states (e.g., cancer and neurological disorders), this work should be of broad interest to researchers studying regulation of gene expression, stress-adaptation, cancer and neurobiology.

    5. Reviewer #1 (Public review):

      Summary:

      Beyond what is stated in the title of this paper, not much needs to be summarized. eIF2A in HeLa cells promotes translation initiation of neither the main ORFs nor short uORFs under any of the conditions tested.

      Strengths:

      Very comprehensive, in fact, given the huge amount of purely negative data, an admirably comprehensive and well-executed analysis of the factor of interest.

      Weaknesses:

      The study is limited to the HeLa cell line, which is now addressed and clearly stated by the authors.

    1. eLife Assessment

      This important study demonstrates the significance of incorporating biological constraints in training neural networks to develop models that make accurate predictions under novel conditions. By comparing standard sigmoid recurrent neural networks (RNNs) with biologically constrained RNNs, the manuscript offers compelling evidence that biologically grounded inductive biases enhance generalization to perturbed conditions. This manuscript will appeal to a wide audience in systems and computational neuroscience.

    2. Reviewer #1 (Public review):

      I congratulate the authors on this beautiful work.

      This manuscript introduces a biologically informed RNN (bioRNN) that predicts the effects of optogenetic perturbations in both synthetic and in vivo datasets. By comparing standard sigmoid RNNs (σRNNs) and bioRNNs, the authors make a compelling case that biologically grounded inductive biases improve generalization to perturbed conditions. This work is innovative, technically strong, and grounded in relevant neuroscience, particularly the pressing need for data-constrained models that generalize causally.

      I have some suggestions for improvement, which I present in the order of re-reading the paper.

      Major

      (1) In line 76, the authors make a very powerful statement: 'σRNN simulation achieves higher similarity with unseen recorded trials before perturbation, but lower than the bioRNN on perturbed trials.' I couldn't find a figure showing this. This might be buried somewhere and, in my opinion, deserves some spotlight - maybe a figure or even inclusion in the abstract.

      (2) It's mentioned in the introduction (line 84) and elsewhere (e.g., line 259) that spiking has some advantage, but I don't see any figure supporting this claim. In fact, spiking seems not to matter (Figure 2C, E). Please clarify how spiking improves performance, and if it does not, acknowledge that. Relatedly, in line 246, the authors state that 'spiking is a better metric but not significant' when discussing simulations. Either remove this statement and assume spiking is not relevant, or increase the number of simulations.

      (3) The authors prefer the metric of predicting hits over MSE, especially when looking at real data (Figure 3). I would bring the supplementary results into the main figures, as both metrics are very nicely complementary. Relatedly, why not add Pearson correlation or R2, and not just focus on MSE Loss?

      (4) I really like the 'forward-looking' experiment in closed loop! But I felt that the relevance of micro perturbations is very unclear in the intro and results. This could be better motivated: why should an experimentalist care about this forward-looking experiment? Why exactly do we care about micro perturbation (e.g., in contrast to non-micro perturbation)? Relatedly, I would try to explain this in the intro without resorting to technical jargon like 'gradients'.

      Minor

      (1) In the intro, the authors refer to 'the field' twice. Personally, I find this term odd. I would opt for something like 'in neuroscience'.

      (2) Line 45: When referring to previous work using data-constrained RNN models, Valente et al. is missing (though it is well cited later when discussing regularization through low-rank constraints).

      (3) Line 11: Method should be methods (missing an 's').

      (4) In line 250, starting with 'So far', is a strange choice of presentation order. After interpreting the results for other biological ingredients, the authors introduce a new one. I would first introduce all ingredients and then interpret. It's telling that the authors jump back to 2B after discussing 2C.

      (5) The black dots in Figure 3E are not explained, or at least I couldn't find an explanation.

    3. Reviewer #2 (Public review):

      Sourmpis et al. present a study in which the importance of including certain inductive biases in the fitting of recurrent networks is evaluated with respect to the generalization ability of the networks when exposed to untrained perturbations.

      The work proceeds in three stages:<br /> (1) a simple illustration of the problem is made. Two reference (ground-truth) networks with qualitatively different connectivity, but similar observable network dynamics, are constructed, and recurrent networks with varying aspects of design similarity to the reference networks are trained to reproduce the reference dynamics. The activity of these trained networks during untrained perturbations is then compared to the activity of the perturbed reference networks. It is shown that, of the design characteristics that were varied, the enforced sign (Dale's law) and locality (spatial extent) of efference were especially important.<br /> (2) The intuition from the constructed example is then extended to networks that have been trained to reproduce certain aspects of multi-region neural activity recorded from mice during a detection task with a working-memory component. A similar pattern is demonstrated, in which enforcing the sign and locality of efference in the fitted networks has an influence on the ability of the trained networks to predict aspects of neural activity during unseen (untrained) perturbations.<br /> (3) The authors then illustrate the relationship between the gradient of the motor readout of trained networks with respect to the net inputs to the network units, and the sensitivity of the motor readout to small perturbations of the input currents to the units, which (in vivo) could be controlled optogenetically. The paper is concluded with a proposed use for trained networks, in which the models could be analyzed to determine the most sensitive directions of the network and, during online monitoring, inform a targeted optogenetic perturbation to bias behavior.

      The authors do not overstate their claims, and in general, I find that I agree with their conclusions. A couple of points to be made:

      (1) Some aspects of the methods are unclear. For comparisons between recurrent networks trained from randomly initialized weights, I would expect that many initializations were made for each model variant to be compared, and that the performance characteristics are constructed by aggregating over networks trained from multiple random initializations. I could not tell from the methods whether this was done or how many models were aggregated.

      2) It is possible that including perturbation trials in the training sets would improve model performance across conditions, including held-out (untrained) perturbations (for instance, to units that had not been perturbed during training). It could be noted that if perturbations are available, their use may alleviate some of the design decisions that are evaluated here.

    1. eLife Assessment

      This useful study attempts to place an ancient maize sample from Bolivia, dated to the end of the Incan empire, in genetic and geographical context. The analyses show that this sample is most closely related to ancient Peruvian maize, but the data are inadequate to determine the direction of dispersal. There are additional deficiencies in the statistical analyses and selection inferences. The topic of the study would appeal to researchers studying maize dispersal and adaptation.

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors describe a good-quality ancient maize genome from 15th-century Bolivia and try to link the genome characteristics to Inca influence. Overall, the manuscript is below the standard in the field. In particular, the geographic origin of the sample and its archaeological context is not well evidenced. While dating of the sample and the authentication of ancient DNA have been evidenced robustly, the downstream genetic analyses do not support the conclusion that genomic changes can be attributed to Inca influence. Furthermore, sections of the manuscript are written incoherently and with logical mistakes. In its current form, this paper is not robust and possibly of very narrow interest.

      Strengths:

      Technical data related to the maize sample are robust. Radiocarbon dating strongly evidenced the sample age, estimated to be around 1474 AD. Authentication of ancient DNA has been done robustly. Spontaneous C-to-T substitutions, which are present in all ancient DNA, are visible in the reported sample with the expected pattern. Despite a low fraction of C-to-T at the 1st base, this number could be consistent with the cool and dry climate in which the sample was preserved. The distribution of DNA fragment sizes is consistent with expectations for a sample of this age.

      Weaknesses:

      (1) Archaeological context for the maize sample is weakly supported by speculation about the origin and has unreasonable claims weighing on it. Perhaps those findings would be more convincing if the authors were to present evidence that supports their conclusions: i) a map of all known tombs near La Paz, ii) evidence supporting the stone tomb origins of this assemblage, and iii) evidence supporting non-Inca provenance of the tomb.

      (2) Dismissal of the admixture in the reported samples is not evidenced correctly. Population f3 statistic with an outgroup is indeed one of the most robust metrics for sample relatedness; however, it should not be used as a test of admixture. For an admixture test, the population f3 statistic should be used in the form: i) target population, ii) one possible parental population, iii) another possible parental population. This is typically done iteratively with all combinations of possible parental populations. Even in such a form, the population f3 statistic is not very sensitive to admixture in cases of strong genetic drift, and instead population f4 statistic (with an outgroup) is a recommended test for admixture.

      (3) The geographic placement of the sample based on genetic data is not robust. To make use of the method correctly, it would be necessary to validate that genetic samples in this region follow the assumption of the 'isolation-by-distance' with dense sampling, which has not been done. Additionally, the authors posit that "This suggests that aBM might not only be genetically related to the archaeological maize from ancient Peru, but also in the possible geographic location." The method used to infer the location is based on pure genetic estimation. The above conclusion is not supported by this method, and it directly contradicts the authors' suggestion that the sample comes from Bolivia.

      (4) The conclusion that Ancient Andean maize is genetically similar to European varieties and hence shares a similar evolutionary history is not well supported. The PCA plot in Figure 4 merely represents sample similarity based on two components (jointly responsible for about 20% of the variation explained), and European samples could be very distant based on other components. Indeed, the direct test using the outgroup f3 statistic does not support that European varieties are particularly closely related to ancient Andean maize. Perhaps these are more closely related to Brazil? We do not know, as this has not been measured.

      (5) The conclusion that long branches in the phylogenetic tree are due to selection under local adaptation has no evidence. Long branches could be the result of missing data, nucleotide misincorporations, genetic drift, or simply due to the inability of phylogenetic trees to model complex population-level relationships such as admixture or incomplete lineage sorting. Additionally, captions to Figure S3, do not explain colour-coding.

      (6) The conclusion that selection detected in aBM sample is due to Inca influence has no support. Firstly, selection signature can be due to environmental or other factors. To disentangle those, the authors would need to generate the data for a large number of samples from similar cultural contexts and from a wide-ranging environmental context, followed by a formal statistical test. Secondly, allele frequency increase can be attributed to selection or demographic processes, and alone is not sufficient evidence for selection. The presented XP-EHH method seems more suitable. Overall, methods used in this paper raise some concerns: i) how accurate are allele-frequency tests of selection when only single individual is used as a proxy for a whole population, ii) the significance threshold has been arbitrary fixed to an absolute number based on other studies, but the standard is to use, for example, top fifth percentile. Finally, linking selection to particular GO terms is not strong evidence, as correlation does not imply causation, and links are unclear anyway.

      In sum, this manuscript presents new data that seems to be of high quality, but the analyses are frequently inappropriate and/or over-interpreted.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript presents valuable new datasets from two ancient maize seeds that contribute to our growing understanding of the maize evolution and biodiversity landscape in pre-colonial South America. Some of the analyses are robust, but the selection elements are not supported.

      Strengths:

      The data collection is robust, and the data appear to beof sufficiently high quality to carry out some interesting analytical procedures. The central finding that aBM maize is closely related to maize from the core Inca region is well supported, although the directionality of dispersal is not supported.

      Weaknesses:

      The selection results are not justified, see examples in the detailed comments below.

      (1) The manuscript mentions cultural and natural selection (line 76), but then only gives a couple of examples of selecting for culinary/use traits. There are many examples of selection to tolerate diverse environments that could be relevant for this discussion, if desired.

      (2) I would be extremely cautious about interpreting the observations of a Spanish colonizer (lines 95-99) without very significant caveats. Indigenous agriculture and foodways would have been far more nuanced than what could be captured in this context, and the genocidal activities of the Europeans would have impacted food production activities to a degree, and any contemporaneous accounts need to be understood through that lens.

      (3) The f3 stats presented in Figure 2 are not set up to test any specific admixture scenarios, so it is unsupported to conclude that the aBM maize is not admixed on this basis (lines 201-202). The original f3 publication (Patterson et al, 2012) describes some scenarios where f3 characteristics associate with admixture, but in general, there are many caveats to this approach, and it's not the ideal tool for admixture testing, compared with e.g., f4 and D (abba-baba) statistics.

      (4) I'm a little bit skeptical that the Locator method adds value here, given the small training sample size and the wide geographic spread and genetic diversity of the ancient samples that include Central America. The paper describing that method (Battey et al 2020 eLife) uses much larger datasets, and while the authors do not specifically advise on sample sizes, they caution about small sample size issues. We have already seen that the ancient Peruvian maize has the most shared drift with aBM maize on the basis of the f3 stats, and the Locator analysis seems to just be reiterating that. I would advise against putting any additional weight on the Locator results as far as geographic origins, and personally I would skip this analysis in this case.

      (5) The overlap in PCA should not be used to confirm that aBM is authentically ancient, because with proper data handling, PCA placement should be agnostic to modern/ancient status (see lines 224-226). It is somewhat unexpected that the ancient Tehuacan maize (with a major teosinte genomic component) falls near the ancient South American maize, but this could be an artifact of sampling throughout the PCA and the lack of teosinte samples that might attract that individual.

      (6) What has been established (lines 250-251) is genetic similarity to the Inca core area, not necessarily the directionality. Might aBM have been part of a cultural region supplying maize to the Inca core region, for example? Without a specific test of dispersal directionality, which I don't think is possible with the data at hand, this is somewhat speculative.

      (7) Singleton SNPs are not a typical criterion for identifying selection; this method needs some citations supporting the exact approach and validation against neutral expectations (line 278). Without Datasets S2 and S3, which are not included with this submission, it is difficult to assess this result further. However, it is very unexpected that ~18,000 out of ~49,000 SNPs would be unique to the aBM lineage. This most likely reflects some data artifact (unaccounted damage, paralogs not treated for high coverage, which are extremely prevalent in maize, etc). I'm confused about unique SNPs in this context. How can they be unique to the aBM lineage if the SNPs used overlap the Grzybowski set? The GO results do not include any details of the exact method used or a statistical assessment of the results. It is not clear if the GO terms noted are statistically enriched.

      (8) The use of XP-EHH with pseudohaplotype variant calls is not viable (line 293). It is not clear what exact implementation of XP-EHH was used, but this method generally relies on phased or sometimes unphased diploid genotype calls to observe shared haplotypes, and some minimum population size to derive statistical power. No implementation of XP-EHH to my knowledge is appropriate for application to this kind of dataset.

    4. Reviewer #3 (Public review):

      Summary:

      The authors seek to place archaeological maize samples (2 kernels) from Bolivia into genetic and geographical context and to assess signatures of selection. The kernels were dated to the end of the Incan empire, just prior to European colonization. Genetic data and analyses were used to characterize the distance from other ancient and modern maize samples and to predict the origin of the sample, which was discovered in a tomb near La Paz, Bolivia. Given the conquest of this region by the Incan empire, it is possible that the sample could be genetically similar to populations of maize in Peru, the center of the Incan empire. Signatures of selection in the sample could help reveal various environmental variables and cultural preferences that shaped maize genetic diversity in this region at that time.

      Strengths:

      The authors have generated substantial genetic data from these archaeological samples and have assembled a data set of published archaeological and modern maize samples that should help to place these samples in context. The samples are dated to an interesting time in the history of South America during a period of expansion of the Incan empire and just prior to European colonization. Much could be learned from even this small set of samples.

      Weaknesses:

      (1) Sample preparation and sequencing:<br /> Details of the quality of the samples, including the percentage of endogenous DN,A are missing from the methods. The low percentage of mapped reads suggests endogenous DNA was low, and this would be useful to characterize more fully. Morphological assessment of the samples and comparison to morphological data from other maize varieties is also missing. It appears that the two kernels were ground separately and that DNA was isolated separately, but data were ultimately pooled across these genetically distinct individuals for analysis. Pooling would violate assumptions of downstream analysis, which included genetic comparison to single archaeological and modern individuals.

      (2) Genetic comparison to other samples:<br /> The authors did not meaningfully address the varying ages of the other archaeological samples and modern maize when comparing the genetic distance of their samples. The archaeological samples were as old as >5000 BP to as young as 70 BP and therefore have experienced varying extents of genetic drift from ancestral allele frequencies. For this reason, age should explicitly be included in their analysis of genetic relatedness.

      (3) Assessment of selection in their ancient Bolivian sample:<br /> This analysis relied on the identification of alleles that were unique to the ancient sample and inferred selection based on a large number of unique SNPs in two genes related to internode length. This could be a technical artifact due to poor alignment of sequence data, evidence supporting pseudogenization, or within an expected range of genetic differentiation based on population structure and the age of the samples. More rigor is needed to indicate that these genetic patterns are consistent with selection. This analysis may also be affected by the pooling of the Bolivian archaeological samples.

      (4) Evidence of selection in modern vs. ancient maize: In this analysis, samples were pooled into modern and ancient samples and compared using the XP-EHH statistic. One gene related to ovule development was identified as being targeted by selection, likely during modern improvement. Once again, ancient samples span many millennia and both South, Central, and North America. These, and the modern samples included, do not represent meaningfully cohesive populations, likely explaining the extremely small number of loci differentiating the groups. This analysis is also complicated by the pooling of the Bolivian archaeological samples.

    1. eLife Assessment

      This important study provides a novel approach for delineating subcortical-cortical white matter bundles. The authors provide convincing evidence by harnessing state-of-the-art methods and cross-species data. Together, this effort will be of interest to scientists across multiple subfields.

    2. Reviewer #1 (Public review):

      Summary:

      The authors note that it is challenging to perform diffusion MRI tractography consistently in both humans and macaques, particularly when deep subcortical structures are involved. The scientific advance described in this paper is effectively an update to the tracts that the XTRACT software supports. The claims of robustness are based on a very small selection of subjects from a very atypical dMRI acquisition (n=50 from HCP-Adult) and an even smaller selection of subjects from a more typical study (n=10 from ON-Harmony).

      Strengths:

      The changes to XTRACT are soundly motivated in theory (based on anatomical tracer studies) and practice (changes in seeding/masking for tractography), and I think the value added by these changes to XTRACT should be shared with the field. While other bundle segmentation software typically includes these types of changes in release notes, I think papers are more appropriate.

      Weaknesses:

      The demonstration of the new tracts does not include a large number of carefully selected scans and is only compared to the prior methods in XTRACT. The small n and limited statistical comparisons are insufficient to claim that they are better than an alternative. Qualitatively, this method looks sound.

      Subject selection at each stage is unclear in this manuscript. On page 5 the data are described as "Using dMRI data from the macaque (𝑁 = 6) and human brain (𝑁 = 50)". Were the 50 HCP subjects selected to cover a range of noise levels or subject head motion? Figure 4 describes 72 pairs for each of monozygotic, dizygotic, non-twin siblings, and unrelated pairs - are these treated separately? Similarly, NH had 10 subjects, but each was scanned 5 times. How was this represented in the sample construction?

      In the paper, the authors state "the mean agreement between HCP and NH reconstructions was lower for the new tracts, compared to the original protocols (𝑝 < 10^−10). This was due to occasionally reconstructing a sparser path distribution, i.e., slightly higher false negative rate," - how can we know this is a false negative rate without knowing the ground truth?

    3. Reviewer #2 (Public review):

      Summary:

      In this article, Assimopoulos et al. expand the FSL-XTRACT software to include new protocols for identifying cortical-subcortical tracts with diffusion MRI, with a focus on tracts connecting to the amygdala and striatum. They show that the amygdalofugal pathway and divisions of the striatal bundle/external capsule can be successfully reconstructed in both macaques and humans while preserving large-scale topographic features previously defined in tract tracing studies. The authors set out to create an automated subcortical tractography protocol, and they accomplished this for a subset of specific subcortical connections for users of the FSL ecosystem.

      Strengths:

      A main strength of the current study is the translation of established anatomical knowledge to a tractography protocol for delineating cortical-subcortical tracts that are difficult to reconstruct. Diffusion MRI-based tractography is highly prone to false positives; thus, constraining tractography outputs by known anatomical priors is important. Key additional strengths include 1) the creation of a protocol that can be applied to both macaque and human data; 2) demonstration that the protocol can be applied to be high quality data (3 shells, > 250 directions, 1.25 mm isotropic, 55 minutes) and lower quality data (2 shells, 100 directions, 2 mm isotropic, 6.5 minutes); and 3) validation that the anatomy of cortical-subcortical tracts derived from the new method are more similar in monozygotic twins than in siblings and unrelated individuals.

      Weaknesses:

      Although this work validates the general organizational location and topographic organization of tractography-derived cortical-subcortical tracts against prior tract tracing studies (a clear strength), the validation is purely visual and thus only qualitative. Furthermore, it is difficult to assess how the current XTRACT method may compare to currently available tractography approaches to delineating similar cortical-subcortical connections. Finally, it appears that the cortical-subcortical tractography protocols developed here can only be used via FSL-XTRACT (yet not with other dMRI software), somewhat limiting the overall accessibility of the method.

      Overall Appraisal:

      This new method will accelerate research on anatomically validated cortical-subcortical white matter pathways. The work has utility for diffusion MRI researchers across fields.

    1. eLife Assessment

      This valuable work presents how PRDM16 plays a critical role during colloid plexus development, through regulating BMP signaling. Solid evidence supports the context-dependent gene regulatory mechanisms both in vivo and in vitro. The work will be of broad interest to researchers working on growth factor signaling mechanisms and vertebrate development.

    2. Reviewer #1 (Public review):

      Summary:<br /> This manuscript describes the role of PRDM16 in modulating BMP response during choroid plexus (ChP) development. The authors combine PRDM16 knockout mice and cultured PRDM16 KO primary neural stem cells (NSCs) to determine the interactions between BMP signaling and PRDM16 in ChP differentiation.<br /> They show PRDM16 KO affects ChP development in vivo and BMP4 response in vitro. They determine genes regulated by BMP and PRDM16 by ChIP-seq or CUT&TAG for PRDM16, pSMAD1/5/8, and SMAD4. They then measure gene activity in primary NSCs through H3K4me3 and find more genes are corepressed than coactivated by BMP signaling and PRDM16 and focus on the 31 genes found to be co-repressed by BMP and PRDM16. Wnt7b is in this set and the authors then provide evidence that PRDM16 and BMP signaling together repress Wnt activity in the developing choroid plexus.

      Strengths:<br /> Understanding context-dependent response to cell signals during development is an important problem. The authors use a powerful combination of in vivo and in vitro systems to dissect how PRDM16 may modulate BMP response in early brain development.

      Main weakness of the experimental setup:<br /> (1) Because the authors state that primary NSCs cultured in vitro lose endogenous Prdm16 expression, they drive expression by a constitutive promoter. However, this means the expression levels is very different from endogenous levels (as explicitly shown in Supp. Fig. 2B) and the effect of many transcription factors is strongly dose-dependent, likely creating differences between the PRDM16-dependent transcriptional response in the in vitro system and in vivo. Although the authors combine in vitro and in vivo evidence on the role of PRDM16 as a co-factor for MBP signaling and verified that BMP induces quiescence in their NSC model in a PRDM16-dependent manner, this experimental setup remains a weakness and likely affects the results of the various genomics experiments.

      Other experimental weaknesses that make the evidence less convincing:

      (1) It seems that the authors compare Prdm16_KO cells to Prdm16 WT cells overexpressing flag_Prdm16. Aside from the possible expression of endogenous Prdm16, other cell differences may have arisen between these cell lines. A properly controlled experiment would compare Prdm16_KO ctrl (possibly infected with a control vector without Prdm16) to Prdm16_KO_E (i.e. the Prdm16_KO cells with and without Prdm16 overexpression.) The authors acknowledged this problem in their rebuttal, stating that they were unable to overexpress PRDM16 in KO cells.

      (2) The authors show in Fig.2E that Ttr is not upregulated by BMP4 in PRDM16_KO NSCs. This appears inconsistent with the presence of Ttr expression in the PRDM16_KO brain in Fig.1C. The authors explained in their rebuttal that the Ttr protein levels are not detectable in the NSCs with antibodies but the effect is still visible at the level of mRNA. The dramatic difference in protein expression is curious.

    3. Reviewer #2 (Public review):

      The authors have revised their manuscript in response to reviewer feedback, incorporating several modifications to improve clarity and provide additional supporting information. To address concerns about confusing terminology, they have standardized the reference to PRDM16 overexpressing cells as Prdm16_OE, clarifying its expression from a constitutive promoter. They also revised the text to resolve seemingly contradictory statements about ChP development in the mutant. New bioinformatic analysis comparing PRDM16 binding in E12.5 ChP cells to co-repressed versus BMP-only-repressed genes has been performed and included in Supplementary Figure 5C, providing a statistical assessment of PRDM16's regulatory role on co-repressed genes. Several figures were updated, including adding an illustration of the Prdm16 cGT allele to Figure 1B, providing a zoomed-in inset for Figure 1E, and including individual channels for Wnt2b and marking boundaries in Figure 7A. Full-view images and examples of spot segmentation for SCRINSHOT analysis are now available in a new supplementary figure, and the presentation of RT-qPCR data in Supplementary Figure 2B was improved by using separate graphs for overexpression samples to avoid a broken Y-axis. Furthermore, the authors have added more references to introductory statements, annotated structures like the ChP, CH, and fourth ventricle in figures, and clarified that the beta-Gal signal was used as a marker for mutant ChP cells in Figure 1D. Finally, the manuscript now includes a discussion of the recently published, related study by Hurwitz et al. (2023) in the discussion section, highlighting similarities and differences. Overall, the authors have satisfactorily addressed the reviewers' comments.

    4. Reviewer #3 (Public review):

      Summary:<br /> Bone morphogenetic protein (BMP) signaling instructs multiple processes during development including cell proliferation and differentiation. The authors set out to understand the role of PRDM16 in these various functions of BMP signaling. They find that PRDM16 and BMP co-operate to repress stem cell proliferation by regulating the genomic distribution of BMP pathway transcription factors. They additionally show that PRDM16 impacts choroid plexus epithelial cell specification. The authors provide evidence for a regulatory circuit (constituting of BMP, PRDM16 and Wnt) that influences stem cell proliferation/differentiation.

      Strengths:<br /> I find the topics studied by the authors in this study of general interest to the field, the experiments well-controlled and the analysis in the paper sound. I have no major scientific concerns.

      Weaknesses:<br /> I have some minor recommendations which will help improve the paper (regarding the discussion).

      Comments on revised version:

      The authors have addressed my concerns in the revised version of the manuscript.

    5. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      This manuscript describes the role of PRDM16 in modulating BMP response during choroid plexus (ChP) development. The authors combine PRDM16 knockout mice and cultured PRDM16 KO primary neural stem cells (NSCs) to determine the interactions between BMP signaling and PRDM16 in ChP differentiation.

      They show PRDM16 KO affects ChP development in vivo and BMP4 response in vitro. They determine genes regulated by BMP and PRDM16 by ChIP-seq or CUT&TAG for PRDM16, pSMAD1/5/8, and SMAD4. They then measure gene activity in primary NSCs through H3K4me3 and find more genes are co-repressed than co-activated by BMP signaling and PRDM16. They focus on the 31 genes found to be co-repressed by BMP and PRDM16. Wnt7b is in this set and the authors then provide evidence that PRDM16 and BMP signaling together repress Wnt activity in the developing choroid plexus.

      Strengths:

      Understanding context-dependent responses to cell signals during development is an important problem. The authors use a powerful combination of in vivo and in vitro systems to dissect how PRDM16 may modulate BMP response in early brain development.

      We thank the reviewer for the thoughtful summary and positive feedback. We appreciate the recognition of our integrative in vivo and in vitro approach. We're glad the reviewer found our findings on context-dependent gene regulation and developmental signaling valuable.

      Main weaknesses of the experimental setup:

      (1) Because the authors state that primary NSCs cultured in vitro lose endogenous Prdm16 expression, they drive expression by a constitutive promoter. However, this means the expression levels are very different from endogenous levels (as explicitly shown in Supplementary Figure 2B) and the effect of many transcription factors is strongly dose-dependent, likely creating differences between the PRDM16-dependent transcriptional response in the in vitro system and in vivo.

      We acknowledge that our in vitro experiments may not ideally replicate the in vivo situation, a common limitation of such experiments, our primary aim was to explore the molecular relationship between PRDM16 and BMP signaling in gene regulation. Such molecular investigations are challenging to conduct using in vivo tissues. In vitro NSCs treated with BMP4 has been used a model to investigate NSC proliferation and quiescence, drawing on previous studies (e.g., Helena Mira, 2010; Marlen Knobloch, 2017). Crucially, to ensure the relevance of our in vitro findings to the in vivo context, we confirmed that cultured cells could indeed be induced into quiescence by BMP4, and this induction necessitated the presence of PRDM16. Furthermore, upon identifying target genes co-regulated by PRDM16 and SMADs, we validated PRDM16's regulatory role on a subset of these genes in the developing Choroid Plexus (ChP) (Fig. 7 and Suppl.Fig7-8). Only by combining evidence from both in vitro and in vivo experiments could we confidently conclude that PRDM16 serves as an essential co-factor for BMP signaling in restricting NSC proliferation.

      (2) It seems that the authors compare Prdm16_KO cells to Prdm16 WT cells overexpressing flag_Prdm16. Aside from the possible expression of endogenous Prdm16, other cell differences may have arisen between these cell lines. A properly controlled experiment would compare Prdm16_KO ctrl (possibly infected with a control vector without Prdm16) to Prdm16_KO_E (i.e. the Prdm16_KO cells with and without Prdm16 overexpression.)

      We agree that Prdm16 KO cells carrying the Prdm16-expressing vector would be a good comparison with those with KO_vector. However, despite more than 10 attempts with various optimization conditions, we were unable to establish a viable cell line after infecting Prdm16 KO cells with the Prdm16-expressing vector. The overall survival rate for primary NSCs after viral infection is low, and we observed that KO cells were particularly sensitive to infection treatment when the viral vector was large (the Prdm16 ORF is more than 3kb).

      As an alternative oo assess vector effects, we instead included two other control cell lines, wt and KO cells infected with the 3xNLS_Flag-tag viral vector, and presented the results in supplementary Fig 2.  When we compared the responses of the four lines — wt, KO, wt infected with the Flag vector, KO infected with the Flag vector — to the addition and removal of BMP4, we confirmed that the viral infection itself has no significant impacts on the responses of these cells to these treatments regarding changes in cell proliferation and Ttr induction.

      Given that wt cells and the KO cells, with or without viral backbone infection behave quite similarly in terms of cell proliferation, we speculate that even if we were successful in obtaining a cell line with Prdm16-expressing vector in the KO cells, it may not exhibit substantial differences compared to wt cells infected with Prdm16-expressing vector.

      Other experimental weaknesses that make the evidence less convincing:

      (1) The authors show in Figure 2E that Ttr is not upregulated by BMP4 in PRDM16_KO NSCs. Does this appear inconsistent with the presence of Ttr expression in the PRDM16_KO brain in Figure1C?

      The reviwer’s point is that there was no significant increase in Ttr expression in Prdm16_KO cells after BMP4 treatment (Fig. 2E), but there remained residule Ttr mRNA signals in the Prdm16 mutant ChP (Fig. 1C). We think the difference lies in the measuable level of Ttr expression between that induced by BMP4 in NSC culture and that in the ChP. This is based on our immunostaining expreriment in which we tried to detect Ttr using a Ttr antibody. This antibody could not detect the Ttr protein in BMP4-treated Prdm16_expressing NSCs but clearly showed Ttr signal in the wt ChP. This means that although Ttr expression can be significantly increased by BMP4 in vitro to a level measurable by RT-qPCR, its absolute quantity even in the Prdm16_expressing condition is much lower compared to that in vivo. Our results in Fig 1C and Fig 2E, as well as Fig 7B, all consistently showed that Prdm16 depletion significantly reduced Ttr expression in in vitro and in vivo.

      (2) Figure 3: The authors use H3K4me3 to measure gene activity. This is however, very indirect, with bulk RNA-seq providing the most direct readout and polymerase binding (ChIP-seq) another more direct readout. Transcription can be regulated without expected changes in histone methylation, see e.g. papers from Josh Brickman. They verify their H3K4me3 predictions with qPCR for a select number of genes, all related to the kinetochore, but it is not clear why these genes were picked, and one could worry whether these are representative.

      H3K4me3 has widely been used as an indicator of active transcription and is a mark for cell identity genes. And it has been demonstrated that H3K4me3 has a direct function in regulating transciption at the step of RNApolII pausing release. As stated in the text, there are advantages and disadvantages of using H3K4me3 compared to using RNA-seq. RNA-seq profiles all gene products, which are affected by transcription and RNA stability and turnover. In contrast, H3K4me3 levels at gene promoter reflects transcriptional activity. In our case, we aimed to identify differential gene expression between proliferation and quiescence states. The transition between these two states is fast and dynamic. RNA-seq may not be able to identify functionally relevant genes but more likely produces false positive and negative results. Therefore, we chose H3K4me3 profiling.

      We agree that transcription may change without histone methylation changes. This may cause an under-estimation of the number of changed genes between the conditions. 

      We validated 7 out of 31 genes (Wnt7b, Id3, Mybl2, Spc24, Spc25, Ndc80 and Nuf2). We chose these genes based on two critira: 1) their function is implicated in cell proliferation and cell-cycle regulation based on gene ontology analysis; 2) their gene products are detectable in the developing ChP based on the scRNA-seq data. Three of these genes (Wnt7b, Id3, Mybl2) are not related to the kinetochore. We now clarify this description in the revised text.

      (3) Line 256: The overlap of 31 genes between 184 BMP-repressed genes and 240 PRDM16-repressed genes seems quite small.

      This result indicates that in addition to co-repressing cell-cycle genes, BMP and PRDM16 have independent fucntions. For example, it was reported that BMP regulates neuronal and astrocyte differentiation (Katada, S. 2021), while our previous work demonstrated that Prdm16 controls temporal identity of NSCs (He, L. 2021).

      (4) The Wnt7b H3K4me3 track in Fig. 3G is not discussed in the text but it shows H3K4me3 high in _KO and low in _E regardless of BMP4. This seems to contradict the heatmap of H3K4me3 in Figure 3E which shows H3K4me3 high in _E no BMP4 and low in _E BMP4 while omitting _KO no BMP4. Meanwhile CDKN1A, the other gene shown in 3G, is missing from 3E.

      The track in Fig 3G shows the absolute signal of H3K4me3 after mapping the sequencing reads to the genome and normaliz them to library size. Compare the signal in Prdm16_E with BMP4 and that in Prdm16_E without BMP4, the one with BMP4 has a lower peak. The same trend can be seen for the pair of Prdm16_KO cells with or without BMP4.  The heatmap in Fig. 3E shows the relative level of H3K4me3 in three conditions. The Prdm16_E cells with BMP4 has the lowest level, while the other two conditions (Prdm16_KO with BMP4 and Prdm16_E without BMP4) display higher levels. These two graphs show a consistent trend of H3K4me3 changes at the Wnt7b promoter across these conditions. Figure 3E only includes genes that are co-repressed by PRDM16 and BMP. CDKN1A’s H3K4me3 signals are consistent between the conditions, and thus it is not a PRDM16- or BMP-regulated gene. We use it as a negative control. 

      (5) The authors use PRDM16 CUT&TAG on dissected dorsal midline tissues to determine if their 31 identified PRDM16-BMP4 co-repressed genes are regulated directly by PRDM16 in vivo. By manual inspection, they find that "most" of these show a PRDM16 peak. How many is most? If using the same parameters for determining peaks, how many genes in an appropriately chosen negative control set of genes would show peaks? Can the authors rigorously establish the statistical significance of this observation? And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.

      In our text, we indicated the genes containing PRDM16 binding peaks in the figures and described them as “Text in black in Fig. 6A and Supplementary Fig. 5A”. We will add the precise number “25 of these genes” in the main text to clarify it. We used BMP-only repressed 184-31 =153 genes (excluding PRDM16-BMP4 co-repressed) as a negative control set of genes. By computationally determine the nearest TSS to a PRDM16 peak, we identified 24/31 co-repressed genes and 84/153 BMP-only-repressed genes, containing PRDM16 peaks in the E12.5 ChP data. Fisher’s Exact Test comparing the proportions yields the P-value = 0.015.

      We are confused with the second part of the comment “And why wasn't the same experiment performed on the NSCs in which the other experiments are done so one can directly compare the results? Instead, as far as I could tell, there is only ChIP-qPCR for two genes in NSCs in Supplementary Figure 4D.” If the reviewer meant why we didn’t sequence the material from sequential-ChIP or validate more taget genes, the reason is the limitation of the material. Sequential ChIP requires a large quantity of the antibodies, and yields little material barely sufficient for a few qPCR after the second round of IP. This yielded amount was far below the minimum required for library construction. The PRDM16 antibody was a gift, and the quantity we have was very limited. We made a lot of efforts to optimize all available commercial antibodies in ChIP and Cut&Tag, but none of them worked in these assays.

      (6) In comparing RNA in situ between WT and PRDM16 KO in Figure 7, the authors state they use the Wnt2b signal to identify the border between CH and neocortex. However, the Wnt2b signal is shown in grey and it is impossible for this reviewer to see clear Wnt2b expression or where the boundaries are in Figure 7A. The authors also do not show where they placed the boundaries in their analysis. Furthermore, Figure 7B only shows insets for one of the regions being compared making it difficult to see differences from the other region. Finally, the authors do not show an example of their spot segmentation to judge whether their spot counting is reliable. Overall, this makes it difficult to judge whether the quantification in Figure 7C can be trusted.

      In the revised manuscript we have included an individal channel of Wnt2b and mark the boundaries. We also provide full-view images and examples of spot segmentation in the new supplementary figure 8. 

      (7) The correlation between mKi67 and Axin2 in Figure 7 is interesting but does not convincingly show that Wnt downstream of PRDM16 and BMP is responsible for the increased proliferation in PRDM16 mutants.

      We agree that this result (the correlation between mKi67 and Axin2) alone only suggests that Wnt signaling is related to the proliferation defect in the Prdm16 mutant, and does not necessarily mean that Wnt is downstream of PRDM16 and BMP. Our concolusion is backed up by two additional lines of evidences:  the Cut&Tag data in which PRDM16 binds to regulatory regions of Wnt7b and Wnt3a; BMP and PRDM16 co-repress Wnt7b in vitro.

      An ideal result is that down-regulating Wnt signaling in Prdm16 mutant can rescue Prdm16 mutant phenotype. Such an experiment is technically challenging. Wnt plays diverse and essential roles in NSC regulation, and one would need to use a celltype-and stage-specific tool to down-regulate Wnt in the background of Prdm16 mutation. Moreover, Wnt genes are not the only targets regulated by PRDM16 in these cells, and downregulating Wnt may not be sufficient to rescue the phenotype. 

      Weaknesses of the presentation:

      Overall, the manuscript is not easy to read. This can cause confusion.

      We have revised the text to improve clarity.

      Reviewer #1 (Recommendations for the authors):

      (1) Overall, the manuscript is not easy to read. Here are some causes of confusion for which the presentation could be cleaned up:

      We are grateful for the reviewer’s suggestion. In the revised manuscript, we have made efforts to improve the clarity of the text.

      (a) Part of the first section is confusing in that some statements seem contradictory, in particular:

      "there is no overall patterning defect of ChP and CH in the Prdm16 mutant" (line 125)

      "Prdm16 depletion disrupted the transition from neural progenitors into ChP epithelia" (line 144)

      It would be helpful if the authors could reformulate this more clearly.

      We modified the text to clarify that while the BMP-patterned domain is not affected, the transition of NSCs into ChP epithelial cells is compromised in the Prdm16 mutant.

      (b) Flag_PRDM16, PRDM16_expressing, PRDM16_E, PRDM16 OE all seem to refer to the same PRDM16 overexpressing cells, which is very confusing. The authors should use consistent naming. Moreover, it would be good if they renamed these all to PRDM16_OE to indicate expression is not endogenous but driven by a constitutive promoter.

      We appreciate the comment and agree that the use of multiple terms to refer to the same PRDM16-overexpressing condition was confusing. Our original intention in using Prdm16_E was to distinguish cells expressing PRDM16 from the two other groups: wild-type cells and Prdm16_KO cells, which both lack PRDM16 protein expression. However, we acknowledge that Prdm16_E could be misinterpreted as indicating expression from the endogenous Prdm16 promoter. To avoid this confusion and ensure consistency, we have now standardized the terminology and refer to this condition as Prdm16_OE, indicating Flag-tagged PRDM16 expression driven by a constitutive promoter.

      (c) Line 179 states "generated a cell line by infecting Prdm16_KO cells with the same viral vector, expressing 3xNSL_Flag". Do the authors mean 3xNLS_Flag_Prdm16, so these are the Prdm16_KO_E cells by the notation suggested above? Or is this a control vector with Flag only? The following paragraph refers to Supplementary Figure 2C-F where the same construct is called KO_CDH, suggesting this was an empty CDH vector, without Flag, or Prdm16. This is confusing.

      We appreciate the reviewer’s careful reading and helpful comment. We acknowledge the confusion caused by the inconsistent terminology. To clarify: in line 179, we intended to describe an attempt to generate a Prdm16_KO cell line expressing 3xNLS_Flag_Prdm16, not a control vector with Flag only. However, despite repeated attempts, we were unable to establish this line due to low viral efficiency and the vulnerability of Prdm16_KO cells to infection with the large construct. Therefore, these cells were not included in the subsequent analyses.

      The term KO_CDH refers to Prdm16_KO cells infected with the empty CDH control vector, which lacks both Flag and Prdm16. This is the line used in the experiments shown in Supplementary Fig. 2C–F. We have revised the text throughout the manuscript to ensure consistent use of terminology and to avoid this confusion.

      (2) The introductory statements on lines 53-54 could use more references.

      Thanks for the suggestion. We have now included more references.

      (3) It would be helpful if all structures described in the introduction and first section were annotated in Figure 1, or otherwise, if a cartoon were included. For example, the cortical hem, and fourth ventricle.

      Thanks for the suggestion. We have now indicated the structures, ChP, CH and the fourth ventricle, in the images in Figure 1 and Supplementary Figure 1.

      (4) In line 115, "as previously shown.." - to keep the paper self-contained a figure illustrating the genetics of the KO allele would be helpful.

      Thanks for the suggestion. We have now included an illustration of the Prdm16 cGT allele in Figure 1B.

      (5) In Figure 1D as costain for a ChP marker would be helpful because it is hard to identify morphologically in the Prdm16 KO.

      Appoligize for the unclarity. The KO allele contains a b-geo reporter driven by Prdm16 endogenous promoter. The samples were co-stained for EdU, b-Gal and DAPI. To distingquish the ChP domain from the CH, we used the presence of b b-Gal as a marker. We indicated this in the figure legend, but now have also clarified this in the revised text.

      (6) The details in Figure 1E are hard to see, a zoomed-in inset would help.

      A zoomed-in inset is now included in the figure.

      (7) Supplementary Figure 2A does not convincingly show that PRDM16 protein is undetectable since endogenous expression may be very low compared to the overexpression PRDM16_E cells so if the contrast is scaled together it could appear black like the KO.

      We appreciate the reviewer’s point and have carefully considered this concern. We concluded that PRDM16 protein is effectively undetectable in cultured wild-type NSCs based on direct comparison with brain tissue. Both cultured NSCs and brain sections were processed under similar immunostaining and imaging conditions. While PRDM16 showed robust and specific nuclear localization in embryonic brain sections (Fig. 1B and Supplementary Fig. 1A), only a small subset of cultured NSCs exhibited PRDM16 signal, primarily in the cytoplasm (middle panel of Fig. 2A). This stark contrast supports our conclusion that endogenous PRDM16 protein is either absent or significantly downregulated in vitro. Because of this limitation, we turned to over-expressing Prdm16 in NSC culture using a constitutive promoter. 

      (9) Line 182 "Following the washout step" - no such step had been described, maybe replace by "After washout of BMP".

      Yes, we have revised the text.

      (8) Line 214: "indicating a modest level" - what defines modest? Compared to what? Why is a few thousand moderate rather than low? Does it go to zero with inhibitors for pathways?

      Here a modest level means a lower level than to that after adding BMP4. To clarify this, we revised the description to “indicating endogenous levels of …”

      (9) The way qPCR data are displayed makes it difficult to appreciate the magnitude of changes, e.g. in Supplementary Figure 2B where a gap is introduced on the scale. Displaying log fold change / relative CT values would be more informative.

      We used a segmented Y-axis in Supplementary Figure 2B because the Prdm16 overexpression samples exhibited much higher experssion levels compared to other conditions. In response to this suggestion, we explored alternative ways to present the result, including ploting log-transformed values and log fold changes. However, these methods did not enhance the clarity of the differences – in fact, log scaling made the magnitude of change appear less apparent. To address this, we now present the overexpression samples in a separate graph, thereby eliminating the need for a broken Y-axis and improving the overall readability of the data.

      (10) Writing out "3 days" instead of 3D in Figure 2A would improve clarity. It would be good if the used time interval is repeated in other figures throughout the paper so it is still clear the comparison is between 0 and 3 days.

      We have changed “3D” to “3 days”. All BMP4 treatments in this study were 3 days.

      (11) Line 290: "we found that over 50% of SMAD4 and pSMAD1/5/8 binding peaks were consistent in Prdm16_E and Prdm16_KO cells, indicating that deletion of Prdm16 does not affect the general genomic binding ability of these proteins" - this only makes sense to state with appropriate controls because 50% seems like a big difference, what is the sample to sample variability for the same condition? Moreover, the next paragraph seems to contradict this, ending with "This result suggests that SMAD binding to these sites depends on PRDM16". The authors should probably clarify the writing.

      We appreciate the reviwer’s comment and agree that clarification was needed. Our point was that SMAD4 and pSMAD1/5/8 retain the ability to bind DNA broadly in the Prdm16 KO cells, with more than half of the original binding sites still occupied. This suggests that deletion of Prdm16 does not globally impair SMAD genomic binding. Howerever, our primary interest lies in the subset of sites that show differential by SMAD binding between wt and Prdm16 KO conditions, as thse are likely to be PRDM16-dependent. 

      In the following paragraph, we focused specifically on describing SMAD and PRDM16 co-bound sites. At these loci, SMAD4 and pSMAD1/5/8 showed reduced enrichment in the absence of PRDM16, suggesting PRDM16 facilitates SMAD binding at these particular regions. We have revised the text in the manuscript to more clearly distinguish between global SMAD binding and PRDM16-dependent sites.

      (12) Much more convincing than ChIP-qPCR for c-FOS for two loci in Figures 5F-G would be a global analysis of c-FOS ChIP-seq data.

      We agree that a global c-FOS ChIP-seq analysis would provide a more comprehensive view of c-FOS binding patterns. However, the primary focus of this study is the interaction between BMP signaling and PRDM16. The enrichment of AP-1 motifs at ectopic SMAD4 binding sites was an unexpected finding, which we validated using c-FOS ChIP-qPCR at selected loci. While a genome-wide analysis would be valuable, it falls beyond the current scope. We agree that future studies exploring the interplay among SMAD4/pSMAD, PRDM16, and AP-1 will be important and informative.

      (13) Figure 6A is hard to read. A heatmap would make it much easier to see differences in expression. Furthermore, if the point is to see the difference between ChP and CH, why not combine the different subclusters belonging to those structures? Finally, why are there 28 genes total when it is said the authors are evaluating a list of 31 genes and also displaying 6 genes that are not expressed (so the difference isn't that unexpressed genes are omitted)?

      For the scRNA-seq data, we chose violin plots because they display both gene expression levels and the number of cells that express each gene. However, we agree that the labels in Figure 6A were too small and difficult to read. We have revised the figure by increasing the font size and moved genes with low expression to  Supplementary Figure 5A. Figure 6A includes 17 more highly expressed genes together with three markers, and  Supplementary Figure 5A contains 13 lowly expressed genes. One gene Mrtfb is missing in the scRNA-seq data and thus not included. We have revised the description of the result in the main text and figure legends.

      Reviewer #2 (Public review):

      Summary:

      This article investigates the role of PRDM16 in regulating cell proliferation and differentiation during choroid plexus (ChP) development in mice. The study finds that PRDM16 acts as a corepressor in the BMP signaling pathway, which is crucial for ChP formation.

      The key findings of the study are:

      (1) PRDM16 promotes cell cycle exit in neural epithelial cells at the ChP primordium.

      (2) PRDM16 and BMP signaling work together to induce neural stem cell (NSC) quiescence in vitro.

      (3) BMP signaling and PRDM16 cooperatively repress proliferation genes.

      (4) PRDM16 assists genomic binding of SMAD4 and pSMAD1/5/8.

      (5) Genes co-regulated by SMADs and PRDM16 in NSCs are repressed in the developing ChP.

      (6) PRDM16 represses Wnt7b and Wnt activity in the developing ChP.

      (7) Levels of Wnt activity correlate with cell proliferation in the developing ChP and CH.

      In summary, this study identifies PRDM16 as a key regulator of the balance between BMP and Wnt signaling during ChP development. PRDM16 facilitates the repressive function of BMP signaling on cell proliferation while simultaneously suppressing Wnt signaling. This interplay between signaling pathways and PRDM16 is essential for the proper specification and differentiation of ChP epithelial cells. This study provides new insights into the molecular mechanisms governing ChP development and may have implications for understanding the pathogenesis of ChP tumors and other related diseases.

      Strengths:

      (1) Combining in vitro and in vivo experiments to provide a comprehensive understanding of PRDM16 function in ChP development.

      (2) Uses of a variety of techniques, including immunostaining, RNA in situ hybridization, RT-qPCR, CUT&Tag, ChIP-seq, and SCRINSHOT.

      (3) Identifying a novel role for PRDM16 in regulating the balance between BMP and Wnt signaling.

      (4) Providing a mechanistic explanation for how PRDM16 enhances the repressive function of BMP signaling. The identification of SMAD palindromic motifs as preferred binding sites for the SMAD/PRDM16 complex suggests a specific mechanism for PRDM16-mediated gene repression.

      (5) Highlighting the potential clinical relevance of PRDM16 in the context of ChP tumors and other related diseases. By demonstrating the crucial role of PRDM16 in controlling ChP development, the study suggests that dysregulation of PRDM16 may contribute to the pathogenesis of these conditions.

      We thank the reviewer for the thorough and thoughtful summary of our study. We’re glad the key findings and significance of our work were clearly conveyed, particularly regarding the role of PRDM16 in coordinating BMP and Wnt signaling during ChP development. We also appreciate the recognition of our integrated approach and the potential implications for understanding ChP-related diseases.

      Weaknesses:

      (1) Limited investigation of the mechanism controlling PRDM16 protein stability and nuclear localization in vivo. The study observed that PRDM16 protein became nearly undetectable in NSCs cultured in vitro, despite high mRNA levels. While the authors speculate that post-translational modifications might regulate PRDM16 in NSCs similar to brown adipocytes, further investigation is needed to confirm this and understand the precise mechanism controlling PRDM16 protein levels in vivo.

      While mechansims controlling PRDM16 protein stability and nuclear localization in the developing brain are interesting, the scope of this paper is revealing the function of PRDM16 in the choroid plexus and its interaction with BMP signaling. We will be happy to pursuit this direction in our next study.

      (2) Reliance on overexpression of PRDM16 in NSC cultures. To study PRDM16 function in vitro, the authors used a lentiviral construct to constitutively express PRDM16 in NSCs. While this approach allowed them to overcome the issue of low PRDM16 protein levels in vitro, it is important to consider that overexpressing PRDM16 may not fully recapitulate its physiological role in regulating gene expression and cell behavior.

      As stated above, we acknowledge that findings from cultured NSCs may not directly apply to ChP cells in vivo. We are cautious with our statements. The cell culture work was aimed to identify potential mechanisms by which PRDM16 and SMADs interact to regulate gene expression and target genes co-regulated by these factors. We expect that not all targets from cell culture are regulated by PRDM16 and SMADs in the ChP, so we validated expression changes of several target genes in the developing ChP and now included the new data in Fig. 7 and Supplementary Fig. 7. Out of the 31 genes identified from cultured cells, four cell cycle regulators including Wnt7b, Id3, Spc24/25/nuf2 and Mybl2, showed de-repression in Prdm16 mutant ChP. These genes can be relevant downstream genes in the ChP, and other target genes may be cortical NSC-specific or less dependent on Prdm16 in vivo.

      (3) Lack of direct evidence for AP1 as the co-factor responsible for SMAD relocation in the absence of PRDM16. While the study identified the AP1 motif as enriched in SMAD binding sites in Prdm16 knockout cells, they only provided ChIP-qPCR validation for c-FOS binding at two specific loci (Wnt7b and Id3). Further investigation is needed to confirm the direct interaction between AP1 and SMAD proteins in the absence of PRDM16 and to rule out other potential co-factors.

      We agree that the finding of the AP1 motif enriched at the PRDM16 and SMAD co-binding regions in Prdm16 KO cells can only indirectly suggest AP1 as a co-factor for SMAD relocation. That’s why we used ChIP-qPCR to examine the presence of C-fos at these sites. Although we only validated two targets, the result confirms that C-fos binds to the sites only in the Prdm16 KO cells but not Prdm16_expressing cells, suggesting AP1 is a co-factor.  Our results cannot rule out the presence of other co-factors.

      Reviewer #2 (Recommendations for the authors):

      Minor typo: [7, page 3] "sicne" should be "since".

      We appreciate the reviewer’s careful reading. We have now corrected the typo and revised some part of the text to improve clarity.

      Reviewer #3 (Public review):

      Summary:

      Bone morphogenetic protein (BMP) signaling instructs multiple processes during development including cell proliferation and differentiation. The authors set out to understand the role of PRDM16 in these various functions of BMP signaling. They find that PRDM16 and BMP co-operate to repress stem cell proliferation by regulating the genomic distribution of BMP pathway transcription factors. They additionally show that PRDM16 impacts choroid plexus epithelial cell specification. The authors provide evidence for a regulatory circuit (constituting of BMP, PRDM16, and Wnt) that influences stem cell proliferation/differentiation.

      Strengths:

      I find the topics studied by the authors in this study of general interest to the field, the experiments well-controlled and the analysis in the paper sound.

      We thank the reviewer for their positive feedback and thoughtful summary. We appreciate the recognition of our efforts to define the role of PRDM16 in BMP signaling and stem cell regulation, as well as the soundness of our experimental design and analysis.

      Weaknesses:

      I have no major scientific concerns. I have some minor recommendations that will help improve the paper (regarding the discussion).

      We have revised the discussion according to the suggestions.

      Reviewer #3 (Recommendations for the authors):

      Specific minor recommendations:

      Page 18. Line 526: In a footnote, the authors point out a recent report which in parallel was investigating the link between PRDM16 and SMAD4. There is substantial non-overlap between these two papers. To aid the reader, I would encourage the authors to discuss that paper in the discussion section of the manuscript itself, highlighting any similarities/differences in the topic/results.

      Thanks for the suggestion. We now included the comparison in the discussion. One conclusion between our study and this publication is consistent, that PRDM16 functions as a co-repressor of SMAD4. However, the mechanims are different. Our data suggests a model in which PRDM16 facilitates SMAD4/pSMAD binding to repress proliferation genes under high BMP conditions. However, the other report suggests that SMAD4 steadily binds to Prdm16 promoter and switches regulatory functions depending on the co-factors. Together with PRDM16, SMAD4 represses gene expression, while with SMAD3 in response to high levels of TGF-b1, it activates gene expression. These differences could be due to different signaling (BMP versus TGF-b), contexts (NSCs versus Pancreatic cancers) etc.

      Page 3. Line 65: typo 'since'

      We appreciate the reviewer’s careful reading. We have now corrected the typo and revised the text to improve clarity.

    1. eLife Assessment

      This important manuscript by Genzoni et al. reports the striking discovery of a regulatory role for trophic eggs in ant caste determination. Prior to this study, trophic eggs were widely assumed to play only a nutritional role in the colony, but this compelling study shows that trophic eggs can suppress queen development, and therefore regulate caste determination in specific social contexts.

    2. Reviewer #1 (Public Review):

      This manuscript describes a series of experiments documenting trophic egg production in a species of harvester ant, Pogonomyrmex rugosus. In brief, queens are the primary trophic egg producers, there is seasonality and periodicity to trophic egg production, trophic eggs differ in many basic dimensions and contents relative to reproductive eggs, and diets supplemented with trophic eggs had an effect on the queen/worker ratio produced (increasing worker production).

      The manuscript is very well prepared and the methods are sufficient. The outcomes are interesting and help fill gaps in knowledge, both on ants as well as insects, more generally.

    3. Reviewer #2 (Public review):

      The revised manuscript by Genzoni et al. reports the striking discovery of a regulatory role for trophic eggs. Prior to this study, trophic eggs were widely assumed to play a nutritional role in the colony, but this study shows that trophic eggs can suppress queen development, and therefore, can play a role in regulating caste determination in specific social contexts. In this revised version of the manuscript, the authors have addressed many of the concerns raised in the first version regarding the lack of sufficient information and context in the Introduction and Discussion.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      This manuscript describes a series of experiments documenting trophic egg production in a species of harvester ant, Pogonomyrmex rugosus. In brief, queens are the primary trophic egg producers, there is seasonality and periodicity to trophic egg production, trophic eggs differ in many basic dimensions and contents relative to reproductive eggs, and diets supplemented with trophic eggs had an effect on the queen/worker ratio produced (increasing worker production).

      The manuscript is very well prepared and the methods are sufficient. The outcomes are interesting and help fill gaps in knowledge, both on ants as well as insects, more generally. More context could enrich the study and flow could be improved.

      We thank the reviewer for these comments. We agree that the paper would benefit from more context. We have therefore greatly extended the introduction.

      Reviewer #2 (Public Review):

      The manuscript by Genzoni et al. provides evidence that trophic eggs laid by the queen in the ant Pogonomyrmex rugosis have an inhibitory effect on queen development. The authors also compare a number of features of trophic eggs, including protein, DNA, RNA, and miRNA content, to reproductive eggs. To support their argument that trophic eggs have an inhibitory effect on queen development, the authors show that trophic eggs have a lower content of protein, triglycerides, glycogen, and glucose than reproductive eggs, and that their miRNA distributions are different relative to reproductive eggs. Although the finding of an inhibitory influence of trophic eggs on queen development is indeed arresting, the egg cross-fostering experiment that supports this finding can be effectively boiled down to a single figure (Figure 6). The rest of the data are supplementary and correlative in nature (and can be combined), especially the miRNA differences shown between trophic and reproductive eggs. This means that the authors have not yet identified the mechanism through which the inhibitory effect on queen development is occurring. To this reviewer, this finding is more appropriate as a short report and not a research article. A full research article would be warranted if the authors had identified the mechanism underlying the inhibitory effect on queen development. Furthermore, the article is written poorly and lacks much background information necessary for the general reader to properly evaluate the robustness of the conclusions and to appreciate the significance of the findings.

      We thank the reviewer for these comments. We agree that the paper would benefit by having more background information and more discussion. We have followed this advice in the revision.

      Reviewer #3 (Public Review):

      In "Trophic eggs affect caste determination in the ant Pogonomyrmex rugosus" Genzoni et al. probe a fundamental question in sociobiology, what are the molecular and developmental processes governing caste determination? In many social insect lineages, caste determination is a major ontogenetic milestone that establishes the discrete queen and worker life histories that make up the fundamental units of their colonies. Over the last century, mechanisms of caste determination, particularly regulators of caste during development, have remained relatively elusive. Here, Genzoni et al. discovered an unexpected role for trophic eggs in suppressing queen development - where bi-potential larvae fed trophic eggs become significantly more likely to develop into workers instead of gynes (new queens). These results are unexpected, and potentially paradigm-shifting, given that previously trophic eggs have been hypothesized to evolve to act as an additional intracolony resource for colonies in potentially competitive environments or during specific times in colony ontogeny (colony foundation), where additional food sources independent of foraging would be beneficial. While the evidence and methods used are compelling (e.g., the sequence of reproductive vs. trophic egg deposition by single queens, which highlights that the production of trophic eggs is tightly regulated), the connective tissue linking many experiments is missing and the downstream mechanism is speculative (e.g., whether miRNA, proteins, triglycerides, glycogen levels in trophic eggs is what suppresses queen development). Overall, this research elevates the importance of trophic eggs in regulating queen and worker development but how this is achieved remains unknown.

      We thank the reviewer for these comments and agree that future work should focus on identifying the substances in trophic eggs that are responsible for caste determination.  

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      Introduction:

      The context for this study is insufficiently developed in the introduction - it would be nice to have a more detailed survey of what is known about trophic eggs in insects, especially social insects. The end of the introduction nicely sets up the hypothesis through the prior work described by Helms Cahan et al. (2011) where they found JH supplementation increased trophic egg production and also increased worker size. I think that the introduction could give more context about egg production in Pogonomyrmex and other ants, including what is known about worker reproduction. For example, Suni et al. 2007 and Smith et al. 2007 both describe the absence of male production by workers in two different harvester ants. Workers tend to have underdeveloped ovaries when in the presence of the queen. Other species of ants are known to have worker reproduction seemingly for the purpose of nutrition (see Heinze and Hölldober 1995 and subsequent studies on Crematogaster smithi). Because some ants, including Pogonomyrmex, lack trophallaxis, it has been hypothesized that they distribute nutrients throughout the nest via trophic eggs as is seen in at least one other ant (Gobin and Ito 2000). Interestingly, Smith and Suarez (2009) speculated that the difference in nutrition of developing sexual versus worker larvae (as seen in their pupal stable isotope values) was due to trophic egg provisioning - they predicted the opposite as was found in this study, but their prediction was in line with that of Helms Cahan et al. (2011). This is all to say that there is a lot of context that could go into developing the ideas tested in this paper that is completely overlooked. The inclusion of more of what is known already would greatly enrich the introduction.

      We agree that it would be useful to provide a larger context to the study. We now provide more information on the life-history of ants and explained under what situations queens and workers may produce trophic eggs. We also mentioned that some ants such as Crematogaster smithi have a special caste of “large workers” which are morphologically intermediate between winged queens and small workers and appear to be specialized in the production of unfertilized eggs. We now also mention the study of Goby and Ito (200) where the authors show that trophic eggs may play an important role in food distribution withing the colony, in particular in species where trophallaxis is rare or absent.

      Methods:

      L49: What lineage is represented in the colonies used? The collection location is near where both dependent-lineage (genetic caste determining) P. rugosus and "H" lineage exist. This is important to know. Further, depending on what these are, the authors should note whether this has relevance to the study. Not mentioning genetic caste determination in a paper that examines caste determination is problematic.

      This is a good point. We have now provided information at the very beginning of the material and method section that the queens had been collected in populations known not to have dependentlineage (genetic caste determining) mechanisms of caste determination.

      L63 and throughout: It would be more efficient to have a paragraph that cites R (must be done) and RStudio once as the tool for all analyses. It also seems that most model construction and testing was done using lme4 - so just lay this out once instead of over and over.

      We agree and have updated the manuscript accordingly.

      L95: 'lenght' needs to be 'length' in the formula.

      Thanks, corrected.

      L151: A PCA was used but not described in the methods. This should be covered here. And while a Mantel test is used, I might consider a permANOVA as this more intuitively (for me, at least) goes along with the PCA.

      We added the PCA description in the Material and Method section.

      Results:

      I love Fig. 3! Super cool.

      Thanks for this positive comment.

      Discussion:

      It would be good to have more on egg cannibalism. This is reasonably well-studied and could be good extra context.

      We have added a paragraph in the discussion to mention that egg cannibalism is ubiquitous in ants.

      Supp Table 1: P. badius is missing and citations are incorrectly attributed to P. barbatus.

      P. badius was present in the Table but not with the other Pogonomyrmex species. For some genera the species were also not listed in alphabetic order. This has been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Comments on introduction:

      The introduction is missing information about caste determination in ants generally and Pogonomyrmex rugosis specifically. This is important because some colonies of Pogonomyrmex rugosis have been shown to undergo genetic caste determination, in which case the main result would be rendered insignificant. What is the evidence that caste determination in the lineages/colonies used is largely environmentally influenced and in what contexts/environmental factors? All of this should be made clear.

      This is a good point. We have expanded the introduction to discuss previous work on caste determination in Pogonomyrmex species with environmental caste determination and now also provide evidence at the beginning of the Material and Method section that the two populations studied do not have a system of genetic caste determination.

      Line 32 and throughout the paper: What is meant exactly by 'reproductive eggs'? Are these eggs that develop specifically into reproductives (i.e., queens/males) or all eggs that are non-trophic? If the latter, then it is best to refer to these eggs as 'viable' in order to prevent confusion.

      We agree and have updated the manuscript accordingly.

      Figure 1/Supp Table 1: It is surprising how few species are known to lay trophic eggs. Do the authors think this is an informative representation of the distribution of trophic egg production across subfamilies, or due to lack of study? Furthermore, the branches show ant subfamilies, not families. What does the question mark indicate? Also, the information in the table next to the phylogeny is not easy to understand. Having in the branches that information, in categories, shown in color for example, could be better and more informative. Finally, having the 'none' column with only one entry is confusing - discuss that only one species has been shown to definitely not lay trophic eggs in the text, but it does not add much to the figure.

      Trophic eggs are probably very common in ants, but this has not been very well studied. We added a sentence in the manuscript to make this clear.

      Thanks for noticing the error family/subfamily error. This has been corrected in Figure 1 and Supplementary Table 1.

      The question mark indicates uncertainty about whether queens also contribute to the production of trophic eggs in one species (Lasius niger). We have now added information on that in the Figure legend.

      We agree with the reviewer that it would be easier to have the information on whether queens and workers produce trophic on the branches of the Tree. However, having the information on the branches would suggest that the “trait” evolved on this part of the tree. As we do not know when worker or queen production of trophic eggs exactly evolved, we prefer to keep the figure as it is.

      Finally, we have also removed the none in the figure as suggested by the reviewer and discussed in the manuscript the fact that the absence of trophic eggs has been reported in only one ant species (Amblyopone silvestrii: Masuko 2003_)._

      Comments on materials and methods:

      Why did they settle on three trophic eggs per larva for their experimental setup?

      We used three trophic eggs because under natural conditions 50-65% of the eggs are trophic. The ratio of trophic eggs to viable eggs (larvae) was thus similar natural condition.

      Line 50: In what kind of setup were the ants kept? Plaster nests? Plastic boxes? Tubes? Was the setup dry or moist? I think this information is important to know in the context of trophic eggs.

      We now explain that colonies were maintained in plastic boxes with water tubes.

      Line 60: Were all the 43 queens isolated only once, or multiple times?

      Each of the 43 queens were isolated for 8 hours every day for 2 weeks, once before and once after hibernation (so they were isolated multiple times). We have changed the text to make clear that this was done for each of the 43 queens.

      Could isolating the queen away from workers/brood have had an effect on the type of eggs laid?

      This cannot be completely ruled out. However, it is possible to reliably determine the proportion of viable and trophic eggs only by isolating queens. And importantly the main aim of these experiments was not to precisely determine the proportion viable and trophic eggs, but to show that this proportion changes before and after hibernation and that queens do not lay viable and trophic eggs in a random sequence.

      Since it was established that only queens lay trophic eggs why was the isolation necessary?

      Yes this was necessary because eggs are fragile and very difficult to collect in colonies with workers (as soon as eggs are laid they are piled up and as soon as we disturb the nest, a worker takes them all and runs away with them). Moreover, it is possible that workers preferentially eat one type of eggs thus requiring to remove eggs as soon as queens would have laid them. This would have been a huge disturbance for the colonies.

      Line 61: Is this hibernation natural or lab induced? What is the purpose of it? How long was the hibernation and at what temperature? Where are the references for the requirement of a diapause and its length?

      The hibernation was lab induced. We hibernated the queens because we previously showed that hibernation is important to trigger the production of gynes in P. rugosus colonies in the laboratory (Schwander et al 2008; Libbrecht et al 2013). Hibernation conditions were as described in Libbrecht et al (2013).  

      Line 73: If the queen is disturbed several times for three weeks, which effect does it have on its egg-laying rate and on the eggs laid? Were the eggs equally distributed in time in the recipient colonies with and without trophic eggs to avoid possible effects?

      It is difficult to respond what was the effect of disturbance on the number and type of eggs laid. But again our aim was not to precisely determine these values but determine whether there was an effect of hibernation on the proportion of trophic eggs. The recipient colonies with and without trophic eggs were formed in exactly the same way. No viable eggs were introduced in these colonies, but all first instar larvae have been introduced in the same way, at the same time, and with random assignment. We have clarified this in the Material and Method section.

      Line 77: Before placing the freshly hatched larvae in recipient colonies, how long were the recipient colonies kept without eggs and how long were they fed before giving the eggs? Were they kept long enough without the queen to avoid possible effects of trophic eggs, or too long so that their behavior changed?

      The recipient colonies were created 7 to 10 days before receiving the first larvae and were fed ad libitum with grass seeds, flies and honey water from the beginning. Trophic eggs that would have been left over from the source colony should have been eaten within the first few days after creating the recipient colonies. However, even if some trophic eggs would have remained, this would not influence our conclusion that trophic eggs influence caste fate, given the fully randomized nature of our treatments and the considerable number of independent replicates. The same applies to potential changes in worker behavior following their isolation from the queen.

      Line 77: Is it known at what stage caste determination occurs in this species? Here first instar larvae were given trophic eggs or not. Does caste-determination occur at the first instar stage? If not, what effect could providing trophic eggs at other stages have on caste-determination?

      A previous study showed that there is a maternal effect on caste determination in the focal species (Schwander et al 2008). The mechanism underlying this maternal effect was hypothesized to be differential maternal provisioning of viable eggs. However, as we detail in the discussion, the new data presented in our study suggests that the mechanism is in fact a different abundance of trophic eggs laid by queens. There is currently no information when exactly caste determination occurs during development

      Comments on results:

      Line 65: How does investigating the order of eggs laid help to "inform on the mechanisms of oogenesis"?

      We agree that the aim was not to study the mechanism of oogenesis. We have changed this sentence accordingly: “To assess whether viable and trophic eggs were laid in a random order, or whether eggs of a given type were laid in clusters, we isolated 11 queens for 10 hours, eight times over three weeks, and collected every hour the eggs laid”

      Figure 2: There is no description/discussion of data shown in panels B, C, E, and F in the main text.

      We have added information in the main text that while viable eggs showed embryonic development at 25 and 65 hours (Fig 12 B, C) there was no such development for trophic eggs (Fig. 2 E,F).

      Line 172: Please explain hibernation details and its significance on colony development/life cycle.

      We have added this information in the Material and Method section.

      Figure 6: How is B plotted? How could 0% of gynes have 100% survival?

      The survival is given for the larvae without considering caste. We have changed the de X axis of panel B and reworded the Figure legend to clarify this.

      Is reduced DNA content just an outcome of reduced cell number within trophic eggs, i.e., was this a difference in cell type or cell number? Or is it some other adaptive reason?

      It is likely to be due to a reduction in cell number (trophic eggs have maternal DNA in the chorion, while viable eggs have in addition the cells from the developing zygote) but we do not have data to make this point.

      Is there a logical sequence to the sequence of egg production? The authors showed that the sequence is non-random, but can they identify in what way? What would the biological significance be?

      We could not identify a logical sequence. Plausibly, the production of the two types of eggs implies some changes in the metabolic processes during egg production resulting in queens producing batches of either viable or trophic eggs. This would be an interesting question to study, but this is beyond the scope of this paper.

      Figure 6b is difficult to follow, and more generally, legends for all figures can be made clearer and more easy to follow.

      We agree. We have now improved the legends of Fig 6B and the other figures.

      Lines 172-174: "The percentage of eggs that were trophic was higher before hibernation...than after. This higher percentage was due to a reduced number of reproductive eggs, the number of trophic eggs laid remained stable" - are these data shown? It would be nice to see how the total egglaying rate changes after hibernation. Also, is the proportion of trophic eggs laid similar between individual queens?

      No the data were not shown and we do not have excellent data to make this point. We have therefore removed the sentence “This higher percentage was due to a reduced number of reproductive eggs, the number of trophic eggs laid remained stable” from the manuscript.

      Figure 6B: Do several colonies produce 100% gynes despite receiving trophic eggs? It would be interesting if the authors discussed why this might occur (e.g., the larvae are already fully determined to be queens and not responsive to whatever signal is in the trophic eggs).

      The reviewer is correct that 4 colonies produced 100% gynes despite receiving trophic eggs. However, the number of individuals produced in these four colonies was small (2,1,2,1, see supplementary Table 2). So, it is likely that it is just by chance that these colonies produced only gynes.

      Figure 5: Why a separation by "size distribution variation of miRNA"? What is the relevance of looking at size distributions as opposed to levels?

      We did that because there many different miRNA species, reflected by the fact that there is not just one size peak but multiple one. This is why we looked at size distribution

      Figure 2: The image of the viable embryo is not clear. If possible, redo the viable to show better quality images.

      Unfortunately, we do not anymore have colonies in the laboratory so this is not possible.

      Comments on discussion:

      Lines 236-247: Can an explanation be provided as to why the effect of trophic eggs in P. rugosus is the opposite of those observed by studies referenced in this section? Could P. rugosus have any life history traits that might explain this observation?

      In the two mentioned studies there were other factors that co-varied with variation in the quantity of trophic eggs. We mentioned that and suggested that it would be useful to conduct experimental manipulation of the quantity of trophic eggs in the Argentine ant and P. barbatus (the two species where an effect of trophic eggs had been suggested).

      The discussion should include implications and future research of the discovery.

      We made some suggestions of experiments that should be performed in the future

      The conclusion paragraph is too short and does not represent what was discussed.

      We added two sentences at the end of the paragraph to make suggestions of future studies that could be performed.

      Lines 231 to 247: Drastically reduce and move this whole part to the introduction to substantiate the assumption that trophic eggs play a nutritional role.

      We moved most of this paragraph to the introduction, as suggested by the reviewer.

      Reviewer #3 (Recommendations For The Authors):

      I would like to commend the authors on their study. The main findings of the paper are individually solid and provide novel insight into caste determination and the nature of trophic eggs. However, the inferences made from much of the data and connections between independent lines of evidence often extend too far and are unsubstantiated.

      We thank the reviewer for the positive comment. We made many changes in the manuscript to improve the discussion of our results.

    1. eLife Assessment

      This study reports useful information on the mechanisms by which a high-fat diet induces arrhythmias in the model organism Drosophila. Specifically, the authors propose that adipokinetic hormone (Akh) secretion is increased with this diet, and through binding of Akh to its receptor on cardiac neurons, arrhythmia is induced. The authors have revised their manuscript but the evidence remains incomplete. Nonetheless, the data presented will be helpful to those who wish to extend the research to a more complex model system, such as the mouse.

    2. Reviewer #1 (Public review):

      Summary:

      In the manuscript submission by Zhao et al. entitled, "Cardiac neurons expressing a glucagon-like receptor mediate cardiac arrhythmia induced by high-fat diet in Drosophila" the authors assert that cardiac arrhythmias in Drosophila on a high fat diet is due in part to adipokinetic hormone (Akh) signaling activation. High fat diet induces Akh secretion from activated endocrine neurons, which activate AkhR in posterior cardiac neurons. Silencing or deletion of Akh or AkhR blocks arrhythmia in Drosophila on high fat diet. Elimination of one of two AkhR expressing cardiac neurons results in arrhythmia similar to high fat diet.

      Strengths:

      The authors propose a novel mechanism for high fat diet induced arrhythmia utilizing the Akh signaling pathway that signals to cardiac neurons.

      Comments on revisions:

      The authors have addressed my other concerns. The only outstanding issue is in regard to the following comment:

      The authors state that "HFD led to increased heartbeat and an irregular rhythm." In representative examples shown, HFD resulted in pauses, slower heart rate, and increased irregularity in rhythm but not consistently increased heart rate (Figures 1B, 3A, and 4C). Based on the cited work by Ocorr et al (https://doi.org/10.1073/pnas.0609278104), Drosophila heart rate is highly variable with periods of fast and slow rates, which the authors attributed to neuronal and hormonal inputs. Ocorr et al then describe the use of "semi-intact" flies to remove autonomic input to normalize heart rate. Were semi-intact flies used? If not, how was heart rate variability controlled? And how was heart rate "increase" quantified in high fat diet compared to normal fat diet? Lastly, how does one measure "arrhythmia" when there is so much heart rate variability in normal intact flies?

      - The authors state that 8 sec time windows were selected at the discretion of the imager for analysis. I don't know how to avoid bias unless the person acquiring the imaging is blinded to the condition and the analysis is also done blind. Can you comment whether data acquisition and analysis was done in a blinded fashion? If not, this should be stated as a limitation of the study.

    3. Reviewer #3 (Public review):

      Zhao et al. provide new insights into the mechanism by which a high-fat diet (HFD) induces cardiac arrhythmia employing Drosophila as a model. HFD induces cardiac arrhythmia in both mammals and Drosophila. Both glucagon and its functional equivalent in Drosophila Akh are known to induce arrhythmia. The study demonstrates that Akh mRNA levels are increased by HFD and both Akh and its receptor are necessary for high-fat diet-induced cardiac arrhythmia, elucidating a novel link. Notably, Zhao et al. identify a pair of AKH receptor-expressing neurons located at the posterior of the heart tube. Interestingly, these neurons innervate the heart muscle and form synaptic connections, implying their roles in controlling the heart muscle. The study presented by Zhao et al. is intriguing, and the rigorous characterization of the AKH receptor-expressing neurons would significantly enhance our understanding of the molecular mechanism underlying HFD-induced cardiac arrhythmia.

      Many experiments presented in the manuscript are appropriate for supporting the conclusions while additional controls and precise quantifications should help strengthen the authors' arguments. The key results obtained by loss of Akh (or AkhR) and genetic elimination of the identified AkhR-expressing cardiac neurons do not reconcile, complicating the overall interpretation.

      The most exciting result is the identification of AkhR-expressing neurons located at the posterior part of the heart tube (ACNs). The authors attempted to determine the function of ACNs by expressing rpr with AkhR-GAL4, which would induce cell death in all AkhR-expressing cells, including ACNs. The experiments presented in Figure 6 are not straightforward to interpret. Moreover, the conclusion contradicts the main hypothesis that elevated Akh is the basis of HFD-induced arrhythmia. The results suggest the importance of AkhR-expressing cells for normal heartbeat. However, elimination of Akh or AkhR restores normal rhythm in HFD-fed animals, suggesting that Akh and AkhR are not important for maintaining normal rhythms. If Akh signaling in ACNs is key for HFD-induced arrhythmia, genetic elimination of ACNs should unalter rhythm and rescue the HFD-induced arrhythmia. An important caveat is that the experiments do not test the specific role of ACNs. ACNs should be just a small part of the cells expressing AkhR. Specific manipulation of ACNs will significantly improve the study. Moreover, the main hypothesis suggests that HFD may alter the activity of ACNs in a manner dependent on Akh and AkhR. Testing how HFD changes calcium, possibly by CaLexA (Figure 2) and/or GCaMP, in wild-type and AkhR mutant could be a way to connect ACNs to HFD-induced arrhythmia. Moreover, optogenetic manipulation of ACNs may allow for specific manipulation of ACNs.

      Interestingly, expressing rpr with AkhR-GAL4 was insufficient to eliminate both ACNs. It is not clear why it didn't eliminate both ACNs. Given the incomplete penetrance, appropriate quantifications should be helpful. Additionally, the impact on other AhkR-expressing cells should be assessed. Adding more copies of UAS-rpr, AkhR-GAL4, or both may eliminate all ACNs and other AkhR-expressing cells. The authors could also try UAS-hid instead of UAS-rpr.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript submission by Zhao et al. entitled, "Cardiac neurons expressing a glucagon-like receptor mediate cardiac arrhythmia induced by high-fat diet in Drosophila" the authors assert that cardiac arrhythmias in Drosophila on a high-fat diet are due in part to adipokinetic hormone (Akh) signaling activation. High-fat diet induces Akh secretion from activated endocrine neurons, which activate AkhR in posterior cardiac neurons. Silencing or deletion of Akh or AkhR blocks arrhythmia in Drosophila on a high-fat diet. Elimination of one of two AkhR-expressing cardiac neurons results in arrhythmia similar to a high-fat diet.

      Strengths:

      The authors propose a novel mechanism for high-fat diet-induced arrhythmia utilizing the Akh signaling pathway that signals to cardiac neurons.

      Weaknesses:

      Major comments:

      (1) The authors state, "Arrhythmic pathology is rooted in the cardiac conduction system." This assertion is incorrect as a blanket statement on arrhythmias. There are certain arrhythmias that have been attributable to the conduction system, such as bradycardic rhythms, heart block, sinus node reentry, inappropriate sinus tachycardia, AV nodal reentrant tachycardia, bundle branch reentry, fascicular ventricular tachycardia, or idiopathic ventricular fibrillation to name a few. However the etiological mechanism of many atrial and ventricular arrhythmias, such as atrial fibrillation or substrate-based ventricular tachycardia, are not rooted in the conduction system. The introduction should be revised to reflect a clear focus (away from?) on atrial fibrillation (AF). In addition, AF susceptibility is known to be modulated by autonomic tone, which is topically relevant (irrelevant?) to this manuscript.

      Thank you for the helpful comment. We rephrased the sentence as “Arrhythmic pathology is often rooted in the cardiac conduction system”.

      (2) The authors state that "HFD led to increased heartbeat and an irregular rhythm." In representative examples shown, HFD resulted in pauses, slower heart rate, and increased irregularity in rhythm but not consistently increased heart rate (Figures 1B, 3A, and 4C). Based on the cited work by Ocorr et al (https://doi.org/10.1073/pnas.0609278104), Drosophila heart rate is highly variable with periods of fast and slow rates, which the authors attributed to neuronal and hormonal inputs. Ocorr et al then describe the use of "semi-intact" flies to remove autonomic input to normalize heart rate. Were semi-intact flies used? If not, how was heart rate variability controlled? And how was heart rate "increase" quantified in high-fat diet compared to normal-fat diet? Lastly, how does one measure "arrhythmia" when there is so much heart rate variability in normal intact flies?

      We also observed that fly heart rate is highly variable with periods of fast and slow rates. To control heart rate variability, Ocorr et al. used semi-intact flies to record the heartbeat  (https://doi.org/10.1073/pnas.0609278104). We consider it a rigorous method to get highly consistent results with high quality videos/images. Since our work has a focus on the neuronal inputs to the heart, we did not use the semi-intact method. Our concern is that it is likely to disrupt the neuronal processes during the dissection. Using OCT, we recorded the heartbeat of intact flies in an 8 s time window, when the heartbeat was relatively stable. The different groups of flies, which were fed on a high-fat diet or a normal-fat diet, were recorded using the same method. Thus, we could compare the differences in heart rate.

      (3) The authors state, "to test whether the HFD-induced increase in Akh in the APC affects APC neuron activity, we used CaLexA (https://doi.org/10.3109/01677063.2011.642910)." According to the reference, CaLexA is a tool to map active neurons and would not indicate, as the authors state, whether Akh affects APC neuron activity specifically. It is equally possible that APC neurons may be activated by HFD and produce more Akh. Please clarify this language.

      Thank you for clarifying the calcium reporter, CaLexA. We rephrased this sentence to “to test whether HFD affects APC neuron activity, we used CaLexA”.

      (4) Are the AkhR+ neurons parasympathetic or sympathetic? Please provide additional experimentation that characterizes these neurons. The AkhR+ neurons appear to be anti-arrhythmic. Please expand the discussion to include a working hypothesis of the overall findings on Akh, AkhR, and AkhR+ neurons.

      Noyes et al. showed that Akh treatment increases heartbeat (Noyes, B. E., F. N. Katz, and M. H. Schaffer. 1995. “Identification and Expression of the Drosophila Adipokinetic Hormone Gene.” Molecular and Cellular Endocrinology 109 (2): 133–41.), suggesting that AkhR+ neurons are sympathetic. We showed that high-fat diet induced Akh expression and secretion, which led to stimulation of AkhR+ neuron and increased heart rate, supporting the sympathetic role of the AkhR+ neurons. Additional explanation on the sympathetic & anti-arrhythmic role of the Akh, AkhR, and AkhR+ neurons were added to the discussion.

      (5) The authors state, "Heart function is dependent on glucose as an energy source." However, the heart's main energy source is fatty acids with minimal use of glucose (doi: 10.1016/j.cbpa.2006.09.014). Glucose becomes more utilized by cardiomyocytes under heart failure conditions. Please amend/revise this statement.

      Thank you for pointing this out and providing the reference. We rephrased this sentence “Heart function is dependent on continuous ATP production. Cardiac ATP in Drosophila might come from fatty acids, glucose, and lactate (Kodde et al., 2007), as well as trehalose.”

      Reviewer #2 (Public Review):

      This manuscript explores mechanisms underlying heart contractility problems in metabolic disease using Drosophila as a model. They confirm, as others have demonstrated, that a high-fat diet (HFD) induces cardiac problems in flies. They showed that a high-fat diet increased Akh mRNA levels and calcium levels in the Akh-producing cells (APC), suggesting there is increased production and release of this hormone in a HFD context. When they knock down Akh production in the APCs using RNAi they see that cardiac contractility problems are abolished. They similarly show that levels of the Akh receptor (Akhr) are increased on a HFD and that loss of Akhr also rescues contractility problems on a HFD.

      One highlight of the paper was the identification of a pair of neurons that express a receptor for the metabolic hormone Akh, and showing initial data that these neurons innervate the cardiac muscle. They then overexpress cell death gene reaper (rpr) in all Akhr-positive cells with Akhr-GAL4 and see that cardiac contractility becomes abnormal.

      However, this paper contains several findings that have been reported elsewhere and it contains key flaws in both experimental design and data interpretation. There is some rationale for doing the experiments, and the data and images are of good quality. However, others have shown that HFD induces cardiac contractility problems (Birse 2010), that Akh mRNA levels are changed with HFD (Liao 2021) that Akh modulates cardiac rhythms (Noyes 1995), so Figures 1-4 are largely a confirmation of what is already known. This limits the overall magnitude of the advances presented in these figures. Overall, the stated concerns limit the impact of the manuscript in advancing our understanding of heart contractility.

      We thank the reviewer for the positive comments and appreciate the reviewer for the instructive suggestions. Birse 2010 (PMID: 21035763) was cited in our manuscript. Liao 2021 showed that Akh mRNA levels are changed with HFD. We added the reference to the revised manuscript and modified the text as: “In consistent with a previous work (Liao et al., 2020), we showed that the expression of Akh was significantly up-regulated in the flies fed a HFD, compared to NFD-fed flies (Figure 2B)”. Our qPCR verified Liao’s results. On top of this, we investigated the calcium levels in the Akh producing cells (APCs) and showed elevated calcium levels in the APC in HFD fed flies. In the revised version, we added more data to show that Akh protein levels were increased with HFD (Figure 2E-F). In line with Noyes' discovery, which showed that Akh injection caused cardioaccelation in prepupae, we showed that genetic manipulation of Akh expression affected heartbeat in the adults.   

      Reviewer #3 (Public Review):

      Zhao et al. provide new insights into the mechanism by which a high-fat diet (HFD) induces cardiac arrhythmia employing Drosophila as a model. HFD induces cardiac arrhythmia in both mammals and Drosophila. Both glucagon and its functional equivalent in Drosophila Akh are known to induce arrhythmia. The study demonstrates that Akh mRNA levels are increased by HFD and both Akh and its receptor are necessary for high-fat diet-induced cardiac arrhythmia, elucidating a novel link. Notably, Zhao et al. identify a pair of AKH receptor-expressing neurons located at the posterior of the heart tube. Interestingly, these neurons innervate the heart muscle and form synaptic connections, implying their roles in controlling the heart muscle. The study presented by Zhao et al. is intriguing, and the rigorous characterization of the AKH receptor-expressing neurons would significantly enhance our understanding of the molecular mechanism underlying HFD-induced cardiac arrhythmia.

      Many experiments presented in the manuscript are appropriate for supporting the conclusions while additional controls and precise quantifications should help strengthen the authors' augments. The key results obtained by loss of Akh (or AkhR) and genetic elimination of the identified AkhR-expressing cardiac neurons do not reconcile, complicating the overall interpretation.

      It is intriguing to see an increase in Akh mRNA levels in HFD-fed animals. This is a key result for linking HFD-induced arrhythmia to Akh. Thus, demonstrating that HFD also increases the Akh protein levels and Akh is secreted more should significantly strengthen the manuscript.

      Thank you for the positive comments and the instructive suggestions. We performed immunostaining to show that Akh protein levels increased, which is consistent with elevated Akh mRNA expression in HFD-fed flies. The data was added to Figure 2, panels E and F. Akh secretion from the APCs is regulated by APC activity (https://doi.org/10.1038/s41586-019-1675-4). We used a calcium reporter CaLexA (https://doi.org/10.3109/01677063.2011.642910) to monitor APC activity and showed that HFD increased APC activity (Figure 2, C-D).

      The experiments employing an AkhR null allele nicely demonstrate its requirement for HFD-induced cardiac arrhythmia. Depletion of Akh in Akh-expressing cells recapitulates the consequence of AkhR knockout, supporting that both Akh and its receptor are required for HFD-induced cardiac arrhythmia. Given that RNAi is associated with off-target effects and some RNAi reagents do not work, testing multiple independent RNAi lines is the standard procedure. It is also important to show the on-target effect of the RNAi reagents used in the study.

      Indeed, RNAi approaches can suffer from off-target effects. For Akh experiments, we used an RNAi line BL_34960, which was generated using artificial microRNAs shRNA (DOI: 10.1038/nmeth.1592). In comparison to long-hairpin constructs, shRNA constructs are expected to be advantageous, e.g., more efficient and minimized off-target. We performed immunostaining to determine Akh-Gal4>UAS-Akh-RNAi efficiency. We showed that anti-Akh fluorescence diminished in Akh-Gal4>UAS-Akh-RNAi APCs. The data was added to Figure 3-figure supplement 1.

      The most exciting result is the identification of AkhR-expressing neurons located at the posterior part of the heart tube (ACNs). The authors attempted to determine the function of ACNs by expressing rpr with AkhR-GAL4, which would induce cell death in all AkhR-expressing cells, including ACNs. The experiments presented in Figure 6 are not straightforward to interpret. Moreover, the conclusion contradicts the main hypothesis that elevated Akh is the basis of HFD-induced arrhythmia. The results suggest the importance of AkhR-expressing cells for normal heartbeat. However, elimination of Akh or AkhR restores normal rhythm in HFD-fed animals, suggesting that Akh and AkhR are not important for maintaining normal rhythms. If Akh signaling in ACNs is key for HFD-induced arrhythmia, genetic elimination of ACNs should unalter rhythm and rescue the HFD-induced arrhythmia. An important caveat is that the experiments do not test the specific role of ACNs. ACNs should be just a small part of the cells expressing AkhR. The experiments presented in Figure 6 cannot justify the authors' conclusion. Specific manipulation of ACNs will significantly improve the study. Moreover, the main hypothesis suggests that HFD may alter the activity of ACNs in a manner dependent on Akh and AkhR. Testing how HFD changes calcium, possibly by CaLexA (Figure 2) and/or GCaMP, in wild-type and AkhR mutants could be a way to connect ACNs to HFD-induced arrhythmia. Moreover, optogenetic manipulation of ACNs will allow for specific manipulation of ACNs, which is crucial for studying the specific role of ACNs in controlling cardiac rhythms.

      Thank you for the insightful comments. We have been trying to find a way to only target the AkhR neurons using split-Gal4. Up to now, it’s not successful. Akh/AkhR signaling shall play a key role in the ACNs, however, we cannot rule out the possibility that ACNs also receive signals other than Akh in the modulation of heartbeat.

      Interestingly, expressing rpr with AkhR-GAL4 was insufficient to eliminate both ACNs. It is not clear why it didn't eliminate both ACNs. Given the incomplete penetrance, appropriate quantifications should be helpful. Additionally, the impact on other AhkR-expressing cells should be assessed. Adding more copies of UAS-rpr, AkhR-GAL4, or both may eliminate all ACNs and other AkhR-expressing cells. The authors could also try UAS-hid instead of UAS-rpr.

      We added more data to show that AkhR+ neurons are positive in anti-Akh staining, indicating the AkhR+ neurons indeed receive Akh.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Typo in line 765: "increased Akh section into the circulation." Section should be secretion.

      Thank you for finding the typo. We changed section to secretion.

      Reviewer #2 (Recommendations For The Authors):

      One interesting extension to our knowledge in Figures 3 & 4 is that loss of Akhr and loss of Akh both block the cardiac contractility defects that accompany a HFD. The main concern I have with the Akh finding is that the authors use only a GAL4 control and no UAS alone control. Metabolic phenotypes often show strain-specific effects, so to make conclusions it is essential that the authors include a UAS alone control alongside the other genotypes to be sure it does not rescue the cardiac contractility defects that accompany a HFD by itself.

      I am interested in the authors' identification of a pair of Akhr-positive neurons that innervate the cardiac muscle. I am not aware of any other studies identifying these neurons, or revealing their function. The contents of Figure 5 therefore represent the largest advance in the study. However, the characterization of these neurons is very superficial, and a lot more work to understand their regulation and function in a HFD context is needed to make conclusions about their role in any HFD-induced cardiac contractility problems. Or to determine how Akh influences the function of these specific neurons in an HFD context.

      The reason I say this is that the authors ablate all Akhr-positive cells in Figure 6 and show that this disturbs normal cardiac contractility. While studies on the one pair of Akhr-positive neurons would be really interesting, ablating all Akhr-positive cells, which includes the fat and many other cell types in the fly, is not a scientifically rigorous approach to answering this question. As a result, the authors are only able to make the claim that ablating many cell types throughout the animal disrupts cardiac contractility, which does not advance our understanding of mechanisms underlying heart contractility problems. In addition, because the experiments they designed did not test whether it was Akh binding to Akhr on those neurons that regulate cardiac contractility problems in a HFD context, their experiments do not support their model in Figure 7.

      The authors also make conclusions that are fairly speculative around Line 231 when describing their model in Figure 7. These claims are simply not supported by the data they present and must be removed. For example, the authors have not identified an endocrine-heart axis, they simply showed that changes in Akh can influence the heart, but this is not necessarily a direct effect on a specific cell type. They do not show data that Akh binds the newly identified Akhr-positive neuron pair to mediate the effects of HFD-induced contractility defects - they just ablate all Akhr-positive cells (fat, neurons, and other types) and show cardiac defects. If those neurons did mediate the abnormal cardiac rhythm promoted by Akh, then ablating those neurons (and not a large number of additional tissues) should rescue HFD-induced heart defects just like reducing Akhr or Akh did (but this is the opposite of what they see). Overall, concerns with experimental design, data interpretation, and relatively few findings that aren't reported elsewhere reduce the impact of this paper.

      We appreciate the positive comments and helpful suggestions. Indeed, it is important to get clean genetic access to the cardiac neurons. We intended to use split Gal4 system to target the AkhR cardiac neurons. We have tried to build a split Gal4 driver AkhR-p65.AD. Two rounds of injection were carried out. However, we did not recover a transgenic line.

      In the revised version, we performed immunostaining using Akh antibodies to show that anti-Akh fluorescence was observed in AkhR neurons (Figure 5-figure supplement 1), indicating an endocrine-heart axis.

    1. eLife Assessment

      This study provides fundamental information on how Arg-II participates in cardiac aging. The phenotypic data provide convincing evidence of non-cell-autonomous contributions to aging-related pathologies. Overall, the study highlights the importance of intercellular signaling in maintaining cardiac health during aging.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Duilio M. Potenza et al. explores the role of Arginase II in cardiac aging, majorly using whole-body arg-ii knock-out mice. In this work, the authors have found that Arg-II exerts non-cell-autonomous effects on aging cardiomyocytes, fibroblasts, and endothelial cells mediated by IL-1b from aging macrophages. The authors have used arg II KO mice and an in vitro culture system to study the role of Arg II. Authors have also reported the cell-autonomous effect of Arg-II through mitochondrial ROS in fibroblasts that contribute to cardiac aging. These findings are sufficiently novel in cardiac aging and provide interesting insights. While the phenotypic data seem strong, the mechanistic details are unclear. How Arg II regulates the IL-1b and modulates cardiac aging is still being determined.

      Strengths:

      This study provides interesting information on the role of Arg II in cardiac aging.

      The phenotypic data in the Arg II KO mice is convincing, and the authors have assessed most of the aging-related changes.

      The data is supported by an in vitro cell culture system.

      Weaknesses:

      The manuscript needs more mechanistic details on how Arg II regulates IL-1b and modulates cardiac aging.

    3. Reviewer #2 (Public review):

      This study investigates the role of arginase-II (Arg-II) in cardiac aging. The authors challenge previous assumptions by demonstrating that Arg-II is not expressed in aged cardiomyocytes, but is upregulated in non-myocyte cells, specifically macrophages, fibroblasts, and endothelial cells. Using Arg-II knockout mice, they show protection against age-associated cardiac inflammation, fibrosis, apoptosis, endothelial-to-mesenchymal transition (EndMT), and ischemic injury. Mechanistically, Arg-II promotes IL-1β release from macrophages and increases mitochondrial ROS in fibroblasts, contributing to cardiac aging through both cell-autonomous and non-cell-autonomous mechanisms.

      The study is well-structured and combines genetic models, molecular assays, and histological analyses to support its conclusions. Including both human and mouse samples strengthens the translational relevance of the findings. The authors have addressed most of the reviewers' comments and have made efforts to improve the manuscript by adding experimental data, explanations, and further discussion.

      The data convincingly support their conclusions. This work provides valuable insights into the mechanisms of cardiac aging, aligns with growing evidence of non-cell-autonomous contributions to aging-related pathologies, and highlights the importance of intercellular signaling in maintaining cardiac health during aging.

      Although the use of cell-specific knockout mouse models would enhance the depth and translational potential of the findings, it is understandable that such an approach would be beyond the scope of a single study. This work lays the groundwork for future investigations into conditional Arg-II knockouts in specific cell types to elucidate the cell-specific roles of Arg-II in cardiac aging.

      Overall, this is a solid and impactful study with strong experimental support

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript by Duilio M. Potenza et al. explores the role of Arginase II in cardiac aging, majorly using whole-body arg-ii knock-out mice. In this work, the authors have found that Arg-II exerts non-cell-autonomous effects on aging cardiomyocytes, fibroblasts, and endothelial cells mediated by IL-1b from aging macrophages. The authors have used arg II KO mice and an in vitro culture system to study the role of Arg II. The authors have also reported the cell-autonomous effect of Arg-II through mitochondrial ROS in fibroblasts that contribute to cardiac aging. These findings are sufficiently novel in cardiac aging and provide interesting insights. While the phenotypic data seems strong, the mechanistic details are unclear. How Arg II regulates the IL-1b and modulates cardiac aging is still being determined. The authors still need to determine whether Arg II in fibroblasts and endothelial contributes to cardiac fibrosis and cell death. This study also lacks a comprehensive understanding of the pathways modulated by Arg II to regulate cardiac aging.

      We sincerely appreciate the valuable feedback provided by the reviewer. It's gratifying to hear that our work provided novel information on the role of arginase-II in cardiac aging which is a complex process involving various cell types and mechanisms. We have devoted considerable effort by performing new experiments to address the reviewer's comments and to delineate more detailed mechanisms of Arg-II in cardiac aging. Please, see below our specific answers to each point of the reviewers.

      Strengths:

      This study provides interesting information on the role of Arg II in cardiac aging.

      The phenotypic data in the arg II KO mice is convincing, and the authors have assessed most of the aging-related changes.

      The data is supported by an in vitro cell culture system.

      We appreciate this reviewer’s positive assessment on the strength of our study.

      Weaknesses:

      The manuscript needs more mechanistic details on how Arg II regulates IL-1b and modulates cardiac aging.

      We made great effort and have performed new experiments in human monocyte cell line (THP1) in which iNOS is not expressed and not inducible by LPS and arg-ii gene was knocked out by CRISPR technology. Moreover, murine bone-marrow derived macrophages in which inos gene was ablated, is also use for this purpose. We found that in the human THP1 monocytes in which Arg-II but not iNOS is induced by LPS (100 ng/mL for 24 hours) (Suppl. Fig. 6A), mRNA and protein levels of IL-1b precursor are markedly reduced in arg-ii knockout THP1<sup>arg-ii<sup>-/-</sup></sup> as compared to the THP1<sup>wt</sup> cells (Suppl. Fig. 6B and 6C), further confirming that Arg-II promotes IL-1b production as also shown in RAW264.7 macrophages (Suppl. Fig. 5A and 5C). Moreover, in the mouse bone-marrow-derived macrophages, LPS-induced IL-1b production is inhibited by inos deficiency (BMDM<sup>inos-/-</sup> vs BMDM<sup>wt</sup>) (Suppl. Fig. 6D and 6E), while Arg-II levels are slightly enhanced in the BMDM<sup>inos-/-</sup> cells (Suppl. Fig. 6D and 6F). All together, these results suggest that iNOS slightly reduces Arg-II expression. Arg-II and iNOS can be upregulated by LPS independently. Both Arg-II and iNOS are required for IL-1b production upon LPS stimulation as illustrated in Suppl. Fig. 6G. For detailed results and discussion, please see answers to the comments point 2 or point 6 raised by this reviewer.

      The authors used whole-body KO mice, and the role of macrophages in cardiac aging is not studied in this model. A macrophage-specific arg II Ko would be a better model.

      We fully agree with this comment of the reviewer. Unfortunately, this macrophage specific arg-ii knockout animal model is not available, yet. Future research shall develop the macrophage-specific arg-ii<sup>-/-</sup> mouse model to confirm this conclusion with aging animals. Since Arg-II is also expressed in fibroblasts and endothelial cells and exerts cell-autonomous and paracrine functions, aging mouse models with conditional arg-ii knockout in the specific cell types would be the next step to elucidate cell-specific function of Arg-II in cardiac aging. We have pointed out this aspect for future research on page 19, lines 2 to 6.

      Experiments need to validate the deficiency of Arg II in cardiomyocytes.

      As pointed out by this reviewer in the comment point 10, Arg-II was previously reported to be expressed in isolated cardiomyocytes from in rats (PMID: 16537391). Unfortunately, negative controls. i.e., arg-ii<sup>-/-</sup> samples were not included in the study to avoid any possible background signals. We made great effort to investigate whether Arg-II is present in the cardiomyocytes from different species including mice, rats and humans and have included old arg-ii<sup>-/-</sup> mouse samples as a negative control. This allows to validate the antibody specificity and background noises beyond any reasonable doubt. The new experiments in Suppl. Fig. 4 confirms the specificity of the antibody against Arg-II in old mouse kidney which is known to express Arg-II in the S3 proximal tubular cells (Huang J, et al. 2021). To exclude the possible species-specific different expression of Arg-II in the cardiomyocytes, aged mouse and rat heart tissues were used for cellular localization of Arg-II by confocal immunofluorescence staining. As shown in Suppl. Fig. 4B and 4C, both species show Arg-II expression only in non-cardiomyocytes (cells between striated cardiomyocytes) (red arrows) but not in striated cardiomyocytes. Even in the rat myocardial infarction tissues, Arg-II was not found in cardiomyocytes but in endocardium cells (Suppl. Fig. 4B). In isolated cardiomyocytes exposed to hypoxia, a well know strong stimulus for Arg-II protein levels, no Arg-II signals could be detected, while in fibroblasts from the same animals, an elevated Arg-II levels under hypoxia is demonstrated (Fig. 5B). Furthermore, even RT-qPCR could not detect arg-ii mRNA in cardiomyocytes but in non-cardiomyocytes (Fig. 5C). All together, these results demonstrate that Arg-II are not expressed or at negligible levels in cardiomyocytes but expressed in non-cardiomyocytes. This new experiments with rat heart are included in the method section on page 20, the 1st paragraph. The results are described on page 7, the 1st paragraph, and discussed on page 12, the 2nd paragraph. Legend to Suppl. Fig. 4 is included in the file “Suppl. figure legend_R”.

      The authors have never investigated the possibility of NO involvement in this mice model.

      As above mentioned, we made great effort and have performed new experiments in human monocyte cell line (THP1) in which iNOS is not expressed and not inducible by LPS and arg-ii gene was knocked out by CRISPR technology. Moreover, murine bone-marrow derived macrophages in which inos gene was ablated, is also use for this purpose. The results show that Arg-II and iNOS can be upregulated by LPS independent of each other and iNOS slightly reduces Arg-II expression. However, both Arg-II and iNOS are required for IL-1b production upon LPS stimulation. For detailed results and discussion, please see answers to the comments point 2 or point 6 raised by this reviewer.

      A co-culture system would be appropriate to understand the non-cell-autonomous functions of macrophages.

      We appreciate the suggestion by this reviewer regarding the co-culture system to test the non-cell autonomous role of Arg-II. We think that our current model, which involves treating cells with conditioned media, is a well-established and effective method for demonstrating the non-cell autonomous role of Arg-II. This approach allows us to observe the effects of Arg-II on surrounding cells through the factors present in the conditioned media released from macrophages. The co-culture system could be considered, if the released factor in the conditioned medium is not stable. This is however not the case. Therefore, we are confident that our experimental model with conditioned medium is sufficiently enough to demonstrate a paracrine effect of cell-cell interaction (please also see answers to the comment point 16.

      The Myocardial infarction data shown in the mice model may not be directly linked to cardiac aging.

      As we have introduced and discussed in the manuscript, aging is a predominant risk factor for cardiovascular disease (CVD). Studies in experimental animal models and in humans provide evidence demonstrating that aging heart is more vulnerable to stressors such as ischemia/reperfusion injury and myocardial infarction as compared to the heart of young individuals. Even in the heart of apparently healthy individuals of old age, chronic inflammation, cardiomyocyte senescence, cell apoptosis, interstitial/perivascular tissue fibrosis, endothelial dysfunction and endothelial-mesenchymal transition (EndMT), and cardiac dysfunction either with preserved or reduced ejection fraction rate are observed. Our study is aimed to investigate the role of Arg-II in cardiac aging phenotype and age-associated cardiac vulnerability to stressors. Therefore, cardiac functional changes and myocardial infarction in response to ischemia/reperfusion injury are suitable surrogate parameters for the purpose.

      Reviewer #2 (Public Review):

      Summary:

      The results from this study demonstrated a cell-specific role of mitochondrial enzyme arginase-II (Arg-II) in heart aging and revealed a non-cell-autonomous effect of Arg-II on cardiomyocytes, fibroblasts, and endothelial cells through the crosstalk with macrophages via inflammatory factors, such as by IL-1b, as well as a cell-autonomous effect of Arg-II through mtROS in fibroblasts contributing to cardiac aging phenotype. These findings highlight the significance of non-cardiomyocytes in the heart and bring new insights into the understanding of pathologies of cardiac aging. It also provides new evidence for the development of therapeutic strategies, such as targeting the ArgII activation in macrophages.

      We're grateful for the reviewer's positive feedback, acknowledging the significant findings of our study on the role of arginase-II (Arg-II) in cardiac aging. We appreciate this reviewer’s insight into the therapeutic potential of targeting Arg-II activation in macrophages and are excited about the implications for future interventions in age-related cardiac pathologies. Thank you for recognizing the importance of our work in advancing our understanding of cardiac aging and potential therapeutic strategies.

      Strengths:

      This study targets an important clinical challenge, and the results are interesting and innovative. The experimental design is rigorous, the results are solid, and the representation is clear. The conclusion is logical and justified.

      We thank this reviewer for the positive comment.

      Weaknesses:

      The discussion could be extended a little bit to improve the realm of the knowledge related to this study.

      We appreciate this comment and have added and revised our discussion on this aspect accordingly at the end of the discussion section on page 19.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I have several critical concerns, specifically about the mechanism of how Arg-II plays a role in cardiac aging.

      My major concerns are:

      (1) The authors have shown non-cell-autonomous effects on aging cardiomyocytes, fibroblasts, and endothelial cells mediated by IL-1b from aging macrophages. A macrophage-specific Arg-II knock-out mouse model is a suitable and necessary control to establish claims.

      We fully agree with this comment of the reviewer. Unfortunately, this macrophage specific arg-ii knockout animal model is not available, yet. Future research shall develop the macrophage-specific arg-ii<sup>-/-</sup> mouse model to confirm this conclusion with aging animals. Since Arg-II is also expressed in fibroblasts and endothelial cells and exerts cell-autonomous and paracrine functions, aging mouse models with conditional arg-ii knockout in the specific cell types would be the next step to elucidate cell-specific function of Arg-II in cardiac aging. We have pointed out this aspect for future research on page 19, lines 2 to 6.

      (2) This study suggests that Arg-II exerts its effect through IL-1b in cardiac ageing. However, all experiments performed to demonstrate the link between ArgII and IL-1β are correlative at best. The underlying molecular mechanism, including transcription factors involved in the regulation of IL-1β by arg-ii, has not been demonstrated.

      We sincerely appreciate this reviewer’s comment on the aspect! To make it clear, a causal role of Arg-II in promoting IL-1β production in macrophages is evidenced by the experimental results showing that old arg-ii<sup>-/-</sup> mouse heart has lower IL-1β levels than the age-matched wt mouse heart (Fig. 6A to 6D). We further showed that the cellular IL-1β protein levels and release are reduced in old arg-ii<sup>-/-</sup> mouse splenic macrophages as compared to the wt cells (Fig. 7A, 7C, and 7D). This result is further confirmed with the mouse macrophage cell line RAW264.7 (Suppl. Fig. 5A and suppl. Fig. 5C), in which we demonstrate that silencing arg-ii reduces IL-1β levels stimulated with LPS.

      According to this reviewer’s comment (see comment point 6), we made further effort to investigate possible involvement of iNOS in Arg-II-regulated IL-1β production in macrophages stimulated with LPS. We performed new experiments in human monocyte cell line (THP1) in which iNOS is not expressed and not inducible by LPS and arg-ii gene was knocked out by CRISPR technology in the cells.

      Moreover, murine bone-marrow derived macrophages in which inos gene was ablated, is also use for this purpose. We found that in the human THP1 monocytes in which Arg-II but not iNOS is induced by LPS (100 ng/mL for 24 hours) (Suppl. Fig. 6A), mRNA and protein levels of IL-1b are markedly reduced in arg-ii knockout THP1<sup>arg-ii<sup>-/-</sup></sup> as compared to the THP1<sup>wt</sup> cells (Suppl. Fig. 6B and 6C), further confirming that Arg-II promotes IL-1b production as also shown in RAW264.7 macrophages (Suppl. Fig. 5A and 5C). The results suggest that Arg-II promotes IL-1b production independently of iNOS. Moreover, the role of iNOS in IL-1b production was also studied in the mouse bone-marrow-derived macrophages in which inos gene is ablated. The results demonstrate that LPS-induced IL-1b production is inhibited by inos deficiency (BMDM<sup>inos-/-</sup> vs BMDM<sup>wt</sup>) (Suppl. Fig. 6D and 6E), while Arg-II levels are slightly enhanced in the BMDM<sup>inos-/-</sup> cells (Suppl. Fig. 6D and 6F). Since arginase and iNOS share the same metabolic substrate L-arginine, <sup>inos-/-</sup> is expected to increase IL-1b production. This is however not the case. A strong inhibition of IL-1β production in <sup>inos-/-</sup> macrophages is observed. These results implicate that iNOS promotes IL-1β production independently of Arg-II and the inhibiting effect of IL-1β by inos deficiency is dominant and able to counteract Arg-II’s stimulating effect on IL-1β production. Hence, our results demonstrate that Arg-II promotes IL-1β production in macrophages independently of iNOS. All together, these results suggest that iNOS slightly reduces Arg-II expression. Arg-II and iNOS can be upregulated by LPS independently. Both Arg-II and iNOS are required for IL-1b production upon LPS stimulation (This concept is illustrated in the Suppl. Fig. 6G). The new results are described on page 8, the last paragraph and page 9, the 1st paragraph, presented in Suppl. Fig.6. The legend to Suppl. Fig. 6 is described in the file “Supplementary figure legend-R”. The related experimental methods are updated on page 23, the last two paragraphs and page 26 the last paragraph. The results are discussed o page 14, the last paragraph and page 15, the first two paragraphs.

      (3) Figure 2: The authors have not validated the whole-body Arg-II knock-out mice for arg-ii ablation.

      Thanks for pointing out this missing information! We have added the information regarding genotyping of the mice in the method section on page 20, first paragraph. Moreover, Fig. 5C also confirms the genotyping of the non-cardiomyocyte cells isolated from wt and arg-ii<sup>-/-</sup> animals.

      (4) It is unclear why the authors have chosen to focus on IL-1β specifically, among other pro-inflammatory cytokines that were also downregulated in Arg-II-/- mice as demonstrated in Fig. 2A-D.

      We appreciate the reviewer's question, which provides an opportunity to delve deeper into our findings. In our investigation, we observed that aging is accompanied by elevated levels of various proinflammatory markers. Intriguingly, our data revealed that tnf-α remained unaffected by the ablation of arg-ii during aging in the heart tissues, while Il-1β showed a significant reduction in arg-ii<sup>-/-</sup> animals compared to age-matched wild-type (wt) mice (Fig. 2). Mcp1 is however a chemoattractant for macrophages and F4-80 serves as a pan marker for macrophages. Moreover, our previous studies demonstrate a relationship between Arg-II and IL-1β in vascular disease and obesity and age-associated renal and pulmonary fibrosis. Finally, IL-1β has been shown to play a causal role in patients with coronary atherosclerotic heart disease as shown by CANTOS trials. Therefore, we have focused on IL-1β in this study. We have now explained and strengthened this aspect in the manuscript on page 7, the last two lines and page 8, the 1st paragraph as following:

      “Taking into account that our previous studies demonstrated a relationship of Arg-II and IL-1β in vascular disease and obesity (Ming et al., 2012) and in age-associated organ fibrosis such as renal and pulmonary fibrosis (Huang et al., 2021; Zhu et al., 2023), and IL-1β has been shown to play a causal role in patients with coronary atherosclerotic heart disease as shown by CANTOS trials (Ridker et al., 2017), we therefore focused on the role of IL-1β in crosstalk between macrophages and cardiac cells such as cardiomyocytes, fibroblasts and endothelial cells”.

      (5) Although macrophages are shown to be involved in cardiac ageing in the arg-ii mouse model, the authors have not estimated macrophage infiltration and expression of inflammatory or senescence markers in the hearts of these mice.

      Thank you very much for raising this important point! Taking the comments of the reviewer into account, we have performed new experiments, i.e., multiple immunofluorescent staining to analyze the infiltrated (CCR2<sup>+</sip>/F4-80<sup>+</sup>) and resident (LYVE1<sup>+</sup>/F4-80<sup>+</sup>) macrophage populations and to investigate to which extent that Arg-II affects the infiltrated and resident macrophage populations in the aging heart and whether this is regulated by arg-ii<sup>-/-</sup>. The results show an age-associated increase in the numbers of F4/80<sup>+</sup> cells in the wt mouse heart, which is reduced in the age-matched arg-ii<sup>-/-</sup> animals (Fig. 2G). This result is in accordance with the result of f4/80 gene expression shown in Fig. 2A, demonstrating that arg-ii gene ablation reduces macrophage accumulation in the aging heart. Interestingly, resident macrophages as characterized by LYVE1<sup>+</sup>/F4-80<sup>+</sup> cells (Fig. 2E and 2H) are predominant in the aging heart as compared to the infiltrated CCR2<sup>+</sup>/F4-80<sup>+</sup> cells (Fig. 2F and 2I). The increase in both LYVE1<sup>+</sup>/F4-80<sup>+</sup> and CCR2<sup>+</sup>/F4-80<sup>+</sup> macrophages in aging heart is reduced in arg-ii<sup>-/-</sup> mice (Fig. 2E, 2F, 2H, and 2I). These new results are described on page 6, the 1st paragraph, presented in Fig. 2E to 2I, and discussed on page 13, the 2nd, paragraph. The legend to Fig. 2 is revised. The method for this additional experiment is included on page 22, the 1st paragraph.

      Moreover, the aged-associated accumulation of the senescence cells as demonstrated by p16<sup>ink4</sup> positive cells is significantly reduced in arg-ii<sup>-/-</sup> animals. This new result is incorporated in the Fig. 1 as Fig. 1G and 1H and described / discussed on page 5, the 2nd paragraph and page 14, the 2nd last sentences of the 1st paragraph. The method of p16<sup>ink4</sup> staining is included in the method section on page 22, the 1st paragraph, line 7. The legend to Fig. 1 is revised accordingly.

      (6) Previously, Arg-II has been reported to serve a crucial role in ageing associated with reduced contractile function in rat hearts by regulating Nitric Oxide Synthase (PMID: 22160208). Elevated NO and superoxide have been shown to play crucial roles in the etiology of cardiovascular diseases (PMID: 24180388). Therefore, it is important to assess whether Nitric Oxide (NO) is involved in the aging-related phenotype in this mouse model.

      Following the reviewer's suggestion, we conducted new experiments to investigate the role of nitric oxide (NO) in the context of the effect of Arg-II-induced IL-1b production in macrophages. We have addressed this question in the response to the comment point 2.

      (7) Based on the results demonstrated in the study, ablation of Arg-II can be expected to cause a reduction in inflammation-associated phenotypes throughout the body at the multi-organ level. The observed improved cardiac phenotype could be an outcome of whole-body Arg-II ablation. It would be fruitful to develop a cardiac-specific Arg-II knockout mouse model to establish the role of Arg-II in the heart, independent of other organ systems.

      We agree with the comment of the reviewer on this point. Unfortunately, as explained above (see point 1), it is currently not possible for us to perform the requested experiments, due to lack of cardiac specific arg-ii-knockout mouse model. Moreover, such an approach is complicated by the absence of Arg-II in cardiomyocytes and the expression of Arg-II in multiple cells including endothelial cells, fibroblasts and macrophage of different origin (resident and monocyte-derived infiltrating cells). It’s thus difficult to generate a cardiac-specific gene knockout mouse. One shall investigate roles of cell-specific Arg-II in cardiac aging by generating cell-specific arg-ii<sup>-/-</sup> mice. We appreciate very this important aspect and have discussed issue on page 19, the lines 2 to 6.

      (8) Contrary to the findings in this paper, Arg-II has previously been reported to be essential for IL-10-mediated downregulation of pro-inflammatory cytokines, including IL-1β (PMID: 33674584).

      Thank you very much for mentioning this study! We have now discussed thoroughly the controversies as the following on page 15, the last paragraph and page 16, the 1st paragraph;

      “It is of note that a study reported that Arg-II is required for IL-10 mediated-inhibition of IL-1b in mouse BMDM upon LPS stimulation (Dowling et al., 2021), which suggests an anti-inflammatory function of Arg-II. The results of our present study, however, demonstrate that LPS enhances Arg-II and IL-1b levels in macrophages and knockout or silencing Arg-II reduces IL-1b production and release, demonstrating a pro-inflammatory effect of Arg-II. Our findings are supported by the study from another group, which shows decreased pro-inflammatory cytokine production including IL-6 and IL-1b in arg-ii<sup>-/-</sup> BMDM most likely through suppression of NFkB pathway, since arg-ii<sup>-/-</sup> BMDM reveals decreased activation of NFkB and IL-1b levels upon LPS stimulation (Uchida et al., 2023). Most importantly, our previous study also showed that re-introducing arg-ii gene back to the arg-ii<sup>-/-</sup> macrophages markedly enhances LPS-stimulated pro-inflammatory cytokine production (Ming et al., 2012), providing further evidence for a pro-inflammatory role of arg-ii under LPS stimulation. In support of this conclusion, chronic inflammatory diseases such as atherosclerosis and type 2 diabetes (Ming et al., 2012), inflammaging in lung (Zhu et al., 2023), kidney (Huang et al., 2021) and pancreas (Xiong, Yepuri, Necetin, et al., 2017) of aged animals or acute organ injury such as acute ischemic/reperfusion or cisplatin-induced renal injury are reduced in the arg-ii<sup>-/-</sup> mice (Uchida et al., 2023). The discrepant findings between these studies and that with IL-10 may implicate dichotomous functions of Arg-II in macrophages, depending on the experimental context or conditions. Nevertheless, our results strongly implicate a pro-inflammatory role of Arg-II in macrophages in the inflammaging in aging heart”.

      (9) The authors have only performed immunofluorescence-based experiments to show fibrotic and apoptotic phenotypes throughout this study. To verify these findings, we suggest that they additionally perform RT-PCR or western blotting analysis for fibrotic markers and apoptotic markers.

      The fibrotic aspect was analyzed not only by microscopy but also by using a quantitative biochemical assay such as hydroxyproline content assessment. Hydroxyproline is a major component of collagen and largely restricted to collagen. Therefore, the measurement of hydroxyproline levels can be used as an indicator of collagen content as previous investigated in the lung (Zhu et al., 2023). We have also measured collagen genes expression by RT-qPCR as suggested by the reviewer and found an age-related decline of collagen mRNA expression levels in both wt and arg-ii<sup>-/-</sup> mice, suggesting that the age-associated cardiac fibrosis and prevention in arg-ii<sup>-/-</sup> mice is due to alterations of translational and/or post-translational regulations, including collagen synthesis and/or degradation. The results are in accordance with that reported by other studies published in the literature. We have pointed out this aspect on page 5, the 2nd paragraph:

      “The increased cardiac fibrosis in aging is however, associated with decreased mRNA levels of collagen-Ia (col-Ia) and collagen-IIIa (col-IIIa), the major isoforms of pre-collagen in the heart (Suppl. Fig. 2A and 2B), which is a well-known phenomenon in cardiac fibrotic remodelling (Besse et al., 1994; Horn et al., 2016). The results demonstrate that age-associated cardiac fibrosis and prevention in arg-ii<sup>-/-</sup> mice is due to alterations of translational and/or post-translational regulations including collagen synthesis and/or degradation”.

      The results are presented in Suppl. Fig. 2, legend to Suppl. Fig. 2 is included in the file “Suppl. figure legend_R”. Suppl. table 2 for primers is revised accordingly.

      We did not use additional markers to perform apoptotic assays with whole heart, since Fig. 3 shows good evidence that the aging is associated with increased apoptotic cells in the heart and significantly reduced in the arg-ii<sup>-/-</sup> mice. The reduction of TUNEL positive (apoptotic) cells in aged arg-ii<sup>-/-</sup> mice is mainly due to decrease in apoptotic cardiomyocytes. With the histological analysis, the apoptotic cell types can be well analysed. Moreover, biochemical assay for apoptosis such as caspase-3 cleavage with whole heart tissues can not distinguish apoptotic cell types and may not be sensitive enough for aging heart, due to relatively low numbers of apoptotic cells in aging heart as compared to myocardial infarct model.  

      (10) Figure 4: arg-ii has previously been reported to be expressed in rat cardiomyocytes (PMID: 16537391). We strongly suggest the authors verify the expression of Arg-II via immunostaining in isolated cardiomyocytes (using published protocols), and by using multiple different cardiomyocyte-specific markers for colocalization studies to prove the lack of arg-ii expression beyond a reasonable doubt.

      As pointed out by this reviewer, Arg-II was previously reported to be expressed in isolated cardiomyocytes from in rats (PMID: 16537391). Unfortunately, negative controls. i.e., arg-ii<sup>-/-</sup> samples were not included in the study to avoid any possible background signals. We made great effort to investigate whether Arg-II is present in the cardiomyocytes from different species including mice, rats and humans and have included old arg-ii<sup>-/-</sup> mouse samples as a negative control. This allows to validate the antibody specificity and background noises beyond any reasonable doubt. The new experiments in Suppl. Fig. 4 confirms the specificity of the antibody against Arg-II in old mouse kidney which is known to express Arg-II in the S3 proximal tubular cells (Huang J, et al. 2021). To exclude the possible species-specific different expression of Arg-II in the cardiomyocytes, aged mouse and rat heart tissues were used for cellular localization of Arg-II by confocal immunofluorescence staining. As shown in Suppl. Fig. 4B and 4C, both species show Arg-II expression only in non-cardiomyocytes (cells between striated cardiomyocytes) (red arrows) but not in striated cardiomyocytes. Even in the rat myocardial infarction tissues, Arg-II was not found in cardiomyocytes but in endocardium cells (Suppl. Fig. 4B). In isolated cardiomyocytes exposed to hypoxia, a well know strong stimulus for Arg-II protein levels, no Arg-II signals could be detected, while in fibroblasts from the same animals, an elevated Arg-II levels under hypoxia is demonstrated (Fig. 5B). Furthermore, RT-qPCR could not detect arg-ii mRNA in cardiomyocytes but in non-cardiomyocytes (Fig. 5C). All together, these results demonstrate that Arg-II are not expressed or at negligible levels in cardiomyocytes but expressed in non-cardiomyocytes. This new experiments with rat heart are included in the method section on page 20, the 1st paragraph. The results are described on page 7, the 1st paragraph, and discussed on page 12, the 2nd paragraph. Legend to Suppl. Fig. 4 is included in the file “Suppl. figure legend_R”.

      (11) Figure 6G: It may be worthwhile to supplement arg-ii<sup>-/-</sup> old cells with IL-1beta to see if there is an increase in TUNEL-positive cells.

      IL-1b is a well known pro-inflammatory cytokine that causes apoptosis in various cell types including cardiomyocytes (Shen Y., et al., Tex Heart Inst J. 2015;42:109–116. doi: 10.14503/THIJ-14-4254; Liu Z. et. al., Cardiovasc Diabetol 2015;14,125. doi: 10.1186/s12933-015-0288-y; Li. Z., et al., Sci Adv 2020;6:eaay0589. doi: 10.1126/sciadv.aay0589). We appreciate very much the interesting idea of this reviewer to investigate the apoptotic responses of cardiomyocytes from arg-ii<sup>-/-</sup> mice to IL-1b. We agree that it is possible that cardiomyocytes from wt from arg-ii<sup>-/-</sup> mice react differently to IL-1b, although the cardiomyocytes do not express Arg-II as demonstrated in our present study. If this is true, it must be due to non-cell autonomous effects of different aging microenvironment in the heart or epigenetic modulations of the myocytes. We found that this is a very interesting aspect and requires further extensive investigation. Since our current study focused on the effect of wt and arg-ii<sup>-/-</sup> macrophages on cardiomyocytes and non-cardiomyocytes, we prefer not to include this suggested aspect in our manuscript and would like to explore it in the following study.

      (12) Figures 4-9: It would be interesting to see if the effect of ArgII in cardiac ageing is gender-specific. It is recommended to include experimental data with male mice in addition to the results demonstrated in female mice.

      As pointed out in the manuscript, we have focused on female mice, because an age-associated increase in arg-ii expression is more pronounced in females than in males (Fig. 1A). As suggested by this reviewer, we performed additional experiments investigating effects of arg-ii deficiency in male mice during aging, focusing on pathophysiological outcomes of ischemia/reperfusion injury in ex vivo experiments. The ex vivo functional analytic experiments with Langendorff system were performed in aged male mice (see Suppl. Fig. 9). Following ischemia/reperfusion injury, wt male mice display reduced left ventricular developed pressure (LVDP), as well as the inotropic and lusitropic states (expressed as dP/dt max and dP/dt min, respectively). As previously reported (Murphy et al., 2007), we also found that old male mice are more prone to I/R injury than age-matched female animals. Specifically, 15 minutes of ischemia are enough to significantly affect the left ventricle contractile function in the male mice (Suppl. Fig. 9). As opposite, age-matched old female mice are relatively resistant to I/R injury, and at least 20 min of ischemia are necessary to induce a significant impairment of the contractile function (Fig. 10). Similar to females, the post I/R recovery of cardiac function is also significantly improved in the male arg-ii<sup>-/-</sup> mice as compared to age-matched wt animals. In addition to functional recovery, triphenyl tetrazolium chloride (TTC) staining (myocardial infarction) upon I/R-injury in males is significantly reduced in the age-matched male arg-ii<sup>-/-</sup> animals (Suppl. Fig. 9C and 9D). All together, these results reveal a role for Arg-II in heart function impairment during aging in both genders with a higher vulnerability to stress in the males. These new results are presented in Suppl. Fig. 9, described on page 10, the last paragraph and page 11. The results are discussed on page 18, the 2nd paragraph as following:

      “The fact that aged females have higher Arg-II but are more resistant to I/R injury seems contradictory to the detrimental effect of Arg-II in I/R injury. It is presumable that cardiac vulnerability to injuries stressors depends on multiple factors/mechanisms in aging. Other factors/mechanisms associated with sex may prevail and determine the higher sensitivity of male heart to I/R injury, which requires further investigation. Nevertheless, the results of our study show that Arg-II plays a role in cardiac I/R injury also in males”.

      The information on the experimental methods in the male animals is included on page 20, the last paragraph and page 21, the 1st paragraph. Legend to Suppl. Fig. 9 is included in the file “Suppl. figure legend_R”.

      (13) Figure 6G: cardiomyocytes from wild-type mice, when treated with macrophages, show 0% TUNEL-positive cells. Since it is unlikely to obtain no TUNEL staining in a cell population, there may be an experimental or analytical error.

      Now it is Fig. 7F and 7G. This is due to our specific experimental procedure. After tissue digestion, cardiomyocytes were plated on laminin-coated dishes. Laminin promotes the adhesion of survived cells. Following plating, we conducted a deep washing process to remove damaged and partially adherent cells. This step ensures that only well-shaped, viable, and strongly adherent cells remain as bioassay cells. These “healthy” cells are then selected for the experiments. the apoptotic cells are removed by washing out, reflecting the high viability of the bioassay cells. We have added this detailed information in the method section on page 24, the 2nd paragraph.

      (14) Figure 7J: Please assess whether arg-ii depletion also affects the mtROS phenotype.

      According to the suggestion of this reviewer, we performed new experiments which show that human cardiac fibroblasts (HCFs) exposed to hypoxia (1% O<sub>2</sub>, 48 hours), a known physiological trigger of Arg-II up-regulation, exhibit increased mtROS generation, which involves Arg-II (new Fig. 8M to 8P). We found that Arg-II protein level as well as mtROS (assessed by mitoSOX staining) were both enhanced, accompanied by increased levels of HIF1α (Fig 8M). Moreover, mito-TEMPO pre-incubation reduces mtROS, confirming the mitochondrial origin of the ROS. Silencing of arg-ii with rAd-mediated shRNA, significantly reduces mtROS levels demonstrating a role of Arg-II in the production of mitochondrial ROS in cardiac fibroblasts (Fig 8M to 8P). We have included these results on page 9, the last paragraph and discussed the results on page 17, the 1st paragraph. The related method is described on page 26, the 2nd paragraph. Legend to Fig. 8 is updated on page 32.

      (15) Figure 8A-E: The authors have treated human-origin endothelial cells with mice-origin macrophage-conditioned media. It would be more suitable to treat the endothelial cells with human-origin macrophage-conditioned media.

      We acknowledge the concern regarding the use of mouse-origin macrophage-conditioned media on human-origin endothelial cells. It is to note, the biological cross-reactivity of cytokines from one species on cells from a different species has been reported in the literature. It was observed that there is quite a strict threshold of 60% amino acid identity, above which cytokines tend to cross-react and statistically, cytokines would tend to cross-react more often as their % amino acid identity increases (Scheerlinck JPY. Functional and structural comparison of cytokines in different species. Vet Immunol Immunopathol. 1999; 72:39-44. https://doi.org/10.1016/S0165-2427(99)00115-4). Taking IL-1b as an example, the 17.5 kDa mature mouse and human IL-1b share 92% aa sequence identity, suggesting a high cross-reactivity. Indeed, human IL-1b has shown biological cross-reactivity in mouse cells (Ledesma E., et al. Interleukin-1 beta (IL-1β) induces tumor necrosis factor alpha (TNF-α) expression on mouse myeloid multipotent cell line 32D cl3 and inhibits their proliferation. Cytokine. 2004; 26:66-72. https://doi.org/10.1016/j.cyto.2003.12.009). Moreover, our results also support the reported cross-reactivity between human and mouse IL-1b. The CM from mouse macrophage indeed showed biological function in human endothelial cells. The observed effects of the conditioned media from aged wild-type macrophages on endothelial cells were specifically mediated through IL-1β. This conclusion is supported by our data showing that the upregulation induced by the conditioned media was significantly reduced by the addition of an IL-1β receptor blocker.

      (16) The co-culture system would be more interesting to test the non-cell autonomous role of Arg II.

      We appreciate the suggestion by this reviewer regarding the co-culture system to test the non-cell autonomous role of Arg-II. We believe that our current model, which involves treating cells with conditioned media, is a well-established and effective method for demonstrating the non-cell autonomous role of Arg-II. This approach allows us to observe the effects of Arg-II on surrounding cells through the factors present in the conditioned media. The co-culture system could be considered, if the released factor in the conditioned medium is not stable. This is however not the case. So we are confident that our experimental model with conditioned medium is good enough to demonstrate a paracrine effect of cell-cell interaction.

      Reviewer #2 (Recommendations For The Authors):

      Some minor comments may be considered to improve the realm of the knowledge related to this study.

      We appreciate this comment and have added and revised our discussion on this aspect accordingly at the end of the discussion section on page 19, the last 6 lines.

      (1) The current study showed strong evidence demonstrating the key role of cardiac macrophages in pathologies of cardiac aging, particularly, the macrophages (MФ) from the circulating blood (hematogenous). It is known that the heart is among the minority of organs in which substantial numbers of yolk-sac MФ persist in adulthood and play a crucial role in maintaining cardiac function. Thus, the adult mammalian heart contains two separate and discrete cardiac MФ subgroups, i.e., the resident MФs originated from yolk sac-derived progenitors and the hematogenous MФs recruited from circulating blood monocytes. These two subtypes of MФs may play distinctive roles in the aging heart and the response to cardiac injury. The author could extend the discussion on the possibility of the resident MФs in aging hearts, which could be further investigated in the future.

      We appreciate the suggestion and agree that it provides valuable insight into the study. Taking the comments of the reviewer 1 into account, we have performed new experiments, i.e., co- immunostaining to analyze the infiltrated (CCR2<sup>+</sup>/F4-80<sup>+</sup>) and resident (LYVE1<sup>+</sup>/F4-80<sup>+</sup>) macrophage populations and to investigate to which extent that Arg-II affects infiltrated and resident macrophage populations in the aging heart. We found that in line with the gene expression of f4/80, immunofluorescence staining reveals an age-associated increase in the numbers of F4/80<sup>+</sup> cells in the wt mouse heart, which is reduced in the age-matched arg-ii<sup>-/-</sup> animals (Fig. 2E, F, G), demonstrating that arg-ii gene ablation reduces macrophage accumulation in the aging heart. Interestingly, resident macrophages as characterized by LYVE1<sup>+</sup>/F4-80<sup>+</sup> cells (Fig. 2E and 2H) are predominant in the aging heart as compared to the infiltrated CCR2<sup>+</sup>/F4-80<sup>+</sup> cells (Fig. 2F and 2I). The increase in both LYVE1<sup>+</sup>/F4-80<sup>+</sup> and CCR2<sup>+</sup>/F4-80<sup>+</sup> macrophages in aging heart is reduced in arg-ii<sup>-/-</sup> mice (Fig. 2E, 2F, 2H, and 2I). These new results are described on page 6, the 1st paragraph, presented in Fig. 2E to 2I, and discussed on page 13, the 2nd, paragraph. The legend to Fig. 2 is revised. The method for this additional experiment is included on page 22, the 1st paragraph.

      (2) It would be beneficial to the readers if the author could provide some explanation about why ArgII could not be detected in VSMCs in the mouse heart and the species difference between humans and mice. In addition, the author may provide an assumption on the possibility that there may also be a cross-talk between macrophages and VSMCs in the aging heart. A little bit more explanation in the Discussion will be helpful.

      We acknowledge and appreciate the suggestion and have discussed these points on page 19 as the following:

      “In this context, another interesting aspect is the cross-talk between macrophages and vascular SMC in the aging heart. In our present study, we could not detect Arg-II in vascular SMC of mouse heart but in that of human heart. This could be due to the difference in species-specific Arg-II expression in the heart or related to the disease conditions in human heart which is harvested from patients with cardiovascular diseases. Indeed, in the apoe<sup>-/-</sup> mouse atherosclerosis model, aortic SMCs do express Arg-II (Xiong et al., 2013). It is interesting to note that rodents hardly develop atherosclerosis as compared to humans. Whether this could be partly contributed by the different expression of Arg-II in vascular SMC between rodents and humans requires further investigation. In our present study, the aspect of the cross-talk between macrophages and vascular SMC is not studied. Since the crosstalk between macrophages and vascular SMC has been implicated in the context of atherogenesis as reviewed (Gong et al., 2025), further work shall investigate whether Arg-II expressing macrophages could interact with vascular SMC in the coronary arteries in the heart and contribute to the development of coronary artery disease and/or vascular remodelling and the underlying mechanisms“.

      (3) Please clarify the arrows in Figure 9C that indicate the infarct area in each splicing section from one heart.

      The arrows in Figure 9C (now Fig. 10C) are indeed utilized to indicate the sections displaying the infarcted area within each splicing section from one heart. We have explained the arrow in the figure legend (now Fig. 10 and also new Suppl. Fig. 9).

    1. Joint Public Review:

      Summary:

      The authors have conducted the largest to date Mendelian Randomization (MR) analysis of the association between genetically predicted measures of adiposity and risk of head and neck cancer (HNC) overall and by subsites within HNC. MR uses genetic predictors of an exposure, such as gene variants associated with high BMI or tobacco use, rather than data from individual physical exams or questionnaires, and if it can be done in its idealized state, there should be no problems with confounding. Traditional epidemiologic studies have reported a variety of associations between BMI (and a few other measures of adiposity) and risk of HNC that typically differ by the smoking status of the subjects. Those findings are controversial given the complex relationship between tobacco and both BMI and HNC risk. Tobacco smokers are often thinner than non-smokers, so this could create an artificial ('confounded') association that may not be fully adjusted away in risk models. The findings of a BMI-HNC association are often attributed to residual confounding, and this seems ripe for an MR approach if suitable genetic instrumental variables can be created. Here, the authors built a variety of genetic instrumental variables for BMI and other measures of adiposity, as well as two instrumental variables for smoking habits, and then tested their hypotheses in a large case-control set of HNC and controls with genetic data.

      The authors found that the genetic model for BMI was associated with HNC risk in simple models, but this association disappeared when using models that better accounted for pleiotropy, the condition when genetic variants are associated with more than one trait, such as both BMI and tobacco use. When they used both adiposity and tobacco use genetic instruments in a single model, there was a strong association with genetically predicted tobacco use (as is expected), but there was no remaining association with genetic predictors of adiposity. They conclude that high BMI/adiposity is not a risk factor for HNC.

      Strengths:

      The primary strength was the expansive use of a variety of different genetic instruments for BMI/adiposity/body size, along with employing a variety of MR model types, several of which are known to be less sensitive to pleiotropy. They also used the largest case-control sample size to date.

      Weaknesses:

      The lack of pleiotropy is an unconfirmable assumption of MR, and the addition of those models is therefore quite important, as this is a primary weakness of the MR approach. Given that concern, I read the sensitivity analyses using pleiotropy-robust models as the main result, and in that case, they can't test their hypotheses as these models do not show a BMI instrumental variable association. The other weakness, which might be remedied, is that the power of the tests here is not described. When a hypothesis is tested with an under-powered model, the apparent lack of association could be due to inadequate sample size rather than a true null. Typically, when a statistically significant association is reported, power concerns are discounted as long as the study is not so small as to create spurious findings. That is the case with their primary BMI instrumental variable model - they find an association so we can presume it was adequately powered. But the primary models they share are not the pleiotropy-robust methods MR-Egger, weighted median, and weighted mode. The tests for these models are null, and that could mean a couple of things: (1) the original primary significant association between the BMI genetic instrument was due to pleiotropy, and they therefore don't have a robust model to explore the effects of the tobacco genetic instrument. (2) The power for the sensitivity analysis models (the pleiotropy-robust methods) is inadequate, and the authors share no discussion about the relative power of the different MR approaches. If they do have adequate power, then again, there is no need to explore the tobacco instrument.

      Reviewing Editor Comments:

      We suggest that the authors add power estimates to assess whether the sample size is sufficient, given the strength and variability of the genetic instruments. It would also be helpful to present effect estimates for the tobacco instruments alone, to clarify their independent contribution and improve the interpretation of the joint models. In addition, the role of pleiotropy should be addressed more clearly, including which model is considered primary. Stratified analyses by smoking status are encouraged, as prior studies indicate that BMI-HNC associations may differ between smokers and non-smokers. Finally, the comparison with previous studies should be revised, as most reported null findings without accounting for tobacco instruments. If this study finds an association, it should not be framed as a replication.

    2. Author response:

      Our response aims to address the following:

      The lack of pleiotropy is an unconfirmable assumption of MR, and the addition of those models is therefore quite important, as this is a primary weakness of the MR approach. Given that concern, I read the sensitivity analyses using pleiotropy-robust models as the main result, and in that case, they can't test their hypotheses as these models do not show a BMI instrumental variable association. The other weakness, which might be remedied, is that the power of the tests here is not described. When a hypothesis is tested with an under-powered model, the apparent lack of association could be due to inadequate sample size rather than a true null. Typically, when a statistically significant association is reported, power concerns are discounted as long as the study is not so small as to create spurious findings. That is the case with their primary BMI instrumental variable model - they find an association so we can presume it was adequately powered. But the primary models they share are not the pleiotropy-robust methods MR-Egger, weighted median, and weighted mode. The tests for these models are null, and that could mean a couple of things: (1) the original primary significant association between the BMI genetic instrument was due to pleiotropy, and they therefore don't have a robust model to explore the effects of the tobacco genetic instrument. (2) The power for the sensitivity analysis models (the pleiotropy-robust methods) is inadequate, and the authors share no discussion about the relative power of the different MR approaches. If they do have adequate power, then again, there is no need to explore the tobacco instrument.

      We would like to highlight that post-hoc power calculations are often considered redundant since the statistical power estimated for an observed association is directly related to its p-value[1]. In other words, the uncertainty of the association is already reflected in its 95% confidence interval. However, we understand power calculations may still be of interest to the reader, so we will incorporate them in the revised manuscript.

      The reason we use inverse variance weighted (IVW) Mendelian randomization (MR) to obtain our main results rather than the pleiotropy-robust methods mentioned by the reviewer/editors (i.e., MR-Egger, weighted median and weighted mode) is that the former has greater statistical power than the latter[2]. Hence, instead of focussing on the statistical significance of the pleiotropy-robust analyses, we consider it is of more value to compare the consistency of the effect sizes and direction of the effect estimates across methods. Any evidence of such consistency increases our confidence in our main findings, since each method relies on different assumptions. As we cannot be sure about the presence and nature of horizontal pleiotropy, it is useful to compare results across methods even though they are not equally powered. It is true that our results for the genetically predicted effects of body mass index (BMI) on the risk of head and neck cancer (HNC) differ across methods. This is precisely what led us to question the validity of our main finding (suggesting a positive effect of BMI on HNC risk). We will clarify this in the discussion section of the revised manuscript as advised.

      We understand that the reviewer/editors are concerned that we do not have a robust model to explore the role of tobacco consumption in the link between BMI and HNC. However, we have a different perspective on the matter. If indeed, the main IVW finding for BMI and HNC is due to pleiotropy (since some of the pleiotropy-robust methods suggest conflicting results), then the IVW multivariable MR method is a way to explore the potential source of this bias[3]. We were particularly interested in exploring the role of smoking in the observed association because smoking and adiposity are known to influence each other [4-9] and share a genetic basis[10, 11].

      References:

      (1) Heinsberg LW, Weeks DE: Post hoc power is not informative. Genet Epidemiol 2022, 46(7):390-394.

      (2) Burgess S, Butterworth A, Thompson SG: Mendelian randomization analysis with multiple genetic variants using summarized data. Genet Epidemiol 2013, 37(7):658-665.

      (3) Burgess S, Davey Smith G, Davies NM, Dudbridge F, Gill D, Glymour MM, Hartwig FP, Kutalik Z, Holmes MV, Minelli C et al: Guidelines for performing Mendelian randomization investigations: update for summer 2023. Wellcome Open Res 2019, 4:186.

      (4) Morris RW, Taylor AE, Fluharty ME, Bjorngaard JH, Asvold BO, Elvestad Gabrielsen M, Campbell A, Marioni R, Kumari M, Korhonen T et al: Heavier smoking may lead to a relative increase in waist circumference: evidence for a causal relationship from a Mendelian randomisation meta-analysis. The CARTA consortium. BMJ Open 2015, 5(8):e008808.

      (5) Taylor AE, Morris RW, Fluharty ME, Bjorngaard JH, Asvold BO, Gabrielsen ME, Campbell A, Marioni R, Kumari M, Hallfors J et al: Stratification by smoking status reveals an association of CHRNA5-A3-B4 genotype with body mass index in never smokers. PLoS Genet 2014, 10(12):e1004799.

      (6) Taylor AE, Richmond RC, Palviainen T, Loukola A, Wootton RE, Kaprio J, Relton CL, Davey Smith G, Munafo MR: The effect of body mass index on smoking behaviour and nicotine metabolism: a Mendelian randomization study. Hum Mol Genet 2019, 28(8):1322-1330.

      (7) Asvold BO, Bjorngaard JH, Carslake D, Gabrielsen ME, Skorpen F, Smith GD, Romundstad PR: Causal associations of tobacco smoking with cardiovascular risk factors: a Mendelian randomization analysis of the HUNT Study in Norway. Int J Epidemiol 2014, 43(5):1458-1470.

      (8) Carreras-Torres R, Johansson M, Haycock PC, Relton CL, Davey Smith G, Brennan P, Martin RM: Role of obesity in smoking behaviour: Mendelian randomisation study in UK Biobank. BMJ 2018, 361:k1767.

      (9) Freathy RM, Kazeem GR, Morris RW, Johnson PC, Paternoster L, Ebrahim S, Hattersley AT, Hill A, Hingorani AD, Holst C et al: Genetic variation at CHRNA5-CHRNA3-CHRNB4 interacts with smoking status to influence body mass index. Int J Epidemiol 2011, 40(6):1617-1628.

      (10) Thorgeirsson TE, Gudbjartsson DF, Sulem P, Besenbacher S, Styrkarsdottir U, Thorleifsson G, Walters GB, Consortium TAG, Oxford GSKC, consortium E et al: A common biological basis of obesity and nicotine addiction. Transl Psychiatry 2013, 3(10):e308.

      (11) Wills AG, Hopfer C: Phenotypic and genetic relationship between BMI and cigarette smoking in a sample of UK adults. Addict Behav 2019, 89:98-103.

    3. eLife Assessment

      The findings represent an important contribution to understanding whether BMI influences head and neck cancer (HNC) risk after accounting for tobacco use. Within the context of the Mendelian Randomization (MR) field, the strength of evidence appears convincing, supported by rigorous methods and a thorough exploration of multiple genetic models of adiposity using diverse MR approaches. Limitations include the absence of associations in sensitivity models designed to better account for pleiotropy, which prevents evaluation of whether incorporating an instrumental variable for tobacco use would alter the findings. Additionally, the lack of a formal power assessment for detecting associations with the instrumental variables employed limits the interpretability and reach of the results.

    1. eLife Assessment

      This study identifies novel approaches to improving transgene expression in the injured mammalian myocardium through a combination of a tissue regeneration enhancer element and engineered AAVs - specifically, a liver-detargeting capsid, AAV.cc84, and an in vivo library screen-selected AAV-IR41. The evidence is convincing, and the AAV vectors are of fundamental value to the field of cardiac gene therapy. Future research exploring how to combine the features of AAV.cc84 and AAV-IR41 could yield an even more promising vector for therapeutic use.

    2. Reviewer #1 (Public review):

      In this manuscript, Wolfson and co-authors demonstrate a combination of an injury-specific enhancer and engineered AAV that enhances transgene expression in injured myocardium. The authors characterize spatiotemporal dynamics of TREE-directed AAV expression in the injured heart using a non-invasive longitudinal monitoring system. They show that transgene expression is drastically increased 3 days post-injury, driven by 2ankrd1a. They reported a liver-detargeted capsid, AAV cc.84, with decreased viral entry into the liver while maintaining TREE transgene specificity. They further identified the IR41 serotype with enhanced transgene expression in injured myocardium from AAV library screening. This is an interesting study that optimizes the potential application of TREE delivery for cardiac repair. However, several concerns were raised prior to publication:

      Major Concerns:

      (1) In Figure 1, the authors demonstrated that 2andkrd1aEN is not responsive to sham injury after AAV delivery, but Figure 3 shows a strong response to sham when AAV is delivered after injury. The authors do not provide an explanation for this observation.

      (2) In Figure 4, a higher GFP signal is observed in all areas of the heart of the IR41-treated mouse compared to AAV9. The authors should compare GFP expression between AAV9 and IR41 in uninjured hearts and provide insights into enhanced cardiac tropism to confirm that IR41 is MI injury enriched, not Sham as well.

      (3) The authors should clarify which model is being used between myocardial infarction (MI) and Ischemia-reperfusion (IR) throughout the figures, as the experimental schemes and figure legends did not match with each other (MI or IR in Figure 1A, 1D, 3A, and 3E). Both models cause different types of injuries. The authors should explain the difference in TREE expression in both models.

      (4) In Figure 2, the authors use REN instead of 2ankrd1aEN to demonstrate liver-detargeting using AAV cc.84. Is there a specific reason?

    3. Reviewer #2 (Public review):

      In this manuscript by Wolfson et al., various adeno-associated viruses (AAVs) were delivered to mice to assess the cardiac-specificity, injury border-zone cardiomyocyte transduction rate, and temporal dynamics, with the goal of finding better AAVs for gene therapies targeting the heart. The authors delivered tissue regeneration enhancer elements (TREEs) controlling luciferase expression and used IVIS imaging to examine transduction in the heart and other organs. They found that luciferase expression increased in the first week after injury when using AAV9-TREE-Hsp68 promoter, waning to baseline levels by 7 weeks. However, AAV9 vectors transduced the liver, which was significantly reduced by using an AAV.cc84 liver de-targeting capsid. The authors then performed in vivo screening of AAV9 capsids and found AAV-IR41 to preferentially transduce injured myocardium when compared to AAV9. Finally, the authors combined TREEs with AAV-IR41 to show improved luciferase expression compared to AAV9-TREE at 7, 14, and 21 days after injury.

      Overall, this manuscript provides insights into TREE expression dynamics when paired with various heart-targeting capsids, which can be useful for researchers studying ischemic injury of murine hearts. While the authors have shown the success of using AAV9-TREEs in porcine hearts, it is unknown whether the expression dynamics would be similar in pigs or humans, as mentioned in the limitations.

      The following questions and concerns can be addressed to improve the manuscript:

      (1) From the IVIS data, it seems that the Hsp68 promoter might not be "normally silent in mouse tissues," specifically in the liver (Figure S1B). Are there any other promoters that can be combined with TREEs to induce cardiac-injury specific expression while minimizing liver expression? This could simplify capsid design to focus on delivery to injured areas.

      (2) Why is it that AAV9-TREE-Hsp68-Luc wane in expression (Figure 1C and 1D), whereas AAV.cc84-TREE-Hsp68-Luc expresses stably for over 2 months (3E)? This has important implications for the goal of transience in gene delivery.

      (3) AAV-IR41 was found to transduce cardiomyocytes in the injured zone. However, this capsid also shows a very strong off-target liver expression. From a capsid design perspective, is it possible to combine AAV-cc84 and AAV-IR41?

      (4) It would be helpful to see immunostaining for the various time points in Figure 5. Is it possible to use an anti-luciferase antibody (or AAV-TREE-Hsp68-eGFP) to compare the two TREE capsids?

    4. Reviewer #3 (Public review):

      Summary:

      The tissue regeneration enhancer elements (TREEs) identified in zebrafish have been shown to drive injury-activated temporal-spatial gene expression in mice and large animals. These findings increase the translational potential of findings in zebrafish to mammals. In this manuscript, the authors tested TREEs in combination with different adeno-associated viral (AAV) vectors using in vivo luciferase bioluminescent imaging that allows for longitudinal tracking. The TREE-driven luciferase delivered by a liver de-targeted AAV.cc84 decreased off-target transduction in the liver. They further screened an AAV library to identify capsid variants that display enhanced transduction for myocardium post-myocardial infarction. A new capsid variant, AAV.IR41, was found to show increased transduction at the infarct border zones.

      Strengths:

      The authors injected AAV-cargo several days after ischemia/reperfusion (I/R) injury as a clinically relevant approach. Overall, this study is significant in that it identifies new AAV vectors for potential new gene therapies in the future. The manuscript is well-written, and their data are also of high quality.

      Weaknesses:

      The authors might be using MI (myocardial infarction) and I/R injury interchangeably in their text and labels. For instance, "We systemically transduced mice at 4 days after permanent left coronary artery ligation with either AAV9 or IR41 harboring a 2ankrd1aEN-Hsp68::fLuc transgene. IVIS imaging revealed higher expression levels in animals transduced with IR41 compared to AAV9, in both sham and I/R groups (Fig. 5A)". They should keep it consistent. There is also no description for the MI model.

    1. eLife Assessment

      This valuable study concerns a model for transgenerational epigenetic inheritance, the learned avoidance by C. elegans of the PA14 pathogenic strain of Pseudomonas aeruginosa. A recent study questioned whether transgenerational inheritance in this paradigm lacks robustness. The authors of this study have worked independently of the group that reported the original phenomenon and also independently of the group that challenged the original report. With solid data, this study independently validates findings previously reported by the Murphy group, confirming that the paradigm is reproducible elsewhere. The present study is therefore of broad interest to anyone studying genetics, epigenetics, or learned behavior.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript addresses the discordant reports of the Murphy (Moore et al., 2019; Kaletsky et al., 2020; Sengupta et al., 2024) and Hunter (Gainey et al., 2025) groups on the existence (or robustness) of transgenerational epigenetic inheritance (TEI) controlling learned avoidance of C. elegans to Pseudomonas aeruginosa. Several papers from Colleen Murphy's group describe and characterize C. elegans transgenerational inheritance of avoidance behaviour. In the hands of the Murphy group, the learned avoidance is maintained for up to four generations, however, Gainey et al. (2025) reported an inability to observe inheritance of learned avoidance beyond the F1 generation. Of note, Gainey et al used a modified assay to measure avoidance, rather than the standard assay used by the Murphy lab. A response from the Murphy group suggested that procedural differences explained the inability of Gainey et al.(2025) to observe TEI. They found two sources of variability that could explain the discrepancy between studies: the modified avoidance assay and bacterial growth conditions (Kaletsky et al., 2025). The standard avoidance assay uses azide as a paralytic to capture worms in their initial decision, while the assay used by the Hunter group does not capture the worm's initial decision but rather uses cold to capture the location of the population at one point in time.

      In this short report, Akinosho, Alexander, and colleagues provide independent validation of transgenerational epigenetic inheritance (TEI) of learned avoidance to P. aeruginosa as described by the Murphy group by demonstrating learned avoidance in the F2 generation. These experiments used the protocol described by the Murphy group, demonstrating reproducibility and robustness.

      Strengths:

      Despite the extensive analyses carried out by the Murphy lab, doubt may remain for those who have not read the publications or for those who are unfamiliar with the data, which is why this report from the Vidal-Gadea group is so important. The observation that learned avoidance was maintained in the F2 generation provides independent confirmation of transgenerational inheritance that is consistent with reports from the Murphy group. It is of note that Akinosho, Alexander et al. used the standard avoidance assay that incorporates azide, and followed the protocol described by the Murphy lab, demonstrating that the data from the Moore and Kaletsky publications are reproducible, in contrast to what has been asserted by the Hunter group.

      Weaknesses:

      While I would have liked to see a confirmation of the daf-7::GFP data in F2, and perhaps inheritance of avoidance beyond F2, the premise of the manuscript is that they have independently verified the transgenerational inheritance of learned avoidance as described by the Murphy lab, and this bar has been met.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript "Independent validation of transgenerational inheritance of learned pathogen avoidance in C. elegans" by Akinosho and Vidal-Gadea offers evidence that learned avoidance of the pathogen PA14 can be inherited for at least two generations. In spite of initial preference for the pathogen when exposed in a 'training session', 24 hours of feeding on this pathogen evoked avoidance. The data are robust, replicated in 4 trials, and the authors note that diminished avoidance is inherited in generations F1 and F2.

      Strengths:

      These results contrast with those reported by Gainey et al, who only observed intergenerational inheritance for a single generation. Although the authors' study does not explain why Gainey et el fail to reproduce the Murphy lab results, one possibility is that a difference in a media ingredient could be responsible.

      Weaknesses:

      The authors do not list the sources of their media ingredients, which might be important with regard to reproducibility.

    4. Reviewer #3 (Public review):

      Summary

      This short paper aims to provide an independent validation of the transgenerational inheritance of learned behaviour (avoidance) that has been published by the Murphy lab. The robustness of the phenotype has been questioned by the Hunter lab. In this paper, the authors present one figure showing that transgenerational inheritance can be replicated in their hands. Overall, it helps to shed some light on a controversial topic.

      Strengths

      The authors clearly outline their methods, particularly regarding the choice of assay, so that attempting to reproduce the results should be straightforward. It is nice to see these results repeated in an independent laboratory.

      Weaknesses

      Previous reports on this topic have provided raw data, which is helpful when assessing sample sizes. The authors provided a spreadsheet containing the choice assay results for individual assays, but not the raw data. In the methods, it is stated that F2 animals were produced from F1 animals by bleaching, but there are many more F2 assays than F1. Were multiple F2 assays performed on the offspring from one F1 plate? If so, they do not represent independent assays.

      I think that the introduction somewhat overstates their findings - do they really "address potential methodological variations that might influence results"? This makes it sound as though they test different conditions, whereas they only use one assay setup throughout.

    1. eLife Assessment

      This study presents a useful finding showing that the high susceptibility to sepsis of Kit-mutant mice is due to dysbiosis. However, the data provided is incomplete and would benefit from more rigorous approaches. With the mechanism part strengthened, this paper would be of interest to researchers on mast cell biology and mucosal immunology.

    2. Reviewer #1 (Public review):

      Summary:

      Mast cells have previously been reported to play an important role in bacterial immune defense and act protectively in sepsis. However, many of these findings were based on studies using Kit mutant mice. In this study, the authors conducted a detailed investigation using mast cell-deficient Cpa3 Cre-Master mice. As a result, the authors found that the Cpa3 Cre-Master mice exhibited responses similar to wild-type mice in terms of bacterial immune defense. This suggests that the observed phenotype is not due to mast cell-dependent bacterial immune defense, but rather is associated with dysbiosis of the gut microbiota.

      Strengths:

      Mast cells have long been reported to play an important role in the protective response against sepsis, and their function in infection defense has been demonstrated. However, Kit mutant mice have been reported to exhibit impaired peristalsis, and several mast cell-specific genetically modified mouse lines have since been developed and examined in detail. This study presents an important finding by logically demonstrating that the exacerbation of sepsis in Kit mice is due to alterations in the gut microbiota, and that the phenotype previously thought to be mast cell-dependent was, in fact, not.

      In addition, the experiments were carefully designed using mice with matched genetic backgrounds. These findings underscore the importance of microbiota composition in interpreting immune phenotypes and highlight the need for co-housing controls in mutant mouse studies.

      A major strength of this work is the robustness of the CLP data, generated over eight years by three independent researchers across two institutions with large sample sizes, lending strong support to the conclusions.

      Weaknesses:

      The study assesses only a limited subset of gut bacterial species, leaving the extent to which E. coli expansion contributes to the observed phenotype unclear. Moreover, in the cohousing experiments, there is no evidence provided to confirm successful microbiota normalization between groups. A more detailed analysis of the microbial composition would be necessary to strengthen the reliability of the findings.

      It is also important to note that Cpa3-deficient mice exhibit not only mast cell depletion but also defects in basophils and T cells. These additional immunological alterations may counterbalance one another, potentially masking phenotypic changes and complicating interpretation.

      Furthermore, it remains to be determined whether the altered gut microbiota observed in KitW/Wv mice is a consequence of impaired intestinal motility, whether a similar phenotype is observed in KitW-sh/W-sh mice, and whether comparable results occur in SCF-deficient models. Addressing these questions would provide greater clarity on the contribution of mast cells versus secondary factors in the observed phenotypes.

      Given that KitW/Wv mice exhibit impaired peristalsis, is the observed increase in E. coli a consequence of this dysfunction?

      Previous studies with BMMC reconstitution experiments have indicated that mast cells are a source of TNF - how does this align with the current findings?

    3. Reviewer #2 (Public review):

      Summary:

      This study presents a useful finding that the high susceptibility to CLP sepsis of Kit-mutant mice is not due to mast cell deficiency, but to dysbiosis.

      However, the present data are insufficient and incomplete to support the conclusion, and would benefit from more rigorous approaches. With the mechanism part strengthened, this paper would be of interest to researchers on mast cell biology and mucosal immunology.

      Recommendations:

      (1) The authors showed that E. coli increases in the cecum of Kit-mutant mice, which causes high CLP susceptibility. However, they did not provide any evidence E. coli is responsible for the high susceptibility. In the Figure 3 experiments, the authors administered the same number of cecal bacteria and did not show the number of E. coli after the administration. The authors should provide evidence showing that depletion of E. coli decreases susceptibility.

      (2) The author should provide direct evidence of dysbiosis by, for example, shotgun sequencing of cecal and fecal contents.

      (3) In case the authors find dysbiosis, they should analyze the mechanisms by which Kit mutation causes dysbiosis.

    4. Author response:

      Reviewer #1 (Public review):

      Summary:

      Mast cells have previously been reported to play an important role in bacterial immune defense and act protectively in sepsis. However, many of these findings were based on studies using Kit mutant mice. In this study, the authors conducted a detailed investigation using mast cell-deficient Cpa3 Cre-Master mice. As a result, the authors found that the Cpa3 Cre-Master mice exhibited responses similar to wild-type mice in terms of bacterial immune defense. This suggests that the observed phenotype is not due to mast cell-dependent bacterial immune defense, but rather is associated with dysbiosis of the gut microbiota.

      Strengths:

      Mast cells have long been reported to play an important role in the protective response against sepsis, and their function in infection defense has been demonstrated. However, Kit mutant mice have been reported to exhibit impaired peristalsis, and several mast cell-specific genetically modified mouse lines have since been developed and examined in detail. This study presents an important finding by logically demonstrating that the exacerbation of sepsis in Kit mice is due to alterations in the gut microbiota, and that the phenotype previously thought to be mast cell-dependent was, in fact, not.

      In addition, the experiments were carefully designed using mice with matched genetic backgrounds. These findings underscore the importance of microbiota composition in interpreting immune phenotypes and highlight the need for co-housing controls in mutant mouse studies.

      A major strength of this work is the robustness of the CLP data, generated over eight years by three independent researchers across two institutions with large sample sizes, lending strong support to the conclusions.

      Weaknesses:

      The study assesses only a limited subset of gut bacterial species, leaving the extent to which E. coli expansion contributes to the observed phenotype unclear.

      We will add new data based on 16S rRNA sequencing to the revised version.

      Moreover, in the cohousing experiments, there is no evidence provided to confirm successful microbiota normalization between groups.

      We note that co-housing is a generally accepted method for microbiota equalization or conversion (Caruso et al., Cell Rep. 2019, Ridaura et al., Science 2013, and reviewed in Moore et al., Clin. Transl. Immunol. 2016). In any case, Kit<sup>W/Wv</sup> mutants were made resistant to CLP by co-housing. Similar microbiota sequencing results between groups,while useful, would again only be correlative.

      A more detailed analysis of the microbial composition would be necessary to strengthen the reliability of the findings.

      See above

      It is also important to note that Cpa3-deficient mice exhibit not only mast cell depletion but also defects in basophils and T cells. These additional immunological alterations may counterbalance one another, potentially masking phenotypic changes and complicating interpretation.

      Regarding basophils in Cpa3<sup>Cre</sup> mice, compared to wild-type mice, basophils are reduced to about 39% of normal (Feyerabend et al., Immunity 2011). In Kit<sup>W/Wv</sup> mice, compared to wild-type mice, basophils are reduced to about 11% of normal. To our knowlegde, there has been no phenotype reported in which a reduction in basophils compensates for the loss for mast cells. Given that Kit<sup>W/Wv</sup> mice have about threefold lower numbers of basophils, and are highly susceptible to sepsis, there is no evidence that a reduction in basophils is protective in mast cell-deficient mice. On the contrary, mice that were normal for mast cells but had their basophils depleted were more susceptible to sepsis (Piliponsky et al., Nat. Immunol. 2019). Hence, basophils appear to be protective, and their reduction increases susceptibility. In light of these data and considerations, there is no evidence for a reduction in basophils to counterbalance the loss of mast cells in Cpa3<sup>Cre</sup> mice.

      Regarding T cells, there is no evidence, and there are no reports, that Cpa3<sup>Cre</sup> mice have defects in T cells (Feyerabend et al., Immunity 2011, Feyerabend et al., Cell Metabolism 2016). Cpa3 is weakly and transiently expressed early in the T cell lineage (Feyerabend et al., Immunity 2009; for expression levels in T cells versus mast cells, see below figure from the Immgen Database). In summary, in contrast to the reviewer's claim, there are no known defects in T cell development or functions in Cpa3<sup>Cre</sup> mice.

      Author response image 1.

      Generated from the Immgen database. Shown are RNAseq gene expression levels of diverse T-cell and mast cell populations.

      Furthermore, it remains to be determined whether the altered gut microbiota observed in Kit<sup>W/Wv</sup> mice is a consequence of impaired intestinal motility, whether a similar phenotype is observed in KitW-sh/W-sh mice, and whether comparable results occur in SCF-deficient models. Addressing these questions would provide greater clarity on the contribution of mast cells versus secondary factors in the observed phenotypes.

      Mice without mast cells (Cpa3<sup>Cre</sup> mice) are as resistant to sepsis as wild-type mice. Hence, mast cells are not involved in the immunity against sepsis, and 'secondary factors' are not involved in this simple experiment (both groups of mice, wild type and Cpa3<sup>Cre</sup> mice, were on the idential genetic background). Second, Kit<sup>W/Wv</sup> mice are also as resistant to sepsis as wild-type mice when confronted with the identical intestinal slurry. Therefore, Kit<sup>W/Wv</sup> mice have no immune deficit in response to sepsis. Hence, in our view, the underlying immunological question regarding the role of mast cells in sepsis has been conclusively addressed by our data. Future studies may address the mechanism that causes dysbiosis in Kit<sup>W/Wv</sup> mice, and other Kit mutants and steel mutants could be examined as well. These questions are, however, unrelated to the role of mast cells in sepsis, or the response of Kit<sup>W/Wv</sup> mice to sepsis, and would therefore not affect the central conclusion of our manuscript ("Susceptibility of Kit-mutant mice to sepsis caused by enteral dysbiosis, not mast cell deficiency").

      Given that Kit<sup>W/Wv</sup> mice exhibit impaired peristalsis, is the observed increase in E. coli a consequence of this dysfunction?

      See above

      Previous studies with BMMC reconstitution experiments have indicated that mast cells are a source of TNF - how does this align with the current findings?

      It is possible that cultured and transplanted mast cells (BMMC) produce TNF. Given that we did not find a reduction in TNF levels in the peritoneal lavage or serum in mice without mast cells undergoing sepsis, under physiological conditions mast cell-derived TNF does not seem to have a measuable impact on total TNF levels.

      Reviewer #2 (Public review):

      Summary:

      This study presents a useful finding that the high susceptibility to CLP sepsis of Kit-mutant mice is not due to mast cell deficiency, but to dysbiosis.

      However, the present data are insufficient and incomplete to support the conclusion, and would benefit from more rigorous approaches. With the mechanism part strengthened, this paper would be of interest to researchers on mast cell biology and mucosal immunology.

      We disagree with this view that our data are insufficient and incomplete. Our results demonstrate that mice lacking mast cells (Cpa3<sup>Cre</sup> mice) are as resistant to sepsis as wild-type mice, indicating that mast cells do not play a detectable role in immunity against sepsis. Additionally, we show that Kit<sup>W/Wv</sup> mice exhibit the same resistance to sepsis as wild-type mice when confronted with the identical intestinal slurry. This finding demonstrates that Kit<sup>W/Wv</sup> mice have no immune deficit in response to sepsis. These central data are both sufficient and complete, given that our data fully address the immunological questions regarding the role of mast cells in sepsis. Our study aimed to investigate the role of mast cells in sepsis, not to examine the mechanisms of dysbiosis or associated pathological phenotypes in Kit mutant controls.

      Recommendations:

      (1) The authors showed that E. coli increases in the cecum of Kit-mutant mice, which causes high CLP susceptibility. However, they did not provide any evidence E. coli is responsible for the high susceptibility.

      We showed that E. coli CFUs were increased in the cecum of Kit-mutant mice, but we did not state that this causes CLP susceptibility. We wrote: 'Hence, Kit<sup>W/Wv</sup> microbiota contains high levels of E. coli, which may underlie the observed pathogenicity'. We demonstrated that intestinal slurry from Kit<sup>W/Wv</sup> mice is more pathogenic compared to intestinal slurry from wild-type mice. However, we did not search for, or identify the bacterial species that causes this increased pathogenicity because we were adressing the role of mast cell in sepsis. 

      In the Figure 3 experiments, the authors administered the same number of cecal bacteria and did not show the number of E. coli after the administration.

      The samples were split and one aliquot was analysed by microbiology and the other aliquot was injected intraperitoneally. Fig. 3d shows the colony forming units (for Lactobacilli and E coli) from aliquots of cecal slurry used in the intraperitoneal injection experiments shown in Fig. 3a-c. Hence, our data show the colony forming units that were injected into the mice. It is unclear to us why this is not the key information rather than 'the number of E. coli after the administration'.

      The authors should provide evidence showing that depletion of E. coli decreases susceptibility.

      See response to point 1 above.

      (2) The author should provide direct evidence of dysbiosis by, for example, shotgun sequencing of cecal and fecal contents.

      The large increase in E coli counts in Kit<sup>W/Wv</sup> is evidence of dysbiosis. To obtain data beyond classical microbiology, we also performed 16S rRNA sequencing which will be included in the revision.

      (3) In case the authors find dysbiosis, they should analyze the mechanisms by which Kit mutation causes dysbiosis.

      The mechanism that causes dysbiosis in Kit<sup>W/Wv</sup> mice (which emerged from our work) belongs to other research areas that address the role of Kit in intestinal pathophysiology. These questions are unrelated to the role of mast cells in sepsis, or the response of Kit<sup>W/Wv</sup> mice to sepsis. Regardless of the results of such experiments, the conclusion ("Susceptibility of Kit-mutant mice to sepsis caused by enteral dysbiosis, not mast cell deficiency") remains unaffected. In brief, further explorations of pathological phenotypes of a control mutant will not add to the core message. Along these lines, the review process and the revision shall center on making the core of a paper as conclusive as possible, and not widen a paper by requests 'tangential to the main conclusion' (Kaelin Jr. Nature 2017).

      References

      Caruso, R., Ono, M., Bunker, M. E., Núñez, G. & Inohara, N. Dynamic and Asymmetric Changes of the Microbial Communities after Cohousing in Laboratory Mice. Cell Rep. 27, 3401-3412.e3 (2019).

      Feyerabend, T. B. et al. Deletion of Notch1 Converts Pro-T Cells to Dendritic Cells and Promotes Thymic B Cells by Cell-Extrinsic and Cell-Intrinsic Mechanisms. Immunity 30, 67–79 (2009).

      Feyerabend, T. B. et al. Cre-Mediated Cell Ablation Contests Mast Cell Contribution in Models of Antibody- and T Cell-Mediated Autoimmunity. Immunity 35, 832–844 (2011).

      Feyerabend, T. B., Gutierrez, D. A. & Rodewald, H.-R. Of Mouse Models of Mast Cell Deficiency and Metabolic Syndrome. Cell Metab 24, 1–2 (2016).

      Kaelin Jr, W. G. Publish houses of brick, not mansions of straw. Nature 545, 387–387 (2017).

      Moore, R. J. & Stanley, D. Experimental design considerations in microbiota/inflammation studies. Clin. Transl. Immunol. 5, e92 (2016).

      Piliponsky, A. M. et al. Basophil-derived tumor necrosis factor can enhance survival in a sepsis model in mice. Nat. Immunol. 20, 129–140 (2019).

      Ridaura, V. K. et al. Gut Microbiota from Twins Discordant for Obesity Modulate Metabolism in Mice. Science 341, 1241214 (2013).

    1. eLife Assessment

      This study provides important insights into the role of polyUbiquitination in neurodegenerative diseases, elucidating how pUb promotes neurodegeneration by affecting proteasomal function. The findings not only offer a new perspective on the pathophysiology of neurodegenerative diseases but also provide potential targets for developing new therapeutic strategies. The results provide solid evidence to support the conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript discusses the role of phosphorylated ubiquitin (pUb) by PINK1 kinase in neurodegenerative diseases. It reveals that elevated levels of pUb are observed in aged human brains and those affected by Parkinson's disease (PD), as well as in Alzheimer's disease (AD), aging, and ischemic injury. The study shows that increased pUb impairs proteasomal degradation, leading to protein aggregation and neurodegeneration. The authors also demonstrate that PINK1 knockout can mitigate protein aggregation in aging and ischemic mouse brains, as well as in cells treated with a proteasome inhibitor. While this study provided some interesting data, several important points should be addressed before being further consideration.

      Strengths:

      (1) Reveals a novel pathological mechanism of neurodegeneration mediated by pUb, providing a new perspective on understanding neurodegenerative diseases.

      (2) The study covers not only a single disease model but also various neurodegenerative diseases such as Alzheimer's disease, aging, and ischemic injury, enhancing the breadth and applicability of the research findings.

      Comments on revisions:

      This study, through a systematic experimental design, reveals the crucial role of pUb in forming a positive feedback loop by inhibiting proteasome activity in neurodegenerative diseases. The data are comprehensive and highly innovative. However, some of the results are not entirely convincing, particularly the staining results in Figure 1.

      In Figure 1A, the density of DAPI staining differs significantly between the control patient and the AD patient, making it difficult to conclusively demonstrate a clear increase in PINK1 in AD patients. Quantitative analysis is needed. In Fig 1C, the PINK1 staining in the mouse brain appears to resemble non-specific staining.

    3. Author response:

      The following is the authors’ response to the previous reviews

      In response to Reviewer #1, we have replaced the original images in Figure 1A with new immunofluorescence data showing matched DAPI staining density between control and AD patient samples. We also have updated the PINK1 staining images of mouse brain sections in Figure 1C to eliminate potential non-specific signals. These revisions provide clearer evidence supporting our conclusions about PINK1/pUb’s role in neurodegeneration.

    1. eLife Assessment

      This important study, which has been improved further upon revision, reveals a critical role of the transcription factor NR2F2 in mouse fetal Leydig cell (FLC) differentiation. With elegantly carried out experiments, the authors provide compelling evidence that NR2F2 helps to initiate the differentiation of certain interstitial cells into FLC until these cells mature into functional secretory cells that produce androgen and insulin-like peptide 3 (INSL3). The particular importance of the work comes from the fact that NR2F2 affects FLCs without altering paracrine signals known to be involved in FLC differentiation. The work will be of interest to colleagues studying reproductive development in mammals including humans or the biological functions of the nuclear receptor family.

    2. Reviewer #1 (Public review):

      Summary:

      In this beautiful paper the authors examined the role and function of NR2F2 in testis development and more specifically on fetal Leydig cells development. It is well known by now that FLC are developed from an interstitial steroidogenic progenitor at around E12.5 and are crucial for testosterone and INSL3 production during embryonic development, which in turn shapes the internal and external genitalia of the male. Indeed, lack of testosterone or INSL3 are known to cause DSD as well as undescended testis, also termed as cryptorchidism.

      The authors first characterized the expression pattern of the NR2R2 protein during testis development and then used two cKO systems of NR2F2, namely the Wt1-creERT2 and the Nr5a1-cre to explore the phenotype of loss of NR2F2. They found in both cases that mice are presenting with undescended testis and major reduction in FLC numbers. They show that NR2F2 has no effect on the amount and expression of the progenitor cells but in its absence, there are less FLC and they are immature.

      The effect of NR2F2 is cell autonomous and does not seem to affect other signalling pathways implemented in Leydig cell development as the DHH, PDGFRA and the NOTCH pathway.

      Overall, this paper is excellent, very well written, fluent and clear. The data is well presented, and all the controls and statistics are in place. I think this paper will be of great interest to the field and paves the way for several interesting follow up studies as stated in the discussion

      Comments on revised version:

      The authors have fully addressed my concerns and the manuscript is looking excellent.

    3. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary

      In this beautiful paper the authors examined the role and function of NR2F2 in testis development and more specifically on fetal Leydig cells development. It is well known by now that FLC are developed from an interstitial steroidogenic progenitors at around E12.5 and are crucial for testosterone and INSL3 production during embryonic development, which in turn shapes the internal and external genitalia of the male. Indeed, lack of testosterone or INSL3 are known to cause DSD as well as undescended testis, also termed as cryptorchidism. The authors first characterized the expression pattern of the NR2R2 protein during testis development and then used two cKO systems of NR2F2, namely the Wt1-creERT2 and the Nr5a1-cre to explore the phenotype of loss of NR2F2. They found in both cases that mice are presenting with undescended testis and major reduction in FLC numbers. They show that NR2F2 has no effect on the amount and expression of the progenitor cells but in its absence, there are less FLC and they are immature.

      The effect of NR2F2 is cell autonomous and does not seem to affect other signalling pathways implemented in Leydig cell development as the DHH, PDGFRA and the NOTCH pathway.

      Overall, this paper is excellent, very well written, fluent and clear. The data is well presented, and all the controls and statistics are in place. I think this paper will be of great interest to the field and paves the way for several interesting follow up studies as stated in the discussion

      Reviewer #2 (Public review):

      The major conclusion of the manuscript is expressed in the title: "NR2F2 is required in the embryonic testis for Fetal Leydig Cell development" and also at the end of the introduction and all along the result part. All the authors' assertions are supported by very clear and statistically validated results from ISH, IHC, precise cell counting and gene expression levels by qPCR. The authors used two different conditional Nr2f2 gene ablation systems that demonstrate the same effects at the FLC level. They also showed that the haplo-insufficiency of Wt1 in the first system (knock-in Wt1-cre-ERT2) aggravated the situation in FLC differentiation by disturbing the differentiation of Sertoli cells and their secretion of pro-FLC factors, which had a confounding effect and encouraged them to use the second system. This demonstrates the great rigor with which the authors interpreted the results. In conclusion, all authors' claims and conclusions are justified by their high-quality results.

      Recommendations for the authors:

      We thank the reviewers for their comments which have improved and strengthened our manuscript. Please see our responses to specific comments below in blue.

      Reviewer #1 (Recommendations for the authors):

      I have several small comments:

      (1) There has been recently a preprint from the Yao lab about the role of NR2F2 is steroidogenic cells (https://www.biorxiv.org/content/10.1101/2024.09.16.613312v1). They performed cKO of NR2F2 using the Wt1creERT2 and found similar results. You should present and discuss this paper in light of your results.

      Estermann et al., report a very similar phenotype of FLC hypoplasia in an independent mouse model of Nr2f2 conditional mutation. We have now referred to this article in the discussion of our manuscript as suggested.

      (2) In the introduction I think it is important to mention that the steroidogenic progenitors are derived from Wnt5a positive cells (https://pubmed.ncbi.nlm.nih.gov/35705036/).

      We have mentioned this point in the introduction as suggested.

      (3) In both models you show a decrease in the number of FLC (60% or 40%) and yet they both present with undescended testis. It is important to discuss the fact that there is no need for a complete ablation of testosterone and INSL3 in order to get cryptorchidism.

      We have mentioned this point in the discussion as suggested.

      The fact that you get only partial reduction in FLC is likely due to redundancy with additional factors, possibly the ARX like you stated in the discussion and it will be interesting to explore that in the future but is beyond the scope of the current paper.

      We agree with the reviewer, this question could be addressed by analyzing Arx,Nr2f2 double mutants.

      (4) In page 8 line 11 you mention data not shown- not sure if this is allowed in the journal .

      The data is now shown in Figure S5A as suggested.

      (5) In Figure 2- it will be good if you add a schematic model of the mouse strains used as well as the experimental and control mice next to the Tam scheme. Similar scheme should be in figure 3 for Nr5a1-cre.

      We have modified Figures 2 and 3 as suggested.

      (6) There is a clear and pronounced effect of the testis cords number and size. It will be good if you could qualify testis cord numbers/ diameter in the mutants even if you do not follow in detail the effect on Sertoli cells

      We have quantified testis cords numbers and area in E14.5 Control and Wt1<sup>CreERT2/+</sup>; Nr2f2<sup>flox/flox</sup> testes. This data is now shown in Figure S2M.

      (7) It will be good to present the undescended testis in the Wt1-cre model in figure 2 and not in the supp figure

      The data is now shown in Figure 2H-I as suggested.

      (8) Please add labelling of the testis, kidney, bladder, vas deferens in figure 3 N+O and in the Wt1-cre model

      We have added the labels in Figures 2 and 3 as suggested.

      (9) In figure 5 which present both models- it will be good to use the scheme I suggested before to highlight which results refer to which ko model.

      We have modified Figure 5 as suggested.

      Reviewer #2 (Recommendations for the authors):  

      The work presented in this manuscript gave me food for thought. I have always been intrigued by the fact that of the large number of interstitial cells in the testis, a minority differentiate into mature androgen-producing Leydig cells. In other words, how is the number of functional steroidogenic cells defined from a large pool of progenitor cells (ARX and NR2F2 positive ones)? This may have a link with the levels of androgens produced (a kind of feedback control) or the effectiveness of these androgens on the target tissues (i.e.: as spermatogenesis efficiency in adults). In addition, there must be specific signals (probably linked to gonadotropins) that induce the recruitment of Leydig cells from the progenitor pool. Perhaps the genetic models generated in this study could help to address these questions. I leave it to the authors to judge.

      We agree with the reviewer. How NR2F2 (and other factors) integrate extrinsic cues to regulate the recruitment of a subset of interstitial steroidogenic progenitors along the Leydig cell differentiation pathway is a fascinating question beyond the scope of this work.

      In addition to this reflection, I propose a few minor modifications likely to improve the quality of the manuscript:

      (1) Page 3, lane 3: I suggest to replace "growth" by "differentiation"

      We have modified the text as suggested.

      (2) Page 3, lane 4: the "scrotum" is missing in the parenthesis. Please add it before "and penis"

      We have modified the text as suggested.

      (3) Page 5, lanes 21-24: kidney hypoplasia is also evident on Fig S2H (stated in the figure legend). It could be also mentioned in this sentence and it implies "...that NR2F2 function is required for testicular and kidney development."

      We have modified the text as suggested.

      (4) Page 5, lanes 28-30. In addition to the reduction in the number of HSD3B-positive cells, HSD3B staining seems clearly more faint in mutant FLC (Fig 2M) compared to adrenal cells on the same section or FLC in control gonads. This fits well with other results on the level of steroidogenic enzymes (Fig 2O) and those presented thereafter (Fig S4 I-J and Fig 5). Perhaps the author could mention this fact.

      We have modified the text as suggested in the results section “NR2F2 is required for FLC maturation” (Page 8).

      (5) Page 5, lanes 31-34: testicular descent is hugely sensible to INSL3 in the mouse (by contrast with other species where androgens seem to be more critical). I was wondering if you can check a better phenotypic marker for the absence (or reduction) of androgens like the differentiation of epididymides by HE staining or the anogenital distance at birth.

      We have measured the anogenital distance at P0 and P1 as suggested and have included the corresponding graph in Fig. S3P

      (6) Page 8, lanes 21-22: "HSD3B positive FLC were smaller and more elongated". It is clear on Fig 5F but not evident on Fig 5D. Could the authors propose another image?

      We have modified Figure 5 as suggested and provide now another example of HSD3B positive FLCs in a Nr5a1Cre; Nr2f2<sup>flox/flox</sup> mutant gonad (Fig. 5D) and the corresponding control littermate (Fig. 5C).

      (7) Page 14, lane 12: "(arrow in I)" should be "(arrow in H)"

      We have modified the text as suggested. Please note that ACTA 2 expression is now shown in Figure S2 G-H.

      (8) Page 15, lane 6: "Arrows indicate NR5A1 positive FLC". There is no arrow on Fig4 C,D; but a kind of scale bar on the enlargement shown in C.

      We have modified Figure 4 as suggested.

    4. Reviewer #2 (Public review):

      The major conclusion of the manuscript is expressed in the title: "NR2F2 is required in the embryonic testis for Fetal Leydig Cell development" and also at the end of the introduction and all along the result part. All the authors' assertions are supported by very clear and statistically validated results from ISH, IHC, precise cell counting and gene expression levels by qPCR. The authors used two different conditional Nr2f2 gene ablation systems that demonstrate the same effects at the FLC level. They also showed that the haplo-insufficiency of Wt1 in the first system (knock-in Wt1-cre-ERT2) aggravated the situation in FLC differentiation by disturbing the differentiation of Sertoli cells and their secretion of pro-FLC factors, which had a confounding effect and encouraged them to use the second system. This demonstrates the great rigor with which the authors interpreted the results. In conclusion, all authors' claims and conclusions are justified by their high-quality results.

      Comments on revised version:

      In their revised version, the authors have taken full account of all my suggestions, and I congratulate them on this. I have no further comments to make on this new version.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Azlan et al. identified a novel maternal factor called Sakura that is required for proper oogenesis in Drosophila. They showed that Sakura is specifically expressed in the female germline cells. Consistent with its expression pattern, Sakura functioned autonomously in germline cells to ensure proper oogenesis. In sakura KO flies, germline cells were lost during early oogenesis and often became tumorous before degenerating by apoptosis. In these tumorous germ cells, piRNA production was defective and many transposons were derepressed. Interestingly, Smad signaling, a critical signaling pathway for the GSC maintenance, was abolished in sakura KO germline stem cells, resulting in ectopic expression of Bam in whole germline cells in the tumorous germline. A recent study reported that Bam acts together with the deubiquitinase Otu to stabilize Cyc A. In the absence of sakura, Cyc A was upregulated in tumorous germline cells in the germarium. Furthermore, the authors showed that Sakura co-immunoprecipitated Otu in ovarian extracts. A series of in vitro assays suggested that the Otu (1-339 aa) and Sakura (1-49 aa) are sufficient for their direct interaction. Finally, the authors demonstrated that the loss of otu phenocopies the loss of sakura, supporting their idea that Sakura plays a role in germ cell maintenance and differentiation through interaction with Otu during oogenesis.

      Strengths:

      To my knowledge, this is the first characterization of the role of CG14545 genes. Each experiment seems to be well-designed and adequately controlled

      Weaknesses:

      However, the conclusions from each experiment are somewhat separate, and the functional relationships between Sakura's functions are not well established. In other words, although the loss of Sakura in the germline causes pleiotropic effects, the cause-and-effect relationships between the individual defects remain unclear.

      Comments on latest version:

      The authors have attempted to address my initial concerns with additional experiments and refutations. Unfortunately, my concerns, especially my specific comments 1-3, remain unaddressed. The present manuscript is descriptive and fails to describe the molecular mechanism by which Sakura exerts its function in the germline. Nevertheless, this reviewer acknowledges that the observed defects in sakura mutant ovaries and the possible physiological significance of the Sakura-Out interaction are worth sharing with the research community, as they may lay the groundwork for future research in functional analysis.

    2. eLife Assessment

      This valuable study reports the first characterization of the CG14545 gene in Drosophila melanogaster, which the authors name "Sakura." Acting during germline stem cell fate and differentiation, Sakura is required for both oogenesis and female fertility, although some mechanistic details require further investigation. This solid study presents a wide-ranging and well-controlled characterization of Sakura, and accordingly the findings and associated reagents described will be of use to scientists interested in oogenesis and early development.

    3. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Azlan et al. identified a novel maternal factor called Sakura that is required for proper oogenesis in Drosophila. They showed that Sakura is specifically expressed in the female germline cells. Consistent with its expression pattern, Sakura functioned autonomously in germline cells to ensure proper oogenesis. In sakura KO flies, germline cells were lost during early oogenesis and often became tumorous before degenerating by apoptosis. In these tumorous germ cells, piRNA production was defective and many transposons were derepressed. Interestingly, Smad signaling, a critical signaling pathway for the GSC maintenance, was abolished in sakura KO germline stem cells, resulting in ectopic expression of Bam in whole germline cells in the tumorous germline. A recent study reported that Bam acts together with the deubiquitinase Otu to stabilize Cyc A. In the absence of sakura, Cyc A was upregulated in tumorous germline cells in the germarium. Furthermore, the authors showed that Sakura co-immunoprecipitated Otu in ovarian extracts. A series of in vitro assays suggested that the Otu (1-339 aa) and Sakura (1-49 aa) are sufficient for their direct interaction. Finally, the authors demonstrated that the loss of otu phenocopies the loss of sakura, supporting their idea that Sakura plays a role in germ cell maintenance and differentiation through interaction with Otu during oogenesis.

      Strengths:

      To my knowledge, this is the first characterization of the role of CG14545 genes. Each experiment seems to be well-designed and adequately controlled

      Weaknesses:

      However, the conclusions from each experiment are somewhat separate, and the functional relationships between Sakura's functions are not well established. In other words, although the loss of Sakura in the germline causes pleiotropic effects, the cause-and-effect relationships between the individual defects remain unclear.

      Comments on latest version:

      The authors have attempted to address my initial concerns with additional experiments and refutations. Unfortunately, my concerns, especially my specific comments 1-3, remain unaddressed. The present manuscript is descriptive and fails to describe the molecular mechanism by which Sakura exerts its function in the germline. Nevertheless, this reviewer acknowledges that the observed defects in sakura mutant ovaries and the possible physiological significance of the Sakura-Out interaction are worth sharing with the research community, as they may lay the groundwork for future research in functional analysis.

    4. Reviewer #3 (Public review):

      In this very thorough study, the authors characterize the function of a novel Drosophila gene, which they name Sakura. They start with the observation that sakura expression is predicted to be highly enriched in the ovary and they generate an anti-sakura antibody, a line with a GFP-tagged sakura transgene, and a sakura null allele to investigate sakura localization and function directly. They confirm the prediction that it is primarily expressed in the ovary and, specifically, that it is expressed in germ cells, and find that about 2/3 of the mutants lack germ cells completely and the remaining have tumorous ovaries. Further investigation reveals that Sakura is required for piRNA-mediated repression of transposons in germ cells. They also find evidence that sakura is important for germ cell specification during development and germline stem cell maintenance during adulthood. However, despite the role of sakura in maintaining germline stem cells, they find that sakura mutant germ cells also fail to differentiate properly such that mutant germline stem cell clones have an increased number of "GSC-like" cells. They attribute this phenotype to a failure in the repression of Bam by dpp signaling. Lastly, they demonstrate that sakura physically interacts with otu and that sakura and otu mutants have similar germ cell phenotypes. Overall, this study helps to advance the field by providing a characterization of a novel gene that is required for oogenesis. The data are generally high-quality and the new lines and reagents they generated will be useful for the field.

      Comments on latest version:

      With these revisions, the authors have addressed my main concerns.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Azlan et al. identified a novel maternal factor called Sakura that is required for proper oogenesis in Drosophila. They showed that Sakura is specifically expressed in the female germline cells. Consistent with its expression pattern, Sakura functioned autonomously in germline cells to ensure proper oogenesis. In Sakura KO flies, germline cells were lost during early oogenesis and often became tumorous before degenerating by apoptosis. In these tumorous germ cells, piRNA production was defective and many transposons were derepressed. Interestingly, Smad signaling, a critical signaling pathway for GSC maintenance, was abolished in sakura KO germline stem cells, resulting in ectopic expression of Bam in whole germline cells in the tumorous germline. A recent study reported that Bam acts together with the deubiquitinase Otu to stabilize Cyc A. In the absence of sakura, Cyc A was upregulated in tumorous germline cells in the germarium. Furthermore, the authors showed that Sakura co-immunoprecipitated Otu in ovarian extracts. A series of in vitro assays suggested that the Otu (1-339 aa) and Sakura (1-49 aa) are sufficient for their direct interaction. Finally, the authors demonstrated that the loss of otu phenocopies the loss of sakura, supporting their idea that Sakura plays a role in germ cell maintenance and differentiation through interaction with Otu during oogenesis.

      Strengths:

      To my knowledge, this is the first characterization of the role of CG14545 genes. Each experiment seems to be well-designed and adequately controlled.

      Weaknesses:

      However, the conclusions from each experiment are somewhat separate, and the functional relationships between Sakura's functions are not well established. In other words, although the loss of Sakura in the germline causes pleiotropic effects, the cause-and-effect relationships between the individual defects remain unclear.

      Reviewer #2 (Public review):

      In this study, the authors identified CG14545 (and named it Sakura), as a key gene essential for Drosophila oogenesis. Genetic analyses revealed that Sakura is vital for both oogenesis progression and ultimate female fertility, playing a central role in the renewal and differentiation of germ stem cells (GSC).

      The absence of Sakura disrupts the Dpp/BMP signaling pathway, resulting in abnormal bam gene expression, which impairs GSC differentiation and leads to GSC loss. Additionally, Sakura is critical for maintaining normal levels of piRNAs. Also, the authors convincingly demonstrate that Sakura physically interacts with Otu, identifying the specific domains necessary for this interaction, suggesting a cooperative role in germline regulation. Importantly, the loss of otu produces similar defects to those observed in Sakura mutants, highlighting their functional collaboration.

      The authors provide compelling evidence that Sakura is a critical regulator of germ cell fate, maintenance, and differentiation in Drosophila. This regulatory role is mediated through the modulation of pMad and Bam expression. However, the phenotypes observed in the germarium appear to stem from reduced pMad levels, which subsequently trigger premature and ectopic expression of Bam. This aberrant Bam expression could lead to increased CycA levels and altered transcriptional regulation, impacting piRNA expression. Given Sakura's role in pMad expression, it would be insightful to investigate whether overexpression of Mad or pMad could mitigate these phenotypic defects (UAS-Mad line is available at Bloomington Drosophila Stock Center).

      As suggested reviewer 1, we tested whether overexpression of Mad could rescue or mitigate the loss of sakura phenotypic defects, by using nos-Gal4-VP16 > UASp-Mad-GFP in the background of sakura<sup>null</sup>. As shown in Fig S11, we did not observe any mitigation of defects.

      Then, we also tested whether expressing a constitutive active form of Tkv, by using UAS-Dcr2, NGT-Gal4 > UASp-tkv.Q235D in the background of sakura<sup>RNAi</sup>. As shown in Fig S12, we did not observe any mitigation of defects by this approach either.

      A major concern is the overstated role of Sakura in regulating Orb. The data does not reveal mislocalized Orb; rather, a mislocalized oocyte and cytoskeletal breakdown, which may be secondary consequences of defects in oocyte polarity and structure rather than direct misregulation of Orb. The conclusion that Sakura is necessary for Orb localization is not supported by the data. Orb still localizes to the oocyte until about stage 6. In the later stage, it looks like the cytoskeleton is broken down and the oocyte is not positioned properly, however, there is still Orb localization in the ~8-stage egg chamber in the oocyte. This phenotype points towards a defect in the transport of Orb and possibly all other factors that need to localize to the oocyte due to cytoskeletal breakdown, not Orb regulation directly. While this result is very interesting it needs further evaluation on the underlying mechanism. For example, the decrease in E-cadherin levels leads to a similar phenotype and Bam is known to regulate E-cadherin expression. Is Bam expressed in these later knockdowns?

      We examined Bam and DE-Cadherin expression in later RNAi knockdowns driven by ToskGal4. As shown in Fig S9, Bam was not expressed in these later knockdowns compared with controls. DE-Cadherin staining suggested a disorganized structure in late-stage egg chambers.

      We agree that we overstated a role of Sakura in regulating Orb in the initial manuscript. We changed the text to avoid overstating.

      The manuscript would benefit from a more balanced interpretation of the data concerning Sakura's role in Orb regulation. Furthermore, a more expanded discussion on Sakura's potential role in pMad regulation is needed. For example, since Otu and Bam are involved in translational regulation, do the authors think that Mad is not translated and therefore it is the reason for less pMad? Currently the discussion presents just a summary of the results and not an extension of possible interpretation discussed in context of present literature.

      We changed the text to avoid overstating a role of Sakura in regulating Orb localization.

      Based on our newly added results showing that transgenic overexpression of Mad could not rescue or mitigate the phenotypic defects of sakura<sup>null</sup> mutant (Fig S11), we do not think the reason for less pMad is less translation of Mad.

      Reviewer #3 (Public review):

      In this very thorough study, the authors characterize the function of a novel Drosophila gene, which they name Sakura. They start with the observation that sakura expression is predicted to be highly enriched in the ovary and they generate an anti-sakura antibody, a line with a GFP-tagged sakura transgene, and a sakura null allele to investigate sakura localization and function directly. They confirm the prediction that it is primarily expressed in the ovary and, specifically, that it is expressed in germ cells, and find that about 2/3 of the mutants lack germ cells completely and the remaining have tumorous ovaries. Further investigation reveals that Sakura is required for piRNA-mediated repression of transposons in germ cells. They also find evidence that sakura is important for germ cell specification during development and germline stem cell maintenance during adulthood. However, despite the role of sakura in maintaining germline stem cells, they find that sakura mutant germ cells also fail to differentiate properly such that mutant germline stem cell clones have an increased number of "GSC-like" cells. They attribute this phenotype to a failure in the repression of Bam by dpp signaling. Lastly, they demonstrate that sakura physically interacts with otu and that sakura and otu mutants have similar germ cell phenotypes. Overall, this study helps to advance the field by providing a characterization of a novel gene that is required for oogenesis. The data are generally high-quality and the new lines and reagents they generated will be useful for the field. However, there are some weaknesses and I would recommend that they address the comments in the Recommendations for the authors section below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      General Comments:

      (1) The gene nomenclature: As mentioned in the text, Sakura means cherry blossom and is one of the national flowers of Japan. I am not sure whether the phenotype of the CG14545 mutant is related to Sakura or not. I would like to suggest the authors reconsider the naming.

      The striking phenotype of sakura mutant­ is tumorous and germless ovarioles. The tumorous phenotype, exhibiting lots of round fusome in germarium visualized by anti-Hts staining, looks like cherry blossom blooming to us. Also, the germless phenotype reminds us falling of the cherry blossom, especially considering that the ratio of tumorous phenotype decreases and that of germless decreases over fly age. Furthermore, “Sakura” symbolizes birth and renewal in Japanese culture (the last author of this manuscript is Japanese). Our findings indicated that the gene sakura is involved in regulation of renewal and differentiation of GSCs (which leads to birth). These are the reasons for the naming, which we would like to keep.

      (2) In many of the microscopic photographs in the figures, especially for the merged confocal images, the resolution looks low, and the images appear blurred, making it difficult to judge the authors' claims. Also, the Alpha Fold structure in Figure 10A requires higher contrast images. The magnification of the images is often inadequate (e.g. Figures 3A, 3B, 5E, 7A, etc). The authors should take high-magnification images separately for the germarium and several different stages of the egg chambers and lay out the figures.

      We are very sorry for the low-resolution images. This was caused when the original PDF file with high-resolution images was compressed in order to meet the small file size limit in the eLife submission portal. In the revised submission, we used high-resolution images.

      Specific Comments

      (1) How Sakura can cooperate with Otu remains unanswered. Sakura does not regulate deubiquitinase activity in vitro. Both sakura and otu appear to be involved in the Dpp-Smad signaling pathway and in the spatial control of Bam expression in the germarium, whereas Otu has been reported to act in concert with Bam to deubiquitinate and stabilize Cyc A for proper cystoblast differentiation. Therefore, it is plausible that the stabilization of Cyc A in the Sakura mutant is an indirect consequence of Bam misexpression and independent of the Sakura-Otu interaction. The authors may need to provide much deeper insight into the mechanism by which Sakura plays roles in these seemingly separable steps to orchestrate germ cell maintenance and differentiation during early oogenesis.

      Yes, it is possible that the stabilization of CycA in the sakura mutant is an indirect consequence of Bam misexpression and independent of the Sakura-Otu interaction. To test the significance and role of the Sakura-Otu interaction, we have attempted to identify Sakura point mutants that lose interaction with Otu. If such point mutants were successfully obtained, we were planning to test if their transgene expression could rescue the phenotypes of sakura mutant as the wild-type transgene did. However, after designing and testing the interaction of over 30 point mutants with Otu, we could not obtain such mutant version of Sakura yet. We will continue making efforts, but it is beyond the scope of the current study. We hope to address this important point in future studies.

      (2) Figure 3A and Figure 4: The authors show that piRNA production is abolished in Sakura KO ovaries. It is known that piRNA amplification (the ping-pong cycle) occurs in the Vasa-positive perinuclear nuage in nurse cells. Is the nuage normally formed in the absence of Sakura? The authors provide high-magnification images in the germarium expressing Vas-GFP. How does Sakura, and possibly Out, contribute to piRNA production? Are the defects a direct or indirect consequence of the loss of Sakura?

      We provided higher magnification images of germarium expressing Vasa-EGFP in sakura mutant background (Fig 3A and 3B). The nuage formation does not seem to be dysregulated in sakura mutant. Currently, we do not know if the piRNA defects are direct or indirect consequence of the loss of Sakura. This question cannot be answered easily. We hope to address this in future studies.

      (3) Figure 7 and Figure 12: The authors showed that Dpp-Smad signaling was abolished in Sakura KO germline cells. The same defects were also observed in otu mutant ovaries (Figure 12B). How does the Sakura-Otu axis contribute to the Dpp-Smad pathway in the germline?

      As we mentioned in the response to comment (1), we attempted to test the significance and role of the Sakura-Otu interaction, including in the Dpp-Smad pathway in the germline, but we have not yet been able to obtain loss-of-interaction mutant(s) of Sakura. We hope to address this in future studies.

      (4) Figure 9 and Fig 10: The authors raised antibodies against both Sakura and Otu, but their specificities were not provided. For Western blot data, the authors should provide whole gel images as source data files. Also, the authors argue that the Otu band they observed corresponds to the 98-kDa isoform (lines 302-304). The molecular weight on the Western blot alone would be insufficient to support this argument.

      When we submitted the initial manuscript, we also submitted original, uncropped, and unmodified whole Western blot images for all gel images to the eLife journal, as requested. We did the same for this revised submission. I believe eLife makes all those files available for downloading to readers.

      In the newly added Fig S13B, we used very young 2-5 hours ovaries and 3-7 days ovaries. 2-5 days ovaries contain only mostly pre-differentiated germ cells. Older ovaries (3-7 days in our case here) contain all 14 stages of oogenesis and later stages predominate in whole ovary lysates.

      As reported in previous literature (Sass et al. 1995), we detected a higher abundance of the 104 kDa Otu isoform than the 98 kDa isoform in from 2-5 hours ovaries and predominantly the 98 kDa isoform in 3-7 days ovaries (Fig S13B). These results confirmed that the major Otu isoform we detected in Western blot, all of which uses old ovaries except for the 2-5 hours ovaries in Fig S13B, is the 98 kDa isoform.

      (5) Otu has been reported to regulate ovo and Sxl in the female germline. Is Sakura involved in their regulation?

      We examined sxl alternative splicing pattern in sakura mutant ovaries. As shown in Fig S6, we detected the male-specific isoform of sxl RNA and a reduced level of the female-specific sxl isoform in sakura mutant ovaries. Thus Sakura seems to be involved in sxl splicing in the female germline, while further studies will be needed to understand whether Sakura has a direct or indirect role here.

      (6) Lines 443-447: The GSC loss phenotype in piwi mutant ovaries is thought to occur in a somatic cell-autonomous manner: both piwi-mutant germline clones and germline-specific piwi knockdown do not show the GSC-loss phenotype. In contrast, the authors provide compelling evidence that Sakura functions in the germline. Therefore, the Piwi-mediated GSC maintenance pathway is likely to be independent of the Sakura-Otu axis.

      We changed the text accordingly.

      Reviewer #2 (Recommendations for the authors):

      Overall, this is a cleanly written manuscript, with some sentences/sections that are confusing the way they are constructed (i.e. Line 37-38, 334, section on Flp/FRT experiments).

      We rewrote those sections to avoid confusion.

      Comment for all merged image data: the quality of the merged images is very poor - the individual channels are better but should also be reprocessed for more resolved image data sets. Also, it would be helpful to have boundaries drawn in an individual panel to identify the regions of the germarium, as cartooned in Figure S1A (which should be brought into Figure 1) F-actin or Vsg staining would have helped throughout the manuscript to enhance the visualization of described phenotypes.

      We are very sorry for the low-resolution images. This was caused when the original PDF file with high-resolution images was compressed in order to meet the small file size limit in the eLife submission portal. In the revised submission, we used high-resolution images.

      We outlined the germarium in Fig 1E.

      We brought the former FigS1 into Fig 1A.

      We provided Phalloidin (F-Actin) staining images in Fig S7.

      All p-values seem off. I recommend running the data through the student t-test again.

      We used the student t-test to calculate p-values and confirmed that they are correct. We don’t understand why the reviewer thinks all p-values seem off.

      In the original manuscript, as we mentioned in each figure legends, we used asterisk (*) to indicate p-value <0.05, without distinguishing whether it’s <0.001, <0.01< or <0.05.

      Probably reviewer 2 is suggesting us to use ***, **, and *, to indicate p-value of <0.001, <0.01, and <0.05, respectively? If so, we now followed reviewer2’s suggestions.

      Figure 1

      (1) Within the text, C is mentioned before A.

      We updated the text and now we mentioned Fig 1A before Fig 1C.

      (2) B should be the supplemental figure.

      We moved the former Fig 1B to Supplemental Figure 1.

      (3) C - How were the different egg chamber stages selected in the WB? Naming them 'oocytes' is deceiving. Recommend labeling them as 'egg chambers', since an oocyte is claimed to be just the one-cell of that cyst.

      We changed the labeling to egg chambers.

      (4) Is the antibody not detecting Sakura in IF? There is no mention of this anywhere in the manuscript.

      While our Sakura antibody detects Sakura in IF, it seems to detect some other proteins as well. Since we have Sakura-EGFP fly strain (which fully rescues sakura<sup>null</sup> phenotypes) to examine Sakura expression and localization without such non-specific signal issues, we relied on Sakura-EGFP rather than anti-Sakura antibodies for IF.

      (5) Expand on the reliance of the sakura-EGFP fly line. Does this overexpression cause any phenotypes?

      sakura-EGFP does not cause any phenotypes in the background of sakura[+/+] and sakura[+/-].

      (6) Line 95 "as shown below" is not clear that it's referencing panel D.

      We now referenced Fig 1D.

      (7) Re: Figures 1 E and F. There is no mention of Hts or Vasa proteins in the text.<br /> "Sakura-EGFP was not expressed in somatic cells such as terminal filament, cap cells, escort cells, or follicle cells (Figure 1E). In the egg chamber, Sakura-EGFP was detected in the cytoplasm of nurse cells and was enriched in developing oocytes (Figure 1F)". Outline these areas or label these structures/sites in the images. The color of Merge labels is confusing as the blue is not easily seen.

      We mentioned Hts and Vasa in the text. We labeled the structures/sites in the images and updated the color labeling.

      Figure 2

      (1) Entire figure is not essential to be a main figure, but rather supplemental.

      We don’t agree with the reviewer. We think that the female fertility assay data, where sakura null mutant exhibits strikingly strong phenotype, which was completely rescued by our Sakura-EGFP transgene, is very important data and we would like to present them in a main figure.

      (2) 2A- one star (*) significance does not seem correct for the presented values between 0 and 100+.

      In the original manuscript, as we mentioned in each figure legends, we used asterisk (*) to indicate p-value <0.05, without distinguishing whether it’s <0.001, <0.01< or <0.05.

      Probably reviewer 2 is suggesting us to use ***, **, and *, to indicate p-value of <0.001, <0.01, and <0.05, respectively? If so, we now followed reviewer2’s suggestions.

      (3) 2C images are extremely low quality. Should be presented as bigger panels.

      We are very sorry for the low-resolution images. This was caused when the original PDF file with high-resolution images was compressed in order to meet the small file size limit in the eLife submission portal. In the revised submission, we used high-resolution images. We also presented as bigger panels.

      Figure 3

      (1) "We observed that some sakura<sup>null</sup> /null ovarioles were devoid of germ cells ("germless"), while others retained germ cells (Fig 3A)" What is described is, that it is hard to see. Must have a zoomed-in panel.

      We provided zoomed-in panels in Fig 3B

      (2) C - The control doesn't seem to match. Must zoom in.

      We provided matched control and also zoomed in.

      (3) For clarity, separate the tumorous and germless images.

      In the new image, only one tumorous and one germless ovarioles are shown with clear labeling and outline, for clarity.

      (4) Use arrows to help clearly indicate the changes that occur. As they are presented, they are difficult to see.

      We updated all the panels to enhance clarity.

      (5) Line 158 seems like a strong statement since it could be indirect.

      We softened the statement.

      Figure 4

      (1) Line 188-189 - Conclusion is an overstatement.

      We softened the statement.

      (2) Is the piRNA reduction due to a change in transcription? Or a direct effect by Sakura?

      We do not know the answers to these questions. We hope to address these in future studies.

      Figure 5

      (1) D - It might make more sense if this graph showed % instead of the numbers.

      We did not understand the reviewer’s point. We think using numbers, not %, makes more sense.

      (2) Line 213 - explain why RNAi 2 was chosen when RNAi 1 looks stronger.

      Fly stock of RNAi line 2 is much healthier than RNAi line 1 (without being driven Gal4) for some reasons. We had a concern that the RNAi line 1 might contain an unwanted genetic background. We chose to use the RNAi 2 line to avoid such an issue.

      (3) In Line 218 there's an extra parenthesis after the PGC acronym.

      We corrected the error.

      (4) TOsk-Gal4 fly is not in the Methods section.

      We mentioned TOsk-Gal4 in the Methods.

      Figure 6:

      (1) The FLP-FRT section must be rewritten.

      We rewrote the FLP-FRT section.

      (2) A - include statistics.

      We included statistics using the chi-square test.

      (3) B - is not recalled in the Results text.

      We referred Fig 6B in the text.

      (4) Line 232 references Figure 3, but not a specific panel.

      We referred Fig 3A, 3C, 3D, and 3E, in the text.

      Figure 7/8 - can go to Supplemental.

      We moved Fig 8 to supplemental. However, we think Fig 7 data is important and therefore we would like to present them as a main figure.

      (1) There should be CycA expression in the control during the first 4 divisions.

      Yes, there is CycA expression observed in the control during the first 4 divisions, while it’s much weaker than in sakura<sup>null</sup> clone.

      (2) Helpful to add the dotted lines to delineate (A) as well.

      We added a dotted outline for germarium in Fig 7A.

      (3) Line 263 CycA is miswritten as CyA.

      We corrected the typo.

      Figure 9

      (1) Otu antibody control?

      We validated Otu antibody in newly added Fig 10C and Fig S13A.

      (2) Which Sakura-EGFP line was used? sakura het. or null background? This isn't mentioned in the text, nor legend.

      We used Sakura-EGFP in the background of sakura[+/+]. We added this information in the methods and figure legend.

      (3) C - Why the switch to S2 cells? Not able to use the Otu antibody in the IP of ovaries?

      We can use the Otu antibody in the IP of ovaries. However, in anti-Sakura Western after anti-Otu IP, antibody light chain bands of the Otu antibodies overlap with the Sakura band. Therefore, we switched to S2 cells to avoid this issue by using an epitope tag.

      Figure 10

      (1) A- The resolution of images of the ribbon protein structure is poor.

      We are very sorry for the low-resolution images. This was caused when the original PDF file with high-resolution images was compressed in order to meet the small file size limit in the eLife submission portal. In the revised submission, we used high-resolution images.

      (2) A table summarizing the interactions between domains would help bring clarity to the data presented.

      We added a table summarizing the fragment interaction results.

      (3) Some images would be nice here to show that the truncations no longer colocalize.

      We did not understand the reviewer’s points. In our study, even for the full-length proteins.

      We have not shown any colocalization of Sakura and Otu in S2 cells or in ovaries, except that they both are enriched in developing oocytes in egg chambers.

      Figure 12

      (1) A - control and RNAi lines do not match.

      We provided matched images.

      (2) In general, since for Sakura, only its binding to Otu was identified and since they phenocopy each other, doesn't most of the characterization of Sakura just look at Otu phenotypes? Does Sakura knockdown affect Otu localization or expression level (and vice versa)?

      We tested this by Western (Fig S15) and IF (Fig 12). Sakura knockdown did not decrease Otu protein level, and Otu knockdown did not decrease Sakura protein level (Fig S15). In sakura<sup>null</sup> clone, Otu level was not notably affected (Fig 12). In sakura<sup>null</sup> clone, Otu lost its localization to the posterior position within egg chambers.

      Figure S6

      (1) It is Luciferase, not Lucifarase.

      We corrected the typo.

      Reviewer #3 (Recommendations for the authors):

      (1) It is interesting that germless and tumorous phenotypes coexist in the same population of flies. Additional consideration of these essentially opposite phenotypes would significantly strengthen the study. For example, do they co-exist within the same fly and are the tumorous ovarioles present in newly eclosed flies or do they develop with age? The data in Figure 8 show that bam knockdown partially suppresses the germless phenotype. What effect does it have on the tumorous phenotype? Is transposon expression involved in either phenotype? Do Sakura mutant germline stem cell clones overgrow relative to wild-type cells in the same ovariole? Does sakura RNAi driven by NGT-Gal4 only cause germless ovaries or does it also cause tumorous phenotypes? What happens if the knockdown of Sakura is restricted to adulthood with a Gal80ts? It may not be necessary to answer all of these questions, but more insight into how these two phenotypes can be caused by loss of sakura would be helpful.

      We performed new experiments to answer these questions.

      do they co-exist within the same fly and are the tumorous ovarioles present in newly eclosed flies or do they develop with age?

      Tumorous and germless ovarioles coexist in the same fly (in the same ovary). Tumorous ovarioles are present in very young (0-1 day old) flies, including newly eclosed (Fig S5). The ratio of germless ovarioles increases and that of tumorous ovarioles decreases with age (Fig S5).

      The data in Figure 8 show that bam knockdown partially suppresses the germless phenotype. What effect does it have on the tumorous phenotype?

      bam knockdown effect on tumorous phenotype is shown in Fig S10. bam knockdown increased the ratio of tumorous ovarioles and the number of GSC-like cells.

      Is transposon expression involved in either phenotype?

      Since our transposon-piRNA reporter uses germline-specific nos promoter, it is expressed only in germ line cells, so we cannot examine in germless ovarioles.

      Do Sakura mutant germline stem cell clones overgrow relative to wild-type cells in the same ovariole?

      Yes, Sakura mutant GSC clones overgrow. Please compare Fig 6C and Fig S8.

      Does sakura RNAi driven by NGT-Gal4 only cause germless ovaries or does it also cause tumorous phenotypes?

      Fig S10 and Fig S12 show the ovariole phenotypes of sakura RNAi driven by NGT-Gal4. It causes both germless and tumorous phenotypes.

      What happens if the knockdown of Sakura is restricted to adulthood with a Gal80ts?

      Our mosaic clone was induced at the adult stage, so we already have data of adulthood-specific loss of function. Gal80ts does not work well with nos-Gal4.

      (2) The idea that the excessive bam expression in tumorous ovaries is due to a failure of bam repression by dpp signaling is not well-supported by the data. Dpp signaling is activated in a very narrow region immediately adjacent to the niche but the images in Figure 7A show bam expression in cells that are very far away from the niche. Thus, it seems more likely to be due to a failure to turn bam expression off at the 16-cell stage than to a failure to keep it off in the niche region. To determine whether bam repression in the niche region is impaired, it would be important to examine cells adjacent to the niche directly at a higher magnification than is shown in Figure 7A.

      We provided higher magnification images of cells adjacent to the niche in new Fig 7A.

      We found that cells adjacent to the niche also express Bam-GFP.

      That said, we agree with the reviewer. A failure to turn bam expression off at the 16-cell stage may be an additional or even a main cause of bam misexpression in sakura mutant. We added this in the Discussion.

      (3) In addition, several minor comments should be addressed:

      a. Does anti-Sakura work for immunofluorescence?

      While our Sakura antibody detects Sakura in IF, it seems to detect some other proteins as well. Since we have Sakura-EGFP fly strain to examine Sakura expression and localization without such non-specific signal issues, we relied on Sakura-EGFP rather than anti-Sakura antibodies.

      b. Please provide insets to show the phenotypes indicated by the different color stars in Figure 3C more clearly.

      We provided new, higher-magnification images to show the phenotypes more clearly.

      c. Please indicate the frequency of the expression patterns shown in Figure 4D (do all ovarioles in each genotype show those patterns or is there variable penetrance?).

      We indicated the frequency.

      d. An image showing TOskGal4 driving a fluorophore should be provided so that readers can see which cells express Gal4 with this driver combination.

      It has been already done in the paper ElMaghraby et al, GENETICS, 2022, 220(1), iyab179, so we did not repeat the same experiment.

    1. eLife Assessment

      This important work describes results from a set of simulation and empirical studies of a set-up assessing exploratory behavior in a potentially rewarding environment that contains danger. The core idea is that an instrumental agent can be helped to be both effective and safe, thus avoiding excessive danger, during exploratory behavior, if the influence of an independent Pavlovian fear is flexibly gated based on uncertainty. This work is grounded in previous foundational work on Pavlovian control of instrumental choice, and significantly extends prior work showing that the impact of Pavlovian reward biases can be flexibly gated. The conclusion that safe but effective exploration can be achieved based on a flexibly weighted combination of a Pavlovian and an instrumental agent is convincing.

    2. Reviewer #1 (Public review):

      Summary:

      This paper provides a computational model of a synthetic task in which an agent needs to find a trajectory to a rewarding goal in a 2D-grid world, in which certain grid blocks incur a punishment. In a completely unrelated setup without explicit rewards, they then provide a model that explains data from an approach-avoidance experiment in which an agent needs to decide whether to approach, or withdraw from, a jellyfish, in order to avoid a pain stimulus, with no explicit rewards. Both models include components that are labelled as "Pavlovian"; hence the authors argue that their data show that the brain uses a "Pavlovian" fear system in complex navigational and approach-avoid decisions.

      In the first setup, they simulate a model in which a "Pavlovian" component learns about punishment in each grid block, where as a Q-learner learns about the optimal path to the goal, using a scalar loss function for rewards and punishments. "Pavlovian" and Q-learning components are then weighed at each step to produce an action. Unsurprisingly, the authors find that including the "Pavlovian" component into the model reduces the cumulative punishment incurred, and this increases as the weight of the "Pavlovian" system increases. The paper does not explore to what extent increasing the punishment loss (while keeping reward loss constant) would lead to the same outcomes with a simpler model architecture.

      In the second setup, an agent learns about punishments alone. So-called "Pavlovian biases" have previously been demonstrated in this task (i.e. an over avoidance when the correct decision is to approach). The authors explore several models to account for the Pavlovian biases.

      Strengths:

      Overall, the modelling exercises are interesting and relevant and incrementally expand the space of existing models.

      Weaknesses:

      For the first task, the simulation results are not compared to a simple Q-learning model. The second task is somewhat artificial, a problem compounded by the virtual reality setup. According to the cover story, participants get "stung by a jellyfish" on average 88 times during the experiment. In one condition, withdrawal from a jelly fish lead to a sting.

    3. Reviewer #2 (Public review):

      Summary:

      The authors tested the efficiency of a model combining Pavlovian fear valuation and instrumental valuation. This model is amenable to many behavioral decision and learning setups - some of which have been or will be designed to test differences in patients with mental disorders (e.g., anxiety disorder, OCD, etc.).

      Strengths:

      (1) Simplicity of the model which can at the same time model rather complex environments.

      (2) Introduction of a flexible omega parameter.

      (3) Direct application to a rather advanced VR task.

      (4) The paper is extremely well written. It was a joy to read.

      Weaknesses:

      Almost none! In very few cases, the explanations could be a bit better.

      Comments on revised version:

      No further comments.

    4. Reviewer #3 (Public review):

      Summary:

      This paper aims to address the problem of exploring potentially rewarding environments that contain danger, based on the assumption that an independent Pavlovian fear learning system can help guide an agent during exploratory behaviour such that it avoids severe danger. This is important given that otherwise later gains seem to outweigh early threats, and agents may end up putting themselves in danger when it is advisable not to do so.

      The authors develop a computational model of exploratory behaviour that accounts for both instrumental and Pavlovian influences, combining the two according to uncertainty in the rewards. The result is that Pavlovian avoidance has a greater influence when the agent is uncertain about rewards.

      Strengths:

      The study does a thorough job of testing this model using both simulations and data from human participants performing an avoidance task. Simulations demonstrate that the model can produce "safe" behaviour, where the agent may not necessarily achieve the highest possible reward but ensures that losses are limited. Interestingly, the model appears to describe human avoidance behaviour in a task that tests for Pavlovian avoidance influences better than a model that doesn't adapt the balance between Pavlovian and instrumental based on uncertainty. The methods are robust, and generally there is little to criticise about the study.

      Weaknesses:

      The methods are robust, and generally there is little to criticise about the study. The extent of the testing in human participants is fairly limited, but goes far enough to demonstrate that the model can account for human behaviour in an exemplar task. There are, however, some elements of the model that are unrealistic (for example, the fact that pre-training is required to select actions with a Pavlovian bias would require the agent to explore the environment initially and encounter a vast amount of danger in order to learn how to avoid the danger later), although this could simply reflect a lengthy evolutionary process.

    5. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      This paper provides a computational model of a synthetic task in which an agent needs to find a trajectory to a rewarding goal in a 2D-grid world, in which certain grid blocks incur a punishment. In a completely unrelated setup without explicit rewards, they then provide a model that explains data from an approach-avoidance experiment in which an agent needs to decide whether to approach or withdraw from, a jellyfish, in order to avoid a pain stimulus, with no explicit rewards. Both models include components that are labelled as Pavlovian; hence the authors argue that their data show that the brain uses a Pavlovian fear system in complex navigational and approach-avoid decisions.

      Thanks to the reviewer’s comments, we have now added the following text to our Discussion section (Lines 290-302):

      “When it comes to our experiments, both the simulation and VR experiment models are related and derived from the same theoretical framework maintaining an algebraic mapping. They differ only in task-specific adaptations i.e. differ in action sets and differ in temporal difference learning rules - multi-step decisions in the grid world vs. Rescorla-Wagner rule for single-step decisions in the VR task. This is also true for Dayan et al. [2006] who bridge Pavlovian bias in a Go-No Go task (negative auto-maintenance pecking task) and a grid world task. A further minor difference between the simulation and VR experiment models is the use of a baseline bias in the human experiment's RL and the RLDDM model, where we also model reaction times with drift rates which is not a behaviour often simulated in the grid world simulations. As mentioned previously, we use the grid world tasks for didactic purposes, similar to Dayan et al. [2006] and common to test-beds for algorithms in reinforcement learning [Sutton et al., 1998]. The main focus of our work is on Pavlovian fear bias in safe exploration and learning, rather than on its role in complex navigational decisions. Future work can focus on capturing more sophisticated safe behaviours, such as escapes [Evans et al., 2019, Sporrer et. al., 2023] and model-based planning, which span different aspects of the threat-imminence continuum [Mobbs et al., 2020].”

      In the first setup, they simulate a model in which a component they label as Pavlovian learns about punishment in each grid block, whereas a Q-learner learns about the optimal path to the goal, using a scalar loss function for rewards and punishments. Pavlovian and Q-learning components are then weighed at each step to produce an action. Unsurprisingly, the authors find that including the Pavlovian component in the model reduces the cumulative punishment incurred, and this increases as the weight of the Pavlovian system increases. The paper does not explore to what extent increasing the punishment loss (while keeping reward loss constant) would lead to the same outcomes with a simpler model architecture, so any claim that the Pavlovian component is required for such a result is not justified by the modelling. 

      Thanks to the reviewer’s comments, we have now added the following text to our Discussion section (Line 303-313):

      “In our simulation experiments, we assume the coexistence of the Pavlovian fear system and the instrumental system to demonstrate the emergent safety-efficiency trade-off from their interaction. It is possible that similar behaviours could be modelled using an instrumental system alone, with higher punishment sensitivity, therefore we do not argue for the necessity for the Pavlovian fear system here. Instead, the Pavlovian fear system itself could be a potential biologically plausible implementation of punishment sensitivity. Unlike punishment sensitivity (scaling of the punishments), which has not been robustly mapped to neural substrates in fMRI studies; the neural substrates for the Pavlovian fear system are well known (e.g., the limbic loop and amygdala, further see Supplementary Fig. 16). Additionally, Pavlovian fear system provides a separate punishment memory that cannot be erased by greater rewards like [Elfwing and Seymour, 2017, Wang et al., 2018]. This fundamental point can be observed in our simple T-maze simulations, where the Pavlovian fear system encourages avoidance behaviour and the agent chooses the smaller reward instead of the greater reward.”

      In the second setup, an agent learns about punishments alone. "Pavlovian biases" have previously been demonstrated in this task (i.e. an overavoidance when the correct decision is to approach). The authors explore several models (all of which are dissimilar to the ones used in the first setup) to account for the Pavlovian biases. 

      Thanks to the reviewer’s comments, we have now added a paragraph in our Discussion section (Line 290-302) explaining the similarity of our models and their integrated interpretation. We hope this addresses the reviewer’s concerns.

      Strengths: 

      Overall, the modelling exercises are interesting and relevant and incrementally expand the space of existing models. 

      Weaknesses: 

      I find the conclusions misleading, as they are not supported by the data. 

      First, the similarity between the models used in the two setups appears to be more semantic than computational or biological. So it is unclear to me how the results can be integrated. 

      Thanks to the reviewer’s comments, we have now added a paragraph in our Discussion section (Line 290-302 onwards) explaining the similarity of our models and their integrated interpretation. We hope this addresses the reviewer’s concerns.

      Secondly, the authors do not show "a computational advantage to maintaining a specific fear memory during exploratory decision-making" (as they claim in the abstract). Making such a claim would require showing an advantage in the first place. For the first setup, the simulation results will likely be replicated by a simple Q-learning model when scaling up the loss incurred for punishments, in which case the more complex model architecture would not confer an advantage. The second setup, in contrast, is so excessively artificial that even if a particular model conferred an advantage here, this is highly unlikely to translate into any real-world advantage for a biological agent. The experimental setup was developed to demonstrate the existence of Pavlovian biases, but it is not designed to conclusively investigate how they come about. In a nutshell, who in their right mind would touch a stinging jellyfish 88 times in a short period of time, as the subjects do on average in this task? Furthermore, in which real-life environment does withdrawal from a jellyfish lead to a sting, as in this task? 

      Crucially, simplistic models such as the present ones can easily solve specifically designed lab tasks with low dimensionality but they will fail in higher-dimensional settings. Biological behaviour in the face of threat is utterly complex and goes far beyond simplistic fight-flight-freeze distinctions (Evans et al., 2019). It would take a leap of faith to assume that human decision-making can be broken down into oversimplified sub-tasks of this sort (and if that were the case, this would require a meta-controller arbitrating the systems for all the sub-tasks, and this meta-controller would then struggle with the dimensionality j). 

      Thanks to the reviewer’s comments, we have now mentioned this point in Lines 299-302.

      On the face of it, the VR task provides higher "ecological validity" than previous screen-based tasks. However, in fact, it is only the visual stimulation that differs from a standard screen-based task, whereas the action space is exactly the same. As such, the benefit of VR does not become apparent, and its full potential is foregone. 

      If the authors are convinced that their model can - then data from naturalistic approach-avoidance VR tasks is publicly available, e.g. (Sporrer et al., 2023), so this should be rather easy to prove or disprove. In summary, I am doubtful that the models have any relevance for real-life human decision-making. 

      Finally, the authors seem to make much broader claims that their models can solve safety-efficiency dilemmas. However, a combination of a Pavlovian bias and an instrumental learner (study 1) via a fixed linear weighting does not seem to be "safe" in any strict sense. This will lead to the agent making decisions leading to death when the promised reward is large enough (outside perhaps a very specific region of the parameter space). Would it not be more helpful to prune the decision tree according to a fixed threshold (Huys et al., 2012)? So, in a way, the model is useful for avoiding cumulatively excessive pain but not instantaneous destruction. As such, it is not clear what real-life situation is modelled here. 

      We hope our additions to the Discussion section, from Line 290 to Line 313 address the reviewer’s concerns.  

      A final caveat regarding Study 1 is the use of a PH associability term as a surrogate for uncertainty. The authors argue that this term provides a good fit to fear-conditioned SCR but that is only true in comparison to simpler RW-type models. Literature using a broader model space suggests that a formal account of uncertainty could fit this conditioned response even better (Tzovara et al., 2018). 

      We have now added a line discussing this. (Line 356-358)

      “Future work could also use a formal account of uncertainty which could fit the fear-conditioned skin-conductance response better than Pearce-Hall associability [Tzovara et al., 2018].”

      Reviewer #2 (Public review): 

      Summary: 

      The authors tested the efficiency of a model combining Pavlovian fear valuation and instrumental valuation. This model is amenable to many behavioral decision and learning setups - some of which have been or will be designed to test differences in patients with mental disorders (e.g., anxiety disorder, OCD, etc.). 

      Strengths: 

      (1) Simplicity of the model which can at the same time model rather complex environments. 

      (2) Introduction of a flexible omega parameter. 

      (3) Direct application to a rather advanced VR task. 

      (4) The paper is extremely well written. It was a joy to read. 

      Weaknesses: 

      Almost none! In very few cases, the explanations could be a bit better. 

      Thank you, we have added further explanations in the discussion section. We have further improved the writing in abstract, introduction and Methods section taking into account recommendations from reviewer #2 and #3.

      Reviewer #2 (Recommendations for the authors): 

      (1) Why is there no flexible omega in Figures 3B and 3C? Did I miss this? 

      Thank you. We have now added additional text to explain our motivation in Experiment 2, which only varies the fixed omega and omits the flexible omega (Lines 136-140).

      “In this set of results, we wish to qualitatively tease apart the role of a Pavlovian bias in shaping and sculpting the instrumental value and also provide more insight into the resulting safety-efficiency trade-off. Having shown the benefits of a flexible ω in the previous section, here we only vary the fixed ω to illustrate the effect of a constant bias and are not concerned with the flexible bias in this experiment.”

      We encourage the reader to consider this akin to an additional study that will explain how Pavlovian bias to withdraw can play a role in avoiding punishments similar to that of punishment sensitivity. This is particularly important as we do have neural correlates for Pavlovian biases but lack a clear neural correlation for punishment sensitivity so far, as mentioned in our new additions to the Discussion section (Lines 303-313).

      (2) The introduction of the flexible omega and the PAL agent in the results is a bit sudden. Some more details are needed to understand this during the first read of this passage. 

      We thank reviewer #2 for bringing this to our notice. We have attempted to refine our passage by including sentences like - 

      “The standard (rational) reinforcement learning system is modelled as the instrumental learning system. The additional Pavlovian fear system biases the withdrawal actions to aid in safe exploration, in line with our hypothesis.”

      “Both systems learn using a basic temporal difference updating rule (or in instances, its special case, the Rescorla-Wagner rule)”

      “We implement the flexible ω using Pearce-Hall associability (see equation 15 in Methods). The Pearce-Hall associability maintains a running average of absolute temporal difference errors (δ) as per equation 14. This acts as a crude but easy-to-compute metric for outcome uncertainty which gates the influence of the Pavlovian fear system, in line with our hypothesis. This implies that higher the outcome uncertainty, as is the case in early exploration, the more cautious our agent will be, resulting in safer exploration”

      (3) In my view, the possibility of modeling moving predators is extremely interesting. I would include Figure 8D and the corresponding explanation in the main text. 

      Response with revision: We thank the reviewer for finding our simulation on moving predators extremely interesting. Unfortunately, since our instrumental system is not model-based, and especially is not explicitly modelling the predator dynamics, our simulation might not be a very accurate representation of real moving predator environments. As pointed out by Reviewer #1, perhaps several other systems other than Pavlovian fear responses are necessary for safe behaviour in such environments and we hope to address these in future studies. Thanks again for taking an interest in our simulations.

      (4) The VR experiment should be mentioned more clearly in the abstract and the introduction. It should be mentioned a bit more clearly why VR was helpful and why the authors did not use a simple bird's eye grid world task. 

      I cannot assess the RLDDM and I did not check the code. 

      Thank you, we have now mentioned the VR experiment more clearly in the abstract and the introduction. We also now further mention that the VR experiment “builds upon previous Go-No Go studies studying Pavlovian-Instrumental transfer (Guitart-Masip et al, 2012; Cavanagh et al, 2013). The virtual-reality approach confers a greater ecological validity and the immersive nature may contribute better fear conditioning, making it easier to distinguish the aversive components.”

      A bird’s eye grid world may not invoke a strong withdrawal response, as seen in these immersive approach-withdrawal tasks where we can clearly distinguish a Pavlovian fear-based withdrawal response. We did include immersive VR maze results in the supplementary materials, but future work is needed to isolate the different systems at play in such a complex behaviour.

      Reviewer #3 (Public review): 

      Summary: 

      This paper aims to address the problem of exploring potentially rewarding environments that contain the danger, based on the assumption that an independent Pavlovian fear learning system can help guide an agent during exploratory behaviour such that it avoids severe danger. This is important given that otherwise later gains seem to outweigh early threats, and agents may end up putting themselves in danger when it is advisable not to do so. 

      The authors develop a computational model of exploratory behaviour that accounts for both instrumental and Pavlovian influences, combining the two according to uncertainty in the rewards. The result is that Pavlovian avoidance has a greater influence when the agent is uncertain about rewards. 

      Strengths: 

      The study does a thorough job of testing this model using both simulations and data from human participants performing an avoidance task. Simulations demonstrate that the model can produce "safe" behaviour, where the agent may not necessarily achieve the highest possible reward but ensures that losses are limited. Interestingly, the model appears to describe human avoidance behaviour in a task that tests for Pavlovian avoidance influences better than a model that doesn't adapt the balance between Pavlovian and instrumental based on uncertainty. The methods are robust, and generally, there is little to criticise about the study. 

      Weaknesses: 

      The extent of the testing in human participants is fairly limited but goes far enough to demonstrate that the model can account for human behaviour in an exemplar task. There are, however, some elements of the model that are unrealistic (for example, the fact that pre-training is required to select actions with a Pavlovian bias would require the agent to explore the environment initially and encounter a vast amount of danger in order to learn how to avoid the danger later). The description of the models is also a little difficult to parse. 

      Thank you, we have now attempted to clarify these points in the Discussion section by adding the following text (Lines 313-321):

      “ We next discuss the plausibility of pre-training to select the hardwired actions In the human experiment, the withdrawal action is straightforwardly biased, as noted, while in the grid world, we assume a hardwired encoding of withdrawal actions for each state/grid. This innate encoding of withdrawal actions could be represented in the dPAG [Kim et al., 2013]. We implement this bias using pre-training, which we assume would be a product of evolution. Alternatively, this could be interpreted as deriving from an appropriate value initialization where the gradient over initialized values determines the action bias. Such aversive value initialization, driving avoidance of novel and threatening stimuli, has been observed in the tail of the striatum in mice, which is hypothesised to function as a Pavlovian fear/threat learning system [Menegas et al., 2018].”

      Reviewer #3 (Recommendations for the authors): 

      I have relatively little to suggest, as in my view the paper is robust, thorough, and creative, and does enough to support the primary argument being made at the most fundamental level. My suggestions for improvement are as follows: 

      (1) Some aspects of the model are potentially unrealistic (as described in the public review), and the paper may benefit from some discussion of these issues or attempts to make the model more realistic - i.e., to what extent is this plausible in explaining more complex avoidance behaviour? Primarily, the fact that pre-training is required to identify actions subject to Pavlovian bias seems unlikely to be effective in real-world situations - is there a better way to achieve this in cases where there isn't necessarily an instinctual Pavlovian response? 

      Thank you, we agree that the advantage of Pavlovian bias is restricted to the bias/instinctual Pavlovian response conferred by evolution. Future work is needed to model more complex avoidance behaviour such as escapes. We hope to have made this more clear with our edits to the Discussion (Lines 299-302) in our response to Reviewer #1’s comments, specifically:

      “The main focus of our work is on Pavlovian fear bias in safe exploration and learning, rather than on its role in complex navigational decisions. Future work can focus on capturing more sophisticated safe behaviours, such as escapes [Evans et al., 2019, Sporrer et. al., 2023] and model-based planning which span different aspects of the threat-imminence continuum [Mobbs et al., 2020]”  

      (2) The description of the model in the method can be a little hard to follow and would benefit from further explanation of certain parameters. In general, it would be good to ensure that all terms mentioned in equations are described clearly in the text (for example, in Equation1 it isn't clear what k refers to). 

      Thank you, we have now added further information on all of the parameters in Equation 1 and overall improved the Methods section writing, for instance using time subscript for less confusion while introducing the parameters. We use the standard notation used in Sutton and Barto textbook. k refers to the timesteps into the future, and is now explained better in the Methods section.

      (3) Another point of clarification in Equation 1 - does the policy account for the Pavlovian influence or is this purely instrumental? 

      Thank you, Equation 1 is purely instrumental. We have now specifically mentioned this. The Pavlovian influence follows later. They are combined into propensities for action as per equations 11-13.

      (4) I was curious whether similar outcomes could be achieved by more complex instrumental models without the need for Pavlovian influences. For example, could different risk-sensitive decision rules (e.g., conditional value at risk) that rely only on the instrumental system afford safe behaviour without the need for an additional Pavlovian system? 

      Thank you for your comment. Yes, CVaR can achieve safe exploration/cautious behaviour in choices similar to Pavlovian avoidance learning. But we think both differ in the following ways:

      (1) CVaR provides the correct solution to the wrong problem (objective that only maximises the lower tail of the distribution of outcomes)

      (2) Pavlovian bias provides the wrong solution to the right problem (normative objective, but a Pavlovian bias which may be vestige of evolution)

      Here we use the “wrong problem, wrong solution, wrong environment” categorisation terminology from Huys et al. 2015.

      Huys, Q. J., Guitart-Masip, M., Dolan, R. J., & Dayan, P. (2015). Decision-theoretic psychiatry. Clinical Psychological Science, 3(3), 400-421.

      Secondly, we find an effect of Pavlovian bias on reaction times - slowing down of approach responses and faster withdrawal responses. We do not think this can be best explained in a CVaR type model and is a direction for future work. We think such model-based methods are slower to compute, but Pavlovian withdrawal bias is quicker response.

      We have now included this in brief in Lines 280-288.

      (5) Figure 5 would benefit from a clearer caption as it is not necessarily clear from the current one that the left panels refer to choices and the right panels to reaction times. 

      Thank you, we have improved the caption for Fig. 5.

      (6) It would be good to include some indication of the quality of the model fits for the human behavioural study (i.e., diagnostics such as R-hat) to ensure that differences in model fit between models are not due to convergence issues with different models. This would be especially helpful for the RLDDM models as these can be difficult to fit successfully.

      Thank you, we observed that all Rhat values were strictly less than 1.05 (most parameters were less than 1.01 and generally close to 1), indicating that the models converged. We have now added this line to the results (Line 246-248). Thanks to the reviewer’s comments, we have now added the following text to our Discussion section (Lines 290-302): “When it comes to our experiments, both the simulation and VR experiment models are related and derived from the same theoretical framework maintaining an algebraic mapping. They differ only in task-specific adaptations i.e. differ in action sets and differ in temporal difference learning rules - multi-step decisions in the grid world vs. Rescorla-Wagner rule for single-step decisions in the VR task. This is also true for Dayan et al. [2006] who bridge Pavlovian bias in a Go-No Go task (negative auto-maintenance pecking task) and a grid world task. A further minor difference between the simulation and VR experiment models is the use of a baseline bias in the human experiment's RL and the RLDDM model, where we also model reaction times with drift rates which is not a behaviour often simulated in the grid world simulations. As mentioned previously, we use the grid world tasks for didactic purposes, similar to Dayan et al. [2006] and common to test-beds for algorithms in reinforcement learning [Sutton et al., 1998]. The main focus of our work is on Pavlovian fear bias in safe exploration and learning, rather than on its role in complex navigational decisions. Future work can focus on capturing more sophisticated safe behaviours, such as escapes [Evans et al., 2019, Sporrer et. al., 2023] and model-based planning, which span different aspects of the threat-imminence continuum [Mobbs et al., 2020].” In the first setup, they simulate a model in which a component they label as Pavlovian learns about punishment in each grid block, whereas a Q-learner learns about the optimal path to the goal, using a scalar loss function for rewards and punishments. Pavlovian and Q-learning components are then weighed at each step to produce an action. Unsurprisingly, the authors find that including the Pavlovian component in the model reduces the cumulative punishment incurred, and this increases as the weight of the Pavlovian system increases. The paper does not explore to what extent increasing the punishment loss (while keeping reward loss constant) would lead to the same outcomes with a simpler model architecture, so any claim that the Pavlovian component is required for such a result is not justified by the modelling.

    1. eLife Assessment

      This study provides important insights into how cryptic pockets play a role in shaping binding preferences of protein-nucleic acid interactions. By combining biochemical assays and state-of-the-art molecular dynamics simulations, mechanism underlying viral protein 35 (VP35) homologs to bind the backbone of double stranded RNA is presented. The evidence is compelling for molecular determinants that suggest two different dsRNA binding modes for VP35 and also underscores the evolutionary importance of these pockets.

    2. Reviewer #1 (Public review):

      Summary:

      Mallimadugula et al. combined Molecular Dynamics (MD) simulations, thiol-labeling experiments, and RNA-binding assays to study and compare the RNA-binding behavior of the Interferon Inhibitory Domain (IID) from Viral Protein 35 (VP35) of Zaire ebolavirus, Reston ebolavirus, and Marburg marburgvirus. Although the structures and sequences of these viruses are similar, the authors suggest that differences in RNA binding stem from variations in their intrinsic dynamics, particularly the opening of a cryptic pocket. More precisely, the dynamics of this pocket may influence whether the IID binds to RNA blunt ends or the RNA backbone.

      Overall, the authors present important findings to reveal how the intrinsic dynamics of proteins can influence their binding to molecules and, hence, their functions. They have used extensive biased simulations to characterize the opening of a pocket which was not clearly seen in experimental results - at least when the proteins were in their unbound forms. Biochemical assays further validated theoretical results and linked them to RNA binding modes. Thus, with the combination of biochemical assays and state-of-the-art Molecular Dynamics simulations, these results are clearly compelling.

      Strengths:

      The use of extensive Adaptive Sampling combined with biochemical assays clearly point to the opening of the Interferon Inhibitory Domain (IID) as a factor for RNA binding. This type of approach is especially useful to assess how protein dynamics can affect its function.

      Weaknesses:

      Although a connection between the cryptic pocket dynamics and RNA binding mode is proposed, the precise molecular mechanism linking pocket opening to RNA binding still remains unclear.

    3. Reviewer #2 (Public review):

      Summary:

      The authors aimed to determine whether a cryptic pocket in the VP35 protein of Zaire ebolavirus has a functional role in RNA binding and, by extension, in immune evasion. They sought to address whether this pocket could be an effective therapeutic target resistant to evolutionary evasion by studying its role in dsRNA binding among different filovirus VP35 homologs. Through simulations and experiments, they demonstrated that cryptic pocket dynamics modulate the RNA binding modes, directly influencing how VP35 variants block RIG-I and MDA5-mediated immune responses.<br /> The authors successfully achieved their aim, showing that the cryptic pocket is not a random structural feature but rather an allosteric regulator of dsRNA binding. Their results not only explain functional differences in VP35 homologs despite their structural similarity but also suggest that targeting this cryptic pocket may offer a viable strategy for drug development with reduced risk of resistance.

      This work represents a significant advance in the field of viral immunoevasion and therapeutic targeting of traditionally "undruggable" protein features. By demonstrating the functional relevance of cryptic pockets, the study challenges long-standing assumptions and provides a compelling basis for exploring new drug discovery strategies targeting these previously overlooked regions.

      Strengths:

      The combination of molecular simulations and experimental approaches is a major strength, enabling the authors to connect structural dynamics with functional outcomes. The use of homologous VP35 proteins from different filoviruses strengthens the study's generality, and the incorporation of point mutations adds mechanistic depth. Furthermore, the ability to reconcile functional differences that could not be explained by crystal structures alone highlights the utility of dynamic studies in uncovering hidden allosteric features.

      Weaknesses:

      While the methodology is robust, certain limitations should be acknowledged. For example, the study would benefit from a more detailed quantitative analysis of how specific mutations impact RNA binding and cryptic pocket dynamics, as this could provide greater mechanistic insight. This study would also benefit from providing a clear rationale for the selection of the amber03 force field and considering the inclusion of volume-based approaches for pocket analysis. Such revisions will strengthen the robustness and impact of the study.

      Comments on revisions:

      The authors addressed the concerns raised.

    4. Reviewer #3 (Public review):

      Summary:

      The authors suggest a mechanism that explains the preference of<br /> viral protein 35 (VP35) homologs to bind the backbone of double stranded RNA versus blunt ends. These preferences have a biological impact in terms of the ability of different viruses to escape the immune response of the host.<br /> The proposed mechanism involves the existence of a cryptic pocket, where VP35 binds the blunt ends of dsRNA when the cryptic pocket is closed and preferentially binds the RNA double stranded backbone when the pocket is open.<br /> The authors performed MD simulation results, thiol labelling experiments, fluorescence polarization assays, as well as point mutations to support their hypothesis.

      Strengths:

      This is a genuinely interesting scientific questions, which is approached through multiple complementary experiments as well as extensive MD simulations. Moreover, structural biology studies focused on RNA-protein interactions are particularly rare, highlighting the importance of further research in this area.

      Weaknesses:

      - Sequence similarity between Ebola-Zaire (94% similarity) explains their similar behaviour in simulations and experimental assays. Marburg instead is a more distant homolog (~80% similarity relative to Ebola/Zaire). This difference is sequence and structure can explain the propensities, without the need to involve the existence of a cryptic pocket.<br /> - No real evidence for the presence of a cryptic pocket is presented, but rather a distance probability distribution between two residues obtained from extensive MD simulations. It would be interesting to characterise the modelled RNA-protein interface in more detail

      Comments on revisions:

      -I still think that the term cryptic pocket is misleading here, unless the cryptic pocket is more thoroughly characterised. I would find it more appropriate to use the term open/closed state.

      - Mg ions are known to be crucial in stabilising RNA structure both in vitro and in MD simulations (see e.g. Draper BJ 2008 and many others). While I understand that the authors cannot repeat simulations in presence of ions, I believe that this detail should be more clearly detailed in the manuscript.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Mallimadugula et al. combined Molecular Dynamics (MD) simulations, thiol-labeling experiments, and RNA-binding assays to study and compare the RNA-binding behavior of the Interferon Inhibitory Domain (IID) from Viral Protein 35 (VP35) of Zaire ebolavirus, Reston ebolavirus, and Marburg marburgvirus. Although the structures and sequences of these viruses are similar, the authors suggest that differences in RNA binding stem from variations in their intrinsic dynamics, particularly the opening of a cryptic pocket. More precisely, the dynamics of this pocket may influence whether the IID binds to RNA blunt ends or the RNA backbone.

      Overall, the authors present important findings to reveal how the intrinsic dynamics of proteins can influence their binding to molecules and, hence, their functions. They have used extensive biased simulations to characterize the opening of a pocket which was not clearly seen in experimental results - at least when the proteins were in their unbound forms. Biochemical assays further validated theoretical results and linked them to RNA binding modes. Thus, with the combination of biochemical assays and state-of-the-art Molecular Dynamics simulations, these results are clearly compelling.

      Strengths:

      The use of extensive Adaptive Sampling combined with biochemical assays clearly points to the opening of the Interferon Inhibitory Domain (IID) as a factor for RNA binding. This type of approach is especially useful to assess how protein dynamics can affect its function.

      Weaknesses:

      Although a connection between the cryptic pocket dynamics and RNA binding mode is proposed, the precise molecular mechanism linking pocket opening to RNA binding still remains unclear.

      Reviewer #2 (Public review):

      Summary:

      The authors aimed to determine whether a cryptic pocket in the VP35 protein of Zaire ebolavirus has a functional role in RNA binding and, by extension, in immune evasion. They sought to address whether this pocket could be an effective therapeutic target resistant to evolutionary evasion by studying its role in dsRNA binding among different filovirus VP35 homologs. Through simulations and experiments, they demonstrated that cryptic pocket dynamics modulate the RNA binding modes, directly influencing how VP35 variants block RIG-I and MDA5-mediated immune responses.

      The authors successfully achieved their aim, showing that the cryptic pocket is not a random structural feature but rather an allosteric regulator of dsRNA binding. Their results not only explain functional differences in VP35 homologs despite their structural similarity but also suggest that targeting this cryptic pocket may offer a viable strategy for drug development with reduced risk of resistance.

      This work represents a significant advance in the field of viral immunoevasion and therapeutic targeting of traditionally "undruggable" protein features. By demonstrating the functional relevance of cryptic pockets, the study challenges long-standing assumptions and provides a compelling basis for exploring new drug discovery strategies targeting these previously overlooked regions.

      Strengths:

      The combination of molecular simulations and experimental approaches is a major strength, enabling the authors to connect structural dynamics with functional outcomes. The use of homologous VP35 proteins from different filoviruses strengthens the study's generality, and the incorporation of point mutations adds mechanistic depth. Furthermore, the ability to reconcile functional differences that could not be explained by crystal structures alone highlights the utility of dynamic studies in uncovering hidden allosteric features.

      Weaknesses:

      While the methodology is robust, certain limitations should be acknowledged. For example, the study would benefit from a more detailed quantitative analysis of how specific mutations impact RNA binding and cryptic pocket dynamics, as this could provide greater mechanistic insight. This study would also benefit from providing a clear rationale for the selection of the amber03 force field and considering the inclusion of volume-based approaches for pocket analysis. Such revisions will strengthen the robustness and impact of the study.

      Reviewer #3 (Public review):

      Summary:

      The authors suggest a mechanism that explains the preference of viral protein 35 (VP35) homologs to bind the backbone of double-stranded RNA versus blunt ends. These preferences have a biological impact in terms of the ability of different viruses to escape the immune response of the host.

      The proposed mechanism involves the existence of a cryptic pocket, where VP35 binds the blunt ends of dsRNA when the cryptic pocket is closed and preferentially binds the RNA double-stranded backbone when the pocket is open.

      The authors performed MD simulation results, thiol labelling experiments, fluorescence polarization assays, as well as point mutations to support their hypothesis.

      Strengths:

      This is a genuinely interesting scientific question, which is approached through multiple complementary experiments as well as extensive MD simulations. Moreover, structural biology studies focused on RNA-protein interactions are particularly rare, highlighting the importance of further research in this area.

      Weaknesses:

      - Sequence similarity between Ebola-Zaire (94% similarity) explains their similar behaviour in simulations and experimental assays. Marburg instead is a more distant homolog (~80% similarity relative to Ebola/Zaire). This difference is sequence and structure can explain the propensities, without the need to involve the existence of a cryptic pocket.  

      - No real evidence for the presence of a cryptic pocket is presented, but rather a distance probability distribution between two residues obtained from extensive MD simulations. It would be interesting to characterise the modelled RNA-protein interface in more detail

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Before assessing the overall quality and significance of this work, this reviewer needs to specify the context of this review. This reviewer's expertise lies in biased and unbiased molecular dynamics simulations and structural biology. Hence, while this reviewer can overall understand the results for thiol-labeling and RNA-binding assays, this review will not assess the quality of these biochemical assays and will mainly focus on the modelling results.

      Overall, the authors present important findings to reveal how the intrinsic dynamics of proteins can influence their binding to molecules and, hence, their functions. They have used extensive biased simulations to characterize the opening of a pocket which was not clearly seen in experimental results - at least when the proteins were in their unbound forms. Biochemical assays further validated theoretical results and linked them to RNA binding modes. Thus, with the combination of biochemical assays and state-of-the-art Molecular Dynamics simulations, these results are clearly compelling.

      Beyond the clear qualities of this work, I would like to mention a few points that may help to better contextualize and rationalize the results presented here.

      - First, both the introduction and discussion sections seem relatively condensed. Extending them to, for example, better describe the methodological context and discuss the methodological limitations and potential future developments related to biased simulations may help the reader get a better idea of the significance of this work.

      - The authors presented 3 homologs in this study: IIDs of Reston, Zaire, and Marburg viruses. While Zaire and Reston are relatively similar in terms of sequence (Figure S1). The sequences clearly differ between Marburg and the two other viruses. Can the author indicate a similarity/identity score for each sequence alignment and extend Figure S1 to really compare Marburg sequence with Reston and Zaire? Can they also discuss how these differences may impact the comparison of the three IIDs? This may also help the reader to understand why sometimes the authors compare the three viruses and why sometimes they are focusing only on comparing Zaire and Reston.

      We would like to thank the reviewer for raising this point and we agree that additional details about the sequence comparison provide more context for the choices of substitutions we made. Therefore, we have updated Fig S1 to include a detailed pairwise comparison of all the IID sequences including the percentage sequence similarity and identity. We have also added the following sentences to the results section where we first introduced the substitutions between Zaire and Reston IIDs

      “While the sequence of Marburg IID differs significantly from Reston and Zaire IIDs with a sequence identity of 42% and 45% respectively (Fig S1), the sequences of Reston and Zaire IID are 88% identical and 94% similar. Particularly, substitutions between these homologs are all distal to the RNA-binding interfaces and all the residues known to make contacts with dsRNA from structural studies are identical. Therefore, we reasoned that comparing these two homologs would help us identify minimal substitutions that control pocket opening probability and allow us to study its effect on dsRNA binding with minimal perturbation of other factors.”

      - In this work, the authors mentioned the cryptic pocket but only illustrated the opening of this pocket by using a simple distance between residues (Figure 2) and a SASA of one cysteine (Figure 3). In previous work done by the authors (Cruz et al. , Nature Communications, 2022), they better characterized residues involved in RNA binding and forming the cryptic pocket. Thus, would it be possible to better described this cryptic pocket (residues involved, volume, etc ..) and better explain how, structurally speaking, it can affect RNA binding mode (blunt ends vs backbone) ?

      We thank the reviewer for pointing out the need for clarification on the residues involved in RNA binding and pocket opening and the mechanism linking them. We have performed the CARDS analysis on Reston and Marburg IID simulations as we had done on Zaire IID simulations in Cruz et al, 2022. The results are shown in Fig S3 and discussed in the main text in the first results section.

      - As a counter-example, the authors used C315 for SASA calculation and thiol labeling (Figure 3). This cysteine is mainly buried as seen by SASA for Reston and Marburg and thiol labelling (Figure 3 E,G,H). Would it be possible to also get thiol labeling rates for Cystein 264 in Reston and its equivalent to see a case where the residue is solvent exposed?

      We have shown the SASA for C264 from the simulations in Fig S4 and the thiol labeling rates for all 4 cysteines in Reston IID in Fig S6. Comparing these rates to the rates of all 4 cysteines obtained for Zaire IID (Fig 4 in Cruz et Al, 2022), we observe that the rates for C264, which is expected to be exposed are significantly faster than those of C315 which is largely buried in all variants.  

      - I strongly support here the will of the authors to share their data by depositing them in an OSF repository. These data help this reviewer to assess some of the results produced by the authors and help to better understand the dynamics of their respective systems. I have just a few comments that need to be addressed regarding these data: o While there are data for WT Reston and Marburg, there is no data for Zaire. Is this because these data correspond to the previous work (Cruz et al. 2022) (in this case, it would be good to make this clear in the main text) or is it an omission? o There is no center.xtc file in the Marburg-MSM directory o There is no protmasses.pdb in the Reston-MSM directory

      - In general, if possible, it would be good to use the same name for each type of file presented in each directory to help a potential user understand a bit more how to use these data.

      - If possible, adding a bit more of metadata and explanations on the OSF webpage would be very beneficial to help find these data. To help in this direction, the authors may have a look to the guidelines presented at the end of this article: https://elifesciences.org/articles/90061

      We thank the reviewer for pointing out the omissions from the OSF repository. We have added the missing files and followed a uniform naming convention. We have also added documentation in the metadata section of the OSF repository to help others use the data.  

      Indeed, the simulation data used for Zaire IID is available on the OSF repository corresponding to Cruz et al. 2022 at https://osf.io/5pg2a. We have also clarified this in the data availability section of the main text.  

      Minor point:

      In Figure 2, there is a slight bump for the 225-295 distance around 1 nm for Reston. Can the author comment it ? As these results are based on long AS, even if very small, do the authors think this population is significant?

      Comparing the probability distributions obtained from bootstrapping the frames used to calculate the MSM equilibrium probabilities (Revised Fig1), we observe that the bump for the Reston IID distribution is persistent in all bootstraps indicating that it might indeed be significant. This is also consistent with our observation that the cysteine 296 does get fully labeled in our thiol labeling experiments, albeit significantly slowly compared to the other homologs.  

      Reviewer #2 (Recommendations for the authors):

      I recommend that the authors implement moderate revisions prior to the publication of this research article, addressing the identified weaknesses (see below).

      The authors should provide a rationale for their selection of the amber03 force field (Duan et al., JCTC 24, 1999-2012, 2003) for molecular dynamics simulations, particularly given the availability of more recent and optimized versions of the AMBER force fields. These newer force fields may offer improved parameterization for biomolecular systems, potentially enhancing the accuracy and reliability of the simulation results.

      We chose the Amber03 force field because it has performed well in much of our past work, including the original prediction of the cryptic pocket that we study in this manuscript. The results presented in this manuscript also demonstrate the predictive power of Amber03.

      Additionally, while the authors utilized solvent-accessible surface area (SASA) for cryptic pocket analysis, volume-based approaches may be more suitable for this purpose. Several studies (e.g., Sztain et al. J. Chem. Inf. Model. 2021, 61, 7, 3495-3501) have demonstrated the utility of volume analysis in identifying and characterizing cryptic pockets. The authors could consider incorporating such methodologies to provide a more comprehensive assessment of pocket dynamics.

      The authors propose that the cryptic pocket is not merely a random structural feature but functions as an allosteric regulator of dsRNA binding. To further substantiate this claim, an in-depth analysis of this allosteric effect using for instance network analysis could significantly enhance the study. Such an approach could identify key residues and interaction networks within the protein that mediate the allosteric regulation. This type of mechanistic insight would not only provide a stronger theoretical framework but also offer valuable information for the rational design of therapeutic interventions targeting the cryptic pocket.  

      We thank the reviewer for pointing out the need for clarification on the molecular mechanism linking the opening of the cryptic pocket to RNA binding. We have performed the CARDS analysis on Reston and Marburg IID simulations as was done on Zaire IID simulations in Cruz et al, 2022. The results are shown in Fig S3 and discussed in the main text in the first results section. Briefly, we do find a community (blue) comprising the pocket residues in Reston and Marburg IIDs as we did in Zaire. Similarly, we find that many of the RNA binding residues fall into the orange and green communities as in Zaire. However, there are differences in exactly which residues are clustered into which of these two communities. There are also differences in how strongly connected these communities are in the three homologs. Therefore, while we can conclude that pocket residues likely have varying influence on the RNA binding residues in the homologs, it is hard to say exactly what that variation is from this analysis alone.  

      Reviewer #3 (Recommendations for the authors):

      - MD simulations: All simulations were initialised from the 3 crystal structures, is it correct? In all cases, RNA ds was not included in simulations, right? Were crystallographic MG ions in the vicinity of the binding site included? these are known to influence structural dynamics to a large extent.

      All simulations were indeed initialized using only protein atoms from the crystal structures 3FKE, 4GHL, and 3L2A. Therefore, crystallographic Mg ions were not included in the simulations. However, we do agree with the reviewer and think that the effect of parameters such as salt concentration, specifically Mg ions which are known to be important for the stability of dsRNA, on the pocket opening equilibrium merits detailed study in future work.

      - Figure 2: Would it be possible to perform e.g. a block error analysis and show the statistical errors of the distributions?

      We agree that showing the statistical variation in the MSM equilibrium probabilities is important for comparing the different distributions. Therefore, we have updated Figs 2 and 5 to show the distributions obtained from MSMs constructed using 100 and 10 random samples of the data respectively to indicate the extent of the statistical variability in the MSM construction.  

      - More detailed structural biology experiments (such as NMR or HDX-MS) could potentially shed more light on the differential behaviour of the three different homologs, providing more evidence for the presence of the cryptic pocket.

      We agree that NMR and HDX-MS are powerful means to study dynamics and are actively exploring these approaches for our future work.

    1. eLife Assessment

      This important study offers convincing evidence that intra-Golgi transport slows from cis to trans and varies between cargos even within the same cisternae, supporting a more stable compartment model. Using nocodazole-induced ministacks, the authors show cargo-specific transport kinetics with distinct velocities and residence times. These findings refine the cisternal progression model and prompt further investigation into alternative mechanisms, such as rapid partitioning or rim progression. This study will be of interest to cell biologists studying membrane trafficking, Golgi organization, and protein secretion, as well as researchers investigating the mechanisms of organelle dynamics and the molecular basis of intracellular transport.

    2. Reviewer #1 (Public review):

      Summary:<br /> In the manuscript by Tie et.al., the authors couple the methodology which they have developed to measure LQ (localization quotient) of proteins within the Golgi apparatus along with RUSH based cargo release to quantify the speed of different cargos traveling through Golgi stacks in nocodazole induced Golgi ministacks to differentiate between cisternal progression vs stable compartment model of the Golgi apparatus. The debate between cisternal progression model and stable compartment model has been intense and going on for decades and important to understand the basic way of function/organization of the Golgi apparatus. As per the stable compartment model, cisterna are stable structures, and cargo moves along the Golgi apparatus in vesicular carriers. While as per cisternal progression model, Golgi cisterna themselves mature acquiring new identity from the cis face to the trans face and act as transport carriers themselves. In this work, authors provide a missing part regarding intra-Golgi speed for transport of different cargoes as well as the speed of TGN exit and based on the differences in the transport velocities for different cargoes tested favor a stable compartment model. The argument which authors make is that if there is cisternal progression, all the cargoes should have a similar intra-Golgi transport speed which is essentially the rate at which the Golgi cisterna mature. Furthermore, using a combination of BFA and Nocodazole treatments authors show that the compartments remain stable in cells for at least 30-60 minutes after BFA treatment.

      Strengths:<br /> The method to accurately measure localization of a protein within the Golgi stack is rigorously tested in the previous publications from the same authors and in combination with pulse chase approaches has been used to quantify transport velocities of cargoes through the Golgi. This is a novel aspect in this paper and differences in intra-Golgi velocities for different cargoes tested makes a case for a stable compartment model.

      Weaknesses:<br /> None noted in the revised version of the manuscript.

    3. Reviewer #2 (Public review):

      Summary:<br /> This manuscript describes the use of quantitative imaging approaches, that have been a key element of the labs work over the past years, to address one of the major unresolved discussions in trafficking: intra-Golgi transport. The approach used has been clearly described in the labs previous papers, and is thus clearly described. The authors clearly address the weaknesses in this manuscript, and do not overstate the conclusions drawn from the data. The only weakness not addressed is the concept of blocking COPI transport with BFA, which is a strong inhibitor and causes general disruption of the system. This is an interesting element of the paper, which I think could be improved upon by using more specific COPI inhibitors instead, although I understand that this is not necessarily straightforward.

      I commend the authors on their clear and precise presentation of this body of work, incorporating mathematical modelling with a fundamental question in cell biology. In all, I think that this is a very robust body of work, that provides a sound conclusion in support of the stable compartment model for the Golgi.

      General points:<br /> The manuscript contains a lot of background in its results sections, and the authors may wish to consider rebalancing the text: The section beginning at Line 175 is about 90% background and 10% data. Could some data currently in supplementary be included here to redress this balance, or this part combined with another?

      Minor points:<br /> Equation 2: A should be in front of the ln2. It's already resolved in equation 3, so likely only needs changing in the text

      Line 152: Why is there a lack of experimental data? High ER background and low golgi signal make it difficult to select ministacks: would be good to see examples of these images. Is 0 a relevant timepoint as cargo is still at the ER? Instead would a timepoint <5' be better demonstrate initial arrival in fast cargo, and 0' discarded?

      Table 1 Line 474: 1-3 independent replicates: is there a better way of incorporating this into the table to make it more streamlined? It would be useful to see each cargo as a mean with error. Is there a more demonstrative way to present the table, for example (but does not have to be) fastest cargo first (Tintra) as in Table 2?

      Line 264 / Fig 3B: It's unclear to me why the VHH-anti-GFP-mCherry internalisation approach was used, when the cells were expressing GFP, that could be used for imaging. Also, this introduces a question over trafficking of the VHH itself, to access the same compartments as the GFP-proteins are localised. It would be useful to describe the choice of this approach briefly in the text.

      446 Typo "internalization"

      Post-Revision

      I thank the authors for their work revising the paper in light of our comments. I am satisfied with their response, and I have no other comments.

    4. Reviewer #3 (Public review):

      The manuscript by Tie et al. provides a quantitative assessment of intra-Golgi transport of diverse cargos. Quantitative approaches using fluorescence microscopy of RUSH synchronized cargos, namely GLIM and measurement of Golgi residence time, previously developed by the author's team (publications from 20216 to 2022), are being used here.

      Most of the results have been already published by the same team in 2016, 2017, 2020 and 2021. In this manuscript, the authors have put together measurement of intra-Golgi transport kinetics and Golgi residence time of many cargos. The quantitative results are supported by a large number of Golgi mini-stacks/cells analyzed. They are discussed with regard to the intra-Golgi transport models being debated in the field, namely the cisternal maturation/progression model and the stable compartments model.

      The authors show that different cargos have distinct intra-Golgi transport kinetics and that the Golgi residence time of glycosyltransferases is high. From this and experiment using brefeldinA, the authors suggest that the rim progression model, adapted from the stable compartments model, fits with their experimental data.

      Strengths:<br /> The major strength of this manuscript is to put together many quantitative results that the authors previously obtained and to discuss them to advance our understanding of the intra-Golgi transport mechanisms.<br /> The analysis by fluorescence microscopy of intra-Golgi transport is tough and this is a tour de force of the authors even though their approach shows limitations, which are clearly stated. Their work is remarkable in regards of the numbers of Golgi markers and secretory cargos which have been analyzed.

      Weaknesses:<br /> Most of the data provided here were already published and thus accessible for the community. The tubular connections between cisternae and the diffusion/biochemical properties of cargos are not taken into account to interpret the results. Indeed, tubular connections and biochemical properties of the cargos may affect their transit through the Golgi and the kinetics with which they reach the TGN for Golgi exit.

      The use of nocodazole might affect cellular homeostasis but this is clearly stated by the authors and is acceptable as we need to perturb the system to conduct this analysis.

      The manual selection of the Golgi mini-stack being analyzed (where the cargo and the Golgi reference markers are clearly detectable ) might introduce a bias in the analysis.

    5. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript by Tie et.al., the authors couple the methodology which they have developed to measure LQ (localization quotient) of proteins within the Golgi apparatus along with RUSH based cargo release to quantify the speed of different cargos traveling through Golgi stacks in nocodazole induced Golgi ministacks to differentiate between cisternal progression vs stable compartment model of the Golgi apparatus. The debate between cisternal progression model and stable compartment model has been intense and going on for decades and important to understand the basic way of function/organization of the Golgi apparatus. As per the stable compartment model, cisterna are stable structures and cargo moves along the Golgi apparatus in vesicular carriers. While as per cisternal progression model, Golgi cisterna themselves mature acquiring new identity from the cis face to the trans face and act as transport carriers themselves. In this work, authors provide a missing part regarding intra-Golgi speed for transport of different cargoes as well as the speed of TGN exit and based on the differences in the transport velocities for different cargoes tested favor a stable compartment model. The argument which authors make is that if there is cisternal progression, all the cargoes should have a similar intra-Golgi transport speed which is essentially the rate at which the Golgi cisterna mature. Furthermore, using a combination of BFA and Nocodazole treatments authors show that the compartments remain stable in cells for at least 30-60 minutes after BFA treatment.

      Strengths:

      The method to accurately measure localization of a protein within the Golgi stack is rigorously tested in the previous publications from the same authors and in combination with pulse chase approaches has been used to quantify transport velocities of cargoes through the Golgi. This is a novel aspect in this paper and differences in intra-Golgi velocities for different cargoes tested makes a case for a stable compartment model.

      Weaknesses:

      Experiments are only tested in one cell line (HeLa cells) and predominantly derived from experimental paradigm using RUSH assays where a secretory cargo is released in a wave (not the most physiological condition) and therefore additional approaches would make a more compelling case for the model.

      We have added datasets from 293T cells in the revamped manuscript.

      Reviewer #2 (Public Review):

      Summary:

      This manuscript describes the use of quantitative imaging approaches, which have been a key element of the labs work over the past years, to address one of the major unresolved discussions in trafficking: intra-Golgi transport. The approach used has been clearly described in the labs previous papers, and is thus clearly described. The authors clearly address the weaknesses in this manuscript and do not overstate the conclusions drawn from the data. The only weakness not addressed is the concept of blocking COPI transport with BFA, which is a strong inhibitor and causes general disruption of the system. This is an interesting element of the paper, which I think could be improved upon by using more specific COPI inhibitors instead, although I understand that this is not necessarily straightforward.

      I commend the authors on their clear and precise presentation of this body of work, incorporating mathematical modelling with a fundamental question in cell biology. In all, I think that this is a very robust body of work, that provides a sound conclusion in support of the stable compartment model for the Golgi.

      General points:

      The manuscript contains a lot of background in its results sections, and the authors may wish to consider rebalancing the text: The section beginning at Line 175 is about 90% background and 10% data. Could some data currently in supplementary be included here to redress this balance, or this part combined with another?

      In the revamped manuscript, we have moved the background information on rapid partitioning and rim progression models to the Introduction.

      Reviewer #3 (Public Review):

      The manuscript by Tie et al. provides a quantitative assessment of intra-Golgi transport of diverse cargos. Quantitative approaches using fluorescence microscopy of RUSH synchronized cargos, namely GLIM and measurement of Golgi residence time, previously developed by the author's team (publications from 20216 to 2022), are being used here.

      Most of the results have been already published by the same team in 2016, 2017, 2020 and 2021. In this manuscript, very few new data have been added. The authors have put together measurements of intra-Golgi transport kinetics and Golgi residence time of many cargos. The quantitative results are supported by a large number of Golgi mini-stacks/cells analyzed. They are discussed with regard to the intra-Golgi transport models being debated in the field, namely the cisternal maturation/progression model and the stable compartments model. However, over the past decades, the cisternal progression model has been mostly accepted thanks to many experimental data.

      The authors show that different cargos have distinct intra-Golgi transport kinetics and that the Golgi residence time of glycosyltransferases is high. From this and the experiment using brefeldinA, the authors suggest that the rim progression model, adapted from the stable compartments model, fits with their experimental data.

      Strengths:

      The major strength of this manuscript is to put together many quantitative results that the authors previously obtained and to discuss them to give food for thought about the intraGolgi transport mechanism.

      The analysis by fluorescence microscopy of intra-Golgi transport is tough and is a tour de force of the authors even if their approach show limitations, which are clearly stated. Their work is remarkable in regards to the numbers of Golgi markers and secretory cargos which have been analyzed.

      Weaknesses:

      As previously mentioned, most of the data provided here were already published and thus accessible for the community. Is there is a need to publish them again?

      The authors' discussion about the intra-Golgi transport model is rather simplistic. In the introduction, there is no mention of the most recent models, namely the rapid partitioning and the rim progression models. To my opinion, the tubular connections between cisternae and the diffusion/biochemical properties of cargos are not enough taken into account to interpret the results. Indeed, tubular connections and biochemical properties of the cargos may affect their transit through the Golgi and the kinetics with which they reach the TGN for Golgi exit.

      Nocodazole is being used to form Golgi mini-stacks, which are necessary to allow intra-Golgi measurement. The use of nocodazole might affect cellular homeostasis but this is clearly stated by the authors and is acceptable as we need to perturb the system to conduct this analysis. However, the manual selection of the Golgi mini-stack being analyzed raises a major concern. As far as I understood, the authors select the mini-stacks where the cargo and the Golgi reference markers are clearly detectable and separated, which might introduce a bias in the analysis.

      The terms 'Golgi residence time ' is being used but it corresponds to the residence time in the trans-cisterna only as the cargo has been accumulated in the trans-Golgi thanks to a 20{degree sign}C block. The kinetics of disappearance of the protein of interest is then monitored after 20{degree sign}C to 37{degree sign}C switch.

      Another concern also lies in the differences that would be introduced by different expression levels of the cargo on the kinetics of their intra-Golgi transport and of their packaging into post-Golgi carriers.

      Please see below for our replies to intra-Golgi transport models, the Golgi residence time, and different expression levels of cargos.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The data shown by the authors to measure differential intra Golgi velocities based on previously established methodology make a case for a stable compartment model, however more data is needed to make a complete story and the clarity of presentation can be improved.

      We sincerely appreciate the reviewer's insightful, detailed, and constructive feedback. Your thoughtful comments have helped us refine our analyses, clarify key points, and strengthen the overall quality of our manuscript. We are grateful for the time and effort you have dedicated to reviewing our work and providing valuable suggestions. Your input has been instrumental in improving both the scientific rigor and presentation of our findings. Thank you for your thorough and thoughtful review.

      Main points:

      (1) Along with the studies in yeast, which authors describe in this paper, the main evidence for cisternal maturation model in mammalian cells comes from Bonfanti et.al., (https://doi.org/10.1016/S0092-8674(00)81723-7), which used EM to visualize a wave of Collagen through Golgi stacks. It is therefore important this work needs to include collagen as one of the cargos tested. Can the authors use the RUSH-Col1AGFP (see: https://doi.org/10.1083/jcb.202005166) as a cargo to monitor intra-Golgi velocities?

      I understand that Hela cells are not professional collagen-secreting, but the authors can use U2OS cells to measure collagen export and two other extreme (slow and fast) cargos to validate the same trend in intra-Golgi transport velocities is seen in other cell lines. This will address three concerns: a. This is not a Hela-specific phenomenon; b. Transport of large cargoes like collagen agree with their proposal; c. To see if the same cargo has the same (similar) intra-Golgi speed and the trend between different cargoes is conserved across cell lines.

      Due to the difficulty of manipulating and imaging the procollagen-I RUSH reporter, we selected the collagenX-RUSH reporter (SBP-GFP-collagenX) instead. Our previous study (Tie et al., eLife, 2028) demonstrated that SBP-GFP-collagenX assembles as a large molecular weight particle, each having ~ 190 copies of SBP-GFP-collagenX. With an estimated mean size of ~ 40 nm, these aggregates are not as large as FM4 aggregates and procollagen-I (> 300 nm) and, therefore, are not excluded from conventional transport vesicles, which typically have a size of 50 – 100 nm. However, collagenX has distinct intra-Golgi transport behaviour from conventional secretory cargos -- while conventional secretory cargos localize to the cisternal interior, collagenX partitions to the cisternal rim (Tie et al., eLife, 2028).

      We studied the intra-Golgi transport of SBP-GFP-collagenX in HeLa cells via GLIM and side averaging. The new results are included in Figure 3 of the revamped manuscript. CollagenX has similar intra-Golgi transport kinetics as conventional secretory cargos, displaying the first-order exponential function in LQ vs. time and velocity vs. time plots.

      The side-averaging images are consistent with previous and current results. collagenX displays a double-punctum during the intra-Golgi transport, indicating a cisternal rim localization, as expected for large secretory cargos. Therefore, our new data demonstrated that cisternal rim partitioned large-size secretory cargos might follow intra-Golgi transport kinetics similar to those of cisternal interior partitioned conventional secretory cargos.

      We tried SBP-GFP-CD59 and SBP-GFP-Tac-TC, cargos with fast and slow intra-Golgi transport velocities, respectively, in 293T cells. Results are included in Figure 2, Supplementary Figure 2, and Table 1 of the revamped manuscript. We found that SBP-GFPTac-TC showed similar t<sub>intra</sub>s, 17 and 14 min, respectively, in HeLa and 293T cells. Considering our previous finding that glycosylation has an essential role in the Golgi exit (Sun et al., JBC, 2020), the distinct intra-Golgi transport kinetics of SBP-GFP-CD59 (t<sub>intra</sub>s, 13 and 5 min, respectively, in HeLa and 293T cells) might be due to its distinct luminal glycosylation between HeLa and 293T cells. Supporting this hypothesis, SBP-GFP-Tac-TC does not have any glycosylation sites due to the truncation of the Tac luminal domain.

      (2) RUSH assay has its own caveats which authors also refer to in the manuscript. Authors should test their model by using pulse chase approaches by SNAP tagged constructs which will allow them to do pulse chase assays without the requirement to release cargo as a wave (see: doi: 10.1242/jcs.231373). It is not necessary to test all the cargoes but the two on the ends of the spectrum (slow and fast). To avoid massive overexpression, authors could express the proteins using weaker promoters. Authors could also use this approach to simultaneously measure the two cargoes by tagging them with CLIP and SNAP tags and doing the pulse chase simultaneously (see: DOI: 10.1083/jcb.202206132). In this case it may be difficult to stain both GM130 and TGN, but authors could monitor the rate of segregation from the GM130 signal.

      During the RUSH assay, the sudden release of a large amount of secretory reporters does not occur under native secretory conditions and, consequently, might introduce artifacts. The reviewer suggests using pulse-chase labeling of SNAP (or CLIP)-tagged secretory cargos, which occurs in a steady state and hence more closely resembles native secretory transport. This is an excellent suggestion. However, we have not yet tested this method due to the following concerns.

      The standard protocol involves blocking existing reporters, pulse-labeling newly synthesized reporters, and chasing their movement along the secretory pathway. However, the typical 20minute pulse labeling period used in the two references would be too long, as a substantial portion of the reporters would already reach the trans-Golgi or exit the Golgi before the chase begins. Conversely, reducing the pulse labeling time would significantly weaken the GLIM signal.

      (3) While the intra-Golgi velocities are different for different cargoes tested, authors should show a control that the arrival of the cargoes from ER to the cis-Golgi follows similar kinetics or if there are differences there is no correlation with the intra-Golgi velocities. In other words, do cargoes which show slow intra-Golgi velocities also take more time to reach the cis-Golgi and vice versa.

      In nocodazole-induced Golgi ministacks, the ER exit site, ERGIC, and cis-Golgi are spatially closely associated. At the earliest measurable time point—5 minutes after biotin treatment— we observed that the secretory cargo had already reached the cis-Golgi (Figure 2 and Supplementary Figure 2). The rapid ER-to-cis-Golgi transport exceeds the temporal resolution of our current protocol, making it difficult to address the reviewer’s question (see our reply to Minor Points (2) of Reviewer #2 for more detailed discussion on this).

      (4) Were the different cargos traveling (at different speeds) through Golgi at the rims, or in the middle of ministack, or by vesicles?

      Please also refer to our reply to Question 1 of Reviewer #1. For the nocodazole-induced Golgi ministack, we previously investigated the lateral cisternal localization of RUSH secretory reporters using our en face average imaging (Tie et al., eLife, 2018). We found that small or conventional cargos (such as CD59 and E-cadherin) partition to the cisternal interior while large cargos (collagenX and FM4-CD8a) partition to the cisternal rim during their intra-Golgi transport. Using GLIM, we showed that the intra-Golgi transport kinetics of collagenX is similar to that of small cargos as both follow the first-order exponential function (Figure 3A-C). Therefore, cisternal rim partitioned large size secretory cargos might have intra-Golgi transport kinetics similar to those of cisternal interior partitioned conventional secretory cargos.

      (5) Figure 4, under both nocodazole and BFA treatment for 30mins, would the stacks have the same number (274 nm per LQ) as thickness? Or does it shrink a little? Considering extended BFA treatment reduced intact Golgi ministacks. This is important to understand the LQ numbers of those Golgi proteins. Besides, can they include one ERGIC marker in this assay, would it be approaching cis-Golgi? Images used for quantification in Figure 4 should be shown in the main figure.

      We define the axial size of the Golgi ministack as the axial distance from the GM130 to the GalT-mCherry, d<sub>(GM130-GalT-mCherry)</sub>, measured using the Gaussian centers of their line intensity profiles. As the reviewer suggested, we measured the axial size of the ministack during the nocodazole and BFA treatment. Indeed, we found a decrease in the ministack axial size from 300 ± 10 nm at 0 min to 190 ± 30 nm at 30 min of BFA treatment. This observation is further confirmed by our side average imaging. The new data is presented in Fig. 6G.

      Our study focuses on changes in the organization of the Golgi ministack. So, we didn’t include ERGIC53 in the current analysis. Instead, we quantified the axial distance between GalTmCherry and CD8a-furin, d<sub>(GalT-mCherry-CD8a-furin)</sub>, and found that it decreased from 200 ± 20 nm at 0 min to 100 ± 30 nm at 30 min of BFA treatment, suggesting the collapse of the TGN. The collapse of the TGN is further visualized by our side average imaging. The new data is presented in Fig. 6H.

      Therefore, our new data demonstrates that the Golgi ministack shrinks, and the TGN collapses under BFA treatment.

      Minor points:

      (1) The LQ data come from confocal/airy scan images, but no such images were shown in this paper. The authors can't assume every reader to have prior knowledge of their previous work. It will be beneficial to have one example image and how the LQ was measured.

      As advised by the reviewer, we have prepared Supplementary Figure 1 to provide a brief illustration of the principle behind GLIM and image processing steps involved.

      (2) The cargos used in this paper need to be introduced: what are they, how were they used in previous literature. Especially the furin constructs come out of the blue (also see point 7).

      As suggested by the reviewer, we have included a schematic diagram in Fig. 1 of the revised manuscript to illustrate all RUSH reporters and their corresponding ER hooks. In this diagram, we also highlight the key sequence differences in the cytosolic tails of different furin mutants.

      Additionally, we have added references for each RUSH reporter at the beginning of the Results and Discussion section.

      (3) There are two categories of exocytosis, constitutive and regulated. It important to state that the phenomenon observed is in cells predominantly showing only constitutive secretion.

      As the reviewer advised, we have added the following sentences in the section titled “Limitations of the study”.

      “Third, all RUSH reporters used in this study are constitutive secretory cargos. As a result, the intra-Golgi transport dynamics observed here might not reflect those of regulated secretion, which involves the synchronized release of a large quantity of cargo in response to a specific signal.”

      (4) All the cargoes show a progressive reduction in instantaneous velocities from cis to medial to trans. Authors should discuss how do they mechanistically explain this. Is the rate of vesicle production progressively decreasing from cis to trans and if so, why?

      As our imaging methods cannot differentiate vesicles from the cisternal rim, we could not tell if the vesicle production rate had changed during the intra-Golgi transport. We have provided an explanation of the progressive reduction of the intra-Golgi transport velocity in the Results and Discussion section. Please see the text below.

      “The progressive reduction in intra-Golgi transport of secretory cargo might result from the enzyme matrix's retention at the trans-Golgi. As the secretory cargos progress along the Golgi stack from the cis to the trans-side, more and more cargos become temporarily retained in the trans-Golgi region, gradually reducing their overall intra-Golgi transport velocity. If the release or Golgi exit of these cargos from the enzyme matrix follows a constant probability per unit time, i.e., a first-order kinetics process, the rate of cargo exiting from the Golgi should follow the first-order exponential function. Since the mechanism underlying intra-Golgi transport kinetics reflects fundamental molecular and cellular processes of the Golgi, further experimental data are essential to rigorously test this hypothesis.”

      (5) The supp file 1 nicely listed the raw data for plotting, and n for numbers of ministacks. Could the authors also show number of cells or experiment repeats?

      In the revamped version of the Supplementary File 1, we have added the cell number for each LQ measurement.

      (6) This recent work used novel multiplexing methods to show that nocodazole-treated cells had similar protein organization as in control may be cited. It also showed the effect of BFA. https://www.cell.com/cell/abstract/S0092-8674(24)00236-8.

      We have added this reference to the Introduction section to support that nocodazole-induced Golgi ministacks have a similar organization as the native Golgi. However, our BFA treatment was combined with the nocodazole treatment, while this paper’s BFA treatment does not contain nocodazole.

      (7) Figure 1G-J, authors should show a schematic to show the difference between different furin constructs. Also, LQ values in Fig 1I start from 1. Authors may need to include even earlier timepoints.

      As suggested by the reviewer, we have shown the domain organization of wild type and mutant furin RUSH reporters in Figure 1, highlighting key amino acids in the cytosolic tail. Please also see our reply to Minor Points (2) of Reviewer #1.

      In the revised manuscript, Fig. 1l (SBP-GFP-CD8a-furin-AC #1) has been updated to become Fig. 2J. In this dataset, the first time point was selected at a relatively late stage (20 min), resulting in an initial LQ value of 0.92. However, this should not pose an issue, as SBP-GFPCD8a-furin-AC reaches a plateau of ~ 1.6. The number of data points is sufficient to capture the rising phase and fit the first-order exponential function curve with an adjusted R<sup>2</sup> = 0.99. Furthermore, we have four independent datasets in total on the intra-Golgi transport of SBPGFP-CD8a-furin-AC (#1-4), demonstrating the consistency of our measurements.

      (8) Figure 2A need to show the data points, not just the lines.

      In the revamped manuscript, Fig. 2A has been updated to become Fig. 4A. The plot of Fig. 4A is calculated based on Equation 3.

      So, it does not have data points. However, t<sub>intra</sub> is calculated based on the experimental LQ vs. t kinetic data. 

      (9) Imaging and camera settings like exposure time, pixel size, etc should be reported in Methods.

      As suggested by the reviewer, we have supplied this information in the Materials and Methods section of the revised manuscript.

      (1) The exposure time and pixel size for the wide-field microscopy:

      “The image pixel size is 65 nm. The range of exposure time is 400 – 5000 ms for each channel.”

      (2) The exposure time and pixel size for the spinning disk confocal microscopy: “The image pixel size is 89 nm. The range of exposure time is 200 – 500 ms for each channel.”

      (3) The pixel dwelling time and pixel size for the Airyscan microscopy:

      “For side averaging, images were acquired under 63× objective (NA 1.40), zoomed in 3.5× to achieve 45 nm pixel size using the SR mode. The pixel dwelling time is 1.16 µs.”

      Reviewer #2 (Recommendations For The Authors):

      We sincerely appreciate the reviewer's insightful, detailed, and constructive feedback. Your thoughtful comments have helped us refine our analyses, clarify key points, and strengthen the overall quality of our manuscript. We are grateful for the time and effort you have dedicated to reviewing our work and providing valuable suggestions. Your input has been instrumental in improving both the scientific rigor and presentation of our findings. Thank you for your thorough and thoughtful review.

      Minor points:

      (1) Equation 2: A should be in front of the ln2. It's already resolved in equation 3, so likely only needs changing in the text

      As suggested by the reviewer, we have changed it accordingly.

      (2) Line 152: Why is there a lack of experimental data? High ER background and low golgi signal make it difficult to select ministacks: would be good to see examples of these images. Is 0 a relevant timepoint as cargo is still at the ER? Instead would a timepoint <5' be better demonstrate initial arrival in fast cargo, and 0' discarded?

      We observed that RUSH reporters typically do not exit the ER in < 5 min of biotin treatment, resulting in a high ER background and low Golgi signal. Example images of SBP-GFP-CD59 are shown below (scale bar: 10 µm). Possible reasons include: 1) the time required for biotin diffusion into the ER, 2) the time needed to displace the RUSH hook from the RUSH reporter, and 3) the time for recruitment of RUSH reporters to ER exit sites. As a result, we could not obtain LQs for time points earlier than 5 min during the biotin chase.

      Author response image 1.

      Despite the challenge in measuring LQs at early time points, 0 is still a relevant time point. At t = 0 min, RUSH reporters should be at the ER membrane near the ER exit site, a definitive pre-Golgi location along the Golgi axis, although we still don’t have a good method to determine its LQ.

      (3) Table 1 Line 474: 1-3 independent replicates: is there a better way of incorporating this into the table to make it more streamlined? It would be useful to see each cargo as a mean with error. Is there a more demonstrative way to present the table, for example (but does not have to be) fastest cargo first (Tintra) as in Table 2?

      As suggested by the reviewer, we revised Table 1. We calculated the mean and SD of t<sub>intra</sub> and arranged our RUSH reporters in ascending order based on their t<sub>intra</sub> values.

      (4) Line 264 / Fig 3B: It's unclear to me why the VHH-anti-GFP-mCherry internalisation approach was used, when the cells were expressing GFP, that could be used for imaging. Also, this introduces a question over trafficking of the VHH itself, to access the same compartments as the GFP-proteins are localised. It would be useful to describe the choice of this approach briefly in the text.

      Here, the surface-labeling approach is used to investigate if GFP-Tac-TC possesses a Golgi retrieval pathway after its exocytosis to the plasma membrane. When VHH-anti-GFP-mCherry is added to the tissue culture medium, it binds to the cell surface-exposed GFP-fused MGAT1, MGAT2, Tac, Tac-TC, CD8a, and CD8a-TC. Next, VHH-anti-GFP-mCherry traces the internalized GFP-fused transmembrane proteins. The surface-labeling approach has two advantages in this case. 1) It is much more sensitive in revealing the minor number of GFPtransmembrane proteins at the plasma membrane and endosomes, which are usually drowned in the strong Golgi and ER background fluorescence in the GFP channel. 2) While the GFP fluorescence distribution has reached a dynamic equilibrium, the surface labeling approach can reveal the endocytic trafficking route and dynamics.

      As the reviewer suggested, we added the following sentence to describe the choice of the cellsurface labeling – “By binding to the cell surface-exposed GFP, VHH-anti-GFP-mCherry serves as a sensitive probe to track the endocytic trafficking itinerary of the above GFP-fused transmembrane proteins”. 

      Regarding the trafficking of VHH-anti-GFP-mCherry itself, in HeLa cells that do not express GFP-fused transmembrane proteins, VHH-anti-GFP-mCherry can be internalized by fluidphase endocytosis. However, the fluid-phase endocytosis is negligible under our experimental condition, as we previously demonstrated (Sun et al., JCS, 2021; PMID: 34533190).

      (5) 446 Typo "internalization"

      It has been corrected.

      Reviewer #3 (Recommendations For The Authors):

      Below are my recommendations for the authors to improve their manuscript:

      We sincerely appreciate the reviewer's insightful, detailed, and constructive feedback. Your thoughtful comments have helped us refine our analyses, clarify key points, and strengthen the overall quality of our manuscript. We are grateful for the time and effort you have dedicated to reviewing our work and providing valuable suggestions. Your input has been instrumental in improving both the scientific rigor and presentation of our findings. Thank you for your thorough and thoughtful review.

      (1) Line 48: Tie at al. 2016 is cited. Please add references to original work showing that cargos transit from cis to trans Golgi cisternae.

      After reviewing the literature, we identified two references that provide some of the earliest morphological evidence of secretory cargo transit from the cis- to the trans-Golgi:

      (1) Castle et al, JCB, 1972; PMID: 5025103

      (2) Bergmann and Singer, JCB, 1983; PMID: 6315743

      The first study utilized pulse-chase autoradiographic EM imaging to track secretory protein movement, while the second employed immuno-EM imaging to observe the synchronized release of VSVGtsO45. Accordingly, we have removed Tie et al., 2016 and replaced it with these newly identified references.

      (2) I would suggest to cite earlier (in the Introduction) the rapid partitioning and rim progression models.

      As suggested, we have moved the rapid partitioning and rim progression models to the Introduction section.

      (3) Figure 1: LQ vs. time plot for SBP-GFP-CD8a-furinAC (panel I, 0.9 to 1.75 in 150 min) is different from Fig 7G of Tie et al. 2016 (LQ O-1.5 in 100 min). Please comment on why those 2 sets of data are different.

      We appreciate the reviewer for pointing out this error. In our previous publication (Tie et al., MBoC, 2016), we presented a total of four datasets on SBP-GFP-CD8a-furin-AC. However, in the earlier version of our manuscript, we mistakenly listed only three datasets, inadvertently omitting Fig. 7G from Tie et al., MBoC, 2016.

      In the revised version, we have now included Fig. S2T (SBP-GFP-CD8a-furin-AC #4), which corresponds to Fig. 7G from Tie et al., MBoC, 2016.

      (4) As mentioned in the public review, I think measurement of the expression level of the cargos is necessary to compare their transport kinetics.

      The reviewer raises a valid concern that is challenging to address. All our data were obtained by imaging overexpressed reporters, and we assume that their overexpression does not significantly impact the Golgi or the secretory pathway. Our previous studies have demonstrated that overexpression does not substantially affect LQs (Figure S2 of Tie et al., MBoC, 2016, and Figure S1 of Tie et al., JCB, 2022).

      We acknowledge this concern as one of the limitations in our study at the end of our manuscript:

      “First, our approach relied on the overexpression of fluorescence protein-tagged cargos. The synchronized release of a large amount of cargo could significantly saturate and skew the intra-Golgi transport.” 

      (5) To my opinion, cisternal continuities would also affect retrograde transport (accelerate) (by diffusion for instance) and not only retrograde transport. Please comment on how this would affect intra-Golgi transport kinetics.

      We believe the reviewer is suggesting “cisternal continuities would also affect retrograde transport (accelerate) (by diffusion for instance) and not only anterograde transport.”

      Transient cisternal continuities have been reported to facilitate the anterograde transport of large quantities of secretory cargos (Beznoussenko et al., 2014; PMID: 24867214) (Marsh et al., 2004; PMID: 15064406) (Trucco et al., 2004; PMID: 15502824). However, we are not aware of any reports demonstrating that such continuities facilitate the retrograde transport of secretory cargo, although Trucco et al. (2004) speculated that Golgi enzymes might use these connections to diffuse bidirectionally (anterograde and retrograde direction). For this reason, we did not discuss this scenario in our manuscript.

      (6) Lines 188-190: I don't understand why the rapid partitioning model is excluded. Please detail more the arguments used for this statement.

      Below is the section from the Introduction that addresses the reviewer's question.

      “This model (rapid partitioning model) suggests that cargos rapidly diffuse throughout the Golgi stack, segregating into multiple post-translational processing and export domains, where cargos are packed into carriers bound for the plasma membrane. Nonetheless, synchronized traffic waves have been observed through various techniques, including EM (Trucco et al., 2004) and advanced light microscopy methods we developed, such as GLIM and side-averaging(Tie et al., 2016; Tie et al., 2022). These findings suggest that the rapid partitioning model might not accurately represent the true nature of the intra-Golgi transport.”

      (7) I would suggest replacing the 'Golgi residence time' by another name as it reflects mainly the time of Golgi exit if I am not mistaken.

      We believe the term “Golgi residence time” more accurately reflects the underlying mechanism – retention. The same approach to measure the Golgi residence time can also be applied to Golgi enzymes such as ST6GAL1. Its slow Golgi exit kinetics (t<sub>1/2</sub> = 5.3 hours) (Sun et al., JCS, 2021) should be primarily due to a strong Golgi retention at its steady state Golgi localization.

      In contrast, the conventional secretory cargos’ Golgi exit times are usually much shorter (t<sub>1/2</sub> < 20 min) (Table 2) due to weaker Golgi retention. In a broader sense, the Golgi exit kinetics of a secretory cargo should be influenced by its Golgi retention. Furthermore, we have consistently used the term “Golgi residence time” in our previous publications. So, we propose maintaining this terminology in the current manuscript.

      (8) Lines 300-306: I would suggest that the authors remove this part as it is highly speculative and not supported by data.

      We have relocated this discussion to the section titled "Our data supports the rim progression model, a modified version of the stable compartment model."

      Our enzyme matrix hypothesis offers a potential explanation for key observations, including the differential cisternal localization of small and large cargos and the interior localization of Golgi enzymes. Cryo-FIB-ET has shown that the interior of Golgi cisternae is enriched with densely packed Golgi enzymes (Engel et al., PNAS, 2015; PMID: 26311849), supporting this hypothesis.

      Additionally, this hypothesis helps explain the gradual reduction in intra-Golgi transport velocities of secretory cargos, as requested by Reviewer #1 (Minor Points 4). For these reasons, we propose retaining this discussion in the manuscript.

      (9) In Figure 3B, percentage of MGAT2-GFP cells with anti-GFP signal at the Golgi is of 41% while Sun et al. 2021 reported 25%, please comment this difference. Reply:

      We included more cells for the quantification. The percentage of cells showing Golgi localization of VHH-anti-GFP-mCherry is now 32% (n = 266 cells). The observed difference, 32% vs. 25% (Sun et al., JCS, 2021), is likely due to uncontrollable variations in experimental conditions, which might have influenced the endocytic Golgi targeting efficiency.

      (10) The effects of brefeldinA are pleiotropic as it disassembles COPI and clathrin coats but also induces tubulation of endosomes. I would recommend using Golgicide A, which is more specific.

      We agree with the reviewer that Golgicide A might be more specific as an inhibitor of Arf1. We will certainly consider using this inhibitor next time.

    1. eLife Assessment

      In this important study, the authors conducted atomistic molecular dynamics simulations to probe the interactions between IRE and unfolded peptides. The results help reconcile contradicting experimental findings in the literature and offer mechanistic insights into the activation of the unfolded protein response. The level of evidence is considered solid, although the use of enhanced sampling and more quantitative analysis would further strengthen the conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      This work provides structural and mechanistic insights into the disordered protein recognition process inside the endoplasmic reticulum by the inositol-requiring enzyme 1. Using state-of-the-art molecular dynamics simulation tools, the authors propose a mechanism of disordered protein recognition that reconciles contradictory findings of biochemical and structural biology experiments.

      Strengths:

      (1) All MD simulations have been carried out in triplicate, and several different folded conformations were generated using alphafold2. This provides adequate statistics to draw meaningful conclusions from the simulations.

      (2) Potential limitations of the disordered protein force fields and water models have been taken into consideration. Particularly, performing the simulation in both TIP3P and TIP4PD water models ensures that the conclusions drawn are not influenced by the force field choice.

      (3) The binding of a large number of disordered peptides was investigated, ensuring that the conclusions drawn about disordered peptide recognition are sufficiently general.

      Weaknesses:

      (1) The timescales of the peptide recognition and unbinding process are much longer than what can be sampled from unbiased simulations. Therefore, the proposed mechanism of recognition should only be considered a hypothesis based on the results presented here. For example, peptides that do not dissociate within one one-microsecond MD simulation are considered to be stable binders. However, they may not have a viable way to bind to the narrow protein cleft in the first place.

      (2) Oftentimes, representative structures sampled from MD simulation are used to draw conclusions (e.g., Figure 4 about the role of R161 mutation in binding affinity). This is not appropriate as one unbinding event being observed or not observed in a microsecond-long trajectory does not provide sufficient information about the binding strength of the free energy difference.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigated the interactions between IRE and unfolded peptides using all-atom molecular dynamics simulations. The interactions between a couple of unfolded peptides and IRE might shed light on the activation of the UPR.

      Strengths:

      (1) Well-written manuscript tailored for a biology audience.

      (2) State-of-the-art structural predictions and all-atom simulations.

      (3) Validation with existing experimental data

      (4) Clear schematic diagram summarizing the mechanisms learned from simulations.

      (5) Shared simulation data and code in a public repository.

      Weaknesses:

      (1) Improving presentation to include more computational details.

      (2) More quantitative analysis in addition to visual structures.

    4. Reviewer #3 (Public review):

      Summary:

      In this important work, the authors use extensive MD simulations to study how the IRE1 protein can detect unfolded peptides. Their study consolidates contradicting experimental results and offers a unique view of the different sensing models that have been proposed in the literature. Overall, it is an excellent study that is quite extensive. The research is solid, meticulous, and carefully performed, leading to convincing conclusions.

      Strengths:

      The strength of this work is the extensive and meticulous molecular dynamics simulations. The authors use and investigate different structural models, for example, carefully comparing a model based on a PDB structure with reconstructed loops with an AlphaFold 2 Multimer model. The author also investigates a wide range of different protein structural models that probe different aspects of the peptide sensing process. These solid and meticulous MD simulations allow the authors to obtain convincing conclusions concerning the peptide sensing process of the IRE1 protein.

      Weaknesses:

      A potential weakness of the study is the usage of equilibrium (unbiased) molecular dynamics simulations, so that processes and conformational changes on the microsecond time scale can be probed. Furthermore, there can be inaccuracies and biases in the description of unfolded peptides and protein segments due to the protein force fields. Here, it should be noted that the authors do acknowledge these possible limitations of their study in the conclusions.

    5. Author response:

      Reviewer #1:

      We appreciate the Reviewer's positive feedback on the strengths of our study.

      The timescales of the peptide recognition and unbinding process are much longer than what can be sampled from unbiased simulations. Therefore, the proposed mechanism of recognition should only be considered a hypothesis based on the results presented here. For example, peptides that do not dissociate within one one-microsecond MD simulation are considered to be stable binders. However, they may not have a viable way to bind to the narrow protein cleft in the first place.

      We thank the Reviewer for this valuable feedback. We agree with the Reviewer. Our work on the IRE1 cLD activation mechanism is focused on generating hypotheses of the binding mechanism driven by MD simulations. We recognize the limitations in defining a stable binder due to the time scales sampled. However, our primary focus was to sample and characterize a possible binding pose in the center of the cLD dimer. We will contextualize our statements about stable binders and limit our claims to stating that the protein-peptide complex is stable within 1 μs-long simulations. However, we believe that our finding that the cLD dimer groove is not able to accommodate peptides is solid, as the steric impediment described is present in all our replicas, both with and without peptides, in a cumulative sampling time of 72 μs. Additionally, we will include a plot showing the distribution of groove width across all replicas.

      Oftentimes, representative structures sampled from MD simulation are used to draw conclusions (e.g., Figure 4 about the role of R161 mutation in binding affinity). This is not appropriate as one unbinding event being observed or not observed in a microsecond-long trajectory does not provide sufficient information about the binding strength of the free energy difference.

      We thank the Reviewer for the insightful comment. As explained in the previous point, we believe that our simulations provide useful hypotheses, and we agree that we do not currently have data to comment on binding affinity. We will, therefore, remove all references to this term. We are aware of the limitations due to the timescale and agree that these limitations cannot be overcome with standard equilibrium simulations. To address these limitations, we plan to use orthogonal methods, namely MM/PB(GB)SA calculations for calculating binding free energies from existing trajectories (as performed by https://doi.org/10.1021/acs.jcim.4c00975). We will add predictions of all the peptides using AlphaFold 3, to confirm the binding region.

      Reviewer #2:

      We thank the Reviewer for their positive feedback.

      Improving presentation to include more computational details.

      We thank the Reviewer for raising this critical point. We agree that the manuscript is tailored for a biology audience, as the data are particularly relevant for that community. Nevertheless, we also understand the importance of providing sufficient methodological detail for computational readers. We will add appropriate computational information in the main text.

      More quantitative analysis in addition to visual structures.

      We will add an uncertainty estimate for the HDX calculations using bootstrapping and include additional information on bond distances for Y161. We will also incorporate time-series data showing the distance of the peptide from the groove across all replicas.

      Reviewer #3:

      We appreciate the Reviewer's positive feedback on our work.

      A potential weakness of the study is the usage of equilibrium (unbiased) molecular dynamics simulations so that processes and conformational changes on the microsecond time scale can be probed. Furthermore, there can be inaccuracies and biases in the description of unfolded peptides and protein segments due to the protein force fields. Here, it should be noted that the authors do acknowledge these possible limitations of their study in the conclusions.

      We appreciate the Reviewer's thoughtful comment. As noted in our response to Reviewer 1, we plan to address the concern about sampling by applying orthogonal methods. We agree with the Reviewer that some form of enhanced sampling is necessary if we want to assess binding in a more quantitative way, e.g., via free energy calculations. However, we also realize that applying any enhanced sampling scheme to our system is very challenging, given its large size and the complex peptide-protein interactions, which are not easily captured in a few collective variables. After a careful assessment and some preliminary tests, we decided that estimating free energies using enhanced sampling would necessitate a separate paper due to both the conceptual complexity of the project and the size of the necessary sampling campaign.

    1. eLife Assessment

      This work provides one of the first important attempts to look at Drosophila immune responses against bacterial, viral, and fungal pathogens in a way that combines the roles of four major arms in immunity (Imd signaling, Toll signaling, phagocytosis, and melanization) rather than studying them separately. The findings are convincing, and the tools provided can be used as they are, or built upon, in various contexts.

    2. Reviewer #1 (Public review):

      Summary:

      The innate immune system serves as the first line of defense against invading pathogens. Four major immune-specific modules - the Toll pathway, the Imd pathway, melanization, and phagocytosis- play critical roles in orchestrating the immune response. Traditionally, most studies have focused on the function of individual modules in isolation. However, in recent years, it has become increasingly evident that effective immune defense requires intricate interactions among these pathways.

      Despite this growing recognition, the precise roles, timing, and interconnections of these immune modules remain poorly understood. Moreover, addressing these questions represents a major scientific undertaking.

      Strengths:

      In this manuscript, Ryckebusch et al. systematically evaluate both the individual and combined contributions of these four immune modules to host defense against a range of pathogens. Their findings significantly enhance our understanding of the layered architecture of innate immunity.

      Weaknesses:

      While I have no critical concerns regarding the study, I do have several suggestions to offer that may help further strengthen the manuscript. These include:

      (1) Have the authors validated the efficiency of the mutants used in this study? It would be helpful to include supporting data or references confirming that the mutations effectively disrupted the intended immune pathways.

      (2) Given the extensive use of double, triple, and quadruple mutants, a more detailed description of the mutant construction process is warranted.

    3. Reviewer #2 (Public review):

      Summary:

      In this work, the authors take a holistic view of Drosophila immunity by selecting four major components of fly immunity often studied separately (Toll signaling, Imd signaling, phagocytosis, and melanization), and studying their combinatory effects on the efficiency of the immune response. They achieve this by using fly lines mutant for one of these components, or modules, as well as for a combination of them, and testing the survival of these flies upon infection with a plethora of pathogens (bacterial, viral, and fungal).

      Strengths:

      It is clear that this manuscript has required a large amount of hands-on work, considering the number of pathogens, mutations, and timepoints tested. In my opinion, this work is a very welcome addition to the literature on fly immune responses, which obviously do not occur in one type of response at a time, but in parallel, subsequently, and/or are interconnected. I find that the major strength of this work is the overall concept, which is made possible by the mutations designed to target the specific immune function of each module (at least seemingly) without major effects on other functions. I believe that the combinatory mutants will be of use for the fly community and enable further studies of the interplay of these components of immune response in various settings.

      To control for the effects arising from the genetic variation other than the intended mutations, the mutants have been backcrossed into a widely used, isogenized Drosophila strain called w1118. Therefore, the differences accounted for by the genotype are controlled.

      I also appreciate that the authors have investigated the two possible ways of dealing with an infection: tolerance and resistance, and how the modules play into those.

      Weaknesses:

      While controlling for the background effects is vital, the w1118 background is problematic (an issue not limited to this manuscript) because of the wide effects of the white mutation on several phenotypes (also other than eye color/eyesight). It is a possibility that the mutation influences the functionality of the immune response components, for example, via effects of the faulty tryptophan handling on the metabolism of the animal.

      I acknowledge that it is not reasonable to ask for data in different backgrounds better representing a "wild type" fly (however, that is defined is another question), but I think this matter should be brought up and discussed.

      The whole study has been conducted on male flies. Immune responses show quite extensive sex-specific variation across a variety of species studied, also in the fly. But the reasons for this variation are not fully understood. Therefore, I suggest that the authors conduct a subset of experiments on female flies to see if the findings apply to both sexes, especially the infection-specificity of the module combinations.

    1. eLife Assessment

      This important study reports an advancement in the diagnosis of Animal African Trypanosomosis (AAT), which adapts a CRISPR-based diagnostic tool (SHERLOCK4AAT) to detect different trypanosome species responsible for AAT. The evidence supporting the conclusions is convincing and in line with the current state-of-the-art diagnostics. This study will be of interest to the fields of Epidemiology, Public Health, and Veterinary Medicine.

    2. Reviewer #1 (Public review):

      Summary:

      This study addresses a critical gap in veterinary diagnostics by developing a CRISPR-based diagnostic toolbox (SHERLOCK4AAT) for detecting animal African trypanosomosis. It describes the development and field deployment of SHERLOCK4AAT, a CRISPR-Cas13-based diagnostic toolbox for the eco-epidemiological surveillance of animal African trypanosomosis (AAT) in West Africa.

      The authors successfully created and validated species-specific assays for multiple trypanosomes, including T. congolense, T. vivax, T. theileri, T. simiae, and T. suis, alongside pan-trypanosomatid and pan-Trypanozoon assays. The field validation in pigs from Guinea and Côte d'Ivoire revealed high trypanosome prevalence (62.7%), frequent co-infections, and importantly identified T. b. gambiense in one animal at each site, suggesting pigs may serve as potential reservoirs for this human-infective parasite.

      A major strength of the study lies in its methodological innovation. By adapting SHERLOCK to target both conserved and species-discriminating sequences, the authors achieved high sensitivity and specificity in detecting Trypanosoma species. Their use of dried blood spots, validated thresholds through ROC analyses, and statistical robustness (e.g., Bayesian latent class modeling) provides a strong foundation for their conclusions.

      The results are significant: over 60% of pigs tested positive for at least one trypanosome species, with co-infections observed frequently and T. b. gambiense detected in pigs at both sites. These findings have direct implications for the role of animal reservoirs in human disease transmission and underscore the value of pigs as sentinel hosts in gHAT elimination efforts.

      The limitations are well acknowledged, particularly the suboptimal sensitivity of the T. vivax assay and the reliance on synthetic controls for T. suis and T. simiae. However, these limitations do not undermine the overall conclusions, and the paper provides a clear roadmap for further assay refinement and implementation.

      This study offers a timely, impactful, and well-substantiated contribution to the field. The SHERLOCK4AAT toolbox holds promise for improving AAT diagnostics in resource-limited settings and advancing One Health surveillance frameworks.

      Strengths:

      (1) The adaptation of SHERLOCK technology for AAT represents a significant technical advancement, offering higher sensitivity than traditional parasitological methods and the ability to detect multiple species simultaneously.

      (2) Rigorously performed with validation using appropriate controls, ROC curve analyses, and Bayesian latent class modelling, establishing clear analytical sensitivity and specificity for most assays.

      (3) Testing 424 pig samples across two countries provides robust evidence of the tool's utility and reveals important epidemiological insights about trypanosome diversity and prevalence.

      (4) The identification of T. b. gambiense in pigs at both sites has significant implications for HAT elimination strategies and highlights the need for integrated One Health approaches.

      (5) The use of dried blood spots and RNA detection for active infections makes the approach practical for field surveillance in resource-limited settings.

      Weaknesses:

      (1) The manuscript would benefit from more detailed discussion of practical considerations such as cost, equipment requirements, and training needs for implementing SHERLOCK in endemic areas and rural settings which would improve applicability.

      (2) Limited discussion of pig selection criteria: More justification for choosing pigs as sentinel animals and discussion of potential limitations of this approach would strengthen the manuscript.

      (3) More details on why certain genes were targeted would strengthen the methods.

      (4) Table formatting could be improved for readability.

      (5) Some figures are complex and would benefit from additional explanations in the legends.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript is important due to the significance of the findings. The strength of evidence is convincing.

      Strengths:

      (1) Using a Novel SHERLOCK4AAT toolkit for diagnosis.

      (2) Identification of various sub-species of Trypanosomes.

      (3) Differentiating the animal subspecies from the human one.

      Weaknesses:

      (1) The title is too long, and the use of definite articles should be reduced in the title.

      (2) The route of blood sample collection in the animals should be well defined and explained.

    4. Reviewer #3 (Public review):

      Summary:

      The study adapts CRISPR-based detection toolkit (SHERLOCK assay) using conserved and species-specific targets for the detection of some members of the Trypanosomatidae family of veterinary importance and species-specific assays to differentiate between the six most common animal trypanosome species responsible for AAT (SHERLOCK4AAT). The assays were able to discriminate between Trypanozoon (T. b. brucei, T. evansi, and T. equiperdum), T. congolense (Savanah, Forest Kilifi, and Dzanga sangha), T. vivax, T. theileri, T. simiae, and T. suis. The design of both broad and species-specific assays was based primarily on sequences of the 18S rRNA, GAPDH (Glyceraldehyde-3-phosphate dehydrogenase), and invariant flagellum antigen (IFX) genes for species identification. Most importantly, the authors showed varying limits of detection for the different SHERLOCK assays, which is somewhat comparable to PCR-derived molecular techniques currently used for detecting animal trypanosomes, even though some of these methodologies have used other primers that target genes such as ITS1 and 7SL sRNA.

      The data presented in the study are particularly useful and of significant interest for the diagnosis of AAT in affected areas.

      Strengths:

      The assays convincingly allow for the analysis and detection of most trypanosomes in AAT.

      Weaknesses:

      Inability for the assay to distinguish T. b. brucei, T. evansi, and T. equiperdum using the 18S rRNA gene, as well as the IFX gene, not achieving the sensitivity requirements for detection of T. vivax. Both T. brucei brucei and T. vivax are the most predominant infective species in animals (in addition to T. congolense), therefore, a reliable assay should be able to convincingly detect these to allow for proper use of the diagnostic assay.

    1. eLife Assessment

      This important study investigates frequency-dependent effects of transcutaneous tibial nerve stimulation (TTNS) on bladder function in healthy humans and, through a computational model, shows that low-frequency stimulation accelerates, and high-frequency delays, the urge to void. The integration of experimental and modeling approaches provides a solid foundation for clinical trials targeting urinary retention. However, concerns were raised about over-interpretation of modest effects and the limited physiological validity of the computational model, especially its mismatch with typical bladder behaviour and lack of quantitative validation.

    2. Reviewer #1 (Public review):

      Summary:

      The research investigates the frequency-dependent effects of transcutaneous tibial nerve stimulation (TTNS) on bladder function in healthy humans and via a computational model. The authors report that low-frequency (1 Hz) TTNS accelerates the urge to void, while high-frequency (20 Hz) TTNS delays it, corroborated by a computational model suggesting brainstem-mediated mechanisms. The work bridges experimental and theoretical approaches to propose a novel framework for TTNS applications in urinary retention.

      Strengths:

      (1) The integration of human experiments and computational modeling is a major strength. The model successfully replicates bladder dynamics and provides mechanistic insights into frequency-dependent effects.

      (2) Identifies potential therapeutic applications for urinary retention, a condition with limited non-invasive treatments.

      (3) Figures are clear and illustrative, and supplementary materials provide essential methodological depth.

      (4) Controlled experimental design (eg., single-blinded, fluid/caffeine restrictions, etc), detailed computational model parameters and validation against animal data, transparency in data exclusion criteria and statistical adjustments.

      Weaknesses:

      (1) The study uses healthy participants; extrapolation to clinical populations (e.g., urinary retention patients) requires validation.

      (2) The simulated bladder capacity (100-150 mL) is lower than physiological ranges (300-400 mL). While the authors note this, the impact on model validity should be further addressed.

      (3) The model omits nociceptive afferents, limiting its applicability to pathological conditions like overactive bladder.

      (4) The lack of significant differences in urge intensity between groups (despite timing differences) warrants deeper discussion. Is the primary effect on efferent activity (as suggested) rather than sensory perception?

      (5) One of the highlights of this study is the identification of the effect of low-frequency (1 Hz) tibial nerve stimulation (TNS) on facilitating bladder contraction. Although the authors have clarified this effect in healthy participants, it would strengthen the conclusion if a UAB animal model (e.g., PMCID: PMC7927909, PMC8163611, PMC7847056, PMC8799394) were used to evaluate the same effect.

    3. Reviewer #2 (Public review):

      Summary:

      Tibial nerve (electrical) stimulation (TNS) has emerged over the past 15 years as a non-invasive method to treat bladder overactivity, but interestingly, new animal work has suggested that TNS could actually be used to excite the bladder when appropriately tuning the stimulation frequency, effectively inverting its effect, perhaps opening the door to treat different conditions (e.g., UAB). The present study tests how healthy people respond to low and high frequency TNS, with the authors showing that they can substantially delay people's first sensation of bladder fullness with high frequencies (20Hz, shown many times before) but also that they can slightly hasten people's first sensation with low frequencies (1Hz, new result in humans). Moreover, the authors develop a computational model of interconnected conductance-based simulated neurons arranged in a physiologically plausible circuit that reproduces some aspects of the frequency-dependent effects of TNS. Their simulations suggest that we might expect low-frequency TNS to also increase the duration of bladder contractions in humans. The study highlights a potential new research direction, optimizing TNS stimulation parameters to increase basal bladder excitability.

      Strengths:

      The main strength of the work is to call attention to a new possibility of inverting the effect of TNS in humans by manipulating stimulation frequency, opening new indications for the therapy. This is highly relevant because of the recent popularity of TNS and its non-invasiveness, which lends itself to rapid testing and evaluation for new conditions and a high willingness to adopt. The authors convincingly demonstrate a modest excitatory effect on bladder sensation with low-frequency TNS, which clearly warrants further investigation.

      The high-level design of the hypotheses, concepts, and experiments is clearly articulated in both the methods and in particularly clear diagrams, letting the reader focus their attention on the most important findings.

      It is rare to develop a new computational model of the lower urinary tract at a systems level, and even more so for it to incorporate circuits in the spinal cord and brainstem centers, and this work undoubtedly advances the field's ability to engineer such systems. Further, because the model is comprised of linked conductance-based point-neurons, it is an excellent tool to investigate how an arguably plausible wiring diagram for neural control of the LUT could result in stimulation frequency-dependent effects on pelvic efferents. It is a proof of concept demonstrating how their mechanistic hypothesis of TNS could be implemented neurophysiologically by the nervous system.

      Weaknesses:

      The main drawback of the work is the frequent overinterpretation of the results. The human study and computational model are both proof-of-principle studies because the experimental effect size and sample size are modest, and the computational model is poorly validated and does not generate physiologically typical cystometric responses in simulations that are designed to recapitulate nominal LUT behavior.

      Despite the stated caveats about the small effect in the human study, it should be emphasized throughout that this result is most reasonably interpreted as showing the possibility that TNS can have a low-frequency excitatory effect that merits follow-up, rather than a conclusive demonstration. The effect size is small (as the authors note) and should be placed in context with some minimally clinically important difference, if possible. The result is statistically significant, but even this may be subject to revision due to the small sample and the effect of post-hoc outlier removal and data analysis choices.

      Given the apparent mismatch between the model and the cystometric behavior at the systems level in the "normal" case (e.g., low capacity, low voiding efficiency, omitted pressure profiles, frequency, etc.) and the absence of quantitative model validation (e.g., it was not compared directly with any experimental data from human urodynamics or rodent cystometry, beyond the initial fit to the neural data, no sensitivity analyses were performed, no goodness of fit computed, etc.) the discussion should be much more circumspect about interpreting the results at a systems level and should probably contain a paragraph explicitly detailing the limitations of the model. The subsequent interpretation should focus narrowly on the neural circuitry, rather than things like contraction duration, where the model is at its strongest. As written, the authors over-interpret what the in silico study can reasonably be used to infer about LUT function.

      More justification is needed for why the contraction duration of the model is the central focus of analysis, when it connects only tentatively to the human study results, which focus on urgency. While not necessarily incorrect, a clearer link or motivation should be offered for how this informs our understanding of frequency-dependent TNS afferent or efferent inhibition during filling (which was the focus of the human studies and the abstract). In other words, why doesn't the model reproduce the 1Hz excitation effect of expediting void onset (or urgency in the human study), and why is it justified to look at contraction duration as a surrogate measure?

      The authors claim that "voiding behavior occurred earlier [at 1Hz stim in the model]", pointing to Figure 6A as evidence, but this panel appears to show a single example model run where 1Hz voiding occurs only ~1s earlier (display makes this very hard to estimate). This is insufficient evidence to support the claim. Later, it is stated that "TNS did not ... void much earlier". The claims should be made compatible, and all such claims should have reasonable supporting evidence.

      There are a number of reporting concerns that can be easily addressed:

      (1) Human Study:

      (a) To interpret the human study analysis, a fuller description of the "optional 10m inute extension" is necessary. How were participants presented with this option, how was blinding preserved, what fraction of participants accepted, and did phase 1 results influence their decisions to continue?

      (b) For reproducibility, details about the TNS parameters should be articulated, such as the method of determining "motor thresholds" (unless this is synonymous with "urge to urinate"), the shape of the stimulation pulses (e.g., biphasic, charge balanced), typical applied current, etc.

      (2) The Computational Model

      (a) The code availability statement for this type of work is inadequate. The model used for simulations in this work, as well as the code used to initialize (and randomize synaptic connections), needs to be hosted publicly because i) a model this intricate is extremely hard to reproduce/verify without code, ii) simulations are an essential piece of the argument, iii) hosting code requires very little overhead. Although there is an appropriate level of detail in the model description, it would not be possible to reproduce the model in any reasonable amount of time (or at all) because of the implementation-level details that are, understandably, omitted from the methods (e.g., what is a "unit", what 'exactly' do the connections in the PMC and PAG diagrams relate to, what were the final parameters used for all conductances, which parameters were "matched" to the original papers and which were not, etc.).

      b) Critical cystometric/urodynamic values that are typically analyzed to assess healthy LUT function are detrusor pressure (timeseries) and/or post-void residual or voiding efficiency (scalars). These should be included to verify that the model is representative of the "normal" case. This is especially important because the model's "normal" behavior appears to have extremely low voiding efficiency (Figure 6A).

    1. eLife Assessment

      This valuable study provides insights into the structure and function of bacterial contractile injection systems that are present in the cytoplasm of many Streptomyces strains. A convincing high-resolution model of the structure of extended forms of the cytoplasmic contractile injection system assembly from Streptomyces coelicolor is presented, with some investigation of the membrane protein CisA in attachment of the extended assembly to the inner face of the cytoplasmic membrane and the firing of the system. The work expands the current understanding of these diverse bacterial nanomachines.

    2. Reviewer #2 (Public review):

      Summary:

      The paper addresses how the S. coelicolor contractile injection system (CISSc) interacts with the membrane, how it contracts and fires, and how it affects both cell viability and differentiation, which it has been implicated to do in previous work from this group and others. The Streptomyces CIS systems have been enigmatic in the sense that they are free-floating in the cytoplasm in an extended form and are seen in contracted conformation (i.e. after having been triggered) mainly in dead and partially lysed cells, suggesting involvement in some kind of regulated cell death. So, how do the structure and function of the CISSc system compare to other types of CIS from other bacteria and phages, does it interact with the cytoplasmic membrane, how does it do that, and is the membrane interaction involved in the suggested role in stress-induced, regulated cell death? The authors address these questions by investigating the role of a membrane protein, CisA, that is encoded by a gene in the CIS gene cluster in S. coelicolor. Further, they show for the first time the structure of the assembled CISSc, purified from the cytoplasm of S. coelicolor, analysed using single-particle cryo-electron microscopy.

      Strengths:

      The beautiful visualisation of the CIS system both by cryo-electron tomography of intact bacterial cells and by single-particle electron microscopy of purified CIS assemblies are clearly the strengths of the paper, both in terms of methods and results. Further, the paper provides genetic evidence that the membrane protein CisA is required for the contraction of the CISSc assemblies that are seen in partially lysed or ghost cells of the wild type. The conclusion that CisA is a transmembrane protein and the inferred membrane topology are well supported by experimental data. The cryo-EM data suggest that CisA is not a stable part of the extended form of the CISSc assemblies. These findings raise the question of what CisA does. Interestingly, Alphafold modelling suggests that the cytoplasmic part of CisA interacts directly with the base plate protein Cis11.

      Weaknesses:

      The investigations of the role of CisA in function, membrane interaction, and triggering of contraction of CIS assemblies are key parts of the paper and are highlighted in the title. However, the data presented to answer these questions are partially incomplete and have some limitations.

      As an example, although the modelling that suggests interaction between CisA and the base plate protein Cis11 appears compelling, the interaction has not yet been possible to test and verify experimentally. Further, it remains unclear whether or how CisA recruits the CISSc system to the membrane. Overall, the mechanism by which CisA may act on CISSc and cause firing remains largely unclear.

      Further, the paper does not provide new insights into the role of the CISSc system in growth or developmental biology of streptomycetes. The assay of how CisA affects the function of the system involves monitoring stress-induced loss of viability based on loss of cytoplasmic GFP signal, as described in a previous paper. The assay looks only at single hyphal fragments released from mycelial networks or mycelial pellets, and it could have been interesting to observe effects also under other growth conditions. Similarly, the effect on the developmental life cycle is limited to showing accelerated sporulation in the CisA mutant, similar to what was previously shown for mutants lacking other parts of the system. The paper shows that CisA is needed for the observed phenotypic effects of the CISSc system, but the overall biological roles of the CISSc and CisA remain elusive.

      Concluding remarks:

      This paper provides new insights into the structure of the unusual subclass of bacterial contractile injection systems (CIS) that is constituted by the cytoplasmically located systems found in streptomycetes. Importantly, the work also describes a membrane protein, CisA, that likely links the CISSc to the cytoplasmic membrane and is required for its function and likely its triggering. The paper will be of large interest in the field, and it will likely be the basis for further and more mechanistic and functional investigations of the Streptomyces CIS systems.

    3. Reviewer #3 (Public review):

      Summary

      In this work, Casu et al. have reported the characterization of a previously uncharacterized membrane protein CisA encoded in a non-canonical contractile injection system of Streptomyces coelicolor, CISSc, which is a cytosolic CISs significantly distinct from both intracellular membrane-anchored T6SSs and extracellular CISs. The authors have presented the first high-resolution structure of the extended CISSc structure. It revealed important structural insights of the extended state of this non-canonical CIS.

      To further explore how CISSc interacted with cytoplasmic membrane, they further set out to investigate a membrane protein CisA encoded in the CISSc cluster and previously hypothesized to be the membrane adaptor for CISSc; however, the structure revealed that it was not associated with CISSc. Using a fluorescence microscope and cell fractionation assay, the authors verified that CisA is indeed a membrane-associated protein. They further determined experimentally that CisA had a cytosolic N-terminal domain and a periplasmic C-terminus. The functional analysis of cisA mutant revealed that it is not required for CISSc assembly but is essential for the contraction, as a result, the deletion significantly affects CISSc-mediated cell death upon stress, timely differentiation, as well as secondary metabolite production. Although the work did not resolve the mechanistic detail how CisA interacts with CISSc structure, they used in-silico prediction of protein-protein interactions between monomeric CisA and CISSc components using Alphafold2-Multimer, which identified baseplate protein Cis11 as a potential interaction partner. Such prediction sets out a strong basis for future investigations to explore the molecular mechanistic details how CisA mediates the contraction via interactions with the CIS structural components such as Cis11. Using AlphaFold3, the authors also estimated the oligomerization state of CisA, which can be present as a pentamer. Authors further suggested that such oligomerization is mediated by the interaction of C-terminal solute-binding like domain.

      In general, the work provides solid data and a strong foundation for future investigation toward understanding the mechanism of CISSc contraction, and potentially, the relation between the membrane association of CISSc, the sheath contraction and the cell death.

      Major Strength:

      The paper is well-structured, and the conclusion of the study is supported by solid data and careful data interpretation were presented. The authors provided strong evidence on (1) the high-resolution structure of extended CISSc determined by cryo-EM, and the subsequent comparison with known eCIS structures, which sheds light on both its similarity and different features from other subtypes of eCISs in detail; (2) the topological features of CisA using fluorescence microscopic analysis, cell fractionation and PhoA-LacZα reporter assays, (3) functions of CisA in CISSc-mediated cell death and secondary metabolite production, likely via the regulation of sheath contraction, (4) structural prediction of the oligomerization state of CisA and potential interaction partners of CIS structure.

      Weakness:

      Due to technical limitations, authors are not able to experimentally demonstrate the direct interaction between CisA with baseplate complex of CISSc, since they could not express cisA in E. coli due to its potential toxicity. Therefore, there is a lack of biochemical analysis of direct interaction between CisA and baseplate wedge. However, they have provided solid AlphaFold2-multimer prediction data and identified baseplate protein Cis11 as a potential interaction partner. Such predictions will guide future work towards biochemical analysis to verify such interaction.

      While there is no direct evidence showing that CisA is responsible for tethering CISSc to the membrane upon stress, and the spatial and temporal relation between membrane association and contraction remains unclear, I recognize that this is above the scope of the current work, so I would expect further investigation to address these questions in future.

      Conclusion

      Overall, the work provides a valuable contribution to our understanding on the structure of a much less understood subtype of CISs, which is unique compared to both membrane-anchored T6SSs and host-membrane targeting eCISs. Authors have successfully demonstrated the role of CisA in the contraction of CISSc, along with solid and detailed analysis of the contraction state of the particles with or without CisA using cryo-ET. Using structural modeling, authors also identified the potential oligomerization state and possible interaction partner within the CIS particle.

      Importantly, the work serves as a strong foundation to further investigate how the sheath contraction works here. The work contributes to expanding our understanding of the diverse CIS superfamilies, with significant novelty.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Contractile Injection Systems (CIS) are versatile machines that can form pores in membranes or deliver effectors. They can act extra or intracellularly. When intracellular they are positioned to face the exterior of the cell and hence should be anchored to the cell envelope. The authors previously reported the characterization of a CIS in Streptomyces coelicolor, including significant information on the architecture of the apparatus. However, how the tubular structure is attached to the envelope was not investigated. Here they provide a wealth of evidence to demonstrate that a specific gene within the CIS gene cluster, cisA, encodes a membrane protein that anchors the CIS to the envelope. More specifically, they show that:

      - CisA is not required for assembly of the structure but is important for proper contraction and CIS-mediated cell death

      - CisA is associated to the membrane (fluorescence microscopy, cell fractionation) through a transmembrane segment (lacZ-phoA topology fusions in E. coli)

      - Structural prediction of interaction between CisA and a CIS baseplate component<br /> - In addition they provide a high-resolution model structure of the >750-polypeptide Streptomyces CIS in its extended conformation, revealing new details of this fascinating machine, notably in the baseplate and cap complexes.

      All the experiments are well controlled including trans-complemented of all tested phenotypes.

      One important information we miss is the oligomeric state of CisA.

      Thank you for this suggestion. We now provide information on the potential oligomeric state of CisA. We performed further AlphaFold3 modelling of CisA using an increasing number of CisA protomers (1 to 8). We ran predictions for the configuration using the sequence of the well-folded C-terminal CisA domain (amino acids 285-468), which includes the transmembrane domain and the conserved domain that shares similarities to carbohydrate-degrading domains. The obtained confidence scores (mean values for pTM=0.73, ipTM=0.7, n=5) indicate that CisA can assemble into a pentamer and that this oligomerization is mediated through the interaction of the C-terminal solute-binding like superfamily domain.

      We have added this information to the revised manuscript (Fig. 3b/c) and further discuss the possible implications of CisA oligomerization for its proposed mode of action.

      While it would have been great to test the interaction between CisA and Cis11, to perform cryo-electron microscopy assays of detergent-extracted CIS structures to maintain the interaction with CisA, I believe that the toxicity of CisA upon overexpression or upon expression in E. coli render these studies difficult and will require a significant amount of time and optimization to be performed. It is worth mentioning that this study is of significant novelty in the CIS field because, except for Type VI secretion systems, very few membrane proteins or complexes responsible for CIS attachment have been identified and studied.

      We thank this reviewer for their highly supportive and positive comments on our manuscript and we are grateful for their recognition of the novelty of our study, particularly in the context of membrane proteins and complexes involved in CIS attachment.

      We agree that further experimental evidence on direct interaction between CisA and Cis11 would have strengthened our model on CisA function. However, as noted by this reviewer, this additional work is technically challenging and currently beyond the scope of this study.

      Reviewer #2 (Public review):

      Summary:

      The overall question that is addressed in this study is how the S. coelicolor contractile injection system (CISSc) works and affects both cell viability and differentiation, which it has been implicated to do in previous work from this group and others. The CISSc system has been enigmatic in the sense that it is free-floating in the cytoplasm in an extended form and is seen in contracted conformation (i.e. after having been triggered) mainly in dead and partially lysed cells, suggesting involvement in some kind of regulated cell death. So, how do the structure and function of the CISSc system compare to those of related CIS from other bacteria, does it interact with the cytoplasmic membrane, how does it do that, and is the membrane interaction involved in the suggested role in stress-induced, regulated cell death? The authors address these questions by investigating the role of a membrane protein, CisA, that is encoded by a gene in the CIS gene cluster in S. coelicolor. Further, they analyse the structure of the assembled CISSc, purified from the cytoplasm of S. coelicolor, using single-particle cryo-electron microscopy.

      Strengths:

      The beautiful visualisation of the CIS system both by cryo-electron tomography of intact bacterial cells and by single-particle electron microscopy of purified CIS assemblies are clearly the strengths of the paper, both in terms of methods and results. Further, the paper provides genetic evidence that the membrane protein CisA is required for the contraction of the CISSc assemblies that are seen in partially lysed or ghost cells of the wild type. The conclusion that CisA is a transmembrane protein and the inferred membrane topology are well supported by experimental data. The cryo-EM data suggest that CisA is not a stable part of the extended form of the CISSc assemblies. These findings raise the question of what CisA does.

      We thank Reviewer #2 for the overall positive evaluation of our manuscript and the constructive criticism.

      Weaknesses:

      The investigations of the role of CisA in function, membrane interaction, and triggering of contraction of CIS assemblies, are important parts of the paper and are highlighted in the title. However, the experimental data provided to answer these questions appear partially incomplete and not as conclusive as one would expect.

      We acknowledge that some aspects of our work remain unanswered. We are currently unable to conduct additional experiments because the two leading postdoctoral researchers on this project have moved on to new positions. We currently don’t have the extra manpower with a similar skill set to pick up the project.

      The stress-induced loss of viability is only monitored with one method: an in vivo assay where cytoplasmic sfGFP signal is compared to FM5-95 membrane stain. Addition of a sublethal level of nisin lead to loss of sfGFP signal in individual hyphae in the WT, but not in the cisA mutant (similarly to what was previously reported for a CIS-negative mutant). Technically, this experiment and the example images that are shown give rise to some concern. Only individual hyphal fragments are shown that do not look like healthy and growing S. coelicolor hyphae. Under the stated growth conditions, S. coelicolor strains would normally have grown as dense hyphal pellets. It is therefore surprising that only these unbranched hyphal fragments are shown in Fig. 4ab.

      We thank this Reviewer for their thoughtful criticism regarding the viability assays and the data presented in Figure 4. We acknowledge the importance of ensuring that the presented images reflect the physiological state of S. coelicolor under the stated growth conditions and recognize that hyphal fragments shown in Figure 4 do not fully capture the typical morphology of S. coelicolor. As pointed out by this reviewer, S. coelicolor grows in large hyphal clumps when cultured in liquid media, making the quantification of fluorescence intensities in hyphae expressing cytoplasmic GFP or stained with the membrane dye FM5-95 particularly challenging. To improve the image analysis and quantification of GFP and FM5-95-fluorescent intensities across the three S. coelicolor strains (wildtype, cisA deletion mutant and the complemented cisA mutant), we vortexed the cell samples before imaging to break up hyphal clumps, increasing hyphal fragments. The hyphae shown in our images were selected as representative examples across three biological replicates.

      Further, S. coelicolor would likely be in a stationary phase when grown 48 h in the rich medium that is stated, giving rise to concern about the physiological state of the hyphae that were used for the viability assay. It would be valuable to know whether actively growing mycelium is affected in the same way by the nisin treatment, and also whether the cell death effect could be detected by other methods.

      The reasoning behind growing S. coelicolor for 48 h before performing the fluorescence-based viability assay was that we (DOI: 10.1038/s41564-023-01341-x ) and others (e.g.: DOI: 10.1038/s41467-023-37087-7 ) previously showed that the levels of CIS particles peak at the transition from vegetative to reproductive/stationary growth, thus indicating that CIS activity is highest during this growth stage. The obtained results in this manuscript are consistent with previous results, in which we showed a similar effect on the viability of wildtype versus cis-deficient S. coelicolor strains (DOI: 10.1038/s41564-023-01341-x ) using nisin, the protonophore CCCP and UV radiation. The results presented in this study and our previous study are based on biological triplicate experiments and appropriate controls. Furthermore, our results are in agreement with the findings reported in a complementary study by Vladimirov et al. (DOI: 10.1038/s41467-023-37087-7 ) that used a different approach (SYTO9/PI staining of hyphal pellets) to demonstrate that CIS-deficient mutants exhibit decreased hyphal death.

      Taken together, we believe that the results obtained from our fluorescence-based viability assay provide strong experimental evidence that functional CIS mediate hyphal cell death in response to exogenous stress.

      The model presented in Fig. 5 suggests that stress leads to a CisA-dependent attachment of CIS assemblies to the cytoplasmic membrane, and then triggering of contraction, leading to cell death. This model makes testable predictions that have not been challenged experimentally. Given that sublethal doses of nisin seem to trigger cell death, there appear to be possibilities to monitor whether activation of the system (via CisA?) indeed leads to at least temporally increased interaction of CIS with the membrane.

      We thank this reviewer for their suggestions on how to test our model further. This is a challenging experiment because we do not know the exact dynamics of how nisin stress is perceived and transmitted to CisA and CIS particles.

      In an attempt to address this point, we have performed co-immunoprecipitation experiments using S. coelicolor cells that produced CisA-FLAG as bait, and which were treated with a sub-lethal nisin concentration for 0/15/45 min.  Mass spectrometry analysis of co-eluted peptides did not show the presence of CIS-associated peptides at the analyzed timepoints. While we cannot exclude the possibility that our experimental assay requires further optimization to successfully demonstrate a CisA-CIS interaction (e.g. optimization of the use of detergents to improve the solubilization of CisA from Streptomyces membrane, which is currently not an established method), an alternative and equally valid hypothesis is that the interaction between CIS particles and CisA is transient and therefore difficult to capture. We would like to mention, however, that we did detect CisA peptides in crude purifications of CIS particles from nisin-stressed cells (Supplementary Table 2, manuscript: line 301/302), supporting our proposed model that CisA can associate with CIS particles in vivo.

      Further, would not the model predict that stress leads to an increased number of contracted CIS assemblies in the cytoplasm? No clear difference in length of the isolated assemblies if Fig. S7 is seen between untreated and nisin-exposed cells, and also no difference between assemblies from WT and cisA mutant hyphae.

      The reviewer is correct that there is no clear difference in length in the isolated CIS particles shown in Figure S7. This is in line with our results, which show that CisA is not required for the correct assembly of CIS particles and their ability to contract in the presence and absence of nisin treatment. The purpose of Figure S7 was to support this statement. We would like to note that the particles shown in Figure S7 were purified from cell lysates using a crude sheath preparation protocol, during which CIS particles generally contract irrespective of the presence or absence of CisA. Thus, we cannot comment on whether there is an increased number of contracted CIS assemblies in the cytoplasm of nisin-exposed cells. To answer this point, we would need to acquire additional cryo-electron tomograms (cyroET) of the different strains treated with nisin. CryoET is an extremely time and labor-intensive task and given that we currently don’t know the exact dynamics of the CIS-CisA interaction following exogenous stress, we believe this experiment is beyond the scope of this work.

      The interaction of CisA with the CIS assembly is critical for the model but is only supported by Alphafold modelling, predicting interaction between cytoplasmic parts of CisA and Cis11 protein in the baseplate wedge. An experimental demonstration of this interaction would have strengthened the conclusions.

      We agree that direct experimental evidence of this interaction would have further strengthened the conclusions of our study, and we have extensively tried to provide additional experimental evidence. Unfortunately, because of the toxicity of cisA expression in E. coli and the possibly transient nature of the interaction under the experimental conditions used, we were unable to confirm this interaction by biochemical or biophysical techniques, such as co-purification or bacterial two-hybrid assays. Despite these technical challenges, we believe that the AlphaFold predictions provided a valuable hypothesis about the role of CisA in firing and the function of CIS particles in S. coelicolor.

      The cisA mutant showed a similarly accelerated sporulation as was previously reported for CIS-negative strains, which supports the conclusion that CisA is required for function of CISSc. But the results do not add any new insights into how CIS/CisA affects the progression of the developmental life cycle and whether this effect has anything to do with the regulated cell death that is caused by CIS. The same applies to the effect on secondary metabolite production, with no further mechanistic insights added, except reporting similar effects of CIS and CisA inactivations.

      Thank you for your feedback on this aspect of the manuscript. We would like to note that the main focus of this study was to provide further insight into how CIS contraction and firing are mediated in Streptomyces. We used the analysis of accelerated sporulation and secondary metabolite production as a readout to directly assess the functionality of CIS in the presence or absence of CisA and to complement the in situ cryoET data. In summary, our data significantly expand our knowledge of CIS function and firing in Streptomyces and suggest a model in which CisA plays an essential role in mediating the interaction of CIS particles with the membrane, which is required for CIS-mediated cell death. We discuss this model in more detail in the revised manuscript (Line 274-283).

      We agree that we still don’t fully understand the full nature of the signals that trigger CIS contraction, but we do know that the production of CIS is an integral part of the Streptomyces multicellular life cycle as demonstrated by two independent previous studies by us and others (DOI: 10.1038/s41564-023-01341-x and DOI: 10.1038/s41467-023-37087-7 ).

      We further speculate that the assembly and CisA-dependent firing of Streptomyces CIS particles could present a molecular mechanism to dismantle part of the vegetative mycelium. This form of “regulated cell death” could provide two key benefits: (1) to prevent the spread of local cellular damage to the rest of mycelium and (2) to provide additional nutrients for the rest of the mycelium to delay the terminal differentiation into spores, which in turn also affects the production of secondary metabolites.

      Concluding remarks:

      The work will be of interest to anyone interested in contractile injection systems, T6SS, or similar machineries, as well for people working on the biology of streptomycetes. There is also a potential impact of the work in the understanding of how such molecular machineries could have been co-opted during evolution to become a mechanism for regulated cell death. However, this latter aspect remains still poorly understood. Even though this paper adds excellent new structural insights and identifies a putative membrane anchor, it remains elusive how the Streptomyces CIS may lead to cell death. It is also unclear what the advantage would be to trigger death of hyphal compartments in response to stress, as well as how such cell death may impact (or accelerate) the developmental progression. Finally, it is inescapable to wonder whether the Streptomyces CIS could have any role in protection against phage infection.

      We thank Reviewer #2 for the overall supportive assessment of our work. We will briefly discuss functional CIS's impact on Streptomyces development in the revised manuscript. We previously tested if Streptomyces could defend against phages but have not found any experimental evidence to support this idea (unpublished data). The analysis of phage defense mechanisms is an underdeveloped area in Streptomyces research, partly due to the currently limited availability of a diverse phage panel.

      Reviewer #3 (Public review):

      Summary:

      In this work, Casu et al. have reported the characterization of a previously uncharacterized membrane protein CisA encoded in a non-canonical contractile injection system of Streptomyces coelicolor, CISSc, which is a cytosolic CISs significantly distinct from both intracellular membrane-anchored T6SSs and extracellular CISs. The authors have presented the first high-resolution structure of extended CISSc structure. It revealed important structural insights in this conformational state. To further explore how CISSc interacted with cytoplasmic membrane, they further set out to investigate CisA that was previously hypothesized to be the membrane adaptor. However, the structure revealed that it was not associated with CISSc. Using fluorescence microscope and cell fractionation assay, the authors verified that CisA is indeed a membrane-associated protein. They further determined experimentally that CisA had a cytosolic N-terminal domain and a periplasmic C-terminus. The functional analysis of cisA mutant revealed that it is not required for CISSc assembly but is essential for the contraction, as a result, the deletion significantly affects CISSc-mediated cell death upon stress, timely differentiation, as well as secondary metabolite production. Although the work did not resolve the mechanistic detail how CisA interacts with CISSc structure, it provides solid data and a strong foundation for future investigation toward understanding the mechanism of CISSc contraction, and potentially, the relation between the membrane association of CISSc, the sheath contraction and the cell death.

      Strengths:

      The paper is well-structured, and the conclusion of the study is supported by solid data and careful data interpretation was presented. The authors provided strong evidence on (1) the high-resolution structure of extended CISSc determined by cryo-EM, and the subsequent comparison with known eCIS structures, which sheds light on both its similarity and different features from other subtypes of eCISs in detail; (2) the topological features of CisA using fluorescence microscopic analysis, cell fractionation and PhoA-LacZα reporter assays, (3) functions of CisA in CISSc-mediated cell death and secondary metabolite production, likely via the regulation of sheath contraction.

      Weaknesses:

      (1) The data presented are not sufficient to provide mechanistic details of CisA-mediated CISSc contraction, as authors are not able to experimentally demonstrate the direct interaction between CisA with baseplate complex of CISSc (hypothesized to be via Cis11 by structural modeling), since they could not express cisA in E. coli due to its potential toxicity. Therefore, there is a lack of biochemical analysis of direct interaction between CisA and baseplate wedge. In addition, there is no direct evidence showing that CisA is responsible for tethering CISSc to the membrane upon stress, and the spatial and temporal relation between membrane association and contraction remains unclear. Further investigation will be needed to address these questions in future.

      We thank Reviewer #3 for the supportive evaluation and constructive feedback of our study in the non-public review. We appreciate the recognition of the technical limitations of experimentally demonstrating a direct interaction between CisA and CIS baseplate complex, and we agree that further investigations in the future will hopefully provide a full mechanistic understanding of the spatiotemporal interaction of CisA and CIS particular and the subsequent CIS firing.

      To further improve the manuscript, we will revise the text and clarify figures and figure legends as suggested in the non-public review.

      Discussion:

      Overall, the work provides a valuable contribution to our understanding on the structure of a much less understood subtype of CISs, which is unique compared to both membrane-anchored T6SSs and host-membrane targeting eCISs. Importantly, the work serves as a good foundation to further investigate how the sheath contraction works here. The work contributes to expanding our understanding of the diverse CIS superfamilies.

      Thank you.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      - Magnification of the potential CisA-Cis11 model, with side chains at the interface, should be shown in Supplementary Figures 9/10 to help the reader appreciates the intercation between the two subunits.

      Done. A zoomed-in view of the relevant side chains at the CisA-Cis11 interface has been added to Supplementary Figure 9e. For clarity, we decided not to highlight these residues in Supplementary Figure 10 because they are identical to those in Figure 9e.

      - A model where CisA is positionned onto the baseplate (by merging the CisA-Cis11 model and the baseplate structure) will also be informative for the reader.

      We agree that such a presentation would be helpful to visualize the proposed CisA-Cis11 interaction. However, the Cis11 residues predicted to bind CisA are buried in our cryoEM single-particle structure of the elongated Streptomyces CIS. This is not surprising, as the structure is based on a previously established non-contractile CIS mutant variant (PMCID: PMC10066040), which means we were only able to capture one specific configuration of the baseplate complex in the current work. This baseplate configuration is most likely structurally distinct from the baseplate configuration in contracted CIS particles. A similar observation was also reported for the baseplate complex of eCIS particles from Algoriphagus machipongonesis (PMCID: PMC8894135 ).  

      We speculate that in Streptomyces, initial non-specific contacts between CisA and cytoplasmic CIS particles induce a rearrangement of baseplate components, resulting in the exposure of the relevant Cis11 residues, which in turn facilitates a transient interaction between CisA and Cis11. This interaction then leads to additional conformational changes within the baseplate complex, triggering sheath contraction and CIS firing.

      We believe that a transient binding step is a crucial part of the activation process, contributing to the dynamic nature of the system.

      - Providing information on the oligomeric state of CisA will strenghten the manuscript. Authors may consider having blue-native gel analysis of CisA-3xFLAG extracted from Streptomyces or E. coli membranes, or in vivo chemical cross-linking coupled to SDS-PAGE analyses. In case these quite straightforward experiments are not possible, the authors may consider providing AF3 models of various CisA multimers.

      Thank you for these suggestions. Unfortunately, we currently don’t have the capability to conduct additional experiments. However, we have performed additional AF3 modelling to explore potential different configurations of CisA. The results of these analyses suggest that CisA can assemble into a pentamer (see also Response to reviewer 1). We speculate that CisA may exist in different oligomeric states and that membrane-localized CisA monomers oligomerize into a larger protein complex in response to a cellular or extracellular (e.g. nisin) signal, which could then directly or indirectly interact with CIS particles in the cytoplasm to facilitate their recruitment to the membrane and CIS firing. Such a stress-dependent conformational change of CisA could also be a safety mechanism to prevent accidental interaction of CisA with CIS particles and CIS firing.

      We now show the AF model for the predicted CisA pentamer in Figure 3b/c and discuss the potential implications of the different CisA configurations in the revised manuscript.

      Reviewer #2 (Recommendations for the authors):

      - The quantification of contracted versus extended CIS assemblies in the cytoplasm is only presented for the tomograms from the cisA mutant (graph in Fig. S2d). However, there are no data for the WT and complemented mutant to compare with. It would help to add such data, or at least refer to the previous quantification done for the WT in the previous paper. Further, would it be possible to illustrate the difference by measuring lengths of CIS assemblies and plot length distributions (assuming the extended ones are long and contracted are short)?

      Thank you for your suggestions. We have included the results from our previous quantification of CIS assembly states observed in the WT in the revised manuscript (lines 106–110).

      In the acquired tomograms of CIS particles observed in intact and dead hyphae, we consistently observed only two CIS conformations: the fully extended state (average length of 233 nm, diameter of 18 nm) and the fully contracted state (average length of 124 nm, diameter of 23 nm). We have added this information to the revised manuscript (lines 112-114).

      - The Western blot in Fig. 3d, top panel, contains additional bands that are not mentioned. Are they non-specific bands? Absent in disA mutant? It would help if it was clarified in the legend what they are.

      Correct, these additional bands are unspecific bands, which are also visible in the lysate and soluble fraction of wild-type sample (negative control, no FLAG-tagged protein). We have now labelled these bands in the figure and clarified the figure legend.

      - Fig. S8a needs improvement. It was not possible to clearly see the stated effect of disA deletion on secondary metabolite production in these photos.

      We agree and have removed figure panel S8a from the manuscript. The quantification of total actinorhodin production shown in Figure S8b convincingly shows a significantly reduction of actinorhodin production in the cisA deletion mutant compared to the wildtype and the complement mutant.

      - It is not an important point, but the paragraph in lines 109-116 appears more like a re-iteration of the Introduction than Results.

      We agree. We have removed the highlighted text from the Results section and added some of the information to the introduction.

      - Line 206 appears to have a typo. Should it not be WT instead of WT cisA?

      Correct. This is a typo which has been fixed. Thank you.

      - At the end of the Discussion, it is suggested that a stepwise mechanism of recruiting CIS to the membrane and then triggering firing would prevent unwanted activation and self-inflicted death. Since both steps appear to be dependent in DisA, it would be good to more clearly spell out how such a stepwise mechanism would work and how it could prevent spontaneous and erroneous firing of the system.

      Thank you for this suggestion. We have revised the text to clarify the proposed stepwise mechanism. Based on additional structural modeling, we propose that the conserved extra-cytoplasmic domain of CisA may play a role in sensing stress signals. Binding of a ‘stress-associated molecule’ could induce a conformational change in CisA, a hypothesis supported by: (1) Foldseek protein structure searches, which suggest that the conserved C-terminal CisA domain resembles substrate/solute-binding proteins, and (2) AlphaFold3 models predicting that CisA can form a pentamer via its putative substrate-binding domain. This suggests that a transition from CisA monomers to pentamers in response to stress may serve as a key checkpoint, activating CisA and facilitating the recruitment of CIS assemblies to the membrane, either directly or indirectly. Conversely, in the absence of a stress signal, CisA is likely to remain in its monomeric (resting) form, incapable of triggering CIS firing. We have revised the discussion to explain the proposed model in more detail.

      We recognize that this model poses many testable hypotheses that we currently cannot test but aim to address in the future.

      Reviewer #3 (Recommendations for the authors):

      There are a few concerns potentially worth addressing to strengthen the study or for future investigation.

      (1) It would be worth considering moving the first part of the result ('CisA is required for CISSc contraction in situ') after presenting the structure of extended CISSc, and combining it with the last part of the result section ('CisA is essential for the cellular function of CISSc'), as both parts describe the functional characterization of CisA.

      We appreciate the reviewer’s suggestion but have chosen to retain the current order of the results. As this manuscript focuses on the role of CisA, we believe that first establishing a functional link between CisA and CIS contraction provides essential context and motivation for the study.

      (2) Line 169: it is not clear to me if the fusion of CisA with mCherry is functional (if it complements the native CisA). Moreover, it was not shown if its localization changes under nisin stress or in the strain with non-contractile CISSc.

      We have not tested if the CisA-mCherry fusion is fully functional. While we cannot exclude the possibility that the activity of this protein fusion is compromised in vivo, we believe that the described accumulation of CisA-mCherry at the membrane is accurate. This conclusion is further supported by the results obtained from protein fractionation experiments and the membrane topology assay (Figure 3).

      We did not examine if the localization of CisA-mCherry changes in CIS mutant strains under nisin-stress, but this is something we will follow up on in the future.

      (3) In ref 18, the previous work from the same team presented a functional fluorescent fusion of Cis2 (sheath), thus, it will be interesting to see if (i) Cis2 localization and dynamics is affected by the absence of CisA under normal and stressed conditions; (ii) if Cis2 shows any co-localization with CisA under normal and especially stressed conditions, and potentially, its timing correlation to ghost cell formation by time-lapse imaging of both fusions.

      We thank this reviewer for the suggestions, and we plan to address these questions in the future.

      (4) Line 261: it was hypothesized by authors that the cytosolic portion of CisA was required for interacting with Cis11. While it was not possible to verify the direct interaction at current state, a S. coelicolor mutant lacking this cytosolic domain may be of help to indirectly test the hypothesis. Moreover, it would be interesting to see if the cytosolic region alone is enough to induce the contraction upon stress (by removing the TM-C region). If so, whether it leads to cell death, or if it is insufficient to cause cell death without membrane association despite the sheath contraction. If not, it would suggest that membrane association occurs before contraction.

      These are really great suggestions and if we had the manpower and resources, we would have performed these experiments. We plan to follow up on these questions in the future.

      However, additional structural modelling of CisA indicates that CisA may exist in different configurations (see response to Reviewer #1 and #2), a monomeric and/or a pentameric configuration. In these structural models (revised Figure 3), CisA oligomerization is mediated by the annotated periplasmic solute-binding domain. It is conceivable that CisA oligomerization (e.g. in response to a stress signal) presents a critical checkpoint that results in a conformational change within CisA monomers that subsequently drives CisA oligomerization into a configuration primed to interact with CIS particles. We would therefore speculate that the expression of just the cytoplasmic CisA domain may not be sufficient for CIS contraction and cell death.

      (5) Line 263: as it was not possible to express full-length cisA in E. coli, making it difficult to assess the interaction between CisA and Cis11, it may be worth considering expressing the cytosolic portion of CisA (ΔTM-C) instead of full-length CisA, or alternatively performing a co-immunoprecipitation assay of CisA (i.e., with an affinity tag) from S. coelicolor cultures under stressed conditions. However, I am aware that these may be beyond the scope of this work but can be considered for future investigation in general.

      Thank you for your suggestions and your understanding that some of this work is beyond the scope of this work. We have performed CisA-FLAG co-immunoprecipitation experiments from S. coelicolor cultures that were treated with nisin for 0/15/45 min. However, mass spectrometry analysis of co-eluted peptides did not show the presence of CIS-associated peptides at the analysed timepoints. While we cannot exclude technical issues with our assays that resulted in an inefficient solubilization of CisA from Streptomyces membranes, an alternative hypothesis is that the interaction between CIS particles and CisA is very transient and therefore difficult to capture. We would like to mention, however, that we did detect CisA peptides in crude purifications of CIS particles from nisin-stressed cells (Supplementary Table 2, manuscript: line 301/302), supporting our proposed model that CisA can associate with CIS particles in vivo.

      Minor points:

      (1) I will suggest moving Supplementary Fig 2d with control quantification of WT strain and complementation strain (similar to Fig 3g from ref 18) to the main Fig 1, as the quantitative representation with better comparison without going back and forth to ref 18.

      Thank you for your suggestion. Instead of moving Supplementary Fig. 2d to the main figure, we have added additional information in lines 106–110 to discuss the previous quantification of CIS assembly states in the WT, as described in our earlier work. We believe this approach allows readers to easily reference our established quantification without compromising the flow of the main figures.

      (2) Line 52/785: as work of Ref 12 has recently been published DOI: 10.1126/sciadv.adp7088, the reference should be updated accordingly.

      This reference has been updated. Thank you.

      (3) A brief description of key differences between contracted (ref 18) and extended sheath structure will be a good addition for a broader audience.

      Thank you for this suggestion. We have added more information on lines 178–180.

      (4) Fig 3d: it is not clear how well the samples from different fractions were normalized in amount (volume and cell density), but there was an inconsistency in the amount of CisA-Flag in lysate, vs. soluble and membrane fractions (total protein amount combined from soluble fraction and membrane fraction together seemed to be more than in the lysate, while in theory it should be more or less equal; and the amount of WhiA from WT seemed to be less than from the CisA-Flag strain). In the method section, it was mentioned that 'The final pellet was dissolved in 1/10 of the initial volume with wash buffer (no urea). Equi-volume amounts of fractions were mixed with 2x SDS sample buffer and analyzed by immunoblotting.' But it is still not clear whether equivalent amounts (normalized to the same OD for example) were used and if we could directly compare. A brief clarification in the legend of how samples were prepared is needed.

      The samples were normalized by first using the same volume of starting material (similar culture density and incubation period for each strain) and by loading equal volumes of each fraction for analysis. After fractionation, equi-volume amounts of the soluble and membrane protein fractions were mixed with 2× SDS sample buffer and subjected to immunoblotting, ensuring a consistent basis for comparison between samples. We have revised the figure legend and Material and Method sections to make this clear.

      We agree that the amount of CisA-3xFLAG appears slightly lower in the “Lysate” fraction compared to the “Membrane” fraction in Figure 3d (now Fig. 3f). However, this does not affect the overall conclusion of this experiment, showing that CisA-3xFLAG is clearly enriched in the membrane fraction.

      For reference, please find below the uncropped version of this Western blot image. Based on the signal of the unspecific bands, we would like to argue that equal amounts of samples obtained from the WT control strain (no FLAG epitope present) and a strain producing CisA-3xFLAG were loaded for each of the fractions. When we revisited this data, we noted that the protein size marker was wrong. This has been fixed.

      Author response image 1.

      (5) Fig. 4f: statistical analysis is missing.

      The missing statistical analysis has been added to this figure and figure legend.

    1. eLife Assessment

      This important study provides information on the TMEM16 family of membrane proteins, which play roles in lipid scrambling and ion transport. By simulating 27 structures representing five distinct family members, the authors captured hundreds of lipid scrambling events, offering insights into the mechanisms of lipid translocation and the specific protein regions involved in these processes. While the data on comparison of scrambling competence is compelling, the evidence for outside-the-groove scramblase activity without experimental validation is missing and is based on a limited set of observed events.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript investigates lipid scrambling mechanisms across TMEM16 family members using coarse-grained molecular dynamics (MD) simulations. While the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations, several critical issues undermine its novelty, impact, and alignment with experimental observations.

      Review on revised version:

      The referee notes that the authors, in their response letter, have concurred with most of the concerns originally raised. Specifically, the authors acknowledge the referee's view that the manuscript primarily confirms previously reported findings and does not present a significantly novel advance, particularly regarding the central observation of groove-mediated lipid scrambling in the open Ca²⁺-bound TMEM16 structures. The authors have also acknowledged the potential discrepancies with existing experimental studies and have addressed this point candidly through additional discussion. Furthermore, the referee appreciates that the authors have echoed the concern regarding the limited statistical robustness of the observed scrambling events.<br /> Given that the authors have essentially affirmed the key points raised in the initial review, the referee believes that these acknowledgements reinforce the basis of the original assessment. Therefore, the referee maintains the original opinion that, despite its technical merits and useful discussion made in the revised version, the manuscript does not offer sufficient novelty or mechanistic depth.

    3. Reviewer #2 (Public review):

      Summary:

      Stephens et al. present a comprehensive study of TMEM16-members via coarse-grained MD simulations (CGMD). They particularly focus on the scramblase ability of these proteins and aim to characterize the "energetics of scrambling". Through their simulations, the authors interestingly relate protein conformational states to membrane's thickness and link those to the scrambling ability of TMEM members, measured as the trespassing tendency of lipids across leaflets. They validate their simulation with a direct qualitative comparison with Cryo-EM maps.

      Strengths:

      The study demonstrates an efficient use of CGMD simulations to explore lipid scrambling across various TMEM16 family members. By leveraging this approach, the authors are able to bypass some of the sampling limitations inherent in all-atom simulations, providing a more comprehensive and high-throughput analysis of lipid scrambling. Their comparison of different protein conformations, including open and closed groove states, presents a detailed exploration of how structural features influence scrambling activity, adding significant value to the field. A key contribution of this study is the finding that groove dilation plays a central role in lipid scrambling. The authors observe that for scrambling-competent TMEM16 structures, there is substantial membrane thinning and groove widening. The open Ca2+-bound nhTMEM16 structure (PDB ID 4WIS) was identified as the fastest scrambler in their simulations, with scrambling rates as high as 24.4 {plus minus} 5.2 events per μs. This structure also shows significant membrane thinning (up to 18 Å), which supports the hypothesis that groove dilation lowers the energetic barrier for lipid translocation, facilitating scrambling.

      The study also establishes a correlation between structural features and scrambling competence, though analyses often lack statistical robustness and quantitative comparisons. The simulations differentiate between open and closed conformations of TMEM16 structures, with open-groove structures exhibiting increased scrambling activity, while closed-groove structures do not. This finding aligns with previous research suggesting that the structural dynamics of the groove are critical for scrambling. Furthermore, the authors explore how the physical dimensions of the groove qualitatively correlate with observed scrambling rates. For example, TMEM16K induces increased membrane thinning in its open form, suggesting that membrane properties, along with structural features, play a role in modulating scrambling activity.

      Another significant finding is the concept of "out-of-the-groove" scrambling, where lipid translocation occurs outside the protein's groove. This observation introduces the possibility of alternate scrambling mechanisms that do not follow the traditional "credit-card model" of groove-mediated lipid scrambling. In their simulations, the authors note that these out-of-the-groove events predominantly occur at the dimer interface between TM3 and TM10, especially in mammalian TMEM16 structures. While these events were not observed in fungal TMEM16s, they may provide insight into Ca2+-independent scrambling mechanisms, as they do not require groove opening.

      Weaknesses:

      A significant challenge of the study is the discrepancy between the scrambling rates observed in CGMD simulations and those reported experimentally. Despite the authors' claim that the rates are in line experimentally, the observed differences can mean large energetic discrepancies in describing scrambling (larger than 1kT barrier in reality). For instance, the authors report scrambling rates of 10.7 events per μs for TMEM16F and 24.4 events per μs for nhTMEM16, which are several orders of magnitude faster than experimental rates. While the authors suggest that this discrepancy could be due to the Martini 3 force field's faster diffusion dynamics, this explanation does not fully account for the large difference in rates. A more thorough discussion on how the choice of force field and simulation parameters influence the results, and how these discrepancies can be reconciled with experimental data, would strengthen the conclusions. Likewise, rate calculations in the study are based on 10 μs simulations, while experimental scrambling rates occur over seconds. This timescale discrepancy limits the study's accuracy, as the simulations may not capture rare or slow scrambling events that are observed experimentally and therefore might underestimate the kinetics of scrambling. It's however, important to recognize that it's hard (borderline unachievable) to pinpoint reasonable kinetics for systems like this using the currently available computational power and force field accuracy. The faster diffusion in simulations may lead to overestimated scrambling rates, making the simulation results less comparable to real-world observations. Thus, I would therefore read the findings qualitatively rather than quantitatively. An interesting observation is the asymmetry observed in the scrambling rates of the two monomers. Since MARTINI is known to be limited in correctly sampling protein dynamics, the authors, in order to preserve the fold, have applied a strong (500 kJ mol-1 nm-2) elastic network. However, I am wondering how the ENM applies across the dimer and if any asymmetry can be noticed in the application of restraints for each monomer and at the dimer interface. How can this have potentially biased the asymmetry in the scrambling rates observed between the monomers? Is this artificially obtained from restraining the initial structure, or is the asymmetry somehow gatekeeping the scrambling mechanism to occur majorly across a single monomer? Answering this question would have far-reaching implications to better describe the mechanism of scrambling.

      Notably, the manuscript does not explore the impact of membrane composition on scrambling rates. While the authors use a specific lipid composition (DOPC) in their simulations, they acknowledge that membrane composition can influence scrambling activity. However, the study does not explore how different lipids or membrane environments or varying membrane curvature and tension, could alter scrambling behaviour. I appreciate that this might have been beyond the scope of this particular paper and the authors plan to further chase these questions, as this work sets a strong protocol for this study. Contextualizing scrambling in the context of membrane composition is particularly relevant since the authors note that TMEM16K's scrambling rate increases tenfold in thinner membranes, suggesting that lipid-specific or membrane-thickness-dependent effects could play a role.

      Comments on revisions:

      I have carefully reviewed the replies of the author, which address the points I raised and improved the manuscript by making the changes outlined in their response. Particularly, I am pleased to see that the authors report ensemble averages in Figure 1-supplement 1 and add relevant information in a newly created table. I welcome the refinement of the discussion towards a cautionary approach in describing quantitatively the findings of experiments and computations for what concerns scrambling rates. I still feel that proper statistical analysis to compare the distributions in Figure 3-figure supplement 6 would have made the points claimed even stronger, but - at the same time - I do see the points of the authors in commenting the differences between these distributions more qualitatively. Overall, I support the publication of this manuscript, it has been a pleasure to read it.

    4. Reviewer #3 (Public review):

      Summary:

      The paper investigates the TMEM16 family of membrane proteins, which play roles in lipid scrambling and ion transport. A total of 27 experimental structures from five TMEM16 family members were analyzed, including mammalian and fungal homologs (e.g., TMEM16A, TMEM16F, TMEM16K, nhTMEM16, afTMEM16). The identified structures were in both Ca²⁺-bound (open) and Ca²⁺-free (closed) states to compare conformations and were preprocessed (e.g., modeling missing loops) and equilibrated. Coarse-grain simulations were performed in DOPC membranes for 10 microseconds to capture the scrambling events. These events were identified by tracking lipids transitioning between the two membrane leaflets and they analysed correlation between scrambling rates, in addition, structural properties such as groove dilation and membrane thinning were calculated. They report 700 scrambling events across structures and the figure 2 elaborates on how open structures show higher activity, also as expected. The authors also address how structures may require open groove, this and other mechanisms around scrambling is a bit controversial in the field.

      Strengths:

      The strength of this study emerges from comparative analysis of multiple structural starting points and understand global/local motions of the protein with respect to lipid movement. Although the protein is well-studied, both experimentally and computationally, the understanding of conformational events in different family members, especially membrane thickness less compared to fungal scramblases offers good insights.

      Weaknesses:

      The weakness of the work is to fully reconcile with experimental evidence of Ca²⁺-independent scrambling rates observed in prior studies, but this part is also challenging using coarse-grain molecular simulations. Previous reports have identified lipid crossing, packing defects and other associated events, so it is difficult to place this paper in that context. However, the absence of validation leaves certain claims, like alternative scrambling pathways, speculative.

    5. Author response:

      The following is the authors’ response to the current reviews.

      We wanted to clarify Reviewer #1’s latest comment in the last round of review, “Furthermore, the referee appreciates that the authors have echoed the concern regarding the limited statistical robustness of the observed scrambling events.” We appreciate the follow up information provided from Reviewer #1 that their comment is specifically about the low count alternative pathway events that we view at the dimer interface, and not the statistics of the manuscript overall as they believe that “the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations (Reviewer #1)”. We agree with the Reviewer and acknowledge that overall our coarse-grained study represents the most comprehensive single manuscript of the entire TMEM16 family to date.


      The following is the authors’ response to the original reviews.

      Public Review:

      Reviewer #1 (Public review):

      Summary:

      The manuscript investigates lipid scrambling mechanisms across TMEM16 family members using coarse-grained molecular dynamics (MD) simulations. While the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations, several critical issues undermine its novelty, impact, and alignment with experimental observations.

      Critical issues:

      (1) Lack of Novelty:

      The phenomenon of lipid scrambling via an open hydrophilic groove is already well-established in the literature, including through atomistic MD simulations. The authors themselves acknowledge this fact in their introduction and discussion. By employing coarse-grained simulations, the study essentially reiterates previously known findings with limited additional mechanistic insight. The repeated observation of scrambling occurring predominantly via the groove does not offer significant advancement beyond prior work.

      We agree with the reviewer’s statement regarding the lack of novelty when it comes to our observations of scrambling in the groove of open Ca2+-bound TMEM16 structures. However, we feel that the inclusion of closed structures in this study, which attempts to address the yet unanswered question of how scrambling by TMEM16s occurs in the absence of Ca2+, offers new observations for the field. In our study we specifically address to what extent the induced membrane deformation, which has been theorized to aid lipids cross the bilayer especially in the absence of Ca2+, contributes to the rate of scrambling (see references 36, 59, and 66). There are also several TMEM16F structures solved under activating conditions (bound to Ca2+ and in the presence of PIP2) which feature structural rearrangements to TM6 that may be indicative of an open state (PDB 6P48) and had not been tested in simulations. We show that these structures do not scramble and thereby present evidence against an out-of-the-groove scrambling mechanism for these states. Although we find a handful of examples of lipids being scrambled by Ca2+-free structures of TMEM16 scramblases, none of our simulations suggest that these events are related to the degree of deformation.

      (2) Redundancy Across Systems:

      The manuscript explores multiple TMEM16 family members in activating and non-activating conformations, but the conclusions remain largely confirmatory. The extensive dataset generated through coarse-grained MD simulations primarily reinforces established mechanistic models rather than uncovering fundamentally new insights. The effort, while statistically robust, feels excessive given the incremental nature of the findings.

      Again, we agree with the reviewer’s statement that our results largely confirm those published by other groups and our own. We think there is however value in comparing the scrambling competence of these TMEM16 structures in a consistent manner in a single study to reduce inconsistencies that may be introduced by different simulation methods, parameters, environmental variables such as lipid composition as used in other published works of single family members. The consistency across our simulations and high number of observed scrambling events have allowed us to confirm that the mechanism of scrambling is shared by multiple family members and relies most obviously on groove dilation.

      (3) Discrepancy with Experimental Observations:

      The use of coarse-grained simulations introduces inherent limitations in accurately representing lipid scrambling dynamics at the atomistic level. Experimental studies have highlighted nuances in lipid permeation that are not fully captured by coarse-grained models. This discrepancy raises questions about the biological relevance of the reported scrambling events, especially those occurring outside the canonical groove.

      We thank the reviewer for bringing up the possible inaccuracies introduced by coarse graining our simulations. This is also a concern for us, and we address this issue extensively in our discussion. As the reviewer pointed out above, our CG simulations have largely confirmed existing evidence in the field which we think speaks well to the transferability of observations from atomistic simulations to the coarse-grained level of detail. We have made both qualitative and quantitative comparisons between atomistic and coarse-grained simulations of nhTMEM16 and TMEM16F (Figure 1, Figure 4-figure supplement 1, Figure 4-figure supplement 5) showing the two methods give similar answers for where lipids interact with the protein, including outside of the canonical groove. We do not dispute the possible discrepancy between our simulations and experiment, but our goal is to share new nuanced ideas for the predicted TMEM16 scrambling mechanism that we hope will be tested by future experimental studies.

      (4) Alternative Scrambling Sites:

      The manuscript reports scrambling events at the dimer-dimer interface as a novel mechanism. While this observation is intriguing, it is not explored in sufficient detail to establish its functional significance. Furthermore, the low frequency of these events (relative to groove-mediated scrambling) suggests they may be artifacts of the simulation model rather than biologically meaningful pathways.

      We agree with the reviewer that our observed number of scrambling events in the dimer interface is too low to present it as strong evidence for it being the alternative mechanism for Ca2+-independent scrambling. This will require additional experiments and computational studies which we plan to do in future research. However, we are less certain that these are artifacts of the coarse-grained simulation system as we observed a similar event in an atomistic simulation of TMEM16F.

      Conclusion:

      Overall, while the study is technically sound and presents a large dataset of lipid scrambling events across multiple TMEM16 structures, it falls short in terms of novelty and mechanistic advancement. The findings are largely confirmatory and do not bridge the gap between coarse-grained simulations and experimental observations. Future efforts should focus on resolving these limitations, possibly through atomistic simulations or experimental validation of the alternative scrambling pathways.

      Reviewer #2 (Public review):

      Summary:

      Stephens et al. present a comprehensive study of TMEM16-members via coarse-grained MD simulations (CGMD). They particularly focus on the scramblase ability of these proteins and aim to characterize the "energetics of scrambling". Through their simulations, the authors interestingly relate protein conformational states to the membrane's thickness and link those to the scrambling ability of TMEM members, measured as the trespassing tendency of lipids across leaflets. They validate their simulation with a direct qualitative comparison with Cryo-EM maps.

      Strengths:

      The study demonstrates an efficient use of CGMD simulations to explore lipid scrambling across various TMEM16 family members. By leveraging this approach, the authors are able to bypass some of the sampling limitations inherent in all-atom simulations, providing a more comprehensive and high-throughput analysis of lipid scrambling. Their comparison of different protein conformations, including open and closed groove states, presents a detailed exploration of how structural features influence scrambling activity, adding significant value to the field. A key contribution of this study is the finding that groove dilation plays a central role in lipid scrambling. The authors observe that for scrambling-competent TMEM16 structures, there is substantial membrane thinning and groove widening. The open Ca2+-bound nhTMEM16 structure (PDB ID 4WIS) was identified as the fastest scrambler in their simulations, with scrambling rates as high as 24.4 {plus minus} 5.2 events per μs. This structure also shows significant membrane thinning (up to 18 Å), which supports the hypothesis that groove dilation lowers the energetic barrier for lipid translocation, facilitating scrambling.

      The study also establishes a correlation between structural features and scrambling competence, though analyses often lack statistical robustness and quantitative comparisons. The simulations differentiate between open and closed conformations of TMEM16 structures, with open-groove structures exhibiting increased scrambling activity, while closed-groove structures do not. This finding aligns with previous research suggesting that the structural dynamics of the groove are critical for scrambling. Furthermore, the authors explore how the physical dimensions of the groove qualitatively correlate with observed scrambling rates. For example, TMEM16K induces increased membrane thinning in its open form, suggesting that membrane properties, along with structural features, play a role in modulating scrambling activity.

      Another significant finding is the concept of "out-of-the-groove" scrambling, where lipid translocation occurs outside the protein's groove. This observation introduces the possibility of alternate scrambling mechanisms that do not follow the traditional "credit-card model" of groove-mediated lipid scrambling. In their simulations, the authors note that these out-of-the-groove events predominantly occur at the dimer interface between TM3 and TM10, especially in mammalian TMEM16 structures. While these events were not observed in fungal TMEM16s, they may provide insight into Ca2+-independent scrambling mechanisms, as they do not require groove opening.

      Weaknesses:

      A significant challenge of the study is the discrepancy between the scrambling rates observed in CGMD simulations and those reported experimentally. Despite the authors' claim that the rates are in line experimentally, the observed differences can mean large energetic discrepancies in describing scrambling (larger than 1kT barrier in reality). For instance, the authors report scrambling rates of 10.7 events per μs for TMEM16F and 24.4 events per μs for nhTMEM16, which are several orders of magnitude faster than experimental rates. While the authors suggest that this discrepancy could be due to the Martini 3 force field's faster diffusion dynamics, this explanation does not fully account for the large difference in rates. A more thorough discussion on how the choice of force field and simulation parameters influence the results, and how these discrepancies can be reconciled with experimental data, would strengthen the conclusions. Likewise, rate calculations in the study are based on 10 μs simulations, while experimental scrambling rates occur over seconds. This timescale discrepancy limits the study's accuracy, as the simulations may not capture rare or slow scrambling events that are observed experimentally and therefore might underestimate the kinetics of scrambling. It's however important to recognize that it's hard (borderline unachievable) to pinpoint reasonable kinetics for systems like this using the currently available computational power and force field accuracy. The faster diffusion in simulations may lead to overestimated scrambling rates, making the simulation results less comparable to real-world observations. Thus, I would therefore read the findings qualitatively rather than quantitatively. An interesting observation is the asymmetry observed in the scrambling rates of the two monomers. Since MARTINI is known to be limited in correctly sampling protein dynamics, the authors - in order to preserve the fold - have applied a strong (500 kJ mol-1 nm-2) elastic network. However, I am wondering how the ENM applies across the dimer and if any asymmetry can be noticed in the application of restraints for each monomer and at the dimer interface. How can this have potentially biased the asymmetry in the scrambling rates observed between the monomers? Is this artificially obtained from restraining the initial structure, or is the asymmetry somehow gatekeeping the scrambling mechanism to occur majorly across a single monomer? Answering this question would have far-reaching implications to better describe the mechanism of scrambling.

      The main aim of our computational survey was to directly compare all relevant published TMEM16 structures in both open and closed states using the Martini 3 CGMD force field. Our standardized simulation and analysis protocol allowed us to quantitatively compare scrambling rates across the TMEM16 family, something that has never been done before. We do acknowledge that direct comparison between simulated versus experimental scrambling rates is complicated and is best to be interpreted qualitatively. In line with other reports (e.g., Li et al, PNAS 2024), lipid scrambling in CGMD is 2-3 orders of magnitude faster than typical experimental findings. In the CG simulation field, these increased dynamics due to the smoother energy landscape are a well known phenomenon. In our view, this is a valuable trade-off for being able to capture statistically robust scrambling dynamics and gain mechanistic understanding in the first place, since these are currently challenging to obtain otherwise. For example, with all-atom MD it would have been near-impossible to conclude that groove openness and high scrambling rates are closely related, simply because one would only measure a handful of scrambling events in (at most) a handful of structures.

      Considering the elastic network: the reviewer is correct in that the elastic network restrains the overall structure to the experimental conformation. This is necessary because the Martini 3 force field does not accurately model changes in secondary (and tertiary) structure. In fact, by retaining the structural information from the experimental structures, we argue that the elastic network helped us arrive at the conclusion that groove openness is the major contributing factor in determining a protein’s scrambling rate. This is best exemplified by the asymmetric X-ray structure of TMEM16K (5OC9), in which the groove of one subunit is more dilated than the other. In our simulation, this information was stored in the elastic network, yielding a 4x higher rate in the open groove than in the closed groove, within the same trajectory.

      Notably, the manuscript does not explore the impact of membrane composition on scrambling rates. While the authors use a specific lipid composition (DOPC) in their simulations, they acknowledge that membrane composition can influence scrambling activity. However, the study does not explore how different lipids or membrane environments or varying membrane curvature and tension, could alter scrambling behaviour. I appreciate that this might have been beyond the scope of this particular paper and the authors plan to further chase these questions, as this work sets a strong protocol for this study. Contextualizing scrambling in the context of membrane composition is particularly relevant since the authors note that TMEM16K's scrambling rate increases tenfold in thinner membranes, suggesting that lipid-specific or membrane-thickness-dependent effects could play a role.

      Considering different membrane compositions: for this study, we chose to keep the membranes as simple as possible. We opted for pure DOPC membranes, because it has (1) negligible intrinsic curvature, (2) forms fluid membranes, and (3) was used previously by others (Li et al, PNAS 2024). As mentioned by the reviewer, we believe our current study defines a good, standardized protocol and solid baseline for future efforts looking into the additional effects of membrane composition, tension, and curvature that could all affect TMEM16-mediated lipid scrambling.

      Reviewer #3 (Public review):

      Strengths:

      The strength of this study emerges from a comparative analysis of multiple structural starting points and understanding global/local motions of the protein with respect to lipid movement. Although the protein is well-studied, both experimentally and computationally, the understanding of conformational events in different family members, especially membrane thickness less compared to fungal scramblases offers good insights.

      We appreciate the reviewer recognizing the value of the comparative study. In addition to valuable insights from previous experimental and computational work, we hope to put forward a unifying framework that highlights various TMEM16 structural features and membrane properties that underlie scrambling function.

      Weaknesses:

      The weakness of the work is to fully reconcile with experimental evidence of Ca²⁺-independent scrambling rates observed in prior studies, but this part is also challenging using coarse-grain molecular simulations. Previous reports have identified lipid crossing, packing defects, and other associated events, so it is difficult to place this paper in that context. However, the absence of validation leaves certain claims, like alternative scrambling pathways, speculative.

      Answer: It is generally difficult to quantitatively compare bulk measurements of scrambling phenomena with simulation results. The advantage of simulations is to directly observe the transient scrambling events at a spatial and temporal resolution that is currently unattainable for experiments. The current experimental evidence for the precise mechanism of Ca2+-independent scrambling is still under debate. We therefore hope to leverage the strength of MD and statistical rigor of coarse-grained simulations to generate testable hypotheses for further structural, biochemical, and computational studies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The findings are largely confirmatory and do not bridge the gap between coarse-grained simulations and experimental observations. Future efforts should focus on resolving these limitations, possibly through atomistic simulations or experimental validation of the alternative scrambling pathways.

      While we agree with what the reviewer may be hinting at regarding limitations of coarse-grained MD simulations, we believe that our study holds much more merit than this comment suggests. We have provided something that has yet to be done in the field: a comprehensive study that directly compares the scrambling rates of multiple TMEM16 family members in different conformations using identical simulation conditions. Our work clearly shows that a sufficiently dilated grooves is the major structural feature that enables robust scrambling for all TMEM16 scramblases members with solved structures. While all TMEM16s cause significant distortion and thinning of the membrane, we assert that the extreme thinning observed around open grooves is significantly enhanced by the lipid scrambling itself as the two leaflets merge through lipid exchange.  We saw no evidence that membrane thinning/distortion alone, in the absence of an open groove, could support scrambling at the rates observed under activating conditions or even the low rates observed in Ca2+-independent scrambling. Moreover, our handful of observations of scrambling events outside of the groove, which has not yet been reported in any study, opens an exciting new direction for studying alternative scrambling mechanisms. That said, we are currently following up on many of the observations reported here such as: scrambling events outside the groove, the kinetics of scrambling, the possibility that lipids line the groove of non-scramblers like TMEM16A, etc. This is being done experimentally with our collaborators through site directed mutagenesis and with all-atom MD in our lab. Unfortunately, it is well beyond the scope of the current study to include all of this in the current paper.

      Reviewer #2 (Recommendations for the authors):

      Major comments and questions:

      (1) Line 214 and Figure 1- Figure Supplement 1: why have you only compared the final frame of the trajectory to the cryo-EM structure? Even if these comparisons are qualitative, they should be representative of the entire trajectory, not a single frame.

      We thank the reviewer for this suggestion and replaced the single-frame snapshots in Figure 1-figure supplement 1 for ensemble-averaged head groups densities. The overall agreement between membrane shapes in CGMD and cryo-EM was not affected by this change.

      (2) Lines 228-231: You comment 'Residues in this site on nhTMEM16 and TMEMF also seem to play a role in scrambling but the mechanism by which they do so is unclear.' This is something you could attempt to quantify in the simulations by calculating the correlation between scrambling and protein-membrane interactions/contacts in this site. Can you speculate on a mechanism that might be a contributing factor?

      We probed the correlation between these residues and scrambling lipids, as suggested by the reviewer, and interestingly not all scrambling lipids interact with these residues. Yet there is strong lipid density in this vicinity (see insets in Figure 1 and Figure 4-figure supplement 2). These observations lead us to suspect these residues impact scrambling indirectly through influencing the conformation of the protein or flexibility and shape of the membrane. This interpretation fits with mutagenesis studies highlighting a role for these residues in scrambling (see refs 59, 62, and 67). Specifically, Falzone et al. 2022 (ref 59) suggested that they may thin the membrane near the groove, but this has not been tested via structure determination and a detailed model of how they impact scrambling is missing. We could address this question with in silico mutations; however, CG simulation is not an appropriate method to study large scale protein dynamics, and AA simulations are likely best, but beyond the scope of this paper.

      (3) Lines 240-245 and Figure 1B: This section discusses the coupling between membrane distortions and the sinusoidal curve around the protein, however, Figure 1B only shows snapshots of the membrane distortions. Is it possible to understand how these two collective variables are correlated quantitatively (as opposed to the current qualitative analysis)?

      We believe that it may be possible to quantitatively capture these two key features of the membrane, as we did previously with nhTMEM16 using our continuum elasticity-based model of the membrane (Bethel and Grabe 2016). Our model agreed with all atom MD surfaces to within ~1 Å, hence showing good quantitative agreement throughout the entire membrane. However, we doubt that we could distill the essence of our model down to a simple functional relationship between the sinusoidal wave and pinching, which we think the reviewer is asking. Rather, we believe that the large-scale sinusoidal distortion (collective variable 1) and pinching/distortion (collective variable 2) near the groove arise from the interplay of the specific protein surface chemistry for each protein (patterning of polar and non-polar residues) and the membrane. This is why we chose to simply report the distinct patterns that the family members impose on the surrounding membrane, which we think is fascinating. Specifically, Fig. 1B shows that different TMEM16 family members distort the membrane in different ways. Most notably, fungal TMEM16s feature a more pronounced sinusoidal deformation, whereas the mammalian members primarily produce local pinching. Then, in Fig. 3A we show that the thinning at the groove happens in all structures and is more pronounced in open, scrambling-competent conformations. In other words, proteins can show very strong thinning (e.g. TMEM16K, 5OC9) even though the membrane generally remains flat.

      (4) Lines 257-258: Authors comment that TMEM16A lacks scramblase activity yet can achieve a fully lipid-lined groove (note the typo - should be lipid-lined, not lipid-line). Is a fully lipid-lined groove a prerequisite for scramblase activity? Are lipid-lined grooves the only requirement for scramblase activity? Could the authors clarify exactly what the prerequisite for scramblase activity is to avoid any confusion; this will be useful for later descriptions (i.e. line 295) where scrambling competence is again referred to. Additionally, the associated figure panel (Figure 1D) shows a snapshot of this finding but lacks any statistical quantifications - is a fully lipid-lined groove a single event? Perhaps the additional analyses, such as the groove-lipid contacts, may be useful here.

      The definition of lipid scrambling is that a lipid fully transitions from one membrane leaflet to the other. While a single lipid could transition through the groove on its own, it is well documented in both atomistic and CG MD simulations, that lipid scrambling typically happens through a lipid-lined groove, as shown in Fig. 1A-B. The lipids tend to form strong choline-to-phosphate interactions with nearest neighbors that make this energetically favorable. That said, lipid-lined grooves are not sufficient for robust scrambling, which is what we show in Fig. 1D where the non-scrambler TMEM16A did in fact feature a lipid-lined groove. As suggested, we performed contact analysis and found that residue K645 on TM6 in the middle of the groove contacts lipids in 9.2% of the simulation frames.

      To get a better understanding of how populated the TM4-TM6 pathway is with lipids across all simulated structures, we determined for every simulation frame how many headgroup beads resided in the groove. This indicates that the ion-conductive state of TMEM16A (5OYB*, Fig. 1D) only had 1 lipid in the pathway, on average, meaning that the configuration shown Fig. 1D is indeed exceptional. As a reference, our strongest scrambler nhTMEM16 4WIS, had an average of 2.8 lipids in the groove. We added a table containing the means and standard deviations that resulted from this analysis as Figure 1-Table supplement 1.

      (5) Lines 295-298 : The scrambling rates of the Ca²⁺-bound and Ca²⁺-free structures fall within overlapping error margins, it becomes difficult to definitively state that Ca²⁺ binding significantly enhances scrambling activity. This undermines the claim that the Ca²⁺-bound structure is the strongest scrambler. The authors should conduct statistical analyses to determine if the difference between the two conditions is statistically significant.

      In contrast to the reviewer’s comment, we do not claim that Ca2+-binding itself enhances lipid scrambling. Instead, what we show is that WT structures that are solved in an open confirmation (all of which are Ca2+-bound, except 6QM6) are robust scramblers. For nhTMEM16, we did not observe any scrambling events for the closed-groove proteins, making further statistical analysis redundant.

      (6) The authors claim that the scrambling rates derived from their MD simulations are in "excellent agreement" with experimental findings (lines 294-295), despite significant discrepancy between simulated and experimentally measured rates. For example, the simulated rate of 24.4 {plus minus} 5.2 events/µs for the open, Ca²⁺-bound fungal nhTMEM16 (PDB ID 4WIS) corresponds to approximately 24 million events per second, which is vastly higher than experimental rates. Experimental studies have reported scrambling rate constants of ~0.003 s⁻¹ for TMEM16 family members in the absence of Ca²⁺, measured under physiological conditions (https://doi.org/10.1038/s41467-019-11753-1 ). Even with Ca²⁺ activation, scrambling rates remain several orders of magnitude lower than the rates observed in simulations. Moreover, this highlights a larger problem: lipid scrambling rates occur over timescales that are not captured by these simulations. While the authors elude to these discrepancies (lines 605-606), they should be emphasised in the text, as opposed to the table caption. These should also be reconducted to differences between the membrane compositions of different studies.

      We agree with the spirit of the reviewer’s comment, and because of that, we were very careful not to claim that we reproduce experimental scrambling rates, just that the trends (scrambling-competent, or not) are correct. On lines 294-295, we actually said that the scrambling rates in our simulations excellently agree with “the presumed scrambling competence of each experimental structure”, which is true. 

      As explained extensively in the discussion section of our paper (and by many others), direct comparison between MD (e.g., Martini 3, but also atomistic force fields) dynamics and experimental measurements is challenging. The primary goal of our paper is to quantify and compare the scrambling capacity of different TMEM16 family members and different states, within a CGMD context.

      That said, we agree with the reviewer that we may have missed rare or long-timescale events (as is the case in any MD experiment) and added this point to the discussion.

      (7) To address these discrepancies, the authors should: i) emphasize that simulated rates serve as qualitative indicators of scrambling competence rather than absolute values comparable to experimental findings and ii) discuss potential reasons for the divergence, such as simulation timescale limitations or lipid bilayer compositions that may favor scrambling and force field inaccuracies.

      Please see our answer to question 6. Within the context of our CGMD survey, we confidently call our results quantitative. However, we agree with the reviewer that comparison with experimental scrambling rates is qualitative and should be interpreted with caution. To reflect this, we rewrote the first sentence of the relevant paragraph in the discussion section.

      (8) Line 310: Can the authors provide a rationale as to why one monomer has a wider groove than the other? Perhaps a contact analysis could be useful. See the comment above about ENM.

      The simulation of Ca2+-bound TMEM16K was initiated from an asymmetric X-ray structure in which chain B features a more dilated groove than chain A (PDB 5OC9). The backbones of TM4 and TM6 in the closed groove (A) are close enough together to be directly interconnected by the elastic network. In contrast, TM4 and TM6 in the more dilated subunit (B) are not restricted by the elastic network and, as a consequence, display some “breathing” behavior (Fig. 3B and Fig. 3-Suppl. 6A), giving rise to a ~4x higher scrambling rate. We explicitly added the word “cryo-EM” and the PDB ID to the sentence to emphasize that the asymmetry stems from the original experimental structure.

      When answering this question, we also corrected a mislabeled chain identifier which was in the original manuscript ‘chain A’ when it is actually ‘chain B’ in Fig.2-Suppl. 3A.

      (9) Line 312: Authors speculate that increased groove width likely accounts for increased scrambling rates. For statistical significance, authors should attempt to correlate scrambling rates and groove width over the simulation period.

      The Reviewer is referring to our description of scrambling rates we measured for TMEM16K where we noted that on average the groove with the highest scrambling rate is also on average wider than the opposite subunit which is below 6 Å. We do not suggest that the correlation between scrambling and groove width is continuous, as the Reviewer may have interpreted from our original submission, but we think it is a binary outcome – lipids cannot easily enter narrow grooves (< 6 Å) and hence scrambling can only occur once this threshold is reached at which point it occurs at a near constant rate. We showed this for 4 different family members in the original Fig. 3B, where scrambling events (black dots) were much more likely during, or right after, groove dilation to distances > 6 Å. 

      (10) Line 359: Authors have plotted the minimum distance between residues TM4 and TM6 in Fig. 3A/B, claiming that a wide groove is required for scrambling. Upon closer examination, it is clear that several of these distributions overlap, reducing the statistical significance of these claims. Statistical tests (i.e. KS-tests) should be performed to determine whether the differences in distributions are significant.

      The Reviewer appears to be asking for a statistical test between the six distance distributions represented by the data in Fig. 3A for the scrambling competent structures (6QP6*, 8B8J, 6QM6, 7RXG, 4WIS, 5OC9), and we think this is being asked because it is believed that we are making a claim that the greater the distance, the greater the scrambling rate. If we have interpreted this comment correctly, we are not making this claim. Rather, we are simply stating that we only observe robust scrambling when the groove width regularly separates beyond 6 Å. The full distance distributions can now be found in Figure 3-figure supplement 6B, and we agree there is significant overlap between some of these distributions. However, the distinguishing characteristic of the 6 distributions from scrambling competent proteins is that they all access large distances, while the others do not. Notably, TMEM16F proteins (6QP6*, 8B8J) are below the 6 Å threshold on average, but they have wide standard deviations and spend well over ¼ of their time in the permissive regime (the upper error bar in the whisker plots in Fig. 3A is the 75% boundary).

      (11) Line 363-364: The authors state that all TMEM16 structures thin the membrane. Could the authors include a description of how membrane thinning is calculated, for instance, is the entire membrane considered, or is thinning calculated on a membrane patch close to the protein? Do membrane patches closer to the transmembrane protein increase or decrease thickness due to hydrophobic packing interactions? The latter question is of particular concern since Martini3 has been shown to induce local thinning of the membrane close to transmembrane helices, yielding thicknesses 2-3 Å thinner than those reported experimentally (https://doi.org/10.1016/j.cplett.2023.140436). This could be an important consideration in the authors' comparison to the bulk membrane thickness (line 364). Finally, how is the 'bulk membrane thickness' measured (i.e., from the CG simulations, from AA simulations, or from experiments)?

      Regarding the calculation of thinning and bulk membrane thickness, as described in Method “Quantification of membrane deformations”, the minimal membrane thickness, or thinning, is defined as the shortest distance between any two points from the interpolated upper and lower leaflet surfaces constructed using the glycerol beads (GL1 and GL2). Bulk membrane thickness is calculated by taking the vertical distance between the averaged glycerol surfaces at the membrane edge.

      The concern of localized membrane deformation due to force field artifacts is well-founded. However, the sinusoidal deformations shown here are much greater than 2-3 Å Martini3 imperfections, and they extend for up to 10 Å radially away from the protein into the bulk membrane (see Figure 3-figure supplement 1-5 for more of a description). Most importantly, the sinusoidal wave patterns set up by the proteins is very similar to those described in the previous continuum calculation and all-atom MD for nhTMEM16 (https://www.pnas.org/doi/full/10.1073/pnas.1607574113).

      (12) Line 374: The authors state a 'positive correlation' between membrane thinning/groove opening and scrambling rates. To support this claim, the authors should report. the correlation coefficients.

      We have removed any discussion concerning correlations between the magnitude of the scrambling rate and the degree of membrane thinning/groove opening. Rather we simply state that opening beyond a threshold distance is required for robust scrambling, as shown in our analysis in Fig. 3A.

      Concerning the relation between thinning and scrambling: Instantaneous membrane thinning is poorly defined (because it is governed by fluctuations of single lipids), and therefore difficult to correlate with the timing of individual scrambling events in a meaningful way.  Moreover, as we state later in that same section, “we argue that the extremely thin membranes are likely correlated with groove opening, rather than being an independent contributing factor to lipid scrambling”.

      (13) Line 396: It is stated that TMEM16A is not a scramblase but the simulating scrambling activity is not zero. How can you be sure that you are monitoring the correct collective variable if you are getting a false positive with respect to experiments?

      We only observe 2 scrambling events in 10 ms, which is a very small rate compared to the scrambling competent states. In a previous large survey Martini CG simulation study that inspired our protocol (Li et al, PNAS 2024), they employed a 1 event/ms cut-off to distinguish scramblers from non-scramblers. Hence, they would have called TMEM16A a non-scrambler as well. We expect that false negatives in this context might be an artifact of the CG forcefield, or it could be that TMEM16A can scramble but too slowly to be experimentally detected. Regarding the collective variable for lipid flipping, it is correct, and we know that this lipid actually flipped.

      (14) Line 402: Distance distributions for the electrostatic interactions between E633 and K645 should be included in the manuscript. This is also the case for the interactions between E843-K850 (lines 491-492).

      Our description of interactions between lipid headgroups and E633 and K645 in TMEM16A (5OYB*) are based on qualitative observations of the MD trajectory, and we highlight an example of this interaction in Figure 3-video 4. The video clearly shows that the lipid headgroups in the center of the groove orient themselves such that the phosphate bead (red) rests just above K645 (blue) and at other times the choline bead (blue) rests just below E633 (red). We do not think an additional plot with the distance distributions between lipids and these residues will add to our understanding of how lipids interact residues in the TMEM16A pore.

      We made a similar qualitative observation for the interaction between the POPC choline to E843 and POPC phosphate to K850 while watching the AAMD simulation trajectory of TMEM16F (PDB ID 6QP6). Given that this was a single observation, and the same interactions does not appear in CG simulation of the same structure (see simulation snapshots in Figure 4-figure supplement 5) we do not think additional analysis would add significantly to our understanding of which residues may stabilize lipids in the dimer interface.

      (15) Lines 450-451: 'As the groove opens, water is exposed to the membrane core and lipid headgroups insert themselves into the water-filled groove to bridge the leaflets.' Is this a qualitative observation? Could the authors report the correlation between groove dilation and the number of water permeation events?

      Yes, this is qualitative, and it sketches the order of events during scrambling, and we revised the main text starting at line 450 to indicate this. As illustrated by the density isosurfaces in Appendix 1-Figure 2A, the amount of water found in the closed versus open grooves is striking – there is a significant flood of water that connects the upper and lower solutions upon groove opening. Moreover, Appendix 1-Figure 2B shows much greater water permeation for open structures (4WIS, 7RXG, 5OC9, 8B8J, …) compared to closed structures (6QMB, 6QMA, 8B8Q, and many of the non-labeled data in the figure that all have closed grooves and near 0 water permeation). A notable exception is TMEM16A (7ZK3*8), which has water permeation but a closed groove and little-to-no lipid scrambling.

      Minor Comments:

      (1) Inconsistent use of '10' and 'ten' throughout.

      We like to kindly point out that we do not find examples of inconsistent use.

      (2) Line 32: 'TM6 along with 3, 4 and 5...' should be 'TM6 along with TM3, TM4 and TM5...'. Same in line 142. Naming should stay consistent.

      Changes are reflected in the updated manuscript.

      (3) Line 141: do you mean traverse (i.e. to travel across)? Or transverse (i.e. to extend across the membrane)?

      This is a typo. We meant “traverse”. Thanks for pointing it out.

      (4) Line 142: 'greasy' should be 'strongly hydrophobic'.

      Changes are reflected in the updated manuscript.

      (5) Line 143-144: "credit card mechanism" requires quotation marks.

      Changes are reflected in the updated manuscript.

      (6) Line 144: state if Nectria haematococca is mammalian or fungal, this is not obvious for all readers.

      Changes are reflected in the updated manuscript.

      (7) Line 147-148: Is TMEM16A/TMEM16K fungal or mammalian? What was the residue before the mutation and which residue is mutated? Perhaps the nomenclature should read as TMEM16X10Y where X=the residue prior to the mutation, 10 is a placeholder for the residue number that is mutated and Y=the new residue following mutation.

      “TMEM16” is the protein family. “A” denotes the specific homolog rather than residue.  

      (8) Lines 157-158: same as 10, it is unclear if these are fungal or mammalian.

      Clarifications added.

      (9) Line 184: "...CGMD simulation" should be "...CGMD simulations".

      Changes made.

      (10) Line 191-192: It would help to create a table of all of the mutants (including if they are mammalian or fungal) summarizing the salt concentrations, lipid and detergent environments, the presence of modulators/activators, etc.

      We added this information to Appendix 1-Table 1 in the supplemental information. We did not specify NaCl concentrations, because they all experimental procedures used standard physiological values for this (100-150 mM).

      (11) Line 210: inconsistencies with 'CG' and 'coarse-grain'.

      Changes made.

      (12) Figure 1 caption: '...totaling ~2μs (B)...' is missing the fullstop after 2μs.

      Changes made.

      (13) Figure 1B: it may be useful to label where the Ca2+ ion binds or include a schematic.

      We updated Fig. 1A to illustrate where Ca2+ binds.

      (14) Line 311: Are these mean distances? The authors should add standard deviations.

      Yes, they are. We added the standard deviations to the text.

      (15) Line 321-322: Perhaps a schematic in Figure 2 would be useful to visualize the structural features described here.

      We would kindly refer interested readers to reference [60].

      (16) Line 377: '...are likely a correlate of groove opening...' should read as: '...are likely correlated to groove opening...'.

      Thank you for pointing it out. Changes made.

      (17) Line 398: the '...empirically determined 6Å threshold for scrambling.' Was this determined from the simulations or from experiments? What does "empirically" mean here? Please state this.

      This value was determined from the simulations. Based on our analysis of the correlation between scrambling rate and groove dilation, we found that the minimal TM4/6 distance of 6 Å can distinguish between the high and low activity scramblers. The exact numerical value is somewhat arbitrary as there is a range of values around 6 Å that serve to distinguish scramblers from non-scramblers.

      (18) Figure 4: This figure should be labelled as A, B, C and D, with the figure caption updated accordingly.

      We updated Figure 4 and its caption.

      Reviewer #3 (Recommendations for Authors):

      The authors must do additional simulations to further validate their claim with different lipids and further substantiate dimer interface independent of Ca2+ ions.

      Thank you for the suggestion. We completely agree that studying scrambling in the context of a diverse lipid environment is an exciting area to explore. We are indeed actively working on a project that shares the similar idea. We decided not to include that study because we think the additional discussion involved would be excessive for the current manuscript. We, however, look forward to publishing our findings in a separate manuscript in the near future. In terms of Ca2+-independent scrambling, we are planning with our experimental collaborator for mutagenesis studies that target the residues we identified along the dimer interface.

      Since calcium ions are critical for the stability of these structures, authors should show that they were placed throughout the simulations consistently.

      As stated in the method section “Coarse-grained system preparation and simulation detail”, all Ca2+ ions are manually placed into the coarse-grained structure from the beginning of the simulation at their identical corresponding position in the experimental structure and harmonically bonded to adjacent acidic residues throughout the duration of simulation. We have also added a label to Fig 1A to indicate where the two Ca2+ ions are located.

      The comparison with experimental structures should be consistent with complete simulation, and not the last structure of the trajectory. Depending on the conformational variability, this might be misleading.

      We agree and updated Fig. 1-supplement figure 1 accordingly. The overall agreement between membrane shapes in CGMD and cryo-EM was not affected by this change.

    1. Reviewer #1 (Public review):

      Summary:

      Meteorin proteins were initially described as secreted neurotrophic factors. In this manuscript, Eggeler et al. demonstrate a novel role for Meteorins in establish left-right axis formation in the zebrafish embryo. The authors generated null mutations in each of the three zebrafish meteorin genes - metrn, metrnla, and metrnlab. Triple mutant embryos displayed phenotypes strongly associated with left-right defects such as heart looping and visceral organ placement, and disrupted expression of Nodal-responsive genes, as did single mutants for metrn and metrnla. The authors then go on to demonstrate that these defects in left-right asymmetry are likely to due to defects in Kupffer's Vesicle and the progenitor dorseal forerunner cells including impaired lumen formation and reduced fluid flow, reduced clustering among DFCs, impaired DFC migration, mislocalization of apical proteins ZO-1 and aPKC, and detachment of DFCs from the EVL. Notably, the authors found that expression of marker genes sox32 and sox17 were not affected, suggesting Meteorins are required for DFC/KV morphogenesis but not necessarily fate specification. Finally, the authors show genetic interaction between Meteorins and integrin receptors, which were previously implicated in left-right patterning. In a supplemental figure, the manuscript also presents data showing expression of meteorin genes around the chick Hensen's node, suggesting that the left-right patterning functions may be conserved among vertebrates.

      Strengths:

      Strengths of this study include the generation of a triple mutant line that targets all known zebrafish meteorin family members. The experiments presented in this study were rigorous especially with respect to quantification and statistical analysis.

      Weaknesses:

      Although the authors convincingly demonstrate a role for Meteorins in zebrafish left-right patterning, data supporting a conserved role in other vertebrates is compelling but limited to one supplemental figure. This aspect would be interesting to follow up in future studies.

      Comments on revisions:

      I thank the authors for their thoughtful responses to the reviewers. They have adequately addressed all of my concerns.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Review:

      Reviewer #1 (Public review):

      Summary:

      Meteorin proteins were initially described as secreted neurotrophic factors. In this manuscript, Eggeler et al. demonstrate a novel role for Meteorins in establish left-right axis formation in the zebrafish embryo. The authors generated null mutations in each of the three zebrafish meteorin genes - metrn, metrnla, and metrnlab. Triple mutant embryos displayed phenotypes strongly associated with left-right defects such as heart looping and visceral organ placement, and disrupted expression of Nodal-responsive genes, as did single mutants for metrn and metrnla. The authors then go on to demonstrate that these defects in left-right asymmetry are likely to due to defects in Kupffer's Vesicle and the progenitor dorseal forerunner cells including impaired lumen formation and reduced fluid flow, reduced clustering among DFCs, impaired DFC migration, mislocalization of apical proteins ZO-1 and aPKC, and detachment of DFCs from the EVL. Notably, the authors found that expression of marker genes sox32 and sox17 were not affected, suggesting Meteorins are required for DFC/KV morphogenesis but not necessarily fate specification. Finally, the authors show genetic interaction between Meteorins and integrin receptors, which were previously implicated in left-right patterning. In a supplemental figure, the manuscript also presents data showing expression of meteorin genes around the chick Hensen's node, suggesting that the left-right patterning functions may be conserved among vertebrates.

      Strengths:

      Strengths of this study include the generation of a triple mutant line that targets all known zebrafish meteorin family members. The experiments presented in this study were rigorous, especially with respect to quantification and statistical analysis.

      Weaknesses:

      Although the authors convincingly demonstrate a role for Meteorins in zebrafish left-right patterning, data supporting a conserved role in other vertebrates is compelling but limited to one supplemental figure.

      We thank the reviewer for their thoughtful summary of our study and for highlighting the strengths of our work, including the generation of the triple mutant line and the rigor of our experimental design and quantitative analyses. We also appreciate the constructive feedback regarding the limited functional data supporting the conservation of Meteorin function in other vertebrates. We agree that this is an important aspect that could be further explored. While functional studies in additional species are beyond the current scope, we will consider such experiments in future work.

      We would like to highlight the phylogenetic analysis of Meteorin proteins we have already performed and included in the manuscript (Fig. S7D), which illustrates the evolutionary conservation of this protein family and supports the possibility of a conserved role in left-right patterning.

      Additionally, we have expanded the methods and discussion to include: (1) details on zebrafish viability in contrast to reported embryonic lethality in metrn mutant mice, (2) the background strains used in our study, (3) observed variability in DFC number and potential batch effects and (4) clarification of our 'convergence ratio' quantification approach.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript the authors describe their study on the role of meteorins in establishing the left-right organizer. The left-right organizer is a transient organ in vertebrate embryos in which rotating cilia cause a fluid flow that breaks the left-right symmetry and coordinates lateralization of internal organs such as gut and heart. In zebrafish, the left-right organizer (also named Kupffer's vesicle) is formed by dorsal forerunner cells, but very little is known about how dorsal forerunner cells coalles and form this ciliated vesicle in the embryo. The authors mutated the three meteorin-coding genes in zebrafish and observed that mutations in each one of these causes laterality defects with the strongest defects observed in the triple mutant. Loss of meteorins affects nodal gene expression, which play essential roles in establishing organ laterality. Meteorins are widely expressed in developing embryos and expression in lateral plate mesoderm and dorsal forerunner cells was observed. The meteorin triple mutant embryos display defects in the migration and clustering of the dorsal forerunner cells impairing kupffer's vesicle formation and cilia rotation. Finally, the authors show that meteorins genetically interact with integrins.

      Strengths:

      - These authors went through the lengthy process of generating triple mutants affecting all three meteorin genes. This provides robust genetic evidence on the role of meteorins in establishing organ laterality and circumvented that interpretation of the results would be hard due to redundant functions of meteorins.

      - The use of life imaging on triple mutants is appreciated

      - High-quality imaging of dorsal forerunner to quantify cell migrations and its relation to Kupffer's vesicle formation.

      Weaknesses:

      - Lack of a model how meteorins regulate dorsal forerunner cell migration.

      - Only genetic data to suggest a link between meteorins and integrins

      - Besides its role in DFC migration, meteorins may also play a more direct role in regulating Nodal signaling, which is not addressed here.

      We appreciate the recognition of the strengths of our study, particularly the generation of the triple meteorin mutants and the use of high-resolution imaging to quantify DFC behavior and Kupffer’s vesicle formation—both of which were central to providing robust evidence for Meteorins' role in left-right patterning.

      We also value the reviewer’s comments on areas that need further exploration, including the need for a mechanistic model explaining how Meteorins regulate DFC migration, the genetic interaction with integrins, and the potential direct involvement of Meteorins in Nodal signaling.

      We agree that deeper mechanistic insights would strengthen the study. While our findings suggest that Meteorins influence DFC migration and clustering through integrin pathways, a detailed mechanistic dissection, particularly regarding the yet unidentified Meteorin receptor, lies beyond the current scope. However, we consider this a key aspect for future research and have discussed it further in the revised discussion section.

      In response to the reviewer’s suggestions, we have expanded the discussion to address the limitations of the current data linking Meteorins and integrins, including relevant citations to studies that implicate integrins in similar contexts. Additionally, we have added a more detailed discussion of the potential for Meteorins to directly influence Nodal signaling, and we cite a relevant study to support this possibility.

      Once again, we thank the reviewer for their insightful and constructive comments. These points raise important directions for future investigation that will further advance our understanding of Meteorin function in left-right axis formation.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      In the Results section (p. 9), the authors state, "...a reduced ZO-1 enrichment at the apical junctions of triplMUT GFP-positive DFCs could be detected." However, in Fig. 4F-G, the areas of ZO-1 enrichment indicated by arrowheads appear quite far from the DFCs themselves, making it unclear if these ZO-1-enriched areas are apical DFC junctions (as stated in the text) or instead are part of the EVL. Is it possible to include an additional cell membrane marker or other landmarks? In addition, the differences in ZO-1 accumulation between mutants and WT appear relatively modest. Is it possible to provide quantification of this effect?

      We appreciate the reviewer’s request for additional stainings and further clarification and we would like to highlight the requested quantifications of ZO-1 accumulation, including statistical analysis, are already provided in Fig. S5E.

      In mouse, loss of Meteorin is embryonic lethal yet the zebrafish triple mutants are viable. Could the authors discuss this discrepancy?

      We have expanded the discussion to address this point, suggesting that species-specific differences in compensatory mechanisms may explain the observed differences in viability. We would like to reiterate that while one study has reported embryonic lethality in metrn mutant mice, this specific mouse line has not been further investigated in any recent publications. Additionally, in collaboration with the lab of Alain Chédotal, we generated independent metrn and metrnl mutant mouse lines, which did not exhibit the phenotype described in the previously mentioned study.

      It has been reported that TL and AB strains exhibit variable numbers of DFCs and thus laterality defects (Moreno-Ayala et al., 2021, Cell Reports 34(2):108606). Would it be possible for the authors to report background stains used in this study and those used to generate the meteorin knock-outs?

      We appreciate the comment highlighting the importance of specifying the background strains used in our study. We have now included this information in the methods section, detailing the zebrafish strains utilized throughout our experiments.

      For statistical analysis, would be possible for the authors to report the number of clutches examined to control for batch effects (especially given the wide variability in DFC numbers as noted above)?

      For further clarification, we have now included additional explanation on number of clutches in the methods section.

      In the Methods section (p. 19), the description of how the convergence ratio was computed was somewhat unclear. Could the authors provide a citation or include a diagram/schematic?

      We have revised the Methods section to provide a clearer definition of the convergence ratio and have included a schematic (Fig. 4D) to illustrate how it was calculated.

      Reviewer #2 (Recommendations for the authors):

      - Meteorins are widely expressed in the embryo. Can the authors comment on whether meteorin expression is required in the dorsal forerunner cells (DFCs) or in other cells? This could be addressed by knockdown experiments in DFCs as described by others (PMID: 15716348)

      We thank the reviewer for this important comment. In our study, we have shown that Meteorins are not required for the identity of DFCs, as several DFC-specific markers remain expressed in the respective cells within the meteorin mutant background (see Fig. S4).

      - In fig1d and 1e the authors use heterotaxy to describe visceral organ placement. The embryo shown in 1d seems to display situs inversus instead of heterotaxy, which is defined as discordance in organ position. The authors should clarify this.

      We agree with the reviewer and have revised the figures and figure legends to clarify the distinction between situs inversus and heterotaxy.

      - In Fig2 the authors show that nodal pathway genes are reduced, suggesting reduced Nodal signaling. How do they explain this as loss of cilia rotation generally leads to randomization of Nodal signaling but not a reduction in signaling.

      Following this suggestion we have now added a further discussion on the possibility that Meteorins could directly regulate Nodal signaling in addition to their role in DFC migration and have cited a relevant study.

      - Reduced Nodal signaling in the LPM leads to organ laterality defects. Most anterior tissues like the heart are more sensitive to perturbation in Nodal signaling in the LPM compared to more posterior organs like gut (see also PMID: 25684355). Since in triple mutants the position of the heart is more affected than the position of the visceral organs this suggests that meteorins play an additional role in Nodal signaling in the LPM. As others have shown that meteorins regulate nodal activity (PMID: 24558432), the authors should address this further.

      As described above, we have now added a further discussion on the possibility that Meteorins could directly regulate Nodal signaling in addition to their role in DFC migration and have cited a relevant study. Further investigation into a possible direct role of Meteorins in Nodal signaling will be pursued in future work.

      - The term 'convergence ratio' is not clearly described and confusing as convergence is also used for the movement of LPM cells towards the midline.

      As noted in response to Reviewer #1, we have revised the Methods section and included a schematic in Fig. 4D to better explain this parameter.

      We are grateful for the thoughtful critiques from both reviewers, which have been very constructive and improved the clarity of our study. We believe that the revisions we have made address the concerns raised, and we look forward to your evaluation of our revised manuscript.

    3. eLife Assessment

      This study presents important insights into the regulation of left-right organ formation. By combining genetic perturbation of all three Meteorin genes in zebrafish and timelapse imaging, the authors identify an essential role for this protein family in the establishment of left-right patterning. They provide convincing evidence that Meteorins are required for the morphogenesis of dorsal forerunner cells, the precursors of the left-right organizer (also named Kupffer's vesicle) in zebrafish. In line with this, Meteorins were shown to genetically interact with integrins ItgaV and Itgb1b to regulate dorsal forerunner cell clustering.

    4. Reviewer #1 (Public review):

      Summary:

      Meteorin proteins were initially described as secreted neurotrophic factors. In this manuscript, Eggeler et al. demonstrate a novel role for Meteorins in establish left-right axis formation in the zebrafish embryo. The authors generated null mutations in each of the three zebrafish meteorin genes - metrn, metrnla, and metrnlab. Triple mutant embryos displayed phenotypes strongly associated with left-right defects such as heart looping and visceral organ placement, and disrupted expression of Nodal-responsive genes, as did single mutants for metrn and metrnla. The authors then go on to demonstrate that these defects in left-right asymmetry are likely to due to defects in Kupffer's Vesicle and the progenitor dorseal forerunner cells including impaired lumen formation and reduced fluid flow, reduced clustering among DFCs, impaired DFC migration, mislocalization of apical proteins ZO-1 and aPKC, and detachment of DFCs from the EVL. Notably, the authors found that expression of marker genes sox32 and sox17 were not affected, suggesting Meteorins are required for DFC/KV morphogenesis but not necessarily fate specification. Finally, the authors show genetic interaction between Meteorins and integrin receptors, which were previously implicated in left-right patterning. In a supplemental figure, the manuscript also presents data showing expression of meteorin genes around the chick Hensen's node, suggesting that the left-right patterning functions may be conserved among vertebrates.

      Strengths:

      Strengths of this study include the generation of a triple mutant line that targets all known zebrafish meteorin family members. The experiments presented in this study were rigorous especially with respect to quantification and statistical analysis.

      Weaknesses:

      Although the authors convincingly demonstrate a role for Meteorins in zebrafish left-right patterning, data supporting a conserved role in other vertebrates is compelling but limited to one supplemental figure. This aspect would be interesting to follow up in future studies.

      Comments on revisions:

      I thank the authors for their thoughtful responses to the reviewers. They have adequately addressed all of my concerns.

    1. eLife Assessment

      The study provides a valuable analysis of escape from X-inactivation based on three rare female GTEX-donors with non-mosaic X-inactivation. The methods and analyses are solid and broadly support the authors' claims. Their data are more comprehensive than those presented previously and add significant weight to evidence for which genes are inactivated or escape from X inactivation in humans.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript investigates genes that escape X-Chromosome Inactivation (XCI) across human tissues, using females that exhibit skewed or non-random XCI. The authors identified 2 female individuals with skewed XCI in the GTex database, in addition to the 1 female skewed sample in this database that has been described in a previous publication (Ref.16). The authors also determined the genes which escape XCI for 380 X-linked genes across 30 different tissues.

      Strengths:

      The novelty of this manuscript is that the authors have identified the XCI expression status for a total of 380 genes across 30 different human tissues, and also discovered the XCI status (escape, variable escape, or silenced) for 198 X-linked genes, whose status was previously not determined. This report is a good resource for the field of XCI, and would benefit from additional analyses and clarification of their comparisons of XCI status.

    3. Reviewer #2 (Public review):

      Summary:

      Gylemo et al. present a manuscript focused on identifying the X-inactivation or X-inactivation escape status for 380 genes across 30 normal human tissues. X-inactivation status of X-linked genes across tissues is important for understanding sex-specific differences in X-linked gene expression and therefore traits, and the likely effect of X-linked pathogenic variants in females. These new data are significant as they double the number of genes that have been classified in the human, and double the number of tissues studied previously.

      Strengths:

      The strengths of this work are that they analyse 3 individuals from the GTex dataset (2 newly identified, 1 previously identified and published) that have highly/ completely skewed X inactivation, which allows the study of escape from X inactivation in bulk RNA-sequencing. The number of individuals and breadth of tissues analysed adds significantly to both the number of genes that have been classified and the weight of evidence for their claims. The additional 198 genes that have been classified and the reclassification of genes that previously had only limited support for their status is useful for the field.

      In analysing the data they find that tissue-specific escape from X inactivation appears relatively rare. Rather, if genes escape, even variably, it tends to occur across tissues. Similarly if a gene is inactivated, it is stable across tissues.

      Comments on revised version:

      The authors have answered all of my queries. While they have not been able to pinpoint the genetic cause of the highly skewed XCI cases in their cohort, I agree this is beyond the scope of this study. I have no further requests.

    4. Reviewer #3 (Public review):

      Summary:

      Nestor and colleagues identify genes escaping X chromosome inactivation (XCI) in rare individuals with non-mosaic XCI (nmXCI) whose tissue-specific RNA-seq datasets were obtained from the GTEX database. Because XCI is non-mosaic, read counts representing a second allele are tested for statistical significant escape, in this case > 2.5% of active X expression. Whereas a prior GTEX analysis found only one nmXCI female, this study finds two additional donors in GTEX, therefore expanding the number of assessed X-linked genes to 380. Although this is fewer than half of X-linked genes, the study demonstrates that although rare, nmXCI females are represented in RNA-seq databases such as GTEX. Therefore this analytical approach is worthwhile pursuing in other (larger) databases as well, to provide deeper insight into escape from XCI which is relevant to X-linked diseases and sex differences.

      Strengths:

      The analysis is well-documented, straight-forward and valuable. The supplementary tables are useful, and the claims in the main text well-supported.

      Weaknesses:

      There are very few, except that this escape catalogue is limited to 3 donors, based on a single (representative) tissue screen in 285 female donors, mostly using muscle samples. However, if only pituitary samples had been screened, nmXCI-1 would have been missed. Additional donors in the 285 representative samples cross a lower threshold of AE = 0.4. It would be worthwhile to query all tissues of the 285 donors to discover more nmXCI cases, as currently fewer than half of X-linked genes received a call using this very worthwhile approach.

      Comments on revised version:

      The authors incorporated some textual changes, but deferred any new analysis, or expansion from these two new skewed donors to include more individuals/tissues, or going more in depth for individual genes to future manuscripts. They appear to have that option at eLife.

    5. Reviewer #4 (Public review):

      Summary:

      This study by Gylemo et al. investigates genes that escape X-Chromosome Inactivation (XCI) by analyzing RNA-sequencing data from three female individuals with highly skewed XCI identified in the GTEx database-two newly reported and one previously described. Utilizing these rare non-mosaic XCI cases, the authors assess allelic expression patterns across 30 normal human tissues to classify the XCI status of 380 X-linked genes, including 198 not previously annotated. The study provides a broader and more comprehensive catalog of XCI escape, contributing valuable insights into sex-specific gene expression and the potential implications of X-linked variants in disease.

      Strengths:

      The primary strength of this work lies in its expanded scope: it doubles the number of tissues and significantly increases the number of X-linked genes with known XCI status compared to previous studies. By focusing on rare individuals with non-random XCI, the authors provide a unique opportunity to observe allelic expression and classify escape status with more confidence. Their findings that escape from XCI is relatively consistent across tissues (rather than tissue-specific) enhance the understanding of XCI mechanisms. The methodology is robust, the data are well-documented, and the supplementary resources are comprehensive. This study thus represents a valuable resource for the XCI field and a promising basis for future investigations.

      Weaknesses:

      Despite its strengths, the study is limited by its reliance on only three individuals, which restricts statistical power and generalizability. Concerns were raised regarding the comparability of XCI status across tissue types and cell lines, particularly given that previous classifications may have used cancer or immortalized cells. Additionally, more could be done to explore the genetic basis behind the observed skewed XCI, which might affect the conclusions about escape patterns. Finally, the authors are encouraged to expand their approach to additional RNA-seq datasets or single-cell analyses to validate their findings and potentially discover more individuals with skewed XCI, which would deepen understanding of this important biological phenomenon.

    6. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer 1 (Public review):

      (1) The authors state that they have reclassified the allelic expression status of 32 genes (shown in Table S5, Supplementary Figure 3). The concern is the source of the tissue or cell line which was originally used to make the classification of XCI status, and whether the comparisons are equivalent. For example, if cell lines (and not tissues) were used to define the XCI status for EGFL6, TSPAN6, and CXorf38, then how can the authors be sure that the escape status in whole tissues would be the same? Also, along these lines, the authors should consider whether escape status in previous studies using immortalized/cancer cell lines (such as the meta-analyses done in Balaton publication) would be different compared to healthy tissues (seems like it should be). Therefore, making comparisons between healthy whole tissues and cancer cell lines doesn't make sense.

      Indeed, many previous classifications were based on clonal cell lines, which could result in atypical patterns of escape due to the profound and varied effects of adaptation to culture. However, one of the primary goals of our study was to directly determine allele-specific expression from the X-chromosome in healthy primary tissues, in part to exclude the potential confounding effects of cell culture. 

      Whereas we do perform comparisons with cell culture-based classifications, we also provide detailed comparisons with the previous classification of Tukiainen et al, which also uses primary human tissues. In addition, whereas the comparison with Balaton et al is not optimal, we hold that it is valuable as it reveals which genes may exhibit aberrant escape patterns in culture. Finally, despite the above reservations, our comparison revealed an over-whelming agreement with previous research which suggests that in the vast majority of cases, escape appears to be correctly maintained in culture. 

      (2) The authors note that skewed XCI is prevalent in the human population, and cite some publications (references 8, 10-12). If RNAseq data is available from these female individuals with skewed XCI (such as ref 12), the authors should consider using their allelic expression pipeline to identify XCI status of more X-linked genes.

      Indeed, we completely agree and are in the process of obtaining this data which has proven complex and time-consuming in the currently regulatory environment.

      (3) It has been well established that the human inactive X has more XCI escape genes compared to the mouse inactive X. In light of the author's observations across human tissues, how does the XCI status compare with the same tissues in mice?

      This is a very interesting point, and a comparison we are currently working on. However, this is a major undertaking and one that is outside of the scope of this study. We do appreciate the differences in mice and humans on X-chromosome level and could only speculate on the overlap being relatively small as the number of escapees in mice has been shown the be far lower than in humans.

      Reviewer 2 (Public review):

      In my view there are only minor weaknesses in this work, that tend to come about due to the requirement to study individuals with highly skewed X inactivation. I wonder whether the cause of the highly skewed X inactivation may somehow influence the likelihood of observing tissue-specific escape from X inactivation. In this light, it would be interesting to further understand the genetic cause for the highly skewed X inactivation in each of these three cases in the whole exome sequencing data. Future additional studies may validate these findings using single-cell approaches in unrelated individuals across tissues, where there is normal X inactivation.

      We thank the reviewer for their positive assessment of our work. This is a point we have and continue to grapple with. We cannot rule out that the genetic cause of complete skewing may influence tissue-specific XCI.  Moreover, the genetic cause for the non-mosaic XCI is currently unclear and is likely to vary between individuals, which could also result in inter-individual variation in tissue-specific escape. We are currently performing large prospective studies in the tissues of healthy females to specifically address this point.

      Reviewer 3 (Public review):

      There are very few, except that this escape catalogue is limited to 3 donors, based on a single(representative) tissue screen in 285 female donors, mostly using muscle samples. However, if only pituitary samples had been screened, nmXCI-1 would have been missed. Additional donors in the 285 representative samples cross a lower threshold of AE = 0.4. It would be worthwhile to query all tissues of the 285 donors to discover more nmXCI cases, as currently fewer than half of X-linked genes received a call using this very worthwhile approach.

      We thank the reviewer for their positive assessment of our work. Of course, we agree that a tissue-wide screen in all individuals would have been optimal and is a line of research we are currently pursuing. However, the analysis of allele-specific expression in all 5,000 RNA-seq samples is a massive undertaking and was simply not practicable within the time-scale of this study. 

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      Thanks to the authors for an interesting manuscript! I enjoyed reading it and the care that has gone into explaining the analyses and the findings. There are a few recommendations that I have for strengthening the work.

      We thank the reviewer for the nice feedback. Much appreciated.

      (1) I would like to see a genetic analysis of the three individuals, to try and identify the genetic causes of the skewed X inactivation beyond just considering the XIC or translocations. The cause of the highly skewed X inactivation would be of interest to many.

      This is certainly a very interesting avenue of research and one that we are currently focusing on. However, in the current study we simply had too few skewed XCI females to assess this  in an exhaustive manner. To tackle this issue, we have begun a prospective study of healthy females to identify additional non-mosaic females.

      (2) I wonder whether the cause of the skewed XCI may somehow influence the assessment of tissue-specific escape? If there is a problem with X inactivation itself, perhaps escape would also be different, making it appear more constitutive than tissue-specific?

      This is a point we have and continue to grapple with. We cannot rule out that the genetic cause of complete skewing may influence tissue-specific XCI.  Moreover, the genetic cause for the non-mosaic XCI is currently unclear and is likely to vary between individuals, which could result in inter-individual variation in tissue-specific escape.

      (3) Presentation/wording suggestions:

      I think the abstract is likely a bit inaccessible to those outside the field. I am in the X inactivation field, but don't use the term non-mosaic X inactivation, but rather would call it highly skewed, or non-random X inactivation. In my view, it would be simpler for the abstract to call non-mosaic XCI highly skewed XCI instead, or to use more words to ensure it is clear for the reader.

      We agree that the terminology of completely skewed/non-mosaic XCI could be more clearly defined in the abstract and have clarified this. “Using females that are non-mosaic (completely skewed) for X-inactivation (nmXCI) has proven a powerful and natural genetic system for profiling X-inactivation in humans.”

      I would consider calling the always escape genes constitutive escapees, while the variable may be facultative.

      This is something we have also considered and have received differing feedback on. However, we will definitely keep this in mind for future publications.

      Line 132, it would be useful to explain median >0.475 as less than 2.5% of reads coming from the inactive allele here, not just in the methods. Can you also explain why this cutoff was chosen?

      We thank the reviewer for this clarification. A clarification has been added to the main text as suggested.

      The cutoff was applied to account for potential variations in skewing, given that we screened only a single tissue sample per individual. Although nmXCI females are theoretically expected to have 0% of reads originating from the 'inactive' allele, this is not always observed due to (a) technical errors such as PCR or sequencing inaccuracies, or (b) differences in skewing between tissue types.

      Lines 156-160 describe how the heterozygous SNPs were identified in relation to Figure 2. I read these in the methods so that I could understand Figure 1, so I suggest moving this section up.

      We have moved the section as suggested by the reviewer.

      Line 156, consider adding in a sentence to describe what is shown in Figures 2A and B i.e, the overlap of SNPs and spread along the X.

      We have added a sentence describing what is shown in Figures 2A and 2B as suggested by the reviewer.

      Line 217, it would be useful to give the % of genes that show tissue-specific escape, to quantify rare.

      We have added a sentence quantifying ‘rare’ at the suggested line.

      (4) Typos:

      Line 119, missing 'the most' before extensive (and remove an).

      We thank the reviewer for pointing this out. This error has been corrected.

      Reviewer #3 (Recommendations for the authors):

      Some results in the supplementary figures were quite striking. What is going on with DDX3X and ZRSR2? How come total read counts are so different between individuals?

      Indeed, this is a very intriguing observation and one that we have simply failed to understand thus far. We are currently performing a large prospective study to obtain greater number of non-mosaic females and tissues samples. Hopefully, additional observations across females will allow us to gain further insights into the inter-individual behaviour of DDX3X and ZRSR2.   

      One item I would like to see added is some analysis to address the cause of these extremely skewed XCI individuals. The copy number analysis suggests there are some segmental deletions on the X in all three nmXCI cases. Where are these deletions, and do any fall in the region of the X-inactivation centre? Have the authors performed any analysis of potentially deleterious X-linked variants in the WGS or WES data? Why are these donors so skewed? It's interesting that UPIC was still more skewed than the other two.

      The segmental deletions the reviewer points out are not segmental deletions, the same variation in coverage is found in all females we’ve looked at including females with a mosaic XCI (see Author response image 1 below where the same pattern of slightly lower read counts is observed at the same sites in all female samples). No deletions were identified in the XIC region. No analysis was performed of deleterious X-linked variants. Why the donors are so skewed is unknown and intriguing. Indeed, identifying the origin of extreme skewing (including the females in this study) is now the main focus of the group. Whereas UPIC had trisomy 17, which has likely resulted in the observed skewing, we have not yet found a genetic variant that could explain the skewing observed in 13PLJ or ZZPU.

      Author response image 1.

      Copy number as log2 ratio using 500kb bins across the X-chromosome for 3 mosaic XCI females (1QPFJ, OXRO, and RU1J) and 3 nmXCI females, UPIC, nmXCI-1 and nmXCI-2.

      This is not necessary to address with new analyses, but as alluded to above, the authors could screen more than a single representative tissue. And to apply this analysis to larger databases (UK biobank), which the authors may be planning to do already.

      This an avenue of research we are currently investigating. 

      The code is well-documented and accessible. Additional information on the manual reclassification (to deal with inflated binomial P-values) would be helpful. Why not require a minimal threshold for escape (10% of active X allele) in addition to a significant binomial P (inactive X exp. > 2.5% of active)?

      We thank the reviewer for this positive assessment of the code. 

      Indeed, how to define ‘escape’ is a vexed issue, and one we feel has been given undue weight within the field. In reality, studies of escape are often dealing with sparse data (e.g. read depth), few observations (genes and individuals) and substantial amounts of missing data. Thus, it is unlikely that a standard statistical approach will be sensitive and specific across different studies and data types. Similarly, cut-offs, though useful would also need to be adjusted to the data type and quality in any given study.

      Whereas we initially used a significant binomial P-value as our sole test (often quoted as ‘best practice’), this resulted in wide-spread inflation of P-values. Thus, we switched to manually curating the allelic expression status of all 380 genes using the empirical guideline of allelic ratio >0.4 (also a commonly used cut-off) as indicating mono-allelic expression. We considered combining the binomial P-value with the cut-off but felt that this would result in an overly complex definition of escape and would unnecessarily exclude many genes from classification, due to the opposing effects of low/high read depth on the binomial and cut-off approaches respectively.

      Indeed, due to the difficultly of both accurate and objective ‘classification’ of escape that we placed an emphasis on clearly displaying all data for each gene in each individual to allow readers to see all the data on which each classification was based.

    1. eLife Assessment

      This important study uncovers the mechanism of inhibition of a membrane pyrophosphatase by non-hydrolyzable phosphonate substrate analogs. Convincing crystallography, EPR spectroscopy, and functional measurements support the presence of a distinct conformational equilibrium of TmPPase in solution, and further supports the notion of asymmetric inhibitor binding at the active site, while maintaining a symmetric conformation at the periplasmic interface.

    2. Reviewer #1 (Public review):

      Summary:<br /> This work examines the binding of several phosphonate compounds to a membrane-bound pyrophosphatase using several different approaches, including crystallography, electron paramagnetic resonance spectroscopy, and functional measurements of ion pumping and pyrophosphatase activity. The work synthesizes these different approaches into a model of inhibition by phosphonates in which the two subunits of the functional dimer interact differently with the phosphonate. This asymmetry in the two subunits of the dimer is consistent with past studies of this system.

      Strengths:<br /> This study integrates a variety of approaches, including structural biology, spectroscopic measurements of protein dynamics, and functional measurements. Overall, data analysis was thoughtful, with careful analysis of the substrate binding sites (for example calculation of POLDOR omit maps). This study agrees with previous studies that have detected functional asymmetry in the membrane PPase dimer.

    3. Reviewer #3 (Public review):

      Summary:<br /> Membrane-bound pyrophosphatases (mPPases) are homodimeric proteins that hydrolyze pyrophosphate and pump H+/Na+ across membranes. They are an attractive drug target against protist pathogens. Non-hydrolysable PPi analogue bisphosphonates such as risedronate (RSD) and pamidronate (PMD) serve as primary drugs currently used. Bisphosphonates have a P-C-P bond, with their central carbon can accommodate up to two substituents, allowing a large compound variability. Here authors solved two TmPPase structures in complex with the bisphosphonates etidronate (ETD) and zoledronate (ZLD) and monitored their conformational ensemble using DEER spectroscopy in solution. These results reveal the inhibition mechanism by these compounds, which is crucial for developing future small-molecule inhibitors.

      Strengths:<br /> Authors show that seven different bisphosphonates can inhibit TmPPase with IC50 values in the micromolar range. Branched aliphatic and aromatic modifications showed weaker inhibition. High-resolution structures for TmPPase with ETD (3.2 Å) and ZLD (3.3 Å) are determined. These structures reveal the binding mode and shed light on the inhibition mechanism. The nature of modification on the bisphosphonate alters the conformation of the binding pocket. The conformational heterogeneity is further investigated using EPR/DEER spectroscopy under several conditions. Altogether, this provides convincing evidence for a distinct conformational equilibrium of TmPPase in solution and further supports the notion of asymmetric inhibitor binding at the active site, while maintaining a symmetric conformation at the periplasmic interface.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:  

      Reviewer #1 (Public review):  

      Summary:  

      This work examines the binding of several phosphonate compounds to a membrane-bound pyrophosphatase using several different approaches, including crystallography, electron paramagnetic resonance spectroscopy, and functional measurements of ion pumping and pyrophosphatase activity. The work attempts to synthesize these different approaches into a model of inhibition by phosphonates in which the two subunits of the functional dimer interact differently with the phosphonate.  

      Strengths:  

      This study integrates a variety of approaches, including structural biology, spectroscopic measurements of protein dynamics, and functional measurements. Overall, data analysis was thoughtful, with careful analysis of the substrate binding sites (for example calculation of POLDOR omit maps).  

      Weaknesses:  

      Unfortunately, the protein did not crystallize with the more potent phosphonate inhibitors. Instead, structures were solved with two compounds with weak inhibitory constants >200 micromolar, which limits the molecular insight into compounds that could possibly be developed into small molecule inhibitors. Likewise, the authors choose to focus the spectroscopy experiments on these weaker binders, missing an opportunity to provide insight into the interaction between more potent binders and the protein. 

      We acknowledge the reviewer concern regarding the choice of weaker inhibitors. We attempted cocrystallization with all available inhibitors, including those with higher potency. However, despite numerous efforts, these potent inhibitors yielded low-resolution crystals, making them unsuitable for detailed structural analysis. Therefore, we chose to focus on the weaker binders, as we were able to obtain high-quality crystal structures for these compounds. This allowed us to perform DEER spectroscopy and monitor conformational TmPPase state ensembles in solution with the added advantage of accurately analysing the data against structural models derived from X-ray crystallography. Using these weaker inhibitors enabled a more precise interpretation of the DEER data, thus providing reliable insights into the conformational dynamics and inhibition mechanism. As suggested by the reviewer, in the revised version, we add new DEER experiments, conditions and analysis on two of the more potent inhibitors (alendronate and pamidronate) to provide additional insight into their interactions. Furthermore, we also implemented additional DEER data on the cytoplasmic side of TmPPase; at a new site we identified (with the advantage of being an endogenous cysteine residue) and spin labelled (C599R1), given the DEER data for the previous T211R1cytoplasmic site were difficult to interpret owing to the highly dynamic nature of this region. The new pair C599R1 yielded high-quality DEER traces and indicated more clearly than T211R1, distance distributions consistent with asymmetry across the sampled conditions.  Again, as suggested by the reviewer, alendronate and pamidronate DEER measurements were also recorded for this site (cytoplasmic side; C599R1) as well as the periplasmic side (525R1).

      In general, the manuscript falls short of providing any major new insight into membrane-bound pyrophosphatases, which are a very well-studied system. Subtle changes in the structures and ensemble distance distributions suggest that the molecular conformations might change a little bit under different conditions, but this isn't a very surprising outcome. It's not clear whether these changes are functionally important, or just part of the normal experimental/protein ensemble variation. 

      We respectfully disagree with the reviewer. The scale of motions particularly seen in solution (and now on a new reliable spin pair (C599R1) located on the cytoplasmic side) correspond to those seen in the full panoply of crystal structures of mPPases. Some proteins undergo very large conformational changes during catalysis – such as the rotary ATPase. This one does not, meaning that the precise motions we describe here are relevant and observed in solution for the first time. Conformational changes in the ensemble, whether large or small, represent essential protein motions which underlie key mPPase catalytic function. These dynamic transitions are extremely challenging to monitor, especially in so many conditions and our DEER spectroscopy data demonstrate the sensitivity and resolution necessary to monitor these subtle changes in equilibria, even if these are only a few Angstroms. For several of the conditions we investigated by DEER in solution, corresponding X-ray structures have been solved, with the derived distances agreeing well with the DEER distributions. This further validates the biological relevance of the structures, and reveals the complete conformational ensemble, intractable using other current approaches. Indeed, some conformational states were previously seen using serial time-resolved X-ray static structures and were consistent with asymmetry.

      The ZLD-bound crystal structure doesn't predict the DEER distances, and the conformation of Na+ binding site sidechains in the ZLD structure doesn't predict whether sodium currents occur. This might suggest that the ZLD structure captures a conformation that does not recapitulate what is happening in solution/ a membrane. 

      We agree with the reviewer that the ZLD-bound crystal structure does not predict the DEER distances. However, we believe this discrepancy arises from the steric bulkiness of ZLD inhibitor, which prevents the closure of the hydrolytic centre. Additionally, the absence of Na+ at the ion gate in the ZLD-bound structure suggests that Na+ transport does not occur, a conclusion further supported by our electrometric measurements. We agree with the reviewer; distances observed in the DEER experiments might represent a potential new conformation in solution, not captured by the static X-ray structure, thereby offering new insights into the dynamic nature of the protein under physiological conditions. This serves to emphasize the complementarity of the DEER approach to Xray crystallography and redoubles the importance of using both techniques. Finally, the static X-ray structures have not captured the asymmetric conformations that must exist to explain half-of-thesites reactivity, where DEER yields distance distributions, across all 16 cases tested here (two mutants with eight conditions each), that are consistent with asymmetry.

      Reviewer #2 (Public review):  

      Summary:  

      Crystallographic analysis revealed the asymmetric conformation of the dimer in the inhibitor-bound state. Based on this result, which is consistent with previous time-resolved analysis, authors verified the dynamics and distance between spin introduced label by DEER spectroscopy in solution and predicted possible patterns of asymmetric dimer.  

      Strengths:  

      Crystal structures with inhibitor bound provide detailed coordination in the binding pocket thus useful information for the mPPase field and maybe for drug development.  

      Weaknesses:  

      The distance information measured by DEER is advantageous for verifying the dynamics and structure of membrane protein in solution. However, regarding T211 data, which, as the authors themselves stated, lacks measurement precision, it is unclear for readers how confident one can judge the conclusion leading from these data for the cytoplasmic side. 

      We thank the reviewer for acknowledging the advantageous use of the DEER methodology for identifying dynamic states of membrane proteins in solution. In our original manuscript, we used two sites in our analysis: S525 (periplasm) and T211 (cytoplasm), in which S525R1 yielded highquality DEER data, while T211R1 yielded weak (or no) visual oscillations, leading to broad distributions for the several conditions tested. In the revised manuscript, we now added a third site at the cytoplasmic side (C599R1 located at TMH14), which yielded high-quality DEER data and comparable to S525R1. Both C599R1 and C525R1 spin pairs generated distance distributions for all 16 conditions (two mutants of eight conditions each) that were described well by the solution-state ensemble adopting a predominantly asymmetric conformation.  

      Furthermore, we have tailored our interpretation of the T211R1 DEER data, and refrain from using the data to draw conclusions about the TmPPase conformational ensemble in the presence of different inhibitors. However, we still opted to include the T211R1 data in the SI because they confirm an important structural feature of mPPase in solution conditions; the intrinsically dynamic behaviour of the loop5-6 where T211 is located. This observation in solution is also consistent with our previous (Kellosalo et al., Science, 2012; Li et al., Nat. Commun, 2016; Vidilaseris et al., Sci. Adv., 2019; Strauss et al., EMBO Rep., 2024) and current X-ray crystallography data. To reiterate, we excluded T211R1 from any analysis relating to mPPase asymmetry and our conclusions were entirely based on the S525R1 and new C599R1 DEER data, which allowed us to monitor both sides on the membrane.  

      The distance information for the luminal site, which the authors claim is more accurate, does not indicate either the possibility or the basis for why it is the ensemble of two components and not simply a structure with a shorter distance than the crystal structure.  

      We thank the reviewer for pointing out this possibility and alternative interpretation of our DEER data. We now provide further analysis to show that our DEER data from both membrane sides reporters are highly consistent with (although they cannot completely exclude) asymmetry and rephrase to be inclusive of other possibilities. Importantly, this additional possibility does not affect the current interpretation of the data in our manuscript. Furthermore, we have removed Fig. 6 from the manuscript, and we now include a direct comparison of the in silico predicted distribution coming from the asymmetric hybrid structure with the 8 conditions tested, for both mutants (i.e. S525R1 and C599R1).

      Reviewer #3 (Public review):  

      Summary:  

      Membrane-bound pyrophosphatases (mPPases) are homodimeric proteins that hydrolyze pyrophosphate and pump H+/Na+ across membranes. They are attractive drug targets against protist pathogens. Non-hydrolysable PPi analogue bisphosphonates such as risedronate (RSD) and pamidronate (PMD) serve as primary drugs currently used. Bisphosphonates have a P-C-P bond, with its central carbon can accommodate up to two substituents, allowing a large compound variability. Here the authors solved two TmPPase structures in complex with the bisphosphonates etidronate (ETD) and zoledronate (ZLD) and monitored their conformational ensemble using DEER spectroscopy in solution. These results reveal the inhibition mechanism of these compounds, which is crucial for developing future small molecule inhibitors.  

      Strengths:  

      The authors show that seven different bisphosphonates can inhibit TmPPase with IC50 values in the micromolar range. Branched aliphatic and aromatic modifications showed weaker inhibition.  

      High-resolution structures for TmPPase with ETD (3.2 Å) and ZLD (3.3 Å) are determined. These structures reveal the binding mode and shed light on the inhibition mechanism. The nature of modification on the bisphosphonate alters the conformation of the binding pocket.  

      The conformational heterogeneity is further investigated using DEER spectroscopy under several conditions.  

      Weaknesses:  

      The authors observed asymmetry in the TmPPase-ELD structure above the hydrolytic center. The structural asymmetry arises due to differences in the orientation of ETD within each monomer at the active site. As a result, loop5-6 of the two monomers is oriented differently, resulting in the observed asymmetry. The authors attempt to further establish this asymmetry using DEER spectroscopy experiments. However, the (over)interpretation of these data leads to more confusion than any further understanding. DEER data suggest that the asymmetry observed in the TmPPase-ELD structure in this region might be funneled from the broad conformational space under the crystallization conditions. 

      We respectfully disagree with the reviewer. The asymmetry was previously established using serial time crystallography (Strauss et al., EMBO Rep, 2024) and biochemical assays (e.g. Malinen et al., Prot. Sci., 2022; Artukka et al., Biochem J, 2018; Luoto et al., PNAS, 2013) and partially seen in one static structure (Vidilaseris et al., Sci Adv 2019). DEER data here also show that the previously proposed asymmetry is also present (and this presence of asymmetry is consistent across all DEER data) within the TmPPase conformational ensemble in solution conditions. Although we cannot rule out the possibility that the TmPPase monomers adopt a metastable intermediate state, in such a case we would expect the distance changes reported by DEER to be symmetric across both membrane sides. However, we observe a symmetry breaking between the cytoplasmic and periplasmic TmPPase sites. Indeed, DEER data yield distance distributions similar to that of the hybrid asymmetric structure under all: apo, +Ca, +Ca/ETD, +ETD, +ZLD, +IDP, +PAM, +ALE conditions.

      DEER data for position T211R1 at the enzyme entrance reveal a highly flexible conformation of loop56 (and do not provide any direct evidence for asymmetry, Figure EV8).

      Please see relevant response above. We acknowledge that T211 is indeed situated on a highly dynamic loop, which is important for gating and our DEER data confirm the high flexibility of this protein region. Given we have not observed dipolar oscillations, leading to broad distributions, we have stated in the original manuscript that we will not establish the presence of any asymmetry in solution on the basis of T211, rather relying on the S525R1 and the new C599R1 sites, for which we have acquired high-quality DEER data, as was also pointed out and has been commented on by all reviewers. We have provided data at the C599R1 position (same cytoplasmic side as 211 for which we have now limited our analysis to a minimum) which further provides evidence for asymmetry, including two new conditions.

      Similarly, data for position S521R1 near the exit channel do not directly support the proposed asymmetry for ETD.  

      The reviewer appears to suggest that we hold the S525R1 DEER data as direct proof of asymmetry; this is combative on the grounds that to directly prove asymmetry would require time-resolved DEER measurements, far beyond the scope of this work. Rather, we have applied DEER measurements to explore whether asymmetry (observed previously via time-resolved X-ray crystallography) is also present (or indeed a possibility) in solution. All our S525R1 and C599R1 DEER data (recorded for eight conditions) are consistent with asymmetry (see also detailed response above).

      Despite the high quality of the data, they reveal a very similar distance distribution. The reported changes in distances are very small (+/- 0.3 nm), which can be accommodated by a change of spin label rotamer distribution alone. Further, these spin labels are located on a flexible loop, thereby making it difficult to directly relate any distance changes to the global conformation

      We thank the reviewer for recognising the high quality of our DEER data for the S525R1 site which we now complement with a new pair on the cytoplasmic facing membrane side (C599R1) with DEER data of comparable quality as for S525R1, where visual oscillations in the raw traces for both spin pairs, as in our case, reportedly lead to highly accurate and reliable distributions, able to separate (in fortuitous cases) helical movements of only a few Angstroms (Peter et al., Nature Comms 13:4396, 2022; Klose et al., Biophys J 120:4842-4858, 2021). The ability of DEER/PELDOR offering near Angstrom resolution was also previously demonstrated by the acquisition and solution of highresolution multi-subunit spin-labelled membrane protein structures (Pliotas at al., PNAS, 2012; Pliotas et al., Nat Struct Mol Biol, 2015; Pliotas, Methods Enzymol, 2017) as well as its ability in detecting small (and of similar to mPPase magnitude) conformational changes in different integral membrane protein systems (Kapsalis et al., Nature Comms, 2019; Kubatova et al., PNAS, 2023; Schmidt et al., JACS, 2024; Lane et al., Structure, 2024; Hett et al., JACS, 2021; Zhao et al., Nature, 2024), occurring under different conditions and/or stimuli in solution and/or lipid environment. The changes here are not below the detection sensitivity of DEER (e.g. ~ 7 Angstroms between the two modal distance extremes (+Ca vs +IDP for S525R1), and with all other conditions showing intermediate changes.  

      We agree with the reviewer that these changes are relatively small, but they are expected for membrane ion pumps. Indeed, none of the mPPase structures show helical movements of greater than half a turn, and that only in helices 6 and 12. There appear to be larger-scale loop closing motions of the 5-6 loop that includes T211, due to the presence of E217 which binds to one of the Mg<sup>2+</sup> ions that coordinate the leaving group phosphate. This is, inter alia, the reason that this loop is so flexible: it cannot order before substrate is bound.  

      The reviewer suggests that the subtle distance shifts detected arise only from changes of label rotamer distribution. However, the concerted nature of the modal distance shifts with respect to multiple different conditions at a single labelling site strongly suggests that preferential rotamer orientations are not the cause. Indeed, for so many spin labels to undergo an arbitrary shift that the modal distance of the entire distribution changes – and in the absence of any conformational change – appears improbable. Here we have the resolution to detect such subtle differences by DEER, given there are unambiguous shifts in our time domain data (i.e. the position of the minimum of the first dipolar oscillation) (Fig 4) and these are reflected in the modal distances in the distributions. We also refrain from performing any quantitative analysis and use qualitative trends in modal distance shifts only; all which support our proposed model of a symmetry breaking across the membrane face. To further belabour this point, we do not quantify the DEER data (for instance through parametric fitting) to extract populations of different conformational states and we appreciate that to do so would be highly prone to error; however we do (and can, we feel without over-interpretation) assert that the modal distances shift.  

      The interpretations listed below are not supported by the data presented:  

      (1) 'In the presence of Ca2+, the distance distribution shifts towards shorter distances, suggesting that the two monomers come closer at the periplasmic side, and consistent with the predicted distances derived from the TmPPase:Ca structure.'

      Problem: This is a far-stretched interpretation of a tiny change, which is not reliable for the reasons described in the paragraph above. 

      While the authors overall agree with the reviewer assessment that ±0.3 nm is a small (not a minor) change, there are literature examples quantifying (or using for quantification) distribution peaks separated by similar Δr. (Kubatova et al., PNAS, 2023; Schmidt et al., JACS, 2024; Hett et al., JACS, 2021; Zhao et al., Nature, 2024). However, the time-domain data clearly indicate the position of the first minimum of the dipolar oscillation shifts to shorter dipolar evolution time. The sensitivity of the time-domain data to subtle changes in dipolar coupling frequency is significantly improved compared to the distance distributions.

      Importantly, we have fitted Gaussians to the experimental distance distributions of 525R1 output by the Comparative Deer Analyzer 2.0 and observed a change in the distribution width in presence of Ca2+, implying the rotameric freedom of the spin label is restricted. However, the CW-EPR for 525R1 indicate that the rotational correlation time of the spin label is highly consistent between conditions (the spectra are almost identical); this cannot be explained simply by rotameric preference of the spin label (as asserted by the reviewer 3), as there is no (further) immobilisation observed from the CW-EPR of apo-state (Figure EV9) to that in presence of Ca2+. Furthermore, in the absence of conformational changes, it is reasonable to assume (and demonstrable from the CW-EPR data) that the rotamer cloud should not significantly change between conditions. However, Gaussian fits of the two extreme cases yielding the longest (i.e., in presence of IDP) and shortest (in presence of ZLD) modal distances for the 525R1 DEER data indicated significant (i.e., above the noise floor after Tikhonov validation) probability density for the IDP condition at 50 Å (P(r) = 0.18). This occurs at four standard deviations above the mean of the Guassian fit to the +ZLD condition, which by random chance should occur with <0.007% probability.  

      As in previous response, the method can detect changes of such magnitude which are not small, but physiologically relevant and expected for integral membrane proteins, such as mPPases. Indeed, even in equal (or more) complex systems such as heptameric mechanosensitive channel proteins DEER provided sub-Angstrom accuracy, when a spin labelled high resolution XRC structure was solved (Pliotas et al., PNAS, 2012; Pliotas et al., Nat Struct Mol Biol, 2015). Despite this being an ideal case where DEER accuracy was experimentally validated another high-resolution structural method on modified membrane protein and is not very common it demonstrates the power of the method, especially when strong oscillations are present in the raw DEER data (as here for mPPase S525R1, and C599R1), even when multiple distances are present, Angstrom resolution is achievable in such challenging protein classes.

      (2) 'Based on the DEER data on the IDP-bound TmPPase, we observed significant deviations between the experimental and the in silico distances derived from the TmPPase:IDP X-ray structure for both cytoplasmic- (T211R1) and periplasmic-end (S525R1) sites (Figure 4D and Figure EV8D). This deviation could be explained by the dimer adopting an asymmetric conformation under the physiological conditions used for DEER, with one monomer in a closed state and the other in an open state.'  

      Problem: The authors are trying to establish asymmetry using the DEER data. Unfortunately, no significant difference is observed (between simulation and experiment) for position 525 as the authors claim (Figure 4D bottom panel). The observed difference for position 112 must be accounted for by the flexibility and the data provide no direct evidence for any asymmetry.  

      Reviewer 3 is incorrect in suggesting that we are trying to prove asymmetry through the DEER data. That is a well-known fact in the literature (e.g. Vidilaseris et al, Sci Adv 2019) where we show (1) that the exit channel inhibitor ATC (i.e. close to S525R1) binds better in solution to the TmPPase:PPi complex than the TmPPase:PPi<sub>2</sub> complex, and (2) that ATC binds in an asymmetric fashion to the TmPPase:IDP<sub>2</sub> complex with just one ATC dimer on one of the exit channels. We merely use the DEER data to support this well-established fact.  

      However, because we agree that the DEER data in presence of IDP does not provide direct proof for asymmetry; particularly for the cytoplasmic facing mutant T211R1, we have refrained from interpreting T211R1 data beyond being a highly dynamic loop region (as evidenced by the broad distributions). As pointed out by the reviewer, the differences in distance distributions between conditions observed for T211R1 likely arise from conformational heterogeneity in solution. Furthermore, we now report DEER data on another new site (C599R1), which is also on the cytoplasmic side and yields high quality DEER data comparable to the S525R1 data (commended for their quality by both the reviewers). The C599R1 measurements show that in all conditions tested, highly similar distributions are observed, inconsistent with the in silico predicted distance distributions from the symmetric X-ray structures, but consistent with an asymmetric hybrid structure (i.e. open-closed) in solution. Importantly, the difference between the fully open (6.8 nm modal distance) and fully closed (4.8 nm modal distance) states of the C599R1 dimer is larger than for the S525R1 dimer pair. Thus, delineating the asymmetric hybrid conformation from the symmetric conformations is more robust.

      (3) 'Our new structures, together with DEER distance measurements that monitor the conformational ensemble equilibrium of TmPPase in solution, provide further solid experimental evidence of asymmetry in gating and transitional changes upon substrate/inhibitor binding.'  

      Problem: See above. The DEER data do not support any asymmetry. 

      We feel that the reviewer comments here are somewhat unfounded. All the DEER data (for 525R1 periplasmic and C599R1 cytoplasmic sites are described, most parsimoniously, using an asymmetric hybrid structure. In particular, the new C599R1 distance distributions are poorly described by the symmetric X-ray crystal structures, with a conserved modal distance of approx. 5.8 nm throughout the tested conditions that aligns nicely with the in silico predictions from the asymmetric hybrid structure. Additionally, all S525R1 and C599R1 data well exceed the relevant criteria of the recent white paper (Schiemann et al., 2021, JACS) from the EPR community to be considered reliably interpretable (strong visual oscillations in the raw traces; signal-to-noise ratio .r.t modulation depth of > 20 in all cases; replicates have been performed and added into the maintext or supplementary; near quantitative labelling efficiency (evidenced by lack of free spin label signal in the CW-EPR spectra); analysed using the CDA (now Figure EV10) to avoid confirmation bias).

      While the DEER data do not prove asymmetry, we do not claim proof of asymmetry in the above sentence. We concede to rephrase the offending sentence above as: “Our new structures, together with DEER distance measurements that monitor the conformational ensemble of TmPPase in solution, do not exclude asymmetry in gating and transitional changes upon substrate/inhibitor binding and are consistent with our proposed model.” We feel that this reframed conjecture of asymmetry is well founded; indeed, comparing all the 16 experimentally derived DEER distance distributions for the 525R1 and 599R1 sites with in-silico modelling performed on the hybridised asymmetric structure (i.e., comprised of one monomer bound to Ca2+ and another bound to IDP) yields overlap coefficients (Islam and Roux, JPC B, 2015) of >0.85. This implies the envelope of the modelled distance distribution is quantitatively inside the envelope of the experimental distance distributions. Thus, the DEER data support asymmetry (previously observed by time-resolved XRC) in solution, and while we appreciate that ideally one would measure time-resolved DEER to directly correlate kinetics of conformational changes within the ensemble to the catalytic cycle of mPPase, (and this is something we aim to do in the future), it is far beyond the scope of this study.

      Indeed, half-of-the-sites reactivity has been demonstrated in at least the following papers

      (Vidilaseris et al, Sci Acv. ,2019, Strauss et al, EMBO Rep. 2024, Malinen et al Prot Sci, 2022, Artukka et al Biochem J, 2018; Luoto et al, PNAS, 2013). Half-of-the sites activity requires asymmetry in the mechanism, and therefore asymmetric motions in the active site (viz 211) and exit channel (viz 525). As mentioned above, we have demonstrated this for other inhibitors (Vidilaseris et al 2019) and as part of a time-resolved experiment (Strauss et al 2024). In fact, given the wealth of evidence showing that the symmetrical crystal structures sample a non- or less-productive conformation of the protein, it would be quixotic to propose the DEER experiments - in solution - do not generate asymmetric conformations. It certainly doesn’t obey Occam’s razor of choosing the simplest possible explanation that covers the data.

      (4) Based on these observations, and the DEER data for +IDP, which is consistent with an asymmetric conformation of TmPPase being present in solution, we propose five distinct models of TmPPase (Figure 7).  

      Problem: Again, the DEER data do not support any asymmetry and the authors may revisit the proposed models. 

      We have redressed the proposed models and limited them to four asymmetric models to clearly illustrate the apo/+Ca/+Ca:ETD-state (model 1) and highlight the distinct binding patterns of various inhibitors (ETD, ZLD and IDP; model 2-4), which result in a variety of closed/open-open states. In this version, we clarify that the proposed models are not solely based on the DEER data but all DEER data recorded for multiple conditions, inhibitors and for two opposite membrane side facing reporters are highly consistent, and are grounded in both current and previously solved structures, with the DEER data providing additional consistency with these models.

      (5) 'In model 2 (Figure 7), one active site is semi-closed, while the other remains open. This is supported by the distance distributions for S525R1 and T211R1 for +Ca/ETD informed by DEER, which agrees with the in silico distance predictions generated by the asymmetric TmPPase:ETD X-ray structure'  

      Problem: Neither convincing nor supported by the data 

      We respectfully disagree with the reviewer. However, owing to the conformational heterogeneity of T211R1, we now exclude T211R1 data from quantitative interpretation of changes to the conformational ensemble. Instead, we include new DEER data from site C599R1, which provides high-quality and convincing data that is consistent with asymmetry at the cytoplasmic face, and inconsistent with in silico distance distributions derived from symmetric X-ray crystal structures. Furthermore, the S525R1 distance distributions for the +ETD (corresponding to +Ca/ETD) and +ZLD conditions were directly compared with both the apo-state distance distribution (corresponding to a fully open, symmetric conformation) and the in silico predicted distributions of the asymmetric hybrid structure (corresponding to an open-closed conformation). Overlap coefficients were calculated (given in the main text) that indicated the +ETD (corresponding to +Ca/ETD) and +ZLD S525R1 distributions were more consistent with the apo-state distance distribution. This suggests that while on the cytosolic face of the membrane, an open-closed conformation is favoured, on the periplasmic face, a symmetric open-open conformation is favoured.

      Recommendations for the authors:  

      Reviewer #1 (Recommendations for the authors):   

      (1) The DEER experiments were performed with the two crystallized inhibitors, ETD and ZLD, along with previously characterized IDP. It would increase the impact of a tighter-binding phosphonate was examined since the inhibitory mechanism of these molecules is of greater interest. 

      We acknowledge the reviewer concern regarding the choice of weaker inhibitors. We chose to focus on the weaker binders, as we were able to obtain high-quality crystal structures for these compounds. This allowed us to perform DEER spectroscopy with the added advantage of accurately analysing the data against structural models derived from X-ray crystallography. In the revised version, we also include results from alendronate and pamidronate, two of the tighter inhibitors, which show similar and consistent results to the others.

      (2) I'm not able to find the concentrations of ETD and ZLD used for the DEER experiments. This information should be added to the Methods section on sample prep for EPR. 

      The information is already mentioned in the Method section on sample preparation for EPR spectroscopy (page 24), where we indicated that the protein aliquots were incubated with a final concentration of 2 mM inhibitors or 10 mM CaCl2 (30 min, RT). However, we recognise that this may not have been sufficiently clear. To clarify, we now explicitly state that the concentration of ETD and ZLD (amongst other inhibitors) used for the DEER experiments is 2 mM.  

      (3) There should be additional detail about the electrometry replicates. Does "triplicate" mean three measurements on the same sensor, three different sensors, and different protein preparations? At a minimum, data should be collected from three different sensors to ensure that the negative results (lack of current) for ETD and ZLD are not due to a failed sensor prep. In addition, Data from the other replicates should be shown in a supplementary figure, either the traces, or in a summary figure. Are the traces shown collected on the same sensor? They could be, in principle, since the inhibitor is washed away after each perfusion. 

      Yes, by 'triplicate', we mean three measurements taken on the same sensor. All traces shown were collected from a single sensor. Thank you for your advice; we now show here additional data from other sensors that display the same pattern. As for the possibility of a failed sensor preparation, this is unlikely since we always ensure the sensor quality with the substrate (PPi) as a positive control after each measurement.

      Author response image 1.

      (4) I'm confused by the NEM modification assay, and I don't think there is enough information in this manuscript for a reader to figure out what is happening. Why is the protein active if an inhibitor is present? I understand that there is a conformational change in the presence of the inhibitor that buries a cysteine, but the inhibitor itself should diminish function, correct? Is the inhibitor removed before testing the function? In addition, it would be clearer if the cysteines that are modified are indicated in the main text. I don't understand what is being shown in Figure Ev2. Shouldn't the accessible cysteines in the apo form be shown? Finally, the sentence "IDP has been reported to prevent the NEM modification..." does not make sense to me. Should the word "by" be removed from this sentence? 

      We apologize for the confusion. Yes, the inhibitors were removed before testing the protein function. In Figure EV2, the accessible cysteines are shown for both the apo and IDP-bound states. As seen, the accessible cysteines in the IDP-bound states are fewer than those in the apo state, meaning fewer cysteines are available for modification. Consequently, more activity is retained when IDP binds due to the reduction in accessible cysteines. We have addressed this in the manuscript (see the method section on the NEM modification assay).

      (5) Why does the model in Figure 7 show the small molecules bound to only one subunit, when they are crystallized in both subunits? 

      We propose that the small molecules bound to the two subunits in the crystal structure is likely a result of substrate inhibition, given the excess inhibitor used during crystallisation (e.g. Artukka, et al., Biochemical Journal, 2018; Vidilaseris, et al., Science Advances, 2022). Our PELDOR data indicate that in solution, the small molecules bound to TmPPase are in an intermediate state between both subunits being closed and both being open, most likely with at least one subunit in an open state. This is also consistent with previous kinetic studies (Anashkin, V. A., et al., International Journal of Molecular Sciences, 22, 2021), which showed that the binding constant of IDP to the second subunit is around 120 times higher than that of the first subunit.

      (6) The authors argue that the two ETDs bound in the two protomers adopt distinct conformations. Can this be further supported, for example, by swapping the position of the two ETDs between the two protomers and calculating a difference map (there should be corresponding negative/positive density if the modelling of the two different conformations is robust)? 

      As per the reviewer suggestion, we swapped the positions of the two ETDs between the protomers and calculated the difference electron density map. This analysis, presented in Figure EV3, reveals corresponding negative and positive electron density peaks, indicating that the ETDs indeed adopt distinct conformations in each protomer, supporting the accuracy of our modeling.

      (7) Are the changes in loop conformation possibly due to crystal packing differences for the two protomers? 

      We examined the crystal packing of the two protomers and found no interactions at the loop regions (red coloured in Author response image 2 below) that could be attributed to crystal packing differences. Therefore, we rule out this possibility.

      Author response image 2.

      (8) Typos:  

      Legend for Figure EV2 cystine - cysteine  

      Page 14, last sentence of the first paragraph: further - further  

      Figure 6 legend: there is no reference to panel B.  

      Thanks for pointing out the typos, now they are fixed.

      Reviewer #2 (Recommendations for the authors):  

      (1) T211 is located on the same loop where ligand/inhibitor-coordinating side chains (E217, D218) are located. It has not been tested whether spin labeling here would affect inhibitor binding. 

      We test all the mutant(s) activity before spin labelling, but not the activity of the spin-labelled mutants. MTSSL spin labels are typically not structurally perturbing. In particular, the T211R1 site that the reviewer is referring to is now not included in our interpretation of conformational changes occurring during mPPase’s functional cycle.

      (2) Why should the spin label be introduced to T211, which is recognized as a flexible region in the crystal structure? Authors should search for suitable residues except for T211 and other residues in this loop to evaluate the cytoplasmic distance. 

      We acknowledge the reviewer’s concern regarding the flexibility of the T211 region for spin labelling. Given the challenges associated with TmPPase, including reduced protein expression, loss of function, or inaccessibility upon spin labelling at certain sites, we have explored alternative residues. After extensive testing, we identified C599 as a suitable site for spin labelling resulting in high-quality DEER data. The results from spin labelling at C599 have been incorporated into the revised manuscript.

      (3) On the other hand, DEER data for S525 is solid, as the authors stated. This residue is located on the luminal side of the enzyme. However, the description of the luminal side structure and the comparison of symmetric/asymmetric dimer in this par are missing in the paper. 

      We thank the viewer for their positive assessment of the S525R1 DEER data. The data for 525 and now also for 599 spin pairs are indeed solid given the strong visual oscillation we observed particularly in such a challenging system.   

      We presented the periplasmic sites in the crystal structure dimer (Figure 4A), highlighting both the symmetrical region and the asymmetric model in Figure 4. In the revised version, we include additional details about this region and our rationale for labeling at position S525.

      (4) The conclusion models (Figure 7) are misleading. In the crystal structure, the 5-6Loop distance between each monomer should be close given the location of the dimer interface, and the actual distance between T211 in the structure (for example, in 5lzq) is about 10A. Nevertheless, the model depicts this distance longer than S525 (40.7A in 5LZQ), which would give a false impression. 

      We would like to apologize for the misleading model. We have now corrected the models to ensure they are consistent with their respective regions in the crystal structures.

      (5) P8 last paragraph  

      It is hard to imagine that in a crystal lattice, the straight inhibitor always binds to monomer A, and the neighboring monomer is always attached to a slightly tilted inhibitor, which causes asymmetry. For example, wouldn't it mean that it would first bind to one of them, which would then affect the neighboring monomer via 5-6 Loop, which would then affect its binding pose? So in this case, the inhibitor did not ARAISE asymmetry, and this is where it is misleading for readers. 

      We apologize for the confusion. What we intended to convey is that the first inhibitor binds to one protomer, which then affects the conformation of the neighbouring monomer, ultimately influencing its binding pose. This is required for half-of-the-sites reactivity, which is well-established in this system. This is reflected in our crystal structure, where we observed asymmetry in the loop 5-6 region and the ETD orientation between the two protomers. We have addressed this in the manuscript accordingly.

      (6) P11 L4 EV10 instead of EV8? 

      Thanks for pointing out. We have corrected it accordingly.

      (7) P11 L5 It is difficult to determine whether the peak is broad or sharp. Should be evaluated quantitatively by showing the half-value width of the peak. This may also be helpful to judge whether the peak is a mixture of two components or a single one. 

      We have taken this analysis out and rephrased the offending sentence. We have also added the FWHM values as the Reviewer suggested, and corresponding standard deviations for the distance distributions (under approximation as Gaussian distribution).   

      (8) Throughout the paper, the topology of the enzyme may be difficult to follow for readers who are not experts in this field. Please indicate the membrane plane's location or a figure's viewpoint in the caption. 

      We acknowledge the importance of making our figures accessible to all readers. In the revised manuscript, we have enhanced the clarity of our figures by explicitly indicating the membrane plane’s location and specifying the viewpoint in each figure caption. For example, we have added annotations such as “Top view of the superposition of chain A (cyan) and chain B (wheat), showing the relative movements (black arrow) of helices. The membrane plane is indicated by dashed lines.”

      (9) Figure 2B Check the color of the helix.  

      IDP and ETD are almost the same color, so it is difficult to see the superposition. It would be easier to understand the reading by, for example, using a lighter or transparent color set only for IDPs.  

      We acknowledge the reviewer concern regarding the colour similarity between the IDP and ETD in Figure 2B, which hinders clear differentiation. To enhance visual distinction, we have adjusted the colour scheme by changing the TmPPase:IDP structure colour to light blue. This modification improves the clarity of the superposition, making the structural differences more discernible.

      (10) Figure 2C Check the coordination state (dotted line), there appears to be coordination between E217Cg and Mg. Also, water that is located near N492 appears to be a bit distant from Mg, why does this act as a ligand? Stereo view or view from different angles, and distance information would help the reader understand the bonding state in more detail.  

      Yes, we confirm that Mg<sup>2+</sup> is coordinated by the oxygen atoms from both the side chain and main chain of residue E217. The water molecule near N492 is not directly coordinated with Mg<sup>2+</sup> but interacts with the O5 atom of one of the phosphate groups in ETD. To enhance clarity, we have updated Figure 2C (and other related figures) to include stereo views.  

      (11) Figure 5A: in the Bottom view (lower left), the symmetric dimer does not look symmetric. Better to view from a 2-fold axis exactly.  

      We have taken this figure out entirely and instead add a direct comparison to the in silico predicted distribution from the asymmetric hybrid structure to all 16 experimental DEER distributions. We have added the symmetric and asymmetric structures to Fig. 4A and view the symmetric structure along the 2-fold axis, as suggested.   

      (12) Figure 5B: Indicate which data is plotted in the caption.  

      As mentioned above, we have taken this figure out, as we felt quantifying two overlapping populations from a single Gaussian was over-interpretation of the data, and at the suggestion of reviewer 3, we have tailored our interpretation here.  

      (13) Figure EV8:  

      Because the authors discuss a lot about their conclusive model based on this data, Figure EV8 should be treated as a main figure, not a supplement. However, this reviewer has serious concerns about the measurement in this figure. Because DEER for T211 is too noisy, I don't see the point in discussing this in detail. For example, in the Ca/ETD data, there is a peak near 50A, but it would be difficult for TM5 to move away from this distance unless the protein unfolds. I do not find it meaningful to discuss using measurement results in which such an impossible distance is detected as a peak.  

      A: Show top view as in Figure 5  

      D: 2nd row dotted line. Regarding the in silico model that is used as a reference to compare the distance information, the distance of 40-50 A for T211 in the Ca-bound form is hard to imagine. PDB 4av6 model shows that T211 is disordered and not visible, but given the position of the TM5 helix, it does not appear to be that different from the IDR binding structure (5LZQ, 10A between two T211). The structures of in silico models are not shown in the figure, as it is only mentioned as modeled in Rossetafold. Please indicate their structures, especially focused on the relative orientation of T211 and S525 in the dimer, which would allow readers to determine the distances.  

      We acknowledge the reviewer’s concerns regarding Figure EV8 and the DEER data for T211R1. Upon re-evaluation, we recognize that the non-oscillating nature of the DEER data for T211R1 leads to broad distributions, indicating increased conformational dynamics, which is expected for a highly dynamic loop. Consequently, we have limited the discussion and interpretation of T211R1 in the revised manuscript and focused more on C599R1.

      Reviewer #3 (Recommendations for the authors):  

      A careful interpretation of the data in view of these limitations and without directly linking to asymmetry could solve the problem of the over-interpretation of the DEER data.  

      We respectfully disagree with the reviewer. Please see our detailed response above.  

      Additional comments:  

      (1) Did the authors use a Cys-less construct for spin labeling and DEER experiments?  

      We utilized a nearly Cys-less construct in which all native cysteines were mutated to serine, except for Cys183, which was retained due to its buried location and functional importance. We then introduced single cysteine mutations for spin labelling. For C599, Ser599 was reverted to cysteine.

      (2) The time data for position T211R1 is too short for most cases (Figure EV8D) for a reliable distance determination. No confidence interval is given for the '+Ca' sample distance distributions.  

      We recorded longer time traces for two of the conditions to better assign the background. We did not use the 211R1 data to reach any conclusions regarding asymmetry, which were based on the 525R1 and the 599R1 data. We now simply include T211R1 data to indicate the high mobility observed at loop5-6. We have added the confidence interval for the +Ca condition.  

      (3) It is recommended to mention the 2+1 artefact obvious at the end of the DEER data. 

      In the methods section, we have mentioned that the “2+1” artefact present at the end of the S525R1, and T211R1 DEER data likely arises from using a 65 MHz offset, rather than an 80 MHz offset (as for the C599R1 data), which avoids significant overlap of the pump and detection pulses. We also mention in the methods section that owing to the intense “2+1” artefact, the decision was made to truncate the artefact away, to minimise the impact on data treatment. As for motivation to use the lower offset of 65 MHz, we did so to maximise the achievable signal-to-noise ratio (SNR), as particularly for the T211R1 data, the detected echo was quite weak. This was further exacerbated by the poor transverse relaxation time observed at that site.  

      (4) Please check the number of significant digits for all the reported values. 

      We have addressed the number of significant digits as requested.

      (5) Please report the mean distances from DEER experiments with the standard deviation or FWHM.

      We have addressed this in the revised manuscript, we report modal distances rather than the mean distances and provide the FWHM and standard deviation.

    1. eLife Assessment

      This study presented valuable findings regarding the basic molecular pathways leading to the cystogenesis of Autosomal Dominant Polycystic Kidney Disease, suggesting BICC1 functions as both a minor causative gene for PKD and a modifier of PKD severity. Although some solid data were supplied to show the functional and structural interactions between BICC-1 and PKD2 and their relevance to the pathogenesis of ADPKD, the characterization of such interactions appear to be incomplete, which renders the specific relevance of these findings for disease etiology unclear.

    2. Reviewer #1 (Public review):

      In this manuscript, Tran et al. investigate the interaction between BICC1 and ADPKD genes in renal cystogenesis. Using biochemical approaches, they reveal a physical association between Bicc1 and PC1 or PC2 and identify the motifs in each protein required for binding. Through genetic analyses, they demonstrate that Bicc1 inactivation synergizes with Pkd1 or Pkd2 inactivation to exacerbate PKD-associated phenotypes in Xenopus embryos and potentially in mouse models. Furthermore, by analyzing a large cohort of PKD patients, the authors identify compound BICC1 variants alongside PKD1 or PKD2 variants in trans, as well as homozygous BICC1 variants in patients with early-onset and severe disease presentation. They also show that these BICC1 variants repress PC2 expression in cultured cells.

      Overall, the concept that BICC1 variants modify PKD severity is plausible, the data are robust, and the conclusions are largely supported. However, several aspects of the study require clarification and discussion:

      (1) The authors devote significant effort to characterizing the physical interaction between Bicc1 and Pkd2. However, the study does not examine or discuss how this interaction relates to Bicc1's well-established role in posttranscriptional regulation of Pkd2 mRNA stability and translation efficiency.

      (2) Bicc1 inactivation appears to downregulate Pkd1 expression, yet it remains unclear whether Bicc1 regulates Pkd1 through direct interaction or by antagonizing miR-17, as observed in Pkd2 regulation. This should be further examined or discussed.

      (3) The evidence supporting Bicc1 and ADPKD gene cooperativity, particularly with Pkd1, in mouse models is not entirely convincing, likely due to substantial variability and the aggressive nature of Bpk/Bpk mice. Increasing the number of animals or using a milder Bicc1 strain, such as jcpk heterozygotes, could help substantiate the genetic interaction.

    3. Reviewer #2 (Public review):

      Tran and colleagues report evidence supporting the expected yet undemonstrated interaction between the Pkd1 and Pkd2 gene products Pc1 and Pc2 and the Bicc1 protein in vitro, in mice, and collaterally, in Xenopus and HEK293T cells. The authors go on to convincingly identify two large and non-overlapping regions of the Bicc1 protein important for each interaction and to perform gene dosage experiments in mice that suggest that Bicc1 loss of function may compound with Pkd1 and Pkd2 decreased function, resulting in PKD-like renal phenotypes of different severity. These results led to examining a cohort of very early onset PKD patients to find three instances of co-existing mutations in PKD1 (or PKD2) and BICC1. Finally, preliminary transcriptomics of edited lines gave variable and subtle differences that align with the theme that Bicc1 may contribute to the PKD defects, yet are mechanistically inconclusive.

      These results are potentially interesting, despite the limitation, also recognized by the authors, that BICC1 mutations seem exceedingly rare in PKD patients and may not "significantly contribute to the mutational load in ADPKD or ARPKD". The manuscript has several intrinsic limitations that must be addressed.

      The manuscript contains factual errors, imprecisions, and language ambiguities. This has the effect of making this reviewer wonder how thorough the research reported and analyses have been.

    4. Reviewer #3 (Public review):

      Summary:

      This study investigates the role of BICC1 in the regulation of PKD1 and PKD2 and its impact on cytogenesis in ADPKD. By utilizing co-IP and functional assays, the authors demonstrate physical, functional, and regulatory interactions between these three proteins.

      Strengths:

      (1) The scientific principles and methodology adopted in this study are excellent, logical, and reveal important insights into the molecular basis of cystogenesis.

      (2) The functional studies in animal models provide tantalizing data that may lead to a further understanding and may consequently lead to the ultimate goal of finding a molecular therapy for this incurable condition.

      (3) In describing the patients from the Arab cohort, the authors have provided excellent human data for further investigation in large ADPKD cohorts. Even though there was no patient material available, such as HUREC, the authors have studied the effects of BICC1 mutations and demonstrated its functional importance in a Xenopus model.

      Weaknesses:

      This is a well-conducted study and could have been even more impactful if primary patient material was available to the authors. A further study in HUREC cells investigating the critical regulatory role of BICC1 and potential interaction with mir-17 may yet lead to a modifiable therapeutic target.

      Conclusion:<br /> The authors achieve their aims. The results reliably demonstrate the physical and functional interaction between BICC1 and PKD1/PKD2 genes and their products.

      The impact is hopefully going to be manifold:

      (1) Progressing the understanding of the regulation of the expression of PKD1/PKD2 genes.

      (2) Role of BiCC1 in mir/PKD1/2 complex should be the next step in the quest for a modifiable therapeutic target.

    5. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Tran et al. investigate the interaction between BICC1 and ADPKD genes in renal cystogenesis. Using biochemical approaches, they reveal a physical association between Bicc1 and PC1 or PC2 and identify the motifs in each protein required for binding. Through genetic analyses, they demonstrate that Bicc1 inactivation synergizes with Pkd1 or Pkd2 inactivation to exacerbate PKD-associated phenotypes in Xenopus embryos and potentially in mouse models. Furthermore, by analyzing a large cohort of PKD patients, the authors identify compound BICC1 variants alongside PKD1 or PKD2 variants in trans, as well as homozygous BICC1 variants in patients with early-onset and severe disease presentation. They also show that these BICC1 variants repress PC2 expression in cultured cells.

      Overall, the concept that BICC1 variants modify PKD severity is plausible, the data are robust, and the conclusions are largely supported. However, several aspects of the study require clarification and discussion:

      (1) The authors devote significant effort to characterizing the physical interaction between Bicc1 and Pkd2. However, the study does not examine or discuss how this interaction relates to Bicc1's well-established role in posttranscriptional regulation of Pkd2 mRNA stability and translation efficiency.

      The reviewer is correct that the present study has not addressed the downstream consequences of this interaction considering that Bicc1 is a posttranscriptional regulator of Pkd2 (and potentially Pkd1). We think that the complex of Bicc1/Pkd1/Pkd2 retains Bicc1 in the cytoplasm and thus restrict its activity in participating in posttranscriptional regulation. As we do not have yet experimental data to support this model, we have not included this model in the manuscript. Yet, we will update the discussion of the manuscript to further elaborate on the potential mechanism of the Bicc1/Pkd1/Pkd2 complex.

      (2) Bicc1 inactivation appears to downregulate Pkd1 expression, yet it remains unclear whether Bicc1 regulates Pkd1 through direct interaction or by antagonizing miR-17, as observed in Pkd2 regulation. This should be further examined or discussed.

      This is a very interesting comment. The group of Vishal Patel published that PKD1 is regulated by a mir-17 binding site in its 3’UTR (PMID: 35965273). We, however, have not evaluated whether BICC1 participates in this regulation. A definitive answer would require us utilize some of the mice described in above reference, which is beyond the scope of this manuscript. We, however, will revise the discussion to elaborate on this potential mechanism.

      (3) The evidence supporting Bicc1 and ADPKD gene cooperativity, particularly with Pkd1, in mouse models is not entirely convincing, likely due to substantial variability and the aggressive nature of Bpk/Bpk mice. Increasing the number of animals or using a milder Bicc1 strain, such as jcpk heterozygotes, could help substantiate the genetic interaction.

      We have initially performed the analysis using our Bicc1 complete knockout, we previously reported on (PMID 20215348) focusing on compound heterozygotes. Yet, like the Pkd1/Pkd2 compound heterozygotes (PMID 12140187) no cyst development was observed until we sacrificed the mice at P21. Our strain is similar to the above mentioned jcpk, which is characterized by a short, abnormal transcript thought to result in a null allele (PMID: 12682776). We thank the reviewer for pointing use to the reference showing the heterozygous mice show glomerular cysts in the adults (PMID: 7723240). This suggestion is an interesting idea we will investigate. In general, we agree with the reviewer that the better understanding the contribution of Bicc1 to the adult PKD phenotype will be critical. To this end, we are currently generating a floxed allele of Bicc1 that will allow us to address the cooperativity in the adult kidney, when e.g. crossed to the Pkd1<sup>RC/RC</sup> mice. Yet, these experiments are unfortunately beyond the scope of this manuscript.

      Reviewer #2 (Public Review):

      Tran and colleagues report evidence supporting the expected yet undemonstrated interaction between the Pkd1 and Pkd2 gene products Pc1 and Pc2 and the Bicc1 protein in vitro, in mice, and collaterally, in Xenopus and HEK293T cells. The authors go on to convincingly identify two large and non-overlapping regions of the Bicc1 protein important for each interaction and to perform gene dosage experiments in mice that suggest that Bicc1 loss of function may compound with Pkd1 and Pkd2 decreased function, resulting in PKD-like renal phenotypes of different severity. These results led to examining a cohort of very early onset PKD patients to find three instances of co-existing mutations in PKD1 (or PKD2) and BICC1. Finally, preliminary transcriptomics of edited lines gave variable and subtle differences that align with the theme that Bicc1 may contribute to the PKD defects, yet are mechanistically inconclusive.

      These results are potentially interesting, despite the limitation, also recognized by the authors, that BICC1 mutations seem exceedingly rare in PKD patients and may not "significantly contribute to the mutational load in ADPKD or ARPKD". The manuscript has several intrinsic limitations that must be addressed.

      As mentioned above, the study was designed to explore whether there is an interaction between BICC1 and the PKD1/PKD2 and whether this interaction is functionally important. How this translates into the clinical relevance will require additional studies (and we have addressed this in the discussion of the manuscript).

      The manuscript contains factual errors, imprecisions, and language ambiguities. This has the effect of making this reviewer wonder how thorough the research reported and analyses have been.

      We respectfully disagree with the reviewer on the latter interpretation. The study was performed with rigor. We have carefully assessed the critiques raised by the reviewer. Most of the criticisms raised by the reviewer will be easily addressed in the revised version of the manuscript. Yet, none of the critiques raised by the reviewer seems to directly impact the overall interpretation of the data.

      Reviewer #3 (Public Review):

      Summary:

      This study investigates the role of BICC1 in the regulation of PKD1 and PKD2 and its impact on cytogenesis in ADPKD. By utilizing co-IP and functional assays, the authors demonstrate physical, functional, and regulatory interactions between these three proteins.

      Strengths:

      (1) The scientific principles and methodology adopted in this study are excellent, logical, and reveal important insights into the molecular basis of cystogenesis.

      (2) The functional studies in animal models provide tantalizing data that may lead to a further understanding and may consequently lead to the ultimate goal of finding a molecular therapy for this incurable condition.

      (3) In describing the patients from the Arab cohort, the authors have provided excellent human data for further investigation in large ADPKD cohorts. Even though there was no patient material available, such as HUREC, the authors have studied the effects of BICC1 mutations and demonstrated its functional importance in a Xenopus model.

      Weaknesses:

      This is a well-conducted study and could have been even more impactful if primary patient material was available to the authors. A further study in HUREC cells investigating the critical regulatory role of BICC1 and potential interaction with mir-17 may yet lead to a modifiable therapeutic target.

      This is an excellent suggestion. We agree with the reviewer that it would have been interesting to analyze HUREC material from the affected patients. Unfortunately, besides DNA and the phenotypic analysis described in the manuscript neither human tissue nor primary patient-derived cells collected before the two patients with the BICC1 p.Ser240Pro mutation passed away. To address this missing link, we have – as a first pass - generated HEK293T cells carrying the BICC1 p.Ser240Pro variant. While these admittingly are not kidney epithelial cells, they indeed show a reduced level of PC2 expression. These data are shown in the manuscript. We have not yet addressed how this relates to its crosstalk with miR-17.

      Conclusion:

      The authors achieve their aims. The results reliably demonstrate the physical and functional interaction between BICC1 and PKD1/PKD2 genes and their products.

      The impact is hopefully going to be manifold:

      (1) Progressing the understanding of the regulation of the expression of PKD1/PKD2 genes.

      (2) Role of BiCC1 in mir/PKD1/2 complex should be the next step in the quest for a modifiable therapeutic target.

    1. eLife Assessment

      The ratio of nuclei to cell volume is a well-controlled parameter in eukaryotic cells. This important study now substantially advances our understanding of the regulatory relationship between cell size and the number of nuclei by identifying novel players in this process. The evidence supporting the conclusions is compelling, with biochemical assays and state-of-the-art microscopy. The paper will be of broad interest for cell biologists and fungal biotechnologists seeking to understand mechanisms determining cell size and number of nuclei, and why this knowledge is also of significant importance for the production of enzymes, and thus production strains not only of Aspergillus oryzae, but also other industrially used fungi.

    2. Reviewer #1 (Public review):

      Filamentous fungi are established workhorses in biotechnology, with Aspergillus oryzae as a prominent example with a thousand-year history. Still, the cell biology and biochemical properties of the production strains is not well understood. The paper of the Takeshita group describes the change in nuclear numbers and correlates it to different production capacities. They used microfluidic devices to really correlate the production with nuclear numbers. In addition, they used microdissection to understand expression profile changes and found an increase in ribosomes. The analysis of two genes involved in cell volume control in S. pombe did not reveal conclusive answers to explain the phenomenon. It appears that it is a multi-trait phenotype. Finally, they identified SNPs in many industrial strains and tried to correlate them to the capability of increasing their nuclear numbers.

      The methods used in the paper range from high-quality cell biology, Raman spectroscopy, to atomic force and electron microscopy, and from laser microdissection to the use of microfluidic devices to study individual hyphae.

      This is a very interesting, biotechnologically relevant paper with the application of excellent cell biology. I have only minor suggestions for improvement.

    3. Reviewer #2 (Public review):

      Summary:

      In the study presented by Itani and colleagues, it is shown that some strains of Aspergillus oryzae - especially those used industrially for the production of sake and soy sauce - develop hyphae with a significantly increased number of nuclei and cell volume over time. These thick hyphae are formed by branching from normal hyphae and grow faster and therefore dominate the colonies. The number of nuclei positively correlates with the thicker hyphae and also the amount of secreted enzymes. The addition of nutrients such as yeast extract or certain amino acids enhanced this effect. Genome and transcriptome analyses identified genes, including rseA, that are associated with the increased number of nuclei and enzyme production. The authors conclude from their data involvement of glycosyltransferases, calcium channels, and the tor regulatory cascade in the regulation of cell volume and number of nuclei. Thicker hyphae and an increased number of nuclei were also observed in high-production strains of other industrially used fungi such as Trichoderma reesei and Penicillium chrysogenum, leading to the hypothesis that the mentioned phenotypes are characteristic of production strains, which is of significant interest for fungal biotechnology.

      Strengths:

      The study is very comprehensive and involves the application of diverse state-of-the-art cell biological, biochemical, and genetic methods. Overall, the data are properly controlled and analyzed, figures and movies are of excellent quality.<br /> The results are particularly interesting with regard to the elucidation of molecular mechanisms that regulate the size of fungal hyphae and their number of nuclei. For this, the authors have discovered a very good model: (regular) strains with a low number of nuclei and strains with a high number of nuclei. Also, the results can be expected to be of interest for the further optimization of industrially relevant filamentous fungi.

      Weaknesses:

      There are only a few open questions concerning the activity of the many nuclei in production strains (active versus inactive), their number of chromosomes (haploid/diploid), and whether hyper-branching always leads to propagation of nuclei.

    4. Reviewer #3 (Public review):

      Summary:

      The authors seek to determine the underlying traits that support the exceptional capacity of Aspergillus oryzae to secrete enzymes and heterologous proteins. To do so, they leverage the availability of multiple domesticated isolates of A. oryzae along with other Aspergillus species to perform comparative imaging and genomic analysis.

      Strengths:

      The strength of this study lies in the use of multifaceted approaches to identify significant differences in hyphal morphology that correlate with enzyme secretion, which is then followed by the use of genomics to identify candidate functions that underlie these differences.

      Weaknesses:

      There are aspects of the methods that would benefit from the inclusion of more detail on how experiments were performed and data interpreted.

      Overall, the authors have achieved their aims in that they are able to clearly document the presence of two distinct hyphal forms in A. oryzae and other Aspergillus species, and to correlate the presence of the thicker, rapidly growing form with enhanced enzyme secretion. The image analysis is convincing. The discovery that the addition of yeast extract and specific amino acids can stimulate the formation of the novel hyphal form is also notable. Although the conclusions are generally supported by the results, this is perhaps less so for the genetic analysis as it remains unclear how direct the role of RseA and the calcium transporters might be in supporting the formation of the thicker hyphae.

      The results presented here will impact the field. The complexity of hyphal morphology and how it affects secretion is not well understood despite the importance of these processes for the fungal lifestyle. In addition, the description of approaches that can be used to facilitate the study of these different hyphal forms (i.e., stimulation using yeast extract or specific amino acids) will benefit future efforts to understand the molecular basis of their formation.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Filamentous fungi are established workhorses in biotechnology, with Aspergillus oryzae as a prominent example with a thousand-year history. Still, the cell biology and biochemical properties of the production strains is not well understood. The paper of the Takeshita group describes the change in nuclear numbers and correlates it to different production capacities. They used microfluidic devices to really correlate the production with nuclear numbers. In addition, they used microdissection to understand expression profile changes and found an increase in ribosomes. The analysis of two genes involved in cell volume control in S. pombe did not reveal conclusive answers to explain the phenomenon. It appears that it is a multi-trait phenotype. Finally, they identified SNPs in many industrial strains and tried to correlate them to the capability of increasing their nuclear numbers.

      The methods used in the paper range from high-quality cell biology, Raman spectroscopy, to atomic force and electron microscopy, and from laser microdissection to the use of microfluidic devices to study individual hyphae.

      This is a very interesting, biotechnologically relevant paper with the application of excellent cell biology. I have only minor suggestions for improvement.

      We sincerely appreciate your fair and positive evaluation of our work. Thank you for your suggestions for improvement. We respond to each of them appropriately.

      Reviewer #2 (Public review):

      Summary:

      In the study presented by Itani and colleagues, it is shown that some strains of Aspergillus oryzae - especially those used industrially for the production of sake and soy sauce - develop hyphae with a significantly increased number of nuclei and cell volume over time. These thick hyphae are formed by branching from normal hyphae and grow faster and therefore dominate the colonies. The number of nuclei positively correlates with the thicker hyphae and also the amount of secreted enzymes. The addition of nutrients such as yeast extract or certain amino acids enhanced this effect. Genome and transcriptome analyses identified genes, including rseA, that are associated with the increased number of nuclei and enzyme production. The authors conclude from their data involvement of glycosyltransferases, calcium channels, and the tor regulatory cascade in the regulation of cell volume and number of nuclei. Thicker hyphae and an increased number of nuclei were also observed in high-production strains of other industrially used fungi such as Trichoderma reesei and Penicillium chrysogenum, leading to the hypothesis that the mentioned phenotypes are characteristic of production strains, which is of significant interest for fungal biotechnology.

      Strengths:

      The study is very comprehensive and involves the application of diverse state-of-the-art cell biological, biochemical, and genetic methods. Overall, the data are properly controlled and analyzed, figures and movies are of excellent quality.

      The results are particularly interesting with regard to the elucidation of molecular mechanisms that regulate the size of fungal hyphae and their number of nuclei. For this, the authors have discovered a very good model: (regular) strains with a low number of nuclei and strains with a high number of nuclei. Also, the results can be expected to be of interest for the further optimization of industrially relevant filamentous fungi.

      Weaknesses:

      There are only a few open questions concerning the activity of the many nuclei in production strains (active versus inactive), their number of chromosomes (haploid/diploid), and whether hyper-branching always leads to propagation of nuclei.

      We are very grateful for your recognition of our findings, the proposed model, and their significance for future applications. We are grateful for the questions, which contribute to a more accurate understanding.

      Our responses to each are provided below. Necessary experiments are in progress.

      Reviewer #3 (Public review):

      Summary:

      The authors seek to determine the underlying traits that support the exceptional capacity of Aspergillus oryzae to secrete enzymes and heterologous proteins. To do so, they leverage the availability of multiple domesticated isolates of A. oryzae along with other Aspergillus species to perform comparative imaging and genomic analysis.

      Strengths:

      The strength of this study lies in the use of multifaceted approaches to identify significant differences in hyphal morphology that correlate with enzyme secretion, which is then followed by the use of genomics to identify candidate functions that underlie these differences.

      Weaknesses:

      There are aspects of the methods that would benefit from the inclusion of more detail on how experiments were performed and data interpreted.

      Overall, the authors have achieved their aims in that they are able to clearly document the presence of two distinct hyphal forms in A. oryzae and other Aspergillus species, and to correlate the presence of the thicker, rapidly growing form with enhanced enzyme secretion. The image analysis is convincing. The discovery that the addition of yeast extract and specific amino acids can stimulate the formation of the novel hyphal form is also notable. Although the conclusions are generally supported by the results, this is perhaps less so for the genetic analysis as it remains unclear how direct the role of RseA and the calcium transporters might be in supporting the formation of the thicker hyphae.

      The results presented here will impact the field. The complexity of hyphal morphology and how it affects secretion is not well understood despite the importance of these processes for the fungal lifestyle. In addition, the description of approaches that can be used to facilitate the study of these different hyphal forms (i.e., stimulation using yeast extract or specific amino acids) will benefit future efforts to understand the molecular basis of their formation.

      We are very grateful for your fair and thoughtful evaluation of our work. We agree that the genetic analysis in the latter part is relatively weaker compared to the imaging analysis in the first half. Rather than a single mutation causing a dramatic phenotypic change, we believe that the accumulation of various mutations through breeding leads to the observed phenotype, making it difficult to clearly demonstrate causality. Since transcriptome and SNP analyses have revealed key pathways and phenotypes, it would be gratifying if these insights could contribute to future applications utilizing filamentous fungi.

    1. eLife Assessment

      The manuscript presents a valuable finding that CCDC32, beyond its reported role in AP2 assembly, follows AP2 to the plasma membrane and regulates clathrin-coated pit assembly and dynamics. The authors further suggest that the alpha-helical region of CCDC32 interacts with AP2 via the alpha appendage domain to mediate this function. While live-cell and ultrastructural imaging data are solid, future biochemical studies will be needed to confirm the proposed CCDC32-AP2 interaction.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      This is a revision of a manuscript previously submitted to Review Commons. The authors have partially addressed my comments, mainly by expanding the introduction and discussion sections. Sandy Schmid, a leading expert on the AP2 adaptor and CME, has been added as a co-corresponding author. The main message of the manuscript remains unchanged. Through overexpression of fluorescently tagged CCDC32, the authors propose that, in addition to its established role in AP2 assembly, CCDC32 also follows AP2 to the plasma membrane and regulates CCP maturation. The manuscript presents some interesting ideas, but there are still concerns regarding data inconsistencies and gaps in the evidence.

      (1) eGFP-CCDC32 was expressed at 5-10 times higher levels than endogenous CCDC32. This high expression can artificially drive CCDC32 to the cell surface via binding to the alpha appendage domain (AD)-an interaction that may not occur under physiological conditions.

      (2) Which region of CCDC32 mediates alpha AD binding? Strangely, the only mutant tested in this work, Δ78-98, still binds AP2, but shifts to binding only mu and beta. If the authors claim that CCDC32 is recruited to mature AP2 via the alpha AD, then a mutant deficient in alpha AD binding should not bind AP2 at all. Such a mutant is critical for establish the model proposed in this work.

      (3) The concept of hemicomplexes is introduced abruptly. What is the evidence that such hemicomplexes exist? If CCDC32 binds to hemicomplexes, this must occur in the cytosol, as only mature AP2 tetramers are recruited to the plasma membrane. The authors state that CCDC32 binds the AD of alpha but not beta, so how can the Δ78-98 mutant bind mu and beta?

      (4) The reported ability of CCDC32 to pull down AP2 beta is puzzling. Beta is not found in the CCDC32 interactome in two independent studies using 293 and HCT116 cells (BioPlex). In addition, clathrin is also absent in the interactome of CCDC32, which is difficult to reconcile with a proposed role in CCPs. Can the authors detect CCDC32 binding to clathrin?

      (5) Figure 5B appears unusual-is this a chimera? Figure 5C likely reflects a mixture of immature and mature AP2 adaptor complexes.

      (6) CCDC32 is reduced by about half in siRNA knockdown. Why not use CRISPR to completely eliminate CCDC32 expression?

    3. Reviewer #2 (Public review):

      Yang et al. describes CCDC32 as a new clathrin mediated endocytosis (CME) accessory protein. The authors show that CCDC32 binds directly to AP2 via a small alpha helical region and cells depleted for this protein show defective CME. Finally, the authors show that the CCDC32 nonsense mutations found in patients with cardio-facial-neuro-developmental syndrome (CFNDS) disrupt the interaction of this protein to the AP2 complex. The results presented suggest that CCDC32 may act as both a chaperone (as recently published) and a structural component of the AP2 complex.

      Strengths:<br /> The conclusions presented are generally well supported by experimental data and the authors carefully point out the differences between their results and the results by Wan et al. (PNAS 2024).

      Weaknesses:<br /> The experiments regarding the role of CCDC32 in CFNDS still require some clarifications to make them clearer to scientists working on this disease. The authors fail to describe that the CCDC32 isoform they use in their studies is different from the one used when CFNDS patient mutations were described. This may create some confusion. Also, the authors did not discuss that the frame-shift mutations in patients may be leading to nonsense mediated decay.

    4. Reviewer #3 (Public review):

      In this manuscript, Yang et al. characterize the endocytic accessory protein CCDC32, which has implications in cardio-facio-neuro-developmental syndrome (CFNDS). The authors clearly demonstrate that the protein CCDC32 has a role in the early stages of endocytosis, mainly through the interaction with the major endocytic adaptor protein AP2, and they identify regions taking part in this recognition. Through live cell fluorescence imaging and electron microscopy of endocytic pits, the authors characterize the lifetimes of endocytic sites, the formation rate of endocytic sites and pits and the invagination depth, in addition to transferrin receptor (TfnR) uptake experiments. Binding between CCDC32 and CCDC32 mutants to the AP2 alpha appendage domain is assessed by pull down experiments. While interaction between CCDC32 and the alpha appendage domain of AP2 is clearly described, a discussion of potential association with other AP2 domains would be beneficial to understand the impact of CCDC32 in endocytosis.

      Together, these experiments allow deriving a phenotype of CCDC32 knock-down and CCDC32 mutants within endocytosis, which is a very robust system, in which defects are not so easily detected. A mutation of CCDC32, mimicking CFNDS mutations, is also addressed in this study and shown to have endocytic defects.

      In summary, the authors present a strong combination of techniques, assessing the impact of CCDC32 in clathrin mediated endocytosis and its binding to AP2.

    5. Author response:

      (1) General Statements

      As you will see in our attached rebuttal to the reviewers, we have added several new experiments and revised manuscript to fully address their concerns.

      (2) Point-by-point description of the revisions

      Reviewer #1:

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Yang et al. describes a new CME accessory protein. CCDC32 has been previously suggested to interact with AP2 and in the present work the authors confirm this interaction and show that it is a bona fide CME regulator. In agreement with its interaction with AP2, CCDC32 recruitment to CCPs mirrors the accumulation of clathrin. Knockdown of CCDC32 reduces the amount of productive CCPs, suggestive of a stabilisation role in early clathrin assemblies. Immunoprecipitation experiments mapped the interaction of CCDC42 to the α-appendage of the AP2 complex α-subunit. Finally, the authors show that the CCDC32 nonsense mutations found in patients with cardio-facial-neuro-developmental syndrome disrupt the interaction of this protein to the AP2 complex. The manuscript is well written and the conclusions regarding the role of CCDC32 in CME are supported by good quality data. As detailed below, a few improvements/clarifications are needed to reinforce some of the conclusions, especially the ones regarding CFNDS.

      We thank the referee for their positive comments. In light of a recently published paper describing CCDC32 as a co-chaperone required for AP2 assembly (Wan et al., PNAS, 2024, see reviewer 2), we have added several additional experiments to address all concerns and consequently gained further insight into CCDC32-AP2 interactions and the important dual role of CCDC32 in regulating CME. 

      Major comments:

      (1) Why did the protein could just be visualized at CCPs after knockdown of the endogenous protein? This is highly unusual, especially on stable cell lines. Could this be that the tag is interfering with the expressed protein function rendering it incapable of outcompeting the endogenous? Does this points to a regulated recruitment?

      The reviewer is correct, this would be unusual; however, it is not the case. We misspoke in the text (although the figure legend was correct) these experiments were performed without siRNA knockdown and we can indeed detect eGFP-CCDC32 being recruited to CCPs in the presence of endogenous protein. Nonetheless, we repeated the experiment to be certain (see Author response image 1).  

      Author response image 1.

      Cohort-averaged fluorescence intensity traces of CCPs (marked with mRuby-CLCa) and CCP-enriched eGFPCCDC32(FL).

      (2) The disease mutation used in the paper does not correspond to the truncation found in patients. The authors use an 1-54 truncation, but the patients described in Harel et al. have frame shifts at the positions 19 (Thr19Tyrfs*12) and 64 (Glu64Glyfs*12), while the patient described in Abdalla et al. have the deletion of two introns, leading to a frameshift around amino acid 90. Moreover, to be precisely test the function of these disease mutations, one would need to add the extra amino acids generated by the frame shift. For example, as denoted in the mutation description in Harel et al., the frameshift at position 19 changes the Threonine 19 to a Tyrosine and ads a run of 12 extra amino acids (Thr19Tyrfs*12).

      The label of the disease mutant p.(Thr19Tyrfs12) and p.(Glu64Glyfs12) is based on a 194aa polypeptide version of CCDC32 initiated at a nonconventional start site that contains a 9 aa peptide (VRGSCLRFQ) upstream of the N-terminus we show. Thus, we are indeed using the appropriate mutation site (see: https://www.uniprot.org/uniprotkb/Q9BV29/entry). The reviewer is correct that we have not included the extra 12 aa in our construct; however as these residues are not present in the other CFNDS mutants, we think it unlikely that they contribute to the disease phenotype.  Rather, as neither of the clinically observed mutations contain the 78-98 aa sequence required for AP2 binding and CME function, we are confident that this defect contributed to the disease. Thus, we are including the data on the CCDC32(1-54) mutant, as we believe these results provide a valuable physiological context to our studies. 

      (3) The frameshift caused by the CFNDS mutations (especially the one studied) will likely lead to nonsense mediated RNA decay (NMD). The frameshift is well within the rules where NMD generally kicks in. Therefore, I am unsure about the functional insights of expressing a diseaserelated protein which is likely not present in patients.

      We thank the reviewer for bringing up this concern. However, as shown in new Figure S1, the mutant protein is expressed at comparable levels as the WT, suggesting that NMD is not occurring.

      (4) Coiled coils generally form stable dimers. The typically hydrophobic core of these structures is not suitable for transient interactions. This complicates the interpretation of the results regarding the role of this region as the place where the interaction to AP2 occurs. If the coiled coil holds a stable CCDC32 dimer, disrupting this dimer could reduce the affinity to AP2 (by reduced avidity) to the actual binding site. A construct with an orthogonal dimeriser or a pulldown of the delta78-98 protein with of the GST AP2a-AD could be a good way to sort this issue.

      We were unable to model a stable dimer (or other oligomer) of this protein with high confidence using Alphafold 3.0. Moreover, we were unable to detect endogenous CCDC32 coimmunoprecipitating with eGFP-CCDC32 (Fig. S6C). Thus, we believe that the moniker, based solely on the alpha-helical content of the protein is a misnomer.  We have explained this in the main text.

      Minor comments:

      (1) The authors interchangeably use the term "flat CCPs" and "flat clathrin lattices". While these are indeed related, flat clathrin lattices have been also used to refer to "clathrin plaques". To avoid confusion, I suggest sticking to the term "flat CCPs" to refer to the CCPs which are in their early stages of maturation.

      Agreed. Thank you for the suggestion. We have renamed these structures flat clathrin assemblies, as they do not acquire the curvature needed to classify them as pits, and do not grow to the size that would classify then as plaques. 

      Significance

      General assessment:

      CME drives the internalisation of hundreds of receptors and surface proteins in practically all tissues, making it an essential process for various physiological processes. This versatility comes at the cost of a large number of molecular players and regulators. To understand this complexity, unravelling all the components of this process is vital. The manuscript by Yang et al. gives an important contribution to this effort as it describes a new CME regulator, CCDC32, which acts directly at the main CME adaptor AP2. The link to disease is interesting, but the authors need to refine their experiments. The requirement for endogenous knockdown for recruitment of the tagged CCDC32 is unusual and requires further exploration.

      Advance:

      The increased frequency of abortive events presented by CCDC32 knockdown cells is very interesting, as it hints to an active mechanism that regulates the stabilisation and growth of clathrin coated pits. The exact way clathrin coated pits are stabilised is still an open question in the field.

      Audience:

      This is a basic research manuscript. However, given the essential role of CME in physiology and the growing number of CME players involved in disease, this manuscript can reach broader audiences.

      We thank the referee for recognizing the ‘interesting’ advances our studies have made and for considering these studies as ‘an important contribution’ to ‘an essential process for various physiological processes’ and able ‘to reach broader audiences’. We have addressed and reconciled the reviewer’s concerns in our revised manuscript. 

      Field of expertise of the reviewer:

      Clathrin mediated endocytosis, cell biology, microscopy, biochemistry.

      Reviewer #2:

      Evidence, reproducibility and clarity

      In this manuscript, the authors demonstrate that CCDC32 regulates clathrin-mediated endocytosis (CME). Some of the findings are consistent with a recent report by Wan et al. (2024 PNAS), such as the observation that CCDC32 depletion reduces transferrin uptake and diminishes the formation of clathrin-coated pits. The primary function of CCDC32 is to regulate AP2 assembly, and its depletion leads to AP2 degradation. However, this study did not examine AP2 expression levels. CCDC32 may bind to the appendage domain of AP2 alpha, but it also binds to the core domain of AP2 alpha.

      We thank the reviewer for drawing our attention to the Wan et al. paper, that appeared while this work was under review.  However, our in vivo data are not fully consistent with the report from Wan et al. The discrepancies reveal a dual function of CCDC32 in CME that was masked by complete knockout vs siRNA knockdown of the protein, and also likely affected by the position of the GFP-tag (C- vs N-terminal) on this small protein. Thus:

      -  Contrary to Wan et al., we do not detect any loss of AP2 expression (see new Figure S3A-B) upon siRNA knockdown. Most likely the ~40% residual CCDC32 present after siRNA knockdown is sufficient to fulfill its catalytic chaperone function but not its structural role in regulating CME beyond the AP2 assembly step.  

      - Contrary to Wan et al., we have shown that CCDC32 indeed interacts with intact AP2 complex (Figure S3C and 6B,C) showing that all 4 subunits of the AP2 complex co-IP with full length eGFP-CCDC32. Interestingly, whereas the full length CCDC32 pulls down the intact AP2 complex, co-IP of the ∆78-98 mutant retains its ability to pull down the β2-µ2 hemicomplex, its interactions with α:σ2 are severely reduced.  While this result is consistent with the report of Wan et al that CCDC32 binds to the α:σ2 hemi-complex, it also suggests that the interactions between CCDC32 and AP2 are more complex and will require further studies.

      - Contrary to Wan et al., we provide strong evidence that CCDC32 is recruited to CCPs. Interestingly, modeling with AlphaFold 3.0 identifies a highly probably interaction between alpha helices encoded by residues 66-91 on CCDC32 and residues 418-438 on α. The latter are masked by µ2-C in the closed confirmation of the AP2 core, but exposed in the open confirmation triggered by cargo binding, suggesting that CCDC32 might only bind to membrane-bound AP2.

      Thus, our findings are indeed novel and indicate striking multifunctional roles for CCDC32 in CME, making the protein well worth further study. 

      (1) Besides its role in AP2 assembly, CCDC32 may potentially have another function on the membrane. However, there is no direct evidence showing that CCDC32 associates with the plasma membrane.

      We disagree, our data clearly shows that CCDC32 is recruited to CCPs (Fig. 1B) and that CCPs that fail to recruit CCDC32 are short-lived and likely abortive (Fig. 1C). Wan et al. did not observe any colocalization of C-terminally tagged CCDC32 to CCPs, whereas we detect recruitment of our N-terminally tagged construct, which we also show is functional (Fig. 6F).  Further, we have demonstrated the importance of the C-terminal region of CCDC32 in membrane association (see new Fig. S7).  Thus, we speculate that a C-terminally tagged CCDC32 might not be fully functional. Indeed, SIM images of the C-terminally-tagged CCDC32 in Wan et al., show large (~100 nm) structures in the cytosol, which may reflect aggregation. 

      (2) CCDC32 binds to multiple regions on AP2, including the core domain. It is important to distinguish the functional roles of these different binding sites.

      We have localized the AP2-ear binding region to residues 78-99 and shown these to be critical for the functions we have identified. As described above we now include data that are complementary to those of Wan et al. However, our data also clearly points to additional binding modalities. We agree that it will be important and map these additional interactions and identify their functional roles, but this is beyond the scope of this paper.  

      (3) AP2 expression levels should be examined in CCDC32 depleted cells. If AP2 is gone, it is not surprising that clathrin-coated pits are defective.

      Agreed and we have confirmed this by western blotting (Figure S3A-B) and detect no reduction in levels of any of the AP2 subunits in CCDC32 siRNA knockdown cells. As stated above this could be due to residual CCDC32 present in the siRNA KD vs the CRISPR-mediated gene KO.

      (4) If the authors aim to establish a secondary function for CCDC32, they need to thoroughly discuss the known chaperone function of CCDC32 and consider whether and how CCDC32 regulates a downstream step in CME.

      Agreed. We have described the Wan et al paper, which came out while our manuscript was in review, in our Introduction.  As described above, there are areas of agreement and of discrepancies, which are thoroughly documented and discussed throughout the revised manuscript.  

      (5) The quality of Figure 1A is very low, making it difficult to assess the localization and quantify the data.

      The low signal:noise in Fig. 1A the reviewer is concerned about is due to a diffuse distribution of CCDC32 on the inner surface of the plasma membrane. We now, more explicitly describe this binding, which we believe reflects a specific interaction mediated by the C-terminus of CCDC32; thus the degree of diffuse membrane binding we observe follows: eGFP-CCDC32(FL)> eGFPCCDC32(∆78-98)>eGFP-CCDC32(1-54)~eGFP/background (see new Fig. S7). Importantly, the colocalization of CCDC32 at CCPs is confirmed by the dynamic imaging of CCPs (Fig 1B).

      (6) In Figure 6, why aren't AP2 mu and sigma subunits shown?

      Agreed. Not being aware of CCDC32’s possible dual role as a chaperone, we had assumed that the AP2 complex was intact.  We have now added this data in Figure 6 B,C and Fig. S3C, as discussed above. 

      Page 5, top, this sentence is confusing: "their surface area (~17 x 10 nm<sup>2</sup>) remains significantly less than that required for the average 100 nm diameter CCV (~3.2 x 103 nm<sup>2</sup>)."

      Thank you for the criticism. We have clarified the sentence and corrected a typo, which would definitely be confusing.  The section now reads,  “While the flat CCSs we detected in CCDC32 knockdown cells were significantly larger than in control cells (Fig. 4D, mean diameter of 147 nm vs. 127 nm, respectively), they are much smaller than typical long-lived flat clathrin lattices (d≥300 nm)(Grove et al., 2014). Indeed, the surface area of the flat CCSs that accumulate in CCDC32 KD cells (mean ~1.69 x 10<sup>4</sup> nm<sup>2</sup>) remains significantly less than the surface area of an average 100 nm diameter CCV (~3.14 x 10<sup>4</sup> nm<sup>2</sup>). Thus, we refer to these structures as ‘flat clathrin assemblies’ because they are neither curved ‘pits’ nor large ‘lattices’. Rather, the flat clathrin assemblies represent early, likely defective, intermediates in CCP formation.” 

      Significance

      Overall, while this work presents some interesting ideas, it remains unclear whether CCDC32 regulates AP2 beyond the assembly step.

      Our responses above argue that we have indeed established that CCDC32 regulates AP2 beyond the assembly step. We have also identified several discrepancies between our findings and those reported by Wan et al., most notably binding between CCDC32 and mature AP2 complexes and the AP2-dependent recruitment of CCDC32 to CCPs.  It is possible that these discrepancies may be due to the position of the GFP tag (ours is N-terminal, theirs is C-terminal; we show that the N-terminal tagged CCDC32 rescues the knockdown phenotype, while Wan et al., do not provide evidence for functionality of the C-terminal construct). 

      Reviewer #3: 

      Evidence, reproducibility and clarity (Required): 

      In this manuscript, Yang et al. characterize the endocytic accessory protein CCDC32, which has implications in cardio-facio-neuro-developmental syndrome (CFNDS). The authors clearly demonstrate that the protein CCDC32 has a role in the early stages of endocytosis, mainly through the interaction with the major endocytic adaptor protein AP2, and they identify regions taking part in this recognition. Through live cell fluorescence imaging and electron microscopy of endocytic pits, the authors characterize the lifetimes of endocytic sites, the formation rate of endocytic sites and pits and the invagination depth, in addition to transferrin receptor (TfnR) uptake experiments. Binding between CCDC32 and CCDC32 mutants to the AP2 alpha appendage domain is assessed by pull down experiments. Together, these experiments allow deriving a phenotype of CCDC32 knock-down and CCDC32 mutants within endocytosis, which is a very robust system, in which defects are not so easily detected. A mutation of CCDC32, known to play a role in CFNDS, is also addressed in this study and shown to have endocytic defects.

      We thank the reviewer for their positive remarks regarding the quality of our data and the strength of our conclusions.  

      In summary, the authors present a strong combination of techniques, assessing the impact of CCDC32 in clathrin mediated endocytosis and its binding to AP2, whereby the following major and minor points remain to be addressed: 

      - The authors show that CCDC32 depletion leads to the formation of brighter and static clathrin coated structures (Figure 2), but that these were only prevalent to 7.8% and masked the 'normal' dynamic CCPs. At the same time, the authors show that the absence of CCDC32 induces pits with shorter life times (Figure 1 and Figure 2), the 'majority' of the pits.

      Clarification is needed as to how the authors arrive at these conclusions and these numbers. The authors should also provide (and visualize) the corresponding statistics. The same statement is made again later on in the manuscript, where the authors explain their electron microscopy data. Was the number derived from there? 

      These points are critical to understanding CCDC32's role in endocytosis and is key to understanding the model presented in Figure 8. The numbers of how many pits accumulate in flat lattices versus normal endocytosis progression and the actual time scales could be included in this model and would make the figure much stronger. 

      Thank you for these comments.  We understand the paradox between the visual impression and the reality of our dynamic measurements. We have been visually misled by this in previous work (Chen et al., 2020), which emphasizes the importance of unbiased image analysis afforded to us through the well-documented cmeAnalysis pipeline, developed by us (Aguet et al., 2013) and now used by many others (e.g. (He et al., 2020)). 

      The % of static structures was not derived from electron microscopy data, but quantified using cmeAnalysis, which automatedly provides the lifetime distribution of CCPs. We have now clarified this in the manuscript and added a histogram (Fig. S4) quantifying the fraction of CCPs in lifetime cohorts  <20s, 21-60s, 61-100s, 101-150s and >150s (static). 

      - In relation to the above point, the statistics of Figure 2E-G and the analysis leading there should also be explained in more detail: For example, what are the individual points in the plot (also in Figures 6G and 7G)? The authors should also use a few phrases to explain software they use, for example DASC, in the main text. 

      Each point in these bar graphs represents a movie, where n≥12. These details have been added to the respective figure legend. We have also added a brief description of DASC analysis in the text. 

      -  There are several questions related to the knock-down experiments that need to be addressed:

      Firstly, knock-down of CCDC32 does not seem to be very strong (Figure S2B). Can the level of knock-down be quantified? 

      We have now quantified the KD efficiency. It is ~60%. This turns out to be fortuitous (see responses to reviewer 2), as a recent publication, which came out after we completed our study, has shown by CRISPR-mediated knockout, that CCD32 also plays an essential chaperone function required for AP2 assembly.  We do not see any reduction in AP2 levels or its complex formation under our conditions (see new Supplemental Figure S3), which suggests that the effects of CCDC32 on CCP dynamics are more sensitive to CCDC32 concentration than its roles as a chaperone. Our phenotypes would have been masked by more efficient depletion of CCDC32.  

      In page 6 it is indicated that the eGFP-CCDC32(1-54) and eGFP-CCDC32(∆78-98) constructs are siRNA-resistant. However in Fig S2B, these proteins do not show any signal in the western blot, so it is not clear if they are expressed or simply not detected by the antibody. The presence of these proteins after silencing endogenous CCDC32 needs to be confirmed to support Figures 6 and Figures 7, which critically rely on the presence of the CCDC32 mutants. 

      Unfortunately, the C-terminally truncated CCDC32 proteins are not detected because they lack the antibody epitope, indeed even the ∆78-98 deletion is poorly detected (compare the GFP blot in new S1A with the anti-CCDC32 blot in S1B).  However, these constructs contain the same siRNA-resistance mutation as the full length protein. That they are expressed and siRNA resistant can be seen in Fig. S2A (now Fig. S1A) blotting for GFP.

      In Figures 6 and 7, siRNA knock-down of CCDC32 is only indicated for sub-figures F to G. Is this really the case? If not, the authors should clarify. The siRNA knock-down in Figure 1 is also only mentioned in the text, not in the figure legend. The authors should pay attention to make their figure legends easy to understand and unambiguous. 

      No, it is not the case.  Thank you for pointing out the uncertainty. We have added these details to the Figure legends and checked all Figure legends to ensure that they clearly describe the data shown.  

      - It is not exactly clear how the curves in Figure 3C (lower panel) on the invagination depth were obtained. Can the authors clarify this a bit more? For example, what are kT and kE in Figure 3A? What is I0? And how did the authors derive the logarithmic function used to quantify the invagination depth? In the main text, the authors say that the traces were 'logarithmically transformed'. This is not a technical term. The authors should refer to the actual equation used in the figure. 

      This analysis was developed by the Kirchhausen lab (Saffarian and Kirchhausen, 2008). We have added these details and reference them in the Figure legend and in the text. We also now use the more accurate descriptor ‘log-transformed’.

      - In the discussion, the claim 'The resulting dysregulation of AP2 inhibits CME, which further results in the development of CFNDS.' is maybe a bit too strong of a statement. Firstly, because the authors show themselves that CME is perturbed, but by no means inhibited. Secondly, the molecular link to CFNDS remains unclear. Even though CCDC32 mutants seem to be responsible for CFNDS and one of the mutant has been shown in this study to have a defect in endocytosis and AP2 binding, a direct link between CCDC32's function in endocytosis and CFNDS remains elusive. The authors should thus provide a more balanced discussion on this topic. 

      We have modified and softened our conclusions, which now read that the phenotypes we see likely “contribute to” rather than “cause” the disease.

      - In Figure S1, the authors annotate the presence of a coiled-coil domain, which they also use later on in the manuscript to generate mutations. Could the authors specify (and cite) where and how this coiled-coil domain has been identified? Is this predicted helix indeed a coiled-coil domain, or just a helix, as indicated by the authors in the discussion?

      See response to Reviewer 1, point 4.  We have changed this wording to alpha-helix. The ‘coiled-coil’ reference is historical and unlikely a true reflection of CCDC32 structure. AlphaFold 3.0 predictions were unable to identify with certainly any coiled-coil structures, even if we modelled potential dimers or trimers; and we find no evidence of dimerization of CCDC32 in vivo. We have clarified this in the text.

      Minor comments

      - In general, a more detailed explanation of the microscopy techniques used and the information they report would be beneficial to provide access to the article also to non-expert readers in the field. This concerns particularly the analysis methods used, for example: 

      How were the cohort-averaged fluorescence intensity and lifetime traces obtained? 

      How do the tools cmeAnalysis and DASC work? A brief explanation would be helpful. 

      We have expanded Methods to add these details, and also described them in the main text. 

      - The axis label of Figure 2B is not quite clear. What does 'TfnR uptake % of surface bound' mean? Maybe the authors could explain this in more detail in the figure legend? Is the drop in uptake efficiency also accessible by visual inspection of the images? It would be interesting to see that. 

      This is a standard measure of CME efficiency. 'TfnR uptake % of surface bound' = Internalized TfnR/Surface bound TfnR. Again, images may be misleading as defects in CME lead to increased levels of TfnR on the cell surface, which in turn would result in more Tfn uptake even if the rate of CME is decreased.

      - Figure 4: How is the occupancy of CCPs in the plasma membrane measured? What are the criteria used to divide CCSs into Flat, Dome or Sphere categories? 

      We have expanded Methods to add these details. Based on the degree of invagination, the shapes of CCSs were classified as either: flat CCSs with no obvious invagination; dome-shaped CCSs that had a hemispherical or less invaginated shape with visible edges of the clathrin lattice; and spherical CCSs that had a round shape with the invisible edges of clathrin lattice in 2D projection images. In most cases, the shapes were obvious in 2D PREM images. In uncertain cases, the degree of CCS invagination was determined using images tilted at ±10–20 degrees. The area of CCSs were measured using ImageJ and used for the calculation of the CCS occupancy on the plasma membrane.

      - Figure 5B: Can the authors explain, where exactly the GFP was engineered into AP2 alpha? This construct does not seem to be explained in the methods section. 

      We have added this information. The construct, which corresponds to an insertion of GFP into the flexible hinge region of AP2, at aa649, was first described by (Mino et al., 2020) and shown to be fully functional.  This information has been added to the Methods section.

      - Figure S1B: The authors should indicate the colour code used for the structural model.

      We have expanded our structural modeling using AlphaFold 3.0 in light of the recent publication suggesting the CCDC32 interacts with the µ2 subunit and does not bind full length AP2. These results are described in the text. The color coding now reflects certainty values given by AlphaFold 3.0 (Fig. S6B, D). 

      - The list of primers referred to in the materials and methods section does not exist. There is a Table S1, but this contains different data. The actual Table S1 is not referenced in the main text. This should be done. 

      We apologize for this error. We have now added this information in Table S2.

      Significance (Required):

      In this study, the authors analyse a so-far poorly understood endocytic accessory protein, CCDC32, and its implication for endocytosis. The experimental tool set used, allowing to quantify CCP dynamics and invagination is clearly a strength of the article that allows assessing the impact of an accessory protein towards the endocytic uptake mechanism, which is normally very robust towards mutations. Only through this detailed analysis of endocytosis progression could the authors detect clear differences in the presence and absence of CCDC32 and its mutants. If the above points are successfully addressed, the study will provide very interesting and highly relevant work allowing a better understanding of the early phases in CME with implication for disease. 

      The study is thus of potential interest to an audience interested in CME, in disease and its molecular reasons, as well as for readers interested in intrinsically disordered proteins to a certain extent, claiming thus a relatively broad audience. The presented results may initiate further studies of the so-far poorly understood and less well known accessory protein CCDC32.

      We thank the reviewer for their positive comments on the significance of our findings and the importance of our detailed phenotypic analysis made possible by quantitative live cell microscopy. We also believe that our new structural modeling of CCDC32 and our findings of complex and extensive interactions with AP2 make the reviewers point regarding intrinsically disordered proteins even more interesting and relevant to a broad audience.  We trust that our revisions indeed address the reviewer’s concerns. 

      The field of expertise of the reviewer is structural biology, biochemistry and clathrin mediated endocytosis. Expertise in cell biology is rather superficial.

      References:

      Aguet, F., Costin N. Antonescu, M. Mettlen, Sandra L. Schmid, and G. Danuser. 2013. Advances in Analysis of Low Signal-to-Noise Images Link Dynamin and AP2 to the Functions of an Endocytic Checkpoint. Developmental Cell. 26:279-291.

      Chen, Z., R.E. Mino, M. Mettlen, P. Michaely, M. Bhave, D.K. Reed, and S.L. Schmid. 2020. Wbox2: A clathrin terminal domain–derived peptide inhibitor of clathrin-mediated endocytosis. Journal of Cell Biology. 219.

      Grove, J., D.J. Metcalf, A.E. Knight, S.T. Wavre-Shapton, T. Sun, E.D. Protonotarios, L.D. Griffin, J. Lippincott-Schwartz, and M. Marsh. 2014. Flat clathrin lattices: stable features of the plasma membrane. Mol Biol Cell. 25:3581-3594.

      He, K., E. Song, S. Upadhyayula, S. Dang, R. Gaudin, W. Skillern, K. Bu, B.R. Capraro, I. Rapoport, I. Kusters, M. Ma, and T. Kirchhausen. 2020. Dynamics of Auxilin 1 and GAK in clathrinmediated traffic. J Cell Biol. 219.

      Mino, R.E., Z. Chen, M. Mettlen, and S.L. Schmid. 2020. An internally eGFP-tagged α-adaptin is a fully functional and improved fiduciary marker for clathrin-coated pit dynamics. Traffic. 21:603-616.

      Saffarian, S., and T. Kirchhausen. 2008. Differential evanescence nanometry: live-cell fluorescence measurements with 10-nm axial resolution on the plasma membrane. Biophys J. 94:23332342.

    1. Author Response:

      We sincerely thank the reviewers and the editorial team for their thoughtful and constructive evaluation of our manuscript. We are very pleased that both reviewers and the Reviewing Editor found the work to be compelling and of interest to the community studying membrane-associated condensates. Below we outline our planned revisions in response to the public reviews.

      Reviewer #1

      We appreciate Reviewer #1’s positive evaluation of the study’s significance and the utility of our theoretical framework.

      1. Understandably, the authors used one system to test their theory (ZO-1). However, to establish a theoretical framework, this is sufficient.

      Response: We acknowledge this limitation. While we agree that additional systems would strengthen the generality of our theory, we note that the focus of this work is to introduce and validate a theoretical framework. As the reviewer notes, this is sufficient for establishing the framework. Nonetheless, we are open to further collaborations or future studies to test the model with other systems.

      Reviewer #2

      We are grateful for Reviewer #2’s detailed comments and will address each of the points as follows:

      1. In the theoretical section, what has previously been known, compared to which equations are new, should be made more clear.

      Response: We will revise the theory section to clearly distinguish previously established formulations from novel contributions.

      1. Some assumptions in the model are made purely for convenience and without sufficient accompanying physical justification. E.g., the authors should justify, on physical grounds, why binding rate effects are/could be larger than the other fluxes.

      Response: We will expand the discussion to provide key physical justification, especially to explain why binding rate effects are/could be larger than the other fluxes.

      1. I feel that further mechanistic explanation as to why bulk phase separation widens the regime of surface phase separation is warranted.

      Response: We will elaborate on the mechanism underlying this coupling.

      1. The major advantage of the non-dilute theory as compared with a best parameterized dilute (or homogenous) theory requires further clarification/evidence with respect to capturing the experimental data.

      Response: We will clarify this comparison more explicitly and highlight how the non-dilute model captures key nonlinear behaviors and concentration-dependent adsorption phenomena that the dilute model fails to reproduce.

      1. Discrete (particle-based) molecular modelling could help to delineate the quantitative improvements that the non-dilute theory has over the previous state-of-the-art. Also, this could help test theoretical statements regarding the roles of bulk-phase separation, which were not explored experimentally.

      Response:  We appreciate the suggestion and agree that such modeling would be valuable. However, this is beyond the scope of the current study. We will add a discussion on how discrete simulations could be used to further test our theory in future work.

      1. Discussion of the caveats and limitations of the theory and modelling is missing from the text.

      Response:  We will add a paragraph outlining caveats and limitations of the modelling.

      We believe these changes will significantly improve the clarity and impact of our manuscript, and we thank the reviewers again for their valuable input.

    2. eLife Assessment

      This important study presents a compelling theoretical framework for understanding phase separation of membrane-bound proteins, with a focus on the organization of tight junction components. By incorporating non-dilute binding effects into thermodynamic models and validating the model's predictions with in vitro experiments on the tight junction protein ZO-1, the authors provide a quantitative tool that will be of interest for biologists interested in membrane-associated condensates. While further clarification of model assumptions and broader mechanistic context would strengthen the work even further, the combination of theory and experiment here is robust and a key advancement in the field.

    3. Reviewer #1 (Public review):

      Summary:

      Biomolecular condensates are an essential part of cellular homeostatic regulation. In this manuscript, the authors develop a theoretical framework for the phase separation of membrane-bound proteins. They show the effect of non-dilute surface binding and phase separation on tight junction protein organization.

      Strengths:

      It is an important study, considering that the phase separation of membrane-bound molecules is taking the center stage of signaling, spanning from immune signaling to cell-cell adhesion. A theoretical framework will help biologists to quantitatively interpret their findings.

      Weaknesses:

      Understandably, the authors used one system to test their theory (ZO-1). However, to establish a theoretical framework, this is sufficient.

    4. Reviewer #2 (Public review):

      Summary:

      The authors present a clear expansion of biophysical (thermodynamic) theory regarding the binding of proteins to membrane-bound receptors, accounting for higher local concentration effects of the protein. To partially test the expanded theory, the authors perform in vitro experiments on the binding of ZO1 proteins to Claudin2 C-terminal receptors anchored to a supported lipid bilayer, and capture the effects that surface phase separation of ZO1 has on its adsorption to the membrane.

      Strengths:

      (1) The derived theoretical framework is consistent and largely well-explained.

      (2) The experimental and numerical methodologies are transparent.

      (3) The comparison between the best parameterized non-dilute theory is in reasonable agreement with experiments.

      Weaknesses:

      (1) In the theoretical section, what has previously been known, compared to which equations are new, should be made more clear.

      (2) Some assumptions in the model are made purely for convenience and without sufficient accompanying physical justification. E.g., the authors should justify, on physical grounds, why binding rate effects are/could be larger than the other fluxes.

      (3) I feel that further mechanistic explanation as to why bulk phase separation widens the regime of surface phase separation is warranted.

      (4) The major advantage of the non-dilute theory as compared with a best parameterized dilute (or homogenous) theory requires further clarification/evidence with respect to capturing the experimental data.

      (5) Discrete (particle-based) molecular modelling could help to delineate the quantitative improvements that the non-dilute theory has over the previous state-of-the-art. Also, this could help test theoretical statements regarding the roles of bulk-phase separation, which were not explored experimentally.

      (6) Discussion of the caveats and limitations of the theory and modelling is missing from the text.

    1. Author response:

      We thank the reviewers for their thoughtful and constructive feedback. As the reviewers noted, dissecting the contributions of Gtr1/2 and Pib2 to TORC1 signaling across diverse nutrient states is a technically and conceptually challenging problem. Indeed, many of the issues raised—including the interpretation of non-canonical TORC1 readouts (e.g., Rps6, Par32), the influence of strain auxotrophy and media composition, and the limitations of phosphoproteomic analysis performed under a single growth condition—underscore the challenges of working with the TORC1 signaling system.

      In response to the reviewers’ comments, we have undertaken a broader and more systematic analysis of TORC1 regulation across defined nitrogen transitions, building directly on the signaling framework established in Figures 6 and 8 of this manuscript. This work, which includes expanded phosphoproteomic profiling and the use of refined genetic tools, supports and extends the key conclusions of Cecil et. al. Specifically, it reinforces the existence of a Pib2-dependent TORC1 output under nitrogen-limited conditions and further clarifies the physiological relevance of the intermediate TORC1 activity state. Due to the scope and depth of this expanded work, we are reporting those findings in a separate publication. Nonetheless, we view the data presented here as a key foundational step in establishing a non-redundant framework for Gtr1/2- and Pib2-dependent control of TORC1.

      We have therefore made minor changes to the manuscript to clarify our use of different growth media and to temper our conclusions where appropriate. These changes, together with the context of ongoing work, should reinforce the value of Cecil et. al. in advancing our understanding of TORC1 and nutrient signaling in eukaryotes.

    1. eLife Assessment

      This study provides an in-depth exploration of the impact of X-linked ZDHHC9 gene mutations on cognitive deficits and epilepsy, with a particular focus on the expression and function of ZDHHC9 in myelin-forming oligodendrocytes (OLs). These valuable findings offer insights into ZDHHC9-related X-linked intellectual disability (XLID) and shed light on the regulatory mechanisms of palmitoylation in myelination. The experimental design and analysis of results are solid, providing a reference for further research in this field.

    2. Reviewer #1 (Public review):

      Summary:<br /> Having shown that acyltransferase ZDHHC9 expression is far higher in myelinating oligodendrocytes (OLs) than in other CNS cell types, Jeong and colleagues focus on exploring the role of ZDHHC9 in myelinating OLs in particular in the palmitoylation of several myelin proteins. This study is relevant in the context of X-linked intellectual disability as it suggests a more relevant role for myelinating glia than previously thought. It also provides useful insights the mechanisms of ZDHHC9-associated XLID and on the palmitoylation-dependent control of myelination.

      Strengths:<br /> Well written paper<br /> In general good data quality<br /> Use of transgenics strategies (in addition to the ZDHHC9 KO) strengthen the data and claims

      Weaknesses:<br /> A few claims might have needed better experimental support but new data and revised discussion sections addressed some of these weaknesses

    3. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      In this work Jeong and colleagues focus on exploring the role of the acyltransferase ZDHHC9 in myelinating OLs in particular in the palmitoylation of several myelin proteins. After confirming the specific enrichment of the Zdhhc9 transcript in mouse and human OLs, the authors examine the subcellular localization of the protein in vitro and observed that in comparison with other isoforms, ZDHHC9 localizes at OLs cell bodies and at discrete puncta in the processes. These observations (Figures 1 and 2) led the authors to hypothesize that ZDHHC9 plays an important role in myelination. No gross changes were detected in OL development in Zdhhc9 KO mice and analyses from P28 Zdhhc9 KO mice crossed with Mobp-EGFP reporter mice did not show changes in EGFP+ OL differentiation (Figure 3).

      However, and given the observed subcellular localization of ZDHHC9 in OL processes (Figure 2) and the observation that the percentage of unmyelinated axons is increased in Zdhhc9 KO (Figure 6), early time points to examine the differentiated pools of OLs and their capacity to extend processes/contact axons need to be considered.

      We appreciate this point, but due to the order in which experiments were performed, the ZDHHC9 KO mouse colony that we maintained after initial submission of this work contains homozygous MOBP-EGFP, but not the mT/mG transgene that would be most optimal for the proposed experiment. We hope the reviewer appreciates that it would take considerable time and effort regarding mouse breeding to cross out the MOBP and add back the mT/mG. We nonetheless appreciate the importance of the point raised and therefore examined an earlier developmental time point (P21, 3 weeks) to quantify OLs and NG2+ OPCs. In our updated Fig 3C1-C3, we use Mobp-EGFP mice to show that Zdhhc9 KO does not significantly affect the number of EGFP+ OLs at this time point in the cortex, corpus callosum and spinal cord. We also show that in corpus callosum, Zdhhc9 KO does not significantly affect the number of NG2+ OPCs at this earlier time point (Fig 3D, E). Furthermore, immunostaining to detect BCAS1, a marker of pre-mature OLs, also revealed no qualitative difference with ZDHHC9 loss at P21. We show representative images from these BCAS1 experiments in an updated Fig S3. While these new experiments do not address the morphology of OLs in Zdhhc9 KO, they do provide further evidence that deficits in myelination in young Zdhhc9 KO mice (Figure 6) are not likely due to gross differences in OPC or OL numbers during development.

      Maturation of OL in Zdhhc9 KO was examined by crossing Zdhhc9 KO with Pdgfra-CreER;R26- EGFP and following the newly EGFP-labelled OPCs following tamoxifen administration. No changes in the numbers of EGFP+ OL were detected. The authors concluded that the loss of ZDHHC9 does not alter oligodendrogenesis in either the young or mature CNS. The authors observed defects in Zdhhc9 KO OL protrusions that they attributed to abnormal OL membrane expansion (Fig 4 and 5). Can they show evidence for this?

      This is an important point, and we appreciate the opportunity to explain the reasoning behind our initial statement more fully, while noting that other explanations are possible. Fig 5B (an Imaris-assisted reconstruction using the EGFP cell fill/morphology marker) highlights large spheroid-like distensions along OL processes. We reason that these spheroids are enclosed by the OL lipid membrane because if the membrane were ruptured, the EGFP signal would likely diffuse. This in turn suggests that the caliber of the OL process at the position of the spheroid is grossly abnormal i.e. the membrane has hyper-expanded. Given that OL membrane growth during myelination extends in two directions, i.e., spiral growth to the axonal surface and longitudinal growth along the axon, it is possible that spheroid-like structures are formed by uneven myelin growth. We recognize that we cannot yet conclude whether and how spheroid formation might be linked to the myelination deficit that we observe in Zdhhc9 KO mice. However, defining the subcellular mechanism for spheroid formation may provide further insights into this issue. We have therefore largely retained the original statement but have added the reasoning above to our revised Discussion.

      The authors report that Zdhhc9 KO primary and secondary branches in OL were longer, some contained spheroid-like swellings and the OL protrusion complexity was higher. However, these data is partially contradictory to what they show in OL differentiation experiments in vitro (Fig 7). There is also no evidence for increased membrane expansion in Zdhhc9 knockdown myelin forming cells in culture. How to reconcile this? 

      We appreciate the reviewer’s interest in this issue. Several non-mutually exclusive factors could account for the differences in OL morphology in vitro versus in vivo caused by Zdhhc9 loss. First, morphology in vivo may well be influenced by the axons and/or other extrinsic components around each OL that are not present in our primary cultures. Second, OL growth in vivo is highly 3-dimensional, whereas growth in culture is largely 2-dimensional – it may be difficult to support formation of spheroids (by definition, a 3-dimensional structure) in the latter situation. Finally, Zdhhc9 is absent in vivo from the beginning of development until the time points examined, whereas in our cultured OL experiments, Zdhhc9 shRNA is virally delivered to OPC cultures at DIV2 and likely acutely affects Zdhhc9 expression predominantly in committed OLs (following the switch to differentiation medium at DIV3). These differences may also affect the ability of other PATs or, potentially, palmitoylation-independent subcellular processes, to compensate for Zdhhc9 loss. We have more fully explained these points in our revised Discussion. 

      Reviewer #2 (Public Review):

      This study provides an in-depth exploration of the impact of X-linked ZDHHC9 gene mutations on cognitive deficits and epilepsy, with a particular focus on the expression and function of ZDHHC9 in myelin-forming oligodendrocytes (OLs). These findings offer crucial insights into understanding ZDHHC9-related X-linked intellectual disability (XLID) and shed light on the regulatory mechanisms of palmitoylation in myelination. The experimental design and analysis of results are convincing, providing a valuable reference for further research in this field. However, upon careful review, I believe the article still needs further improvement and supplementation in the following aspects:

      (1) Regarding the subcellular localization experiment of ZDHHC9 mutants in OL, it is currently limited to in vitro cultured OL, lacking validation in vivo OL or myelin sheath. Additionally, it is necessary to investigate whether the abnormal subcellular localization of ZDHHC9 mutants affects their enzyme activity and palmitoylation modification of substrate proteins.

      This is an important point but is technically challenging to address in vivo as it would likely require delivery of AAV to express ZDHHC9wt and XLID mutants specifically in OLs, preferably in the absence of endogenous ZDHHC9. We hope the reviewers would agree that this experiment is beyond the scope of the current study. However, we did compare the ability of ZDHHC9wt and XLID mutants to palmitoylate MBP, and to autopalmitoylate (sometimes used as a surrogate measure of PAT activity) in transfected heterologous cells. Although we recognize that this over-expression system is less physiological than a native OL, it has the benefit of being able to readily compare transfected wt vs mutant forms of ZDHHC9 with minimal contribution from endogenous ZDHHC9. Intriguingly, using this system, we found that autopalmitoylation activity of the XLID ZDHHC9-P150S mutant does not differ significantly from that of ZDHHC9wt, and that this mutant is still capable of palmitoylating MBP. Moreover, the R96W mutant, while impaired in autopalmitoylation, still palmitoylated MBP approximately 50% as effectively as ZDHHC9wt in our cell-based assay. These findings suggest that ZDHHC9-P150S and, probably, ZDHHC9-R96W mutants might still be able to palmitoylate substrates in OLs if they were properly localized. This possibility in turn suggests that impaired subcellular targeting in addition to, or instead of, impaired catalytic activity, may be a key factor in certain cases of ZDHHC9-associated XLID. We have expanded our Figure 8 (new panels 8E-G) to show these additional experiments and have summarized the conclusions above in our revised Discussion. We thank the reviewer for suggesting that we further investigate this issue.

      (2) The experimental period (P21+21 days) using genetic labeling to track the development of myelinating cells may not be long enough. It is recommended to extend the observation time and analyze at more time points to more comprehensively reflect the impact of Zdhhc9 KO.

      We appreciate this point from the reviewer but, regrettably, we did not maintain the PdgfraCreER; R26-EGFP; Zdhhc9 KO mouse line and hope the reviewer appreciates that it would take considerable time and effort to rederive this line and then perform the suggested extended time course experiments. However, we note for the reviewer that our preliminary studies did not reveal any effect of Zdhhc9 KO on the number of MOBP-EGFP+ OLs in 6-month-old mice (not shown), consistent with a model in which Zdhhc9 loss does not affect OPC-OL commitment per se.

      (3) The author speculates that Zdhhc9 may regulate myelination by affecting the membrane localization of specific myelin proteins, but lacks direct experimental evidence to support this. It is suggested to detect the expression and distribution of relevant proteins in the myelin of Zdhhc9 KO mice.

      We share the reviewer’s interest in this point but realized that it is more technically challenging to address than might be initially thought. The main protein we would implicate and seek to test is MBP, but we already found that there is no gross change in MBP distribution in vivo in Zdhhc9 KO mice (Fig 3A). However, an anti-MBP antibody recognizes all forms of MBP, not just the specific splice variants whose palmitoylation is affected by ZDHHC9 loss. Specifically assessing nanoscale distribution of these splice variants would require a way (e.g. anti-MBP splice form-specific antibodies that are compatible with immuno-EM) to distinguish these variants from other, non-palmitoylated forms of MBP. Although such an antibody could be an important tool, we hope the reviewers would agree that developing and characterizing such a reagent is beyond the scope of the current study.

      We do, however, note that the lack of gross change in MBP distribution and levels in Zdhhc9 KO mice is consistent with the relatively mild phenotype of these mice, compared with shiverer (shi/shi) mice, in which MBP is completely lost. In shiverer, CNS compact myelin is almost absent (PMID: 671037; PMID: 88695; PMID: 460693) and, as the name suggests, mice display a shivering gait, and exhibit seizures and early death. In contrast, Zdhhc9 mice show only subtle behavioral deficits (PMID: 29944857). These differences are all consistent with a model in which Zdhhc9 KO mice, despite their significantly reduced MBP palmitoylation (Fig 8) have grossly normal distribution and levels of MBP when all splice variants are assessed (Fig 3, Fig 8). It is not inconceivable that Zdhhc9 KO mice have a nanoscale change in the distribution of MBP, particularly of specific palmitoylated splice variants, within myelin that profoundly affects myelin ultrastructure, without grossly altering MBP distribution. However, an alternative and not mutually exclusive possibility is that aberrant palmitoylation of other Zdhhc9 substrates accounts for, or contributes to, the abnormalities in myelin at the ultrastructural level. Addressing this issue would require a multi-pronged approach, not just to assess palmitoylation and distribution of such proteins in Zdhhc9 KO, but also to test whether they are direct Zdhhc9 substrates, in order to rule out indirect effects. We hope reviewers would agree that this is best left to a separate study. However, in our revised Discussion we now summarize what can be inferred regarding Zdhhc9-dependent effects on total and splicevariant specific distribution and levels of MBP.  

      (4) Although the article mentions the association of Zdhhc9 with intellectual disabilities, it does not involve behavioral analysis of Zdhhc9 KO mice. It is recommended to supplement some behavioral experimental data to support the important role of Zdhhc9 in maintaining normal cognitive function, enhancing the clinical relevance of the article.

      We appreciate this point from the reviewer. The behavior of the same ZDHHC9 KO mouse line that we used was reported in PMID: 31747610 and in PMID: 29944857. In the former study, Zdhhc9 KO mice were reported to display seizures reminiscent of phenotypes in human patients with ZDHHC9 mutation. The latter study assessed performance of Zddhc9 KO mice in several tasks that test cognitive function. Specifically the KO mice were reported to display “altered behaviour in the open-field test, elevated plus maze and acoustic startle test that is consistent with a reduced anxiety level; a reduced hang time in the hanging wire test that suggests underlying hypotonia but which may also be linked to reduced anxiety [and] deficits in the Morris water maze test of hippocampal-dependent spatial learning and memory.”. We have incorporate these findings in our revised Discussion, where we summarize how these phenotypes are common, not just to human patients with ZDHHC9 mutation, but also to other human neurodevelopmental conditions and mouse models in which ID is a common feature.

      (5) For the abnormal myelination observed in Zdhhc9 KO mice, including unmyelinated large-diameter axons and excessively myelinated small-diameter axons, the article lacks indepth research and explanation on the exact mechanism and mode of action of ZDHHC9 in regulating myelination.

      We share the reviewer’s interest in this point but again note that gaining definitive insights into this issue is far from trivial. Convincing evidence of a causative mechanism would require an exhaustive identification of ZDHHC9 in vivo substrates, followed by point mutation of substrate palmitoylation site(s) to determine the extent to which palmitoylation of such protein(s) phenocopies ZDHHC9 loss. Nonetheless, it is possible to break this question down and to summarize what we do and do not know. For example, our experiments in cultured OLs show that ZDHHC9 loss causes call-autonomous deficits in morphological maturation of these cells. We also know that ZDHHC9 loss results in impaired palmitoylation of MBP, a direct substrate for ZDHHC9. Moreover, loss of ZDHHC9 at Golgi outposts in OLs (a phenotype observed with several XLID-associated mutant forms of ZDHHC9, even those with no significant loss of catalytic activity) correlates with intellectual disability. Together, these findings are consistent with a model in which ZDHHC9 action at OL Golgi outposts is critical for normal myelination. However, it is yet to be determined whether the key substrates of ZDHHC9 include MBP, other palmitoyl-proteins that are key constituents of CNS myelin, or proteins whose palmitoylation is important for myelin protein trafficking and targeting. Another non-mutually exclusive possibility is that ZDHHC9 acts at Golgi outposts but indirectly, for example to drive the expression of myelin protein genes. Future experiments, including but not limited to palmitoyl-proteomics in ZDHHC9 (OL-specific) KO mice, will be needed to provide more definitive insights into this issue. We have expanded our Discussion of links between ZDHHC9 mutation and impaired myelination to summarize the above points.

      (6) The function of ZDHHC9 in OL may be related to the Golgi apparatus, but its exact role in these structures is still unclear. It is suggested to discuss in more detail the role of ZDHHC9 in the Golgi apparatus in the discussion section.

      We appreciate this point, which we considered as related to point (5) above. In our revised Discussion we highlight how ZDHHC9 action at Golgi outposts may involve direct palmitoylation of myelin proteins, palmitoylation of proteins that direct myelin proteins to the myelin membrane and/or activation of gene expression programs that serve to drive myelination. We further note that these possibilities are not mutually exclusive.

      (7) More experimental support and in-depth research are needed on the detailed mechanism of how ZDHHC9 and Golga7 cooperatively regulate MBP palmitoylation, and how this decrease in palmitoylation level leads to myelination defects.

      This is another important point – our new experiments suggest that, although some XLID mutations markedly affect ZDHHC9’s ability to palmitoylate MBP, others do not, yet all of the mutant forms fail to localize to Golgi outposts. These findings are consistent with a model in which the subcellular location at which ZDHHC9 palmitoylates MBP, and potentially other substrates, is critical for normal myelination. Interestingly, despite their marked differences in basal catalytic activity (as assessed by autopalmitoylation), wt and all XLID forms of ZDHHC9 appear to show enhanced activity (measured by both auto- and MBP palmitoylation) in the presence of ZDHHC9, suggesting that the association with Golga7 (which also localizes to Golgi outposts) is central to ZDHHC9 activity. This model is also highly consistent with the biased expression of Golga7 in OLs, compared to other CNS cell types (Fig 1E, 1F). Moreover, XLID-associated mutant forms of ZDHHC9 also show reduced protein stability and are impaired in their ability to form complexes with Golga7 (also known as Golgi Complex Protein 16kDa; GCP16; PMID: 37035671). Failure of ZDHHC9 XLID mutants to localize to Golgi outposts may thus be due to aberrant trafficking of mutant ZDHHC9 per se, but may also involve impaired association/stabilization of ZDHHC9/Golga7 complexes at these locations. Again, it is possible that either or both of these mechanisms, which are not mutually exclusive, contribute to impaired MBP palmitoylation and/or myelination deficits. We summarize these points in our revised Discussion.

      In summary, it is recommended that the authors address the above issues through additional experiments and improved discussions to further strengthen the credibility and clinical relevance of the article.

      Recommendations for the authors:  

      Reviewer #1 (Recommendations For The Authors):

      No gross changes were detected in OL development in Zdhhc9 KO mice and analyses from P28 Zdhhc9 KO mice crossed with Mobp-EGFP reporter mice did not show changes in EGFP+ OL differentiation (Figure 3). However, and given the observed subcellular localization of ZDHHC9 in OL processes (Figure 2) and the observation that the percentage of unmyelinated axons is increased in Zdhhc9 KO (Figure 6), ***early time points to examine the differentiated pools of OLs and their capacity to extend processes/contact axons need to be considered***.

      We appreciate this point, but due to the order in which experiments were performed, the ZDHHC9 KO mouse colony that we maintained after initial submission of this work contains homozygous MOBP-EGFP, but not the mT/mG transgene that would be most optimal for the proposed experiment. We hope the reviewer appreciates that it would take considerable time and effort regarding mouse breeding to cross out the MOBP and add back the mT/mG. We nonetheless appreciate the importance of the point raised and therefore examined an earlier developmental time point (P21, 3 weeks) to quantify OLs and NG2+ OPCs. In our updated Fig 3C1-C3, we use Mobp-EGFP mice to show that Zdhhc9 KO does not significantly affect the number of EGFP+ OLs at this time point in the cortex, corpus callosum and spinal cord. We also show that in corpus callosum, Zdhhc9 KO does not significantly affect the number of NG2+ OPCs at this earlier time point (Fig 3D, E). Furthermore, immunostaining to detect BCAS1, a marker of pre-mature OLs, also revealed no qualitative difference with ZDHHC9 loss at P21. We show representative images from these BCAS1 experiments in an updated Fig S3. While these new experiments do not address the morphology of OLs in Zdhhc9 KO, they do provide further evidence that deficits in myelination in young Zdhhc9 KO mice (Figure 6) are not likely due to gross differences in OPC or OL numbers during development.

      The authors observed defects in Zdhhc9 KO OL protrusions that they attributed to abnormal OL membrane expansion (Fig 4 and 5). Can they show evidence for this?

      This is an important point, and we appreciate the opportunity to explain the reasoning behind our initial statement more fully, while noting that other explanations are possible. Fig 5B (an Imaris-assisted reconstruction using the EGFP cell fill/morphology marker) highlights large spheroid-like distensions along OL processes. We reason that these spheroids are enclosed by the OL lipid membrane because if the membrane were ruptured, the EGFP signal would likely diffuse. This in turn suggests that the caliber of the OL process at the position of the spheroid is grossly abnormal i.e. the membrane has hyper-expanded. Given that OL membrane growth during myelination extends in two directions, i.e., spiral growth to the axonal surface and longitudinal growth along the axon, it is possible that spheroid-like structures are formed by uneven myelin growth. We recognize that we cannot yet conclude whether and how spheroid formation might be linked to the myelination deficit that we observe in Zdhhc9 KO mice.

      However, defining the subcellular mechanism for spheroid formation may provide further insights into this issue. We have therefore largely retained the original statement but have added the reasoning above to our revised Discussion.

      The authors report that Zdhhc9 KO primary and secondary branches in OL were longer, some contained spheroid-like swellings and the OL protrusion complexity was higher. However, these data is partially contradictory to what they show in OL differentiation experiments in vitro (Fig 7). There is also no evidence for increased membrane expansion in Zdhhc9 knockdown myelin forming cells in culture. How do they reconcile these different findings?

      We appreciate the reviewer’s interest in this issue. Several non-mutually exclusive factors could account for the differences in OL morphology in vitro versus in vivo caused by Zdhhc9 loss. First, morphology in vivo may well be influenced by the axons and/or other extrinsic components around each OL that are not present in our primary cultures. Second, OL growth in vivo is highly 3-dimensional, whereas growth in culture is largely 2-dimensional – it may be difficult to support formation of spheroids (by definition, a 3-dimensional structure) in the latter situation. Finally, Zdhhc9 is absent in vivo from the beginning of development until the time points examined, whereas in our cultured OL experiments, Zdhhc9 shRNA is virally delivered to OPC cultures at DIV2 and likely acutely affects Zdhhc9 expression predominantly in committed OLs (following the switch to differentiation medium at DIV3). These differences may also affect the ability of other PATs or, potentially, palmitoylation-independent subcellular processes, to compensate for Zdhhc9 loss. We have more fully explained these points in our revised Discussion. 

      Page 7: "The OL processes in this culture condition correspond to large lipid-rich membranous sheets that form spiral membrane expansion on axons in vivo (49)." At which stage are authors referring to? OL processes are extended in culture before membrane formation and this is not clear here. In a 3-days differentiation culture, most OLs have not yet formed a myelin sheath (eg., Figure 2 in Zuchero et al., 2015, Dev Cell).

      We appreciate the reviewer highlighting this point. We first note that our oligodendrocyte (OL) culture conditions differ from the immunopanning method used by Zuchero et al., 2015 (original reference (Emery and Dugas, 2013)), which may affect the time course and progression of OL process elaboration and/or myelin sheath formation. We further note that in our cultures most EGFP+ processes are also MBP+ at the time point examined (strictly 3 days plus 9 hours post-differentiation). It thus seems likely that these MBP+ structures largely correspond to the MBP+ wrapping sheaths that occur in vivo, so we have therefore retained our original statement but have added this further explanation.

      Minor: Figure 6 (Legend): Time points should be indicated throughout the panels.

      We have added this information as requested

      Reviewer 2 Recommendations for the Authors:

      (1) Regarding the subcellular localization experiment of ZDHHC9 mutants in OL, it is currently limited to in vitro cultured OL, lacking validation in vivo OL or myelin sheath. Additionally, it is necessary to investigate whether the abnormal subcellular localization of ZDHHC9 mutants affects their enzyme activity and palmitoylation modification of substrate proteins.

      We thank the reviewer for raising this point. New data in our revised Figure 8 compares autopalmitoylation (sometimes used as a surrogate measure of PAT activity) of ZDHHC9wt and XLID mutants, and their ability to palmitoylate MBP in transfected cells. Intriguingly, we found that autopalmitoylation activity of the ZDHHC9-P150S mutant does not differ significantly from that of ZDHHC9wt, and that this mutant is still capable of palmitoylating MBP. Moreover, the R96W mutant, while impaired in autopalmitoylation, still palmitoylated MBP approximately 50% as effectively as ZDHHC9wt in our cell-based assay. These findings suggest that ZDHHC9-P150S and, probably, ZDHHC9-R96W mutants might still be able to palmitoylate substrates in OLs if they were properly localized. This possibility in turn suggests that impaired subcellular targeting in addition to, or instead of, impaired catalytic activity, may be a key factor in certain cases of ZDHHC9-associated XLID. We have expanded our Figure 8 to show these new experiments and have summarized the conclusions above in our revised Discussion. We thank the reviewer for suggesting that we further investigate this issue.

      (2) The experimental period (P21+21 days) using genetic labeling to track the development of myelinating cells may not be long enough. It is recommended to extend the observation time and analyze at more time points to more comprehensively reflect the impact of Zdhhc9 KO.

      We appreciate this point from the reviewer but, regrettably, we did not maintain the PdgfraCreER; R26-EGFP; Zdhhc9 KO mouse line and hope the reviewer appreciates that it would take considerable time and effort to rederive this line and then perform the suggested extended time course experiments. However, we note for the reviewer that our preliminary studies did not reveal any effect of Zdhhc9 KO on the number of MOBP-EGFP+ OLs in 6-month-old mice (not shown), consistent with a model in which Zdhhc9 loss does not affect OPC-OL commitment per se.

      (3) The author speculates that Zdhhc9 may regulate myelination by affecting the membrane localization of specific myelin proteins, but lacks direct experimental evidence to support this. It is suggested to detect the expression and distribution of relevant proteins in the myelin of Zdhhc9 KO mice.

      We share the reviewer’s interest in this point but realized that it is more technically challenging to address than might be initially thought. The main protein we would implicate and seek to test is MBP, but we already found that there is no gross change in MBP distribution in vivo in Zdhhc9 KO mice (Fig 3A). However, an anti-MBP antibody recognizes all forms of MBP, not just the specific splice variants whose palmitoylation is affected by ZDHHC9 loss. Specifically assessing nanoscale distribution of these splice variants would require a way (e.g. am anti-MBP splice form-specific antibody that is compatible with immuno-EM) to distinguish these variants from other, non-palmitoylated forms of MBP. Although such an antibody could be an important tool we hope the reviewers would agree that developing and characterizing such a reagent is beyond the scope of the current study.

      We do, however, note that the lack of gross change in MBP distribution and levels in Zdhhc9 KO mice is consistent with the relatively mild phenotype of these mice, compared with shiverer (shi/shi) mice, in which MBP is completely lost. In shiverer, CNS compact myelin is almost absent (PMID: 671037; PMID: 88695; PMID: 460693) and, as the name suggests, mice display a shivering gait, and exhibit seizures and early death. In contrast, Zdhhc9 mice show only subtle behavioral deficits (PMID: 29944857). These differences are all consistent with a model in which Zdhhc9 KO mice, despite their significantly reduced MBP palmitoylation (Fig 8) have grossly normal distribution and levels of MBP when all splice variants are assessed (Fig 3, Fig 8). It is not inconceivable that Zdhhc9 KO mice have a nanoscale change in the distribution of MBP, particularly of specific palmitoylated splice variants, within myelin that profoundly affects myelin ultrastructure, without grossly altering MBP distribution. However, an alternative and not mutually exclusive possibility is that aberrant palmitoylation of other

      Zdhhc9 substrates accounts for, or contributes to, the abnormalities in myelin at the ultrastructural level. Addressing this issue would require a multi-pronged approach, not just to assess palmitoylation and distribution of such proteins in Zdhhc9 KO, but also to test whether they are direct Zdhhc9 substrates, in order to rule out indirect effects. We hope reviewers would agree that this is best left to a separate study. However, in our revised Discussion we now summarize what can be inferred regarding Zdhhc9-dependent effects on total and splicevariant specific distribution and levels of MBP.  

      (4) Although the article mentions the association of Zdhhc9 with intellectual disabilities, it does not involve behavioral analysis of Zdhhc9 KO mice. It is recommended to supplement some behavioral experimental data to support the important role of Zdhhc9 in maintaining normal cognitive function, enhancing the clinical relevance of the article.

      We appreciate this point from the reviewer. The behavior of the same ZDHHC9 KO mouse line that we used was reported in PMID: 31747610 and in PMID: 29944857. In the former study, Zdhhc9 KO mice were reported to display seizures reminiscent of phenotypes in human patients with ZDHHC9 mutation. The latter study assessed performance of Zddhc9 KO mice in several tasks that test cognitive function. Specifically the KO mice were reported to display “altered behaviour in the open-field test, elevated plus maze and acoustic startle test that is consistent with a reduced anxiety level; a reduced hang time in the hanging wire test that suggests underlying hypotonia but which may also be linked to reduced anxiety [and] deficits in the Morris water maze test of hippocampal-dependent spatial learning and memory.”. We have incorporate these findings in our revised Discussion, where we summarize how these phenotypes are common, not just to human patients with ZDHHC9 mutation, but also to other human neurodevelopmental conditions and mouse models in which ID is a common feature.

      (5) For the abnormal myelination observed in Zdhhc9 KO mice, including unmyelinated large-diameter axons and excessively myelinated small-diameter axons, the article lacks indepth research and explanation on the exact mechanism and mode of action of ZDHHC9 in regulating myelination.

      We share the reviewer’s interest in this point but again note that gaining definitive insights into this issue is far from trivial. Convincing evidence of a causative mechanism would require an exhaustive identification of ZDHHC9 in vivo substrates, followed by point mutation of substrate palmitoylation site(s) to determine the extent to which palmitoylation of such protein(s) phenocopies ZDHHC9 loss. Nonetheless, it is possible to break this question down and to summarize what we do and do not know. For example, our experiments in cultured OLs show that ZDHHC9 loss causes call-autonomous deficits in morphological maturation of these cells. We also know that ZDHHC9 loss results in impaired palmitoylation of MBP, a direct substrate for ZDHHC9. Moreover, loss of ZDHHC9 at Golgi outposts in OLs (a phenotype observed with several XLID-associated mutant forms of ZDHHC9, even those with no significant loss of catalytic activity) correlates with intellectual disability. Together, these findings are consistent with a model in which ZDHHC9 action at OL Golgi outposts is critical for normal myelination. However, it is yet to be determined whether the key substrates of ZDHHC9 include MBP, other palmitoyl-proteins that are key constituents of CNS myelin, or proteins whose palmitoylation is important for myelin protein trafficking and targeting. Another non-mutually exclusive possibility is that ZDHHC9 acts at Golgi outposts but indirectly, for example to drive the expression of myelin protein genes. Future experiments, including but not limited to palmitoyl-proteomics in ZDHHC9 (OL-specific) KO mice, will be needed to provide more definitive insights into this issue. We have expanded our Discussion of links between ZDHHC9 mutation and impaired myelination to summarize the above points.

      (6) The function of ZDHHC9 in OL may be related to the Golgi apparatus, but its exact role in these structures is still unclear. It is suggested to discuss in more detail the role of ZDHHC9 in the Golgi apparatus in the discussion section.

      We appreciate this point, which we considered as related to point (5) above. In our revised Discussion we highlight how ZDHHC9 action at Golgi outposts may involve direct palmitoylation of myelin proteins, palmitoylation of proteins that direct myelin proteins to the myelin membrane and/or activation of gene expression programs that serve to drive myelination. We further note that these possibilities are not mutually exclusive.

      (7) More experimental support and in-depth research are needed on the detailed mechanism of how ZDHHC9 and Golga7 cooperatively regulate MBP palmitoylation, and how this decrease in palmitoylation level leads to myelination defects.

      This is another important point – our new experiments suggest that, although some XLID mutations markedly affect ZDHHC9’s ability to palmitoylate MBP, others do not, yet all of the mutant forms fail to localize to Golgi outposts. These findings are consistent with a model in which the subcellular location at which ZDHHC9 palmitoylates MBP, and potentially other substrates, is critical for normal myelination. Interestingly, despite their marked differences in basal catalytic activity (as assessed by autopalmitoylation), wt and all XLID forms of ZDHHC9 appear to show enhanced activity (measured by both auto- and MBP palmitoylation) in the presence of ZDHHC9, suggesting that the association with Golga7 (which also localizes to Golgi outposts) is central to ZDHHC9 activity. This model is also highly consistent with the biased expression of Golga7 in OLs, compared to other CNS cell types (Fig 1E, 1F). Moreover, XLID-associated mutant forms of ZDHHC9 also show reduced protein stability and are impaired in their ability to form complexes with Golga7 (also known as Golgi Complex Protein 16kDa; GCP16; PMID: 37035671). Failure of ZDHHC9 XLID mutants to localize to Golgi outposts may thus be due to aberrant trafficking of mutant ZDHHC9 per se, but may also involve impaired association/stabilization of ZDHHC9/Golga7 complexes at these locations. Again, it is possible that either or both of these mechanisms, which are not mutually exclusive, contribute to impaired MBP palmitoylation and/or myelination deficits. We summarize these points in our revised Discussion.

    1. eLife Assessment

      This manuscript determines how PA28g, a proteasome regulator that is overexpressed in tumors, and C1QBP, a mitochondrial protein for maintaining oxidative phosphorylation that plays a role in tumor progression, interact in tumor cells to promote their growth, migration and invasion. Additional experiments and analyses that supported the theoretical models for the interaction have been performed in response to the reviews. The overall findings and conceptual framework are important and the evidence is solid. A logical extrapolation of this work is to test the C1QBP mutants using functional assays to determine whether the mutations can decrease the protein stability mediated by the interaction with PA28g.

    2. Reviewer #2 (Public review):

      Summary:

      The authors tried to determine how PA28g functions in oral squamous cell carcinoma (OSCC) cells. They hypothesized it may act through metabolic reprogramming in the mitochondria.

      Strengths:

      They found that the genes of PA28g and C1QBP are in an overlapping interaction network after an analysis of a genome database. They also found that the two proteins interact in coimmunoprecipitation and pull-down assays using the lysate from OSCC cells with or without expression of the exogenous genes. They used truncated C1QBP proteins to map the interaction site to the N-terminal 167 residues of C1QBP protein. They observed the levels of the two proteins are positively correlated in the cells. They provided evidence for the colocalization of the two proteins in the mitochondria and the effect on mitochondrial form and function in vitro and in vivo OSCC models, and the correlation of the protein expression with the prognosis of cancer patients.

      Comments on revision:

      The third revision added data from two point mutations of C1QBP that would disrupt a hydrogen bond network with PA28g protein. As one would expect from the structural models obtained with AlphaFold, the interaction between the two proteins as detected by co-immunoprecipitation of cell lysate was reduced by both mutations. Therefore, the theoretical models for the interaction were supported by the experimental data. Moving forward, the home run experiments would be to test the C1QBP mutants in functional assays to determine whether the mutations can decrease the protein stability afforded by the interaction with PA28g, which in turn decrease the effect of PA28g on mitochondria and tumor cells via C1QBP. Success of these experiments will conclude this manuscript that presents a novel finding for tumor cell biology which could be a launch pad for therapeutic intervention of tumor development.

    3. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #2 (Public review):

      This manuscript determines how PA28g, a proteasome regulator that is overexpressed in tumors, and C1QBP, a mitochondrial protein for maintaining oxidative phosphorylation that plays a role in tumor progression, interact in tumor cells to promote their growth, migration and invasion. Evidence for the interaction and its impact on mitochondrial form and function was provided although it is not particularly strong.

      The revised manuscript corrected mislabeled data in figures and provides more details in figure legends. Misleading sentences and typos were corrected. However, key experiments that were suggested in previous reviews were not done, such as making point mutations to disrupt the protein interactions and assess the consequence on protein stability and function. Results from these experiments are critical to determine whether the major conclusions are fully supported by the data.

      The second revision of the manuscript included the proximity ligation data to support the PA28g-C1QBP interaction in cells. However, the method and data were not described in sufficient detail for readers to understand. The revision also includes the structural models of the PA28g-C1QBP complex predicted by AlphaFold. However, the method and data were not described with details for readers to understand how this structural modeling was done, what is the quality of the resulting models, and the physical nature of the protein-protein interaction such as what kind of the non-covalent interactions exist in the interface of the protein complexes. Furthermore, while the interactions mediated by the protein fragments were tested by pull-down experiments, the interactions mediated by the three residues were not tested by mutagenesis and pull-down experiments. In summary, the revision was improved, but further improvement is needed.

      Thank you very much for your comments.

      (1) Based on your suggestion, we predicted the possible interaction sites using AlphaFold 3 and found that mutations in amino acids 76 and 78 of C1QBP affect the interaction with PA28γ (Revised Appendix Figure 1J). Subsequently, pulldown experiment also found that after mutating the amino acids at the two aforementioned sites (T76A, G78N), C1QBP that could bind to PA28γ decreased (Revised Figure 1J). The above results confirm that PA28γ could interacts with C1QBP, in a manner dependent on the N-terminus of C1QBP. These findings are now included in the revised manuscript “In addition, we employed AlphaFold 3 to perform energy minimization and predict hydrogen bonds between the C1QBP N-terminus (amino acids 1-167) and the PA28γ protein interaction region. The results suggest that the T76 and G78 residues of C1QBP may be key contributors to the interaction. Consistently, coimmunoprecipitation analysis demonstrated that mutations at these sites (C1QBPT76A and C1QBPG78N) significantly reduced the binding ability to PA28γ (Fig. 1J and Appendix Fig. 1J)”, specifically in results section. We believe this additional validation strengthens the robustness of our findings.

      (2) According to your suggestion, we have added a description of the results of PLA in the figure legend (Revised Figure 1C) and the method of PLA in the appendix file (Revised Appendix file, Part “Proximity Ligation Assay”). The revised text reads as follows: (C) PLA image of UM1 cells shows the interaction between C1QBP and PA28γ in both cytoplasm and nucleus (red fluorescence).

      (3) In the light of your suggestion, we have enriched the description of AlphaFold 3 analysis in the appendix file (Revised Appendix file, Page 10-11). The revised text reads as follows:

      “Prediction and Analysis of Protein Interactions

      Protein Sequence Retrieval and Structure Prediction

      The protein sequences of C1QBP and PA28γ were obtained from the AlphaFold Protein Structure Database. Structural predictions of the protein-protein interaction between C1QBP and PA28γ were conducted using AlphaFold 3. The plDDT (predicted local distance difference test) values were utilized to assess the confidence of the predicted models. Models with a plDDT score above 70 were considered confident, while those with a score above 90 were categorized as very high confidence. These values were annotated in the figures to indicate the reliability of the structural predictions.”

      “Protein Preparation and Structure Optimization

      The best-scored model for the C1QBP-PA28γ interaction predicted by AlphaFold 3 was selected for further analysis. The model was imported into MOE 2022 (Molecular Operating Environment) software for protein preparation. This process included the removal of water molecules and other heteroatoms, followed by the addition of hydrogen atoms to the structure. This step was essential for optimizing the protein’s 3D conformation and ensuring the correctness of the protonation states at physiological pH.”

      “Energy Minimization and Hydrogen Bond Prediction

      The protein structure was subjected to energy minimization using the Amber10: EHT (Effective Hamiltonian Theory) force field, with R-field 1: 80 settings to refine the model’s geometry. The minimization process was performed to optimize the protein’s internal energy and ensure stable conformation, followed by calculation of hydrogen bond interactions. The interaction energies and hydrogen bonds were analyzed to identify potential binding sites and stabilize the predicted protein-protein complex.”

    1. eLife Assessment

      This important study reports solid evidence for the significant role of mother-child neural synchronization and relationship quality in the development of Theory of Mind (ToM) and social cognition. The findings effectively bridge brain development with children's behavior and parenting practices, and will be of interest to researchers studying brain development and social cognition, as well as the general public.

    2. Reviewer #1 (Public review):

      The authors have undertaken a significant revision of the manuscript and addressed the vast majority of our original comments. The manuscript is significantly improved as a result and will make a nice contribution to the literature. The new framing is especially impactful.

      We have a few remaining comments to improving the manuscript:

      Q1: The authors clarified the multiple comparison correction appropriately, and included a comprehensive of the study limitations related to causality and SEM. We think there could be a few further improvements to the manuscript to fully address our initial comment.

      Under the results section where the authors describe the use of structural equation modeling, we think that it would be helpful to readers to further emphasize that the current design doesn't allow for delineation of temporal sequences in development and do cannot reflect true mediation. These are important caveats that the readers describe beautifully in their response.

      In addition to think about the mediating variables, can the authors conduct a sensitivity analysis that re-orders the IV, mediator, and DV? That way, a formal comparison can be made between model fits. It would provide an empirical basis for how to temper the discussion of these findings.

      Q7: We think that this analysis (lack of significant correlations between ISS, child age, and neural maturity) and corresponding discussion by the authors would be very interesting for readers. It does not appear as though they've added this information to the text (even in a supplementary file would suffice), but I think their conclusions about the data are strengthened related to context specific neural dynamics.

    3. Reviewer #2 (Public review):

      Summary:<br /> This study investigates the impact of mother-child neural synchronization and the quality of parent-child relationships on the development of Theory of Mind (ToM) and social cognition. Utilizing a naturalistic fMRI movie-viewing paradigm, the authors analyzed inter-subject neural synchronization in mother-child dyads and explored the connections between neural maturity, parental caregiving, and social cognitive outcomes. The findings indicate age-related maturation in ToM and social pain networks, emphasizing the importance of dyadic interactions in shaping ToM performance and social skills, thereby enhancing our understanding of the environmental and intrinsic influences on social cognition.

      Strengths:<br /> This research addresses a significant question in developmental neuroscience, by linking social brain development with children's behaviors and parenting. It also uses a robust methodology by incorporating neural synchrony measures, naturalistic stimuli, and a substantial sample of mother-child dyads to enhance its ecological validity. Furthermore, the SEM approach provides a nuanced understanding of the developmental pathways associated with Theory of Mind (ToM). The manuscript also addressed many concerns raised in the initial review. The adoption of the neuroconstructivist framework effectively frames neural and cognitive development as reciprocal, addressing prior concerns about causality. The justification for methodological choices, such as omitting resting-state baselines due to scanning challenges in children and using unit-weighted scoring for ToM tasks, further strengthens the study's credibility.

      Weaknesses:<br /> (1) The revised introduction has improved, particularly in framing the first goal-developmental changes in ToM and SPM networks-as a "developmental anchor" for goals 2 and 3. However, given prior research on age-related changes in these networks (e.g., Richardson et al., 2018), the authors should clarify whether this goal seeks to replicate prior findings or to extend them under new contexts. Specifying how this part differs from existing work and articulating specific hypotheses would enhance the focus.<br /> (2) I still have some reservations about retaining the slightly causal term "shape" in the title. While the manuscript now carefully avoids causal claims, the title may still be interpreted as implying directionality, especially by non-specialist audiences.<br /> (3) One more question about Figure 2A and 2B: adults and children showed highly similar response curves for video frames, yet some peaks (e.g., T02, T05, T06) are identified as ToM or SPM events only in adults. Whether statistical methods account for the differences? Or whether the corresponding video frames contain subtle social cues that only adults can process?

    4. Reviewer #3 (Public review):

      Summary:<br /> The article explores the role of mother-child interactions in the development of children's social cognition, focusing on Theory of Mind (ToM) and Social Pain Matrix (SPM) networks. Using a naturalistic fMRI paradigm involving movie viewing, the study examines relationships among children's neural development, mother-child neural synchronization, and interaction quality. The authors identified a developmental pattern in these networks, showing that they become more functionally distinct with age. Additionally, they found stronger neural synchronization between child-mother pairs compared to child-stranger pairs, with this synchronization and neural maturation of the networks associated with the mother-child relationship and parenting quality.

      Strengths:<br /> This is a well-written paper, and using dyadic fMRI and naturalistic stimuli enhances its ecological validity, providing valuable insights into the dynamic interplay between brain development and social interactions.

      Weaknesses:<br /> The current sample size (N = 34 dyads) is a limitation, particularly given the use of SEM, which generally requires larger samples for stable results. Although the model fit appears adequate, this does not guarantee reliability with the current sample size.

    5. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1:

      The authors sought to examine the associations between child age, reports of parent-child relationship quality, and neural activity patterns while children (and also their parents) watched a movie clip. Major methodological strengths include the sample of 3-8 year-old children in China (rare in fMRI research for both age range and non-Western samples), use of a movie clip previously demonstrated to capture theory of mind constructs at the neural level, measurement of caregiver-child neural synchrony, and assessment of neural maturity. Results provide important new information about parent-child neural synchronization during this movie and associations with reports of parent-child relationship quality. The work is a notable advance in understanding the link between the caregiving context and the neural construction of theory of mind networks in the developing brain.

      We are grateful for the reviewer’s generous and thoughtful summary of our work. We particularly appreciate the recognition of the methodological strengths—including the rare developmental sample, culturally diverse context, and use of naturalistic, theory of mind-relevant stimuli—as well as the importance of integrating neural synchrony and relational variables. The reviewer’s comments affirm the core motivation behind this study: to advance our understanding of how the caregiving environment shapes the neurodevelopment of social cognition in early childhood. We have taken all specific suggestions seriously and hope the revised manuscript more clearly communicates these contributions.

      We appreciate that the authors wanted to show support for a mediational mechanism. However, we suggest that the authors drop the structural equation modeling because the data are cross-sectional so mediation is not appropriate. Other issues include the weak justification of including the parent-child neural synchronization as part of parenting.... it could just as easily be a mechanism of change or driven by the child rather than a component of parenting behavior. The paper would be strengthened by looking at associations between selected variables of interest that are MOST relevant to the imaging task in a regression type of model. Furthermore, the authors need to be more explicit about corrections for multiple comparisons throughout the manuscript; some of the associations are fairly weak so claims may need to be tempered if they don't survive correction.

      Thanks for feedback on the use of SEM in our study. We recognize the limitations of using SEM to infer mediation with cross-sectional data and acknowledge that longitudinal designs are better suited for such analyses. However, our goal was not to establish causality but to explore potential pathways linking parenting, personal traits, and Theory of Mind (ToM) behavior to social cognition outcomes. SEM allowed us to simultaneously examine the relationships among these latent constructs, providing a cohesive framework for understanding the interplay of these factors. That said, we understand your concern and are willing to revise the manuscript to de-emphasize causal interpretations of the SEM findings.

      We thank the reviewer for raising the corrections for multiple comparisons. We confirm that all correlation analyses reported in the manuscript have been corrected for multiple comparisons using the False Discovery Rate (FDR) procedure. In the revised manuscript, we now explicitly indicate FDR correction for all relevant p-values to ensure clarity and transparency. Where this information was previously missing, we have corrected the oversight and clearly labeled the results as FDR-corrected or uncorrected where appropriate. Additionally, we have carefully reviewed our interpretation of all reported associations. For any results that were close to the significance threshold, we have tempered our claims and now describe them as a marginally significant association to avoid overstating our findings.

      The corresponding changes have been made on Discussion section of the revised manuscript.

      Reverse correlation analysis is sensible given what prior developmental fMRI studies have done. But reverse correlation analysis may be more prone to overfitting and noise, and lacks sensitivity to multivariate patterns. Might inter-subject correlation be useful for *within* the child group? This would minimize noise and allow for non-linear patterns to emerge.

      We appreciate the reviewer’s thoughtful suggestion regarding potential limitations of reverse correlation analysis. While we agree that inter-subject correlation (ISC) within the child group may be useful in other contexts, our primary goal in using reverse correlation was not to identify temporally distributed or multivariate response patterns, but rather to isolate specific events within the naturalistic stimulus that reliably evoke Theory of Mind (ToM) and Social Pain-related responses in adults—who possess more stable and mature neural signatures. These adult-derived events serve as anchors for subsequent developmental comparisons and provide a principled way to define timepoints of interest that are behaviorally and theoretically meaningful.

      Using reverse correlation in adults allows us to identify canonical ToM and Social Pain events in a data-driven yet hypothesis-informed manner. We then examine how children’s neural responses to these same events vary with age, neural maturity, and dyadic synchrony. This approach is consistent with prior work in developmental social neuroscience (e.g., Richardson et al., 2018) and offers a valid framework for identifying interpretable social-cognitive events in naturalistic stimuli.

      We have now clarified the rationale for using adult-based reverse correlation in the revised manuscript and explicitly stated its advantages for identifying targeted ToM and Social Pain content in the stimulus.

      The corresponding changes have been made on pages 17 of the revised manuscript.

      “We employed reverse correlation analysis in adults to identify discrete events within the movie that elicited reliable neural responses across participants in ToM and SPM networks.

      The events of adults were chosen for this analysis due to the relative stability and maturity of their social brain responses, allowing for robust detection of canonical ToM and social pain-related moments. These events, once identified, served as stimulus-locked timepoints for subsequent analyses in the child cohort. This approach enables us to examine how children's responses to well-characterized, socially meaningful events vary with age and parent-child dyadic dynamics.”

      No learning effects or temporal lagged effects are tested in the current study, so the results do not support the authors' conclusions that the data speak to Bandura's social learning theory. The authors do mention theories of biobehavioral synchrony in the introduction but do not discuss this framework in the discussion (which is most directly relevant to the data). The data can also speak to other neurodevelopmental theories of development (e.g.,neuroconstructivist approaches), but the authors do not discuss them. The manuscript would benefit from significantly revising the framework to focus more on biobehavioral synchrony data and other neurodevelopmental approaches given the prior work done in this area rather than a social psychology framework that is not directly evaluated.

      We appreciate the reviewer’s thoughtful and constructive feedback. We agree that the current study does not directly test mechanisms central to Bandura’s social learning theory, such as observational learning over time or behavioral modeling. In light of this, we have significantly revised the theoretical framing of the manuscript to focus more directly on the biobehavioral synchrony framework, which more accurately reflects the dyadic neural measures employed in this study and is better supported by our findings.

      Specifically, we have expanded the Discussion to contextualize our findings in terms of biobehavioral synchrony, emphasizing how inter-subject neural synchronization may reflect coordinated parent-child engagement and emotional attunement. We have also incorporated insights from neurodevelopmental and neuroconstructivist models, acknowledging that social cognitive development is shaped by dynamic interactions between neural maturation and environmental input over time.

      Although we continue to briefly reference Bandura’s theory to situate our findings within broader social-cognitive frameworks, we have clearly delineated the boundaries of what our data can support and have tempered previous claims. These changes are intended to better align our conceptual framing with the empirical evidence and relevant theoretical models.

      The corresponding changes have been made on pages 11-12 of the revised manuscript.

      “Insights into mechanisms of Neuroconstructivist Perspectives and Bandura’s social learning theory

      Our findings align with a neuroconstructivist perspective, which conceptualizes brain development as an emergent outcome of reciprocal interactions between biological constraints and context-specific environmental inputs. Rather than presuming fixed traits or linear maturation, this perspective highlights how neural circuits adaptively organize in response to experience, gradually supporting increasingly complex cognitive functions49. It offers a particularly powerful lens for understanding how early caregiving environments modulate the maturation of social brain networks.

      Building on this framework, the present study reveals that moment-to-moment neural synchrony between parent and child, especially during emotionally salient or socially meaningful moments, is associated with enhanced Theory of Mind performance and reduced dyadic conflict. This suggests that beyond age-dependent neural maturation, dyadic neural coupling may serve as a relational signal, embedding real-time interpersonal dynamics into the child’s developing neural architecture [1] . Our data demonstrate that children’s brains are not merely passively maturing, but are also shaped by the relational texture of their lived experiences—particularly interactions characterized by emotional engagement and joint attention. Importantly, this adds a new dimension to neuroconstructivist theory: it is not simply whether the environment shapes development, but how the quality of interpersonal input dynamically calibrates neural specialization. Interpersonal variation leaves detectable signatures in the brain, and our use of neural synchrony as a dyadic metric illustrates one potential pathway through which caregiving relationships exert formative influence on the developing social brain.

      The contribution of this work lies not in reiterating the interplay of nature and nurture, but in specifying the mechanistic role of interpersonal neural alignment as a real-time, context-sensitive developmental input. Neural synchrony between parent and child may function as a form of relationally grounded, temporally structured experience that tunes the child’s social brain toward contextually relevant signals. Unlike generalized enrichment, this form of neural alignment is inherently personalized and contingent—features that may be especially potent in shaping social cognitive circuits during early childhood.

      Although our study was not designed to directly examine learning mechanisms such as imitation or reinforcement, the findings can be viewed as broadly consistent with social learning theory. Bandura's theory posits that human behavior is shaped by observational learning and modeling from others in one's environment [2-4]. According to Bandura, children acquire social cognitive skills by observing and interacting with their parents and other significant figures in their environment. This dynamic interplay shapes their ability to understand and predict the behavior of others, which is crucial for the development of ToM and other social competencies.”

      References

      (1) Hughes, C. et al. Origins of individual differences in theory of mind: From nature to nurture? Child development 76, 356-370 (2005).

      (2) Koole, S. L. & Tschacher, W. Synchrony in psychotherapy: A review and an integrative framework for the therapeutic alliance. Frontiers in psychology 7, 862 (2016).

      (3) Liu, D., Wellman, H. M., Tardif, T. & Sabbagh, M. A. Theory of mind development in Chinese children: a meta-analysis of false-belief understanding across cultures and languages. Developmental Psychology 44, 523 (2008).

      (4) Frith, U. & Frith, C. D. Development and neurophysiology of mentalizing. Philosophical Transactions of the Royal Society of London. Series B: Biological Sciences 358, 459-473 (2003).

      The significance and impact of the findings would be clearer if the authors more clearly situated the findings in the context of (a) other movie and theory of mind fMRI task data during development; and (b) existing data on parent-child neural synchrony (often uses fNIRS or EEG). What principles of brain and social cognition development do these data speak to? What is new?

      We thank the reviewer for this thoughtful comment. In response, we have revised the Discussion section to more clearly situate our findings within two key literatures: (a) fMRI studies examining Theory of Mind using movie-based and traditional task paradigms across development, and (b) research on parent-child neural synchrony. We now articulate more explicitly how our findings advance current understanding of the neural architecture of social cognition in childhood, and how they contribute new insights into the relational processes shaping brain function. These revisions clarify the conceptual and empirical novelty of our study, particularly in its use of naturalistic fMRI, simultaneous child-parent dyads, and integration of neural maturity with interpersonal synchrony.

      The corresponding changes have been made on pages 12 of the revised manuscript.

      “Our findings contribute to and extend prior research using fMRI paradigms to investigate ToM development in children.  Previous work has shown that these networks become increasingly specialized and differentiated throughout childhood [1-3]. The current study extends these findings by demonstrating that the development of social brain networks is a gradual process that continues beyond the preschool years and is related to children's chronological age. This finding is consistent with behavioral research indicating that ToM and social abilities continue to develop and refine throughout middle childhood and adolescence [4]. Importantly, we move beyond prior work by combining reverse correlation with naturalistic stimuli to isolate discrete, behaviorally meaningful events (e.g., mental state attribution, social rejection) and relate children’s brain responses to adult patterns and social outcomes. This event-level analysis in a dyadic context offers greater ecological and interpretive precision than traditional block or condition-based designs. Our study provides novel evidence for the neural underpinnings of this protracted development, suggesting that the functional maturation of social brain networks may support the continued acquisition and refinement of social cognitive skills.

      In parallel, our study builds on and extends a growing body of work on parent-child neural synchrony, much of which has relied on fNIRS or EEG hyperscanning to demonstrate interpersonal alignment during communication, shared attention, or cooperative tasks [5-7]. While these modalities offer fine temporal resolution, they are limited in spatial precision and typically focus on surface-level cortical regions such as the prefrontal cortex. By contrast, our naturalistic fMRI approach enables the examination of deep and distributed brain networks—specifically those supporting social cognition—within child-parent dyads during emotionally and cognitively rich scenarios. Intriguingly, we found that neural synchronization during movie viewing was higher in child-mother dyads compared to child-stranger dyads.”

      Reference

      (1) Jacoby, N., Bruneau, E., Koster-Hale, J. & Saxe, R. Localizing Pain Matrix and Theory of Mind networks with both verbal and non-verbal stimuli. Neuroimage 126, 39-48 (2016).

      Astington, J. W. & Jenkins, J. M. A longitudinal study of the relation between language and theory-of-mind development. Developmental Psychology 35, 1311 (1999).

      (2) Carter, E. J. & Pelphrey, K. A. School-aged children exhibit domain-specific responses to biological motion. Social Neuroscience 1, 396-411 (2006).

      (3) Cantlon, J. F., Pinel, P., Dehaene, S. & Pelphrey, K. A. Cortical representations of symbols, objects, and faces are pruned back during early childhood. Cerebral Cortex 21, 191-199 (2011).

      (4) Im-Bolter, N., Agostino, A. & Owens-Jaffray, K. Theory of mind in middle childhood and early adolescence: Different from before? Journal of experimental child psychology 149, 98-115 (2016).

      (5) Deng, X. et al. Parental involvement affects parent-adolescents brain-to-brain synchrony when experiencing different emotions together: an EEG-based hyperscanning study. Behavioural brain research 458, 114734 (2024).

      (6) Miller, J. G. et al. Inter-brain synchrony in mother-child dyads during cooperation: an fNIRS hyperscanning study. Neuropsychologia 124, 117-124 (2019).

      (7) Nguyen, T., Bánki, A., Markova, G. & Hoehl, S. Studying parent-child interaction with hyperscanning. Progress in brain research 254, 1-24 (2020).

      There is little discussion about the study limitations, considerations about the generalizability of the findings, and important next steps and future directions. What can the data tell us, and what can it NOT tell us?

      We appreciate the reviewer’s recommendation to elaborate on the study’s limitations, generalizability, and future directions. In response, we have added a dedicated section to the Discussion that critically addresses these considerations. We acknowledge the cross-sectional nature of the study, the modest sample size, and the use of a single stimulus context as key limitations. We also clarify the inferences that can be drawn from our data and what remains speculative. Finally, we outline specific future research directions.

      The corresponding changes have been made on pages 13-14 of the revised manuscript.

      “While leveraging a naturalistic movie-viewing paradigm allowed us to study children's spontaneous neural responses during a semi-structured yet engaging task, dedicated experimental designs are still needed to make stronger inferences about the cognitive processes involved. Additionally, our region-of-interest approach precluded examination of whole-brain networks; future work could explore developmental changes in broader functional circuits. The cross-sectional nature of our study is a further limitation, as it cannot definitively establish the causal directions of the observed relationships. Longitudinal designs tracking children's brain development and social cognitive abilities over time would help clarify whether early parenting impacts later neural maturation and behavioral outcomes, or vice versa. Our sample was restricted to mother-child dyads, leaving open questions about potential differences in father-child relationships and gender effects on parenting neurobiology. Larger and more diverse samples would enhance the generalizability of the findings.

      Several future directions emerge from this research. First, combining naturalistic neuroimaging with structured cognitive tasks could elucidate the specific mental processes underlying children's neural responses during movie viewing. Examining how these processes relate to real-world social behavior would further bridge neurocognitive function and ecological validity. Longitudinal studies beginning in infancy could chart the developmental trajectories of parent-child neural synchrony and their impact on long-term social outcomes. Such work could also explore sensitive periods when parenting may be most influential on social brain maturation. Finally, expanding this multimodal approach to clinical populations like autism could yield insights into atypical social cognitive development and inform tailored intervention strategies targeting parent-child relationships and neural plasticity.”

      To evaluate associations between child neural activity patterns during the movie AND parent-child synchronization patterns AND other variables such as parent-child communication and theory of mind behavior, it seems like a robust approach could be to examine whether similar synchronization patterns are associated with similar scores on different variables. Would allow for non-linear and multivariate associations.

      We greatly appreciate the reviewer’s thoughtful suggestion regarding the use of similarity-based or multivariate analyses to assess whether dyads with similar neural synchronization profiles also exhibit similar scores on behavioral or relational variables. We agree that this type of analysis—such as representational similarity analysis (RSA) or inter-subject pattern similarity—offers a powerful framework for capturing non-linear and multivariate associations, and could provide deeper insights into shared neurobehavioral patterns across participants. However, the analytic logic of similarity-based approaches typically requires the availability of comparable measures across individuals or dyads (e.g., child A and child B must both have measures of brain activity, behavior, and environment). In the present study, our focus was on the child as the behavioral and developmental target, and we did not collect parallel behavioral or cognitive variables from the parent side (e.g., adult Theory of Mind ability, emotional traits, parenting style questionnaires beyond dyadic reports). As a result, it was not feasible to construct pairwise similarity matrices across dyads that include both neural synchrony and matched behavioral dimensions from both individuals.

      Instead, our study was designed to examine how child-level outcomes (e.g., Theory of Mind performance, social functioning) are associated with (a) the child’s neural responses to specific social events, and (b) the degree of neural synchronization with their mother, as a marker of relational engagement. The analytical emphasis, therefore, remained on within-child variation, modulated by the quality of the parent-child interaction.

      Were there associations between parent-child neural synchronization and child age? What was the association between neural maturity and parent-child neural synchronization

      We thank the reviewer for raising this important point regarding associations between parent-child neural synchronization (ISS), child age, and neural maturity.

      As reported in the original manuscript, we did not observe significant correlations between parent-child ISS and child age for either the Theory of Mind (ToM) or Social Pain Matrix (SPM) networks (all ps > 0.1). Additionally, we conducted additional analysis, we found no significant correlations between ISS and neural maturity (Author response image 1, r = 0.2503, p = 0.1533).

      These findings indicate that parent-child neural synchronization in this naturalistic viewing context is not simply explained by age-related maturation or children's neural maturity level. Instead, ISS may predominantly reflect real-time interpersonal engagement or relational dynamics rather than individual developmental trajectories or neural maturity.

      Author response image 1.

      Scatterplot showing the association between parent-child inter-subject synchronization (ISS) and neural maturity, averaged across the Theory of Mind (ToM) and Social Pain Matrix (SPM) networks. Each point represents one dyad. No significant correlation was observed between ISS and neural maturity (r = 0.2503, p = 0.1533, suggesting that interpersonal neural synchronization and individual neural maturation may reflect dissociable aspects of social brain development.

      The rationale for splitting the ages into 3 groups is unclear and creates small groups that could be more prone to spurious associations. Why not look at age continuously?

      We thank the reviewer for raising this important point. We fully agree that analyzing age as a continuous variable is statistically more robust and minimizes concerns about spurious associations due to arbitrary groupings.

      To clarify, all primary statistical models—including correlational analyses—treated age as a continuous variable, and our core developmental inferences are based on these continuous-age findings.

      In addition to these analyses, we included age group comparisons as a supplementary approach, guided by both theoretical considerations and visual inspection of the data. Specifically, we aimed to explore whether functional differentiation between social brain networks (e.g., ToM and SPM) might begin to emerge non-linearly or earlier than expected, particularly in the youngest children. Such early neural divergence may not be well-captured by linear trends alone. The grouped analysis allowed us to illustrate that network differentiation was already observable in children under age 5, suggesting that certain aspects of social brain organization may emerge earlier than classically assumed.

      We have now clarified this rationale in the revised manuscript and emphasized that the group-based analysis was used solely to highlight developmental shifts that may not follow a linear pattern, and not for formal hypothesis testing.

      The corresponding changes have been made on pages 9 of the revised manuscript.

      “While our primary analyses treated age as a continuous variable, we also performed exploratory group-based comparisons to probe for potential non-linear developmental shifts in social brain network organization. This approach revealed that the differentiation between ToM and SPM networks was already present in the youngest group (ages 3–4), suggesting that early neural specialization may begin prior to the age at which ToM behavior is reliably observed. These group-level observations provide complementary evidence to the continuous analyses and may inform future work examining sensitive periods or early markers of social brain development.”

      Tables would be improved if they were more professionally formatted (e.g., names of the variables rather than variable abbreviation codes).

      We appreciate the reviewer’s suggestion to improve the clarity and professionalism of our tables. In the revised manuscript, we have reformatted all tables to include full variable names rather than abbreviations or coded labels, and we ensured consistency in terminology across the manuscript text, tables, and figure legends. We have also added explanatory footnotes where needed to clarify any derived or composite measures. We hope these revisions improve the accessibility and readability of the results for a broader audience

      Reviewer #2:

      Summary:

      This study investigates the impact of mother-child neural synchronization and the quality of parent-child relationships on the development of Theory of Mind (ToM) and social cognition. Utilizing a naturalistic fMRI movie-viewing paradigm, the authors analyzed inter-subject neural synchronization in mother-child dyads and explored the connections between neural maturity, parental caregiving, and social cognitive outcomes. The findings indicate age-related maturation in ToM and social pain networks, emphasizing the importance of dyadic interactions in shaping ToM performance and social skills, thereby enhancing our understanding of the environmental and intrinsic influences on social cognition.

      Strengths:

      This research addresses a significant question in developmental neuroscience, by linking social brain development with children's behaviors and parenting. It also uses a robust methodology by incorporating neural synchrony measures, naturalistic stimuli, and a substantial sample of mother-child dyads to enhance its ecological validity. Furthermore, the SEM approach provides a nuanced understanding of the developmental pathways associated with Theory of Mind (ToM).

      We appreciate the positive evaluation and valuable comments of the reviewer. According to the reviewer`s comments, we have revised the manuscript thoroughly to address the concerns raised by the reviewer. A point-by-point response to each of the issues raised by the reviewer has been made. We believe that the revision of our manuscript has now been significantly improved.

      Upon reviewing the introduction, I feel that the first goal - developmental changes of the social brain and its relation to age - seems somewhat distinct from the other two goals and the main research question of the manuscript. The authors might consider revising this section to enhance the overall coherence of the manuscript. Additionally, the introduction lacks a clear background and rationale for the importance of examining age-related changes in the social brain.

      We thank the reviewer for this thoughtful observation. In response, we have revised the Introduction to better integrate the developmental aspect of the social brain with the broader research aims. We now explicitly link age-related changes in social brain organization to the emergence of social cognitive abilities and highlight why early childhood (ages 3–8) represents a particularly formative period. This revision clarifies that our first aim—examining functional specialization and neural maturity in Theory of Mind (ToM) and Social Pain Matrix (SPM) networks—serves as a developmental foundation for understanding how dyadic influences, such as neural synchrony and caregiving quality, shape children’s social cognition.

      We have also improved the rationale for examining age-related change, drawing on key literature in developmental neuroscience to show how the early emergence and specialization of social brain networks provide a necessary context for interpreting interpersonal neural dynamics.

      The corresponding changes have been made on pages 3 of the revised manuscript.

      “These findings suggest that the development of specialized brain regions for reasoning about others' mental states and physical sensations is a gradual process that continues throughout childhood.

      Understanding how these networks differentiate with age is essential not only for mapping typical brain development, but also for contextualizing the role of environmental influences. By establishing normative patterns of neural maturity and differentiation, we can better interpret how relational experiences—such as caregiver-child synchrony and parenting quality—modulate these trajectories. Thus, our first goal provides a developmental anchor that grounds our investigation of interpersonal and environmental contributions to social brain function.”

      The manuscript uses both "mother-child" and "parent-child" terminology. Does this imply that only mothers participated in the fMRI scans while fathers completed the questionnaires? If so, have the authors considered the potential impact of parental roles (father vs. mother)?

      We thank the reviewer for raising this important point regarding terminology and parental roles. To clarify, all participating caregivers in the current study were biological mothers, and all behavioral questionnaires were also completed by these same mothers. No fathers were included in this study. We have revised the manuscript throughout to consistently use the term “mother-child” when referring to the specific dyads in our sample.

      We also appreciate the opportunity to elaborate on the rationale for including only mothers. Prior research has shown that maternal and paternal influences on child development are not interchangeable, and that the neural correlates of caregiving behaviors differ between mothers and fathers. For example, studies have demonstrated distinct patterns of brain activation during social and emotional processing in mothers versus fathers (Abraham et al., 2014; JE Swain et al., 2014). Given these differences, we deliberately focused on mother-child dyads to maintain neurobiological consistency in our analysis and reduce variance associated with heterogeneous caregiving roles. We now clarify this rationale in the revised Methods and Discussion sections.

      The corresponding changes have been made on pages 14 of the revised manuscript.

      “We chose to focus exclusively on mother-child dyads in this study based on prior evidence suggesting distinct neural and behavioral caregiving profiles between mothers and fathers [1-2], allowing us to maintain role consistency and reduce variability in dyadic interactions.

      Our sample was restricted to mother-child dyads, leaving open questions about potential differences in father-child relationships and gender effects on parenting neurobiology [1]. Larger and more diverse samples would enhance the generalizability of the findings.”

      Reference:

      (1) Swain, J. E. et al. Approaching the biology of human parental attachment: Brain imaging, oxytocin and coordinated assessments of mothers and fathers. Brain research 1580, 78-101 (2014).

      (2) Abraham, E. et al. Father's brain is sensitive to childcare experiences. Proceedings of the National Academy of Sciences 111, 9792-9797 (2014).

      There is inconsistent usage of the terms ISC and ISS in the text and figures, both of which appear to refer to synchronization derived from correlation analysis. It would be beneficial to maintain consistency throughout the manuscript.

      We thank the reviewer for highlighting the inconsistent use of “ISC” and “ISS” in the original manuscript. We agree that clarity and consistency in terminology are essential. In response, we have revised the manuscript to consistently use “ISS” (inter-subject synchronization) throughout the text, figures, tables, and legends.

      Of the 50 dyads, 16 were excluded due to data quality issues, which constitutes a significant proportion. It would be helpful to know whether these excluded dyads exhibited any distinctive characteristics. Providing information on demographic or behavioral differences-such as Theory of Mind (ToM) performance and age range between the excluded and included dyads would enhance the assessment of the findings' generalizability.

      We thank the reviewer for this important observation. We agree that understanding the characteristics of excluded participants is essential for assessing the generalizability of the findings.

      In response, we conducted comparative analyses between included and excluded dyads (N = 34 included; N = 16 excluded) on key demographic and behavioral variables, including child age, gender, and Theory of Mind (ToM) performance. These analyses revealed no significant differences between groups on any of these measures (ps > 0.1), suggesting that data exclusion due to quality issues (e.g., excessive motion, incomplete scans) did not introduce systematic bias.

      We have now added this information to the Results and Methods sections of the manuscript.

      The corresponding changes have been made on pages 6 and 17 of the revised manuscript.

      “Of the 50 initial mother-child dyads recruited, 16 were excluded due to excessive head motion (n = 11), incomplete scan sessions (n = 3), or technical issues during data acquisition (n = 2). The final sample consisted of 34 dyads. To assess potential bias introduced by data exclusion, we compared included and excluded dyads on child age, gender, and Theory of Mind performance. No significant differences were found across these variables (all ps > 0.1), suggesting that the analytic sample was demographically representative of the full cohort.

      Comparison between included and excluded dyads revealed no significant differences in child age (t = 1.23, p = 0.24), ToM scores (t = -0.54, p = 0.59), or sex distribution (χ² < 0.01, p = 0.98), indicating that data exclusion did not bias the sample in a systematic way.”

      The article does not adhere to the standard practice of using a resting state as a baseline for subtracting from task synchronization. Is there a rationale for this approach? Not controlling for a baseline may lead to issues, such as whether resting state synchronization already differs between subjects with varying characteristics.

      We thank the reviewer for raising this important methodological point. We agree that controlling for baseline synchronization, such as using a resting-state scan as a comparison, can help disambiguate whether task-induced synchrony reflects genuine stimulus-driven coupling or baseline differences across individuals or dyads.

      In the present study, we focused on inter-subject synchronization (ISS) during naturalistic movie viewing, a task condition that has been widely used in previous developmental and social neuroscience research to assess shared neural engagement. We did not include a resting-state scan in the current protocol due to time constraints and the young age of our participants (ages 3–8), as longer scanning sessions often result in increased motion and reduced data quality in pediatric populations. Moreover, many prior studies using ISS in naturalistic paradigms have similarly focused on task-driven synchrony without subtracting a resting baseline (e.g., Hasson et al., 2004; Nguyen et al., 2020; Reindl et al., 2018).

      That said, we acknowledge that baseline neural synchrony across dyads may vary depending on individual or relational characteristics (e.g., temperament, arousal, attentional style), and this remains an important question for future research. In the revised Discussion, we now explicitly note the absence of a resting-state baseline as a limitation and highlight the need for future studies to examine how resting and task-based ISS may interact, particularly in the context of child-caregiver dyads.

      The corresponding changes have been made on page 13 of the revised manuscript.

      “Another limitation of the current design is the lack of a resting-state baseline for inter-subject synchronization. While our focus was on synchronization during naturalistic social processing, we cannot determine whether individual differences in ISS reflect purely task-induced coupling or are partially shaped by trait-level synchrony present at rest. Including both resting and task conditions in future work would allow for stronger inferences about stimulus-specific versus baseline-driven synchronization, especially in relation to interpersonal factors such as relationship quality or social responsiveness.”

      The title of the manuscript suggests a direct influence of mother-child interactions on children's social brain and theory of mind. However, the use of structural equation modeling (SEM) may not fully establish causal relationships. It is possible that the development of children's social brain and ToM also enhances mother-child neural synchronization. The authors should address this alternative hypothesis of the potential bidirectional relationship in the discussion and exercise caution regarding terms that imply causality in the title and throughout the manuscript.

      We appreciate the reviewer’s careful attention to issues of causality in our manuscript. We agree that our cross-sectional design limits causal inference, and that the use of structural equation modeling (SEM) in this context does not allow for conclusions about directional or mechanistic pathways. In response, we have revised the Discussion to explicitly acknowledge these limitations, and now include an expanded section on the potential for bidirectional or co-constructed processes, consistent with neuroconstructivist frameworks.

      We have also tempered the interpretation of our SEM findings, avoiding causal language throughout the manuscript and clarifying that our analyses are exploratory and associational in nature. We hope that these changes provide a more cautious and developmentally grounded interpretation of the data.

      With regard to the title, we respectfully chose to retain the original wording, as we believe it captures the thematic focus and central research question of the paper—namely, the potential role of mother-child interaction in the development of children’s social brain and Theory of Mind. While we understand the reviewer’s concern, we note that the interpretation of this phrasing is contextualized within the manuscript, which now includes clear qualifications regarding the limits of causal inference. We have taken care to ensure that no claims of unidirectional causality are made in the body of the paper.

      The corresponding changes have been made on pages 11- 12 of the revised manuscript.

      “Our findings align with a neuroconstructivist perspective, which conceptualizes brain development as an emergent outcome of reciprocal interactions between biological constraints and context-specific environmental inputs. Rather than presuming fixed traits or linear maturation, this perspective highlights how neural circuits adaptively organize in response to experience, gradually supporting increasingly complex cognitive functions54. It offers a particularly powerful lens for understanding how early caregiving environments modulate the maturation of social brain networks.

      Building on this framework, the present study reveals that moment-to-moment neural synchrony between parent and child, especially during emotionally salient or socially meaningful moments, is associated with enhanced Theory of Mind performance and reduced dyadic conflict. This suggests that beyond age-dependent neural maturation, dyadic neural coupling may serve as a relational signal, embedding real-time interpersonal dynamics into the child’s developing neural architecture. Our data demonstrate that children’s brains are not merely passively maturing, but are also shaped by the relational texture of their lived experiences—particularly interactions characterized by emotional engagement and joint attention. Importantly, this adds a new dimension to neuroconstructivist theory: it is not simply whether the environment shapes development, but how the quality of interpersonal input dynamically calibrates neural specialization. Interpersonal variation leaves detectable signatures in the brain, and our use of neural synchrony as a dyadic metric illustrates one potential pathway through which caregiving relationships exert formative influence on the developing social brain.

      The contribution of this work lies not in reiterating the interplay of nature and nurture, but in specifying the mechanistic role of interpersonal neural alignment as a real-time, context-sensitive developmental input. Neural synchrony between parent and child may function as a form of relationally grounded, temporally structured experience that tunes the child’s social brain toward contextually relevant signals. Unlike generalized enrichment, this form of neural alignment is inherently personalized and contingent—features that may be especially potent in shaping social cognitive circuits during early childhood.

      The cross-sectional nature of our study is a further limitation, as it cannot definitively establish the causal directions of the observed relationships. Longitudinal designs tracking children's brain development and social cognitive abilities over time would help clarify whether early parenting impacts later neural maturation and behavioral outcomes, or vice versa.”

      I would appreciate more details about the 14 Theory of Mind (ToM) tasks, which could be included in supplemental materials. The authors score them on a scale from 0 to 14 (each task 1 point); however, the tasks likely vary in difficulty and should carry different weights in the total score (for example, the test and the control questions should have different weights). Many studies have utilized the seven tasks according to Wellman and Liu (2004), categorizing them into "basic ToM" and "advanced ToM." Different components of ToM could influence the findings of the current study, which should be further examined by a more in-depth analysis.

      We thank the reviewer for raising this important point regarding the structure and scoring of the Theory of Mind (ToM) tasks. We will provide a detailed description of all 14 tasks in the Supplemental Materials, including their content, targeted mental state concepts (e.g., beliefs, desires, intentions), and design features (e.g., test/control items, task format).

      We fully agree that ToM tasks differ in complexity, and in principle, a weighted or component-based scoring approach (e.g., distinguishing basic and advanced ToM) could offer greater interpretive value. However, in our study design, tasks were administered in a fixed sequence from lower to higher difficulty, and testing was terminated if the child was unable to successfully complete three consecutive tasks. This approach is developmentally appropriate for younger children but results in non-random missingness for more advanced tasks—particularly among children at the lower end of the age range (3–4 years).

      Given this adaptive task structure, re-scoring using weighted or subscale-based approaches would introduce systematic bias, as children who struggled with early items were not administered more complex ones. As a result, a full breakdown by task type (e.g., basic vs. advanced ToM) would only reflect a restricted subsample and would not be comparable across the full cohort. For this reason, we retained the unit-weighted total ToM score as the most developmentally valid and comparable metric across participants.

      Reviewer #3:

      Summary:

      The article explores the role of mother-child interactions in the development of children's social cognition, focusing on Theory of Mind (ToM) and Social Pain Matrix (SPM) networks. Using a naturalistic fMRI paradigm involving movie viewing, the study examines relationships among children's neural development, mother-child neural synchronization, and interaction quality. The authors identified a developmental pattern in these networks, showing that they become more functionally distinct with age. Additionally, they found stronger neural synchronization between child-mother pairs compared to child-stranger pairs, with this synchronization and neural maturation of the networks associated with the mother-child relationship and parenting quality.

      Strengths:

      This is a well-written paper, and using dyadic fMRI and naturalistic stimuli enhances its ecological validity, providing valuable insights into the dynamic interplay between brain development and social interactions. However, I have some concerns regarding the analysis and interpretation of the findings. I have outlined these concerns below in the order they appear in the manuscript, which I hope will be helpful for the revision.

      We appreciate the reviewer’s thoughtful and constructive summary of the manuscript. The concerns raised regarding aspects of the analysis and interpretation have been carefully considered. Detailed point-by-point responses are provided below, along with descriptions of the corresponding revisions made to improve the clarity, precision, and interpretive caution of the manuscript.

      Given the importance of social cognition in this study, please cite a foundational empirical or review paper on social cognition to support its definition. The current first citation is primarily related to ASD research, which may not fully capture the broader context of social cognition development.

      We thank the reviewer for this helpful suggestion. We agree that a broader, foundational reference is more appropriate for introducing the concept of social cognition. In response, we have revised the Introduction to include a widely cited theoretical or review paper on social cognition to provide a more general developmental context.

      The corresponding changes have been made on pages 3 of the revised manuscript.

      “Social cognition, defined as the ability to interpret and predict others' behavior based on their beliefs and intentions and to interact in complex social environments and relationships is a crucial aspect of human development [1-2]”

      (1) Adolphs, R. The social brain: neural basis of social knowledge. Annual review of psychology 60, 693-716 (2009).

      (2) Frith, C. D. & Frith, U. Mechanisms of social cognition. Annual review of psychology 63, 287-313 (2012).

      It is standard practice to report the final sample size in the Abstract and Introduction, rather than the initial recruited sample, as high attrition rates are common in pediatric studies. For example, this study recruited 50 mother-child dyads, and only 34 remained after quality control. This information is crucial for interpreting the results and conclusions. I recommend reporting the final sample size in the abstract and introduction but specifying in the Methods that an additional 16 mother-child dyads were initially recruited or that 50 dyads were originally collected.

      We thank the reviewer for this helpful recommendation. In the original version of the manuscript, the Abstract and Introduction referenced the total number of dyads recruited (N = 50). In line with standard reporting practices and to ensure clarity regarding the analytic sample, we have now revised both the Abstract and Introduction to report the final sample size (N = 34). The full recruitment and exclusion details—including the number of dyads removed due to excessive motion or technical issues—are now clearly described in the Methods section.

      The corresponding changes have been made on pages 1 and 4 of the revised manuscript.

      In the "Neural maturity reflects the development of the social brain" section, the authors report the across-network correlation for adults, finding a negative correlation between ToM and SPM. However, the cross-network correlations for the three child groups are not reported. The statement that "the two networks were already functionally distinct in the youngest group of children we tested" is based solely on within-network positive correlations, which does not fully demonstrate functional distinctness. Including cross-network correlations for the child groups would strengthen this conclusion.

      We thank the reviewer for this insightful comment. We agree that within-network correlations alone do not fully establish functional distinctness, particularly in early development. To more directly test whether the ToM and SPM networks were already differentiated in children, we have now included the cross-network correlations between the two networks for each of the three age groups in the revised manuscript. These findings support and strengthen our original claim that the ToM and SPM networks are functionally dissociable even in early childhood, and we have revised the relevant Results sections accordingly to reflect this.

      The corresponding changes have been made on page 7 of the revised manuscript.

      “In children, each network also exhibited positive correlations within-network and negative correlations across networks (within-ToM correlation M(s.e.) = 0.31(0.04); within-SPM correlation M(s.e.) = 0.29(0.04); across-network M(s.e.) = −0.09 (0.02).

      In the Pre-junior group only (3-4 years old children, n = 12), both ToM and SPM networks had positive within-network correlations (within-ToM correlation M (s.e.) = 0.29(0.06); within-SPM correlation M(s.e.) = 0.23(0.05), across-network M(s.e.) = −0.05(0.02)).”

      The ROIs for the ToM and SPM networks are defined based on previous literature, applying the same ROIs across all age groups. While I understand this is a common approach, it's important to note that this assumption may not fully hold, as network architecture can evolve with age. The functional ROIs or components of a network might shift, with regions potentially joining or exiting a network or changing in size as children develop. For instance, Mark H. Johnson's interactive specialization theory suggests that network composition may adapt over developmental stages. Although the authors follow the approach of Richardson et al. (2018), it would be beneficial to discuss this limitation in the Discussion. An alternative approach would be to apply data-driven analysis to justify the selection of the ROIs for the two networks.

      We thank the reviewer for this thoughtful and theoretically grounded comment.  In our study, we followed the approach of Richardson et al. (2018), using a priori ROIs defined from adult meta-analyses and ToM/SPM task studies. This approach facilitates comparison with prior work and provides anatomical consistency across participants. However, we fully agree that applying adult-defined ROIs to pediatric populations involves important assumptions about the stability of network architecture across development, which may not fully hold in early childhood.

      We have now addressed this limitation more explicitly in the revised Discussion, emphasizing that the fixed-ROI approach may not capture the dynamic reorganization of social brain networks during development.

      The corresponding changes have been made on pages 13 of the revised manuscript.

      “Moreover, the ROIs used to define the ToM and SPM networks were based on meta-analyses and task studies primarily conducted with adults. While this approach promotes comparability with existing literature, it assumes that the spatial organization of these networks is stable across age groups. However, theories of interactive specialization suggest that the composition and boundaries of functional networks may undergo reorganization during development, with regions potentially entering or exiting networks based on experience and maturational processes. As a result, the current analysis may not fully capture age-specific functional architecture, particularly in younger children. Future studies using data-driven or age-appropriate parcellation methods could provide more precise characterizations of how social brain networks are constructed and differentiated throughout childhood.”

      The current sample size (N = 34 dyads) is a limitation, particularly given the use of SEM, which generally requires larger samples for stable results. Although the model fit appears adequate, this does not guarantee reliability with the current sample size. I suggest discussing this limitation in more detail in the Discussion.

      We thank the reviewer for highlighting the limitations of applying structural equation modeling (SEM) with a relatively modest sample size. We agree that SEM generally benefits from larger samples to ensure model stability and parameter reliability, and that satisfactory model fit does not guarantee robustness in small-sample contexts.

      In the revised Discussion, we now more clearly acknowledge that the use of SEM in the current study is exploratory in nature, and that all results should be interpreted with caution due to potential sample size-related constraints. The model was constructed to provide an integrated view of the observed associations rather than to establish definitive pathways. We have also added a note that future research with larger samples and longitudinal designs will be needed to validate and extend the proposed model.

      The corresponding changes have been made on pages 13 of the revised manuscript.

      “In addition, the modest sample size (N = 34 dyads) presents limitations for the application of structural equation modeling (SEM), which typically requires larger samples for stable estimation and generalizable inferences. While the model fit was acceptable, the results should be interpreted as exploratory and hypothesis-generating, rather than confirmatory. Future studies with larger, independent samples will be important for validating the structure and directionality of the proposed relationships”

      Based on the above comment, I believe that conclusions regarding the relationship between social network development, parenting, and support for Bandura's theory should be tempered. The current conclusions may be too strong given the study's limitations.

      We thank the reviewer for this important and balanced observation. We agree that the conclusions drawn from the current study should reflect the exploratory nature of the analyses, as well as the methodological limitations, including the modest sample size and cross-sectional design.

      In response, we have revised the Conclusion sections to use more cautious, associative language when describing the observed relationships among social brain development, parenting factors, and Theory of Mind outcomes. In particular, we have tempered statements regarding support for Bandura’s social learning theory, clarifying that while our findings are consistent with social learning frameworks, the data do not allow for direct tests of modeling or observational learning mechanisms.

      We hope these revisions help clarify the scope of the findings and improve the conceptual rigor of the manuscript.

      The corresponding changes have been made on pages 14 of the revised manuscript.

      “Our study provides novel evidence that children's social cognitive development may be shaped by the intricate interplay between environmental influences, such as parenting, and biological factors, such as neural maturation. Our findings contribute to a growing understanding of the factors associated with social cognitive development and suggest the potential importance of parenting in this process. Specifically, the study points to the possible role of the parent-child relationship in supporting the development of social brain circuitry and highlights the relevance of family-based approaches for addressing social difficulties. The observed neural synchronization between parent and child, which was associated with relationship quality, underscores the potential significance of positive parental engagement in fostering social cognitive skills. Future longitudinal and clinical research can build on this multimodal approach to further clarify the neurobehavioral mechanisms underlying social cognitive development. Such research may help inform more effective strategies for promoting healthy social functioning and mitigating social deficits through targeted family-based interventions.”

      The SPM (pain) network is associated with empathic abilities, also an important aspect of social skills. It would be relevant to explore whether (or explain why) SPM development and child-mother synchronization are (or are not) related to parenting and the parent-child relationship.

      We thank the reviewer for this thoughtful and important comment regarding the role of the Social Pain Matrix (SPM) network in social cognition and empathy. We agree that this network represents a critical component of social-cognitive development and is theoretically linked to affective processing and interpersonal understanding.

      We would like to clarify that in our existing analyses—already included in the original submission and detailed in the Supplemental Results—SPM network measures showed similar significant associations with behavioral outcomes than the ToM network. These outcomes included children's performance on ToM tasks as well as broader measures of social functioning. We have added more discussion in the supplementary results.

      “To further investigate the specificity of our findings, we conducted additional control analyses focusing on the individual components of the social brain networks examined in our study: the Theory of Mind (ToM) and Social Pain Matrix (SPM) networks.

      When analyzing these networks separately, we found significant correlations between neural maturity and age, as well as between inter-subject synchronization (ISS) and parent-child relationship quality for both the ToM and SPM networks individually (Fig. S1). Specifically, neural maturity within each network was positively correlated with age, indicating that both networks undergo maturation during childhood. Similarly, ISS within each network was negatively correlated with parent-child conflict scores, suggesting that both networks contribute to the observed relationship between neural synchrony and parent-child relationship quality.

      These results highlight the importance of considering the social brain as an integrated system, where the ToM and SPM networks work in concert to support social cognitive development. While each network shows age-related maturation and sensitivity to parent-child relationship quality, their combined functioning appears to be crucial for predicting broader social cognitive outcomes.

    1. eLife Assessment

      This fundamental study examines whether synaptic cell adhesion molecules neuroligin 1-3 resident on astrocytes, rather than neurons, exert effects on synaptic structure and function. With compelling evidence, including rigorous validation of neuroligin deletion efficiency in astrocytes and independent confirmation using human neuron-mouse glia co-cultures, the authors report that deletion of neuroligins 1-3 specifically in astrocytes does not alter synapse formation or astrocyte morphology in the hippocampus or visual cortex. This study provides definitive evidence highlighting the specific role of neuronal neuroligins rather than their astrocytic counterparts in synaptogenesis.

    2. Reviewer #1 (Public review):

      Astrocytes are known to express neuroligins 1-3. Within neurons, these cell adhesion molecules perform important roles in synapse formation and function. Within astrocytes, a significant role for neuroligin 2 in determining excitatory synapse formation and astrocyte morphology was shown in 2017. However, there has been no assessment of what happens to synapses or astrocyte morphology when all three major forms of neuroligins within astrocytes (isoforms 1-3) are deleted using a well characterized, astrocyte specific, and inducible cre line. By using such selective mouse genetic methods, the authors here show that astrocytic neuroligin 1-3 expression in astrocytes is not consequential for synapse function or for astrocyte morphology. They reach these conclusions with careful experiments employing quantitative western blot analyses, imaging and electrophysiology. They also characterize the specificity of the cre line they used. Overall, this is a very clear and strong paper that is supported by rigorous experiments. The discussion considers the findings carefully in relation to past work. This paper is of high importance, because it now raises the fundamental question of exactly what neuroligins 1-3 are actually doing in astrocytes. In addition, it enriches our understanding of the mechanisms by which astrocytes participate in synapse formation and function. The paper is very clear, well written and well illustrated with raw and average data.

      Comments on revisions:

      My previous comments have been addressed. I have no additional points to make and congratulate the authors.

    3. Reviewer #2 (Public review):

      In the present manuscript, Golf et al. investigate the consequences of astrocyte-specific deletion of Neuroligin (Nlgn) family cell adhesion proteins on synapse structure and function in the brain. Decades of prior research had shown that Neuroligins mediate their effects at synapses through their role in the postsynaptic compartment of neurons and their transsynaptic interaction with presynaptic Neurexins. More recently, it was proposed for the first time that Neuroligins expressed by astrocytes can also bind to presynaptic Neurexins to regulate synaptogenesis (Stogsdill et al. 2017, Nature). However, several aspects of the model proposed by Stogsdill et al. on astrocytic Neuroligin function conflict with prior evidence on the role of Neuroligins at synapse, prompting Golf et al. to further investigate astrocytic Neuroligin function in the current study. Using postnatal conditional deletion of Nlgn1-3 specifically from astrocytes in mice, Golf et al. show that virtually no changes in the expression of synaptic proteins or in the properties of synaptic transmission at either excitatory or inhibitory synapses are observed. Moreover, no alterations in the morphology of astrocytes themselves were found. To further extend this finding, the authors additionally analyzed human neurons co-cultured with mouse glia lacking expression of Nlgn1-4. No difference in excitatory synaptic transmission was observed between neurons cultured in the presence of wildtype vs. Nlgn1-4 conditional knockout glia. The authors conclude that while Neuroligins are indeed expressed in astrocytes and are hence likely to play some role there, this role does not include any direct consequences on synaptic structure and function, in direct contrast to the model proposed by Stogsdill et al.

      Overall, this is a strong study that addresses a fundamental and highly relevant question in the field of synaptic neuroscience. Neuroligins are not only key regulators of synaptic function, they have also been linked to numerous psychiatric and neurodevelopmental disorders, highlighting the need to precisely define their mechanisms of action. The authors take a wide range of approaches to convincingly demonstrate that under their experimental conditions, Nlgn1-3 are efficiently deleted from astrocytes in vivo, and that this deletion does not lead to major alterations in the levels of synaptic proteins or in synaptic transmission at excitatory or inhibitory synapses, or in the morphology of astrocytes. The authors have conducted an elegant and compelling analysis demonstrating efficient deletion of astrocytic Nlgn1-3, with deletion rates of 83-96% for Nlgn2 and Nlgn3, and 65-72% for Nlgn1. While the co-culture experiments provide additional support, they are not essential as the in vivo data on astrocytic Nlgn1-3 deletion are compelling on their own. Together, the data from this study provide compelling and important evidence that, whatever the role of astrocytic Neuroligins may be, they do not contribute substantially to synapse formation or function under the conditions investigated.

      Comments on revisions:

      All of my concerns have been satisfactorily addressed.<br /> The authors have fully addressed my concerns, and have in particular conducted a very elegant and compelling analysis of the degree of deletion of astrocytic Nlgn1-3/4 in their models. This greatly strengthens the main claims of their study and the fundamental nature of their conclusions for the field of synapse biology.<br /> Regarding the co-culture experiments, while I was initially concerned about the lack of controls demonstrating that glia affect synapse formation in human neurons, the authors have appropriately addressed this by clarifying the missing references and explaining that their culture system has been extensively validated in previous studies. Since the data on astrocytic Nlgn1-3 deletion in vivo are compelling on their own, the co-culture experiment provides useful additional support for the main conclusions.<br /> The authors have also added the mouse strain background information to the methods section as requested, which is important for interpreting potential differences with other studies.

    4. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Astrocytes are known to express neuroligins 1-3. Within neurons, these cell adhesion molecules perform important roles in synapse formation and function. Within astrocytes, a significant role for neuroligin 2 in determining excitatory synapse formation and astrocyte morphology was shown in 2017. However, there has been no assessment of what happens to synapses or astrocyte morphology when all three major forms of neuroligins within astrocytes (isoforms 1-3) are deleted using a well characterized, astrocyte specific, and inducible cre line. By using such selective mouse genetic methods, the authors here show that astrocytic neuroligin 1-3 expression in astrocytes is not consequential for synapse function or for astrocyte morphology. They reach these conclusions with careful experiments employing quantitative western blot analyses, imaging and electrophysiology. They also characterize the specificity of the cre line they used. Overall, this is a very clear and strong paper that is supported by rigorous experiments. The discussion considers the findings carefully in relation to past work. This paper is of high importance, because it now raises the fundamental question of exactly what neuroligins 1-3 are actually doing in astrocytes. In addition, it enriches our understanding of the mechanisms by which astrocytes participate in synapse formation and function. The paper is very clear, well written and well illustrated with raw and average data.

      Comments on revisions:

      My previous comments have been addressed. I have no additional points to make and congratulate the authors.

      Thank you for your acceptance.

      Reviewer #2 (Public Review):

      In the present manuscript, Golf et al. investigate the consequences of astrocyte-specific deletion of Neuroligin (Nlgn) family cell adhesion proteins on synapse structure and function in the brain. Decades of prior research had shown that Neuroligins mediate their effects at synapses through their role in the postsynaptic compartment of neurons and their transsynaptic interaction with presynaptic Neurexins. More recently, it was proposed for the first time that Neuroligins expressed by astrocytes can also bind to presynaptic Neurexins to regulate synaptogenesis (Stogsdill et al. 2017, Nature). However, several aspects of the model proposed by Stogsdill et al. on astrocytic Neuroligin function conflict with prior evidence on the role of Neuroligins at synapses, prompting Golf et al. to further investigate astrocytic Neuroligin function in the current study. Using postnatal conditional deletion of Nlgn1-3 specifically from astrocytes in mice, Golf et al. show that virtually no changes in the expression of synaptic proteins or in the properties of synaptic transmission at either excitatory or inhibitory synapses are observed. Moreover, no alterations in the morphology of astrocytes themselves were found. To further extend this finding, the authors additionally analyzed human neurons co-cultured with mouse glia lacking expression of Nlgn1-4. No difference in excitatory synaptic transmission was observed between neurons cultured in the present of wildtype vs. Nlgn1-4 conditional knockout glia. The authors conclude that while Neuroligins are indeed expressed in astrocytes and are hence likely to play some role there, this role does not include any direct consequences on synaptic structure and function, in direct contrast to the model proposed by Stogsdill et al.

      Overall, this is a strong study that addresses a fundamental and highly relevant question in the field of synaptic neuroscience. Neuroligins are not only key regulators of synaptic function, they have also been linked to numerous psychiatric and neurodevelopmental disorders, highlighting the need to precisely define their mechanisms of action. The authors take a wide range of approaches to convincingly demonstrate that under their experimental conditions, Nlgn1-3 are efficiently deleted from astrocytes in vivo, and that this deletion does not lead to major alterations in the levels of synaptic proteins or in synaptic transmission at excitatory or inhibitory synapses, or in the morphology of astrocytes. While the co-culture experiments are somewhat more difficult to interpret due to lack of a control for the effect of wildtype mouse astrocytes on human neurons, they are also consistent with the notion that deletion of Nlgn1-4 from astrocytes has no consequences for the function of excitatory synapses. Together, the data from this study provide compelling and important evidence that, whatever the role of astrocytic Neuroligins may be, they do not contribute substantially to synapse formation or function under the conditions investigated.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The authors have fully addressed my concerns, and have in particular conducted a very elegant and compelling analysis of the degree of deletion of astrocytic Nlgn1-3/4 in their models. This greatly strengthens the main claims of their study and the fundamental nature of their conclusions for the field of synapse biology.

      I am somewhat less convinced by the newly added experiment to investigate deletion of Nlgns1-4 from glia in glia-neuron co-cultures. The authors provide no evidence to show that either WT or cKO glia have any effect on synapse formation or function in human neurons, and therefore, the current lack of a difference could equally result from the fact that both WT and cKO glia were non-functional altogether. The authors cite two studies to state that human neurons do not form synapses in the absence of astrocytes, Zhang et al. 2013 and Huang et al. 2017, but neither seem to be listed in the references (unless Zhang et al. 2014 was meant), making it difficult to assess the relevance of these data. However, since the data on astrocytic Nlgn1-3 deletion in vivo are compelling on their own, I do not see the co-culture experiment as essential for the main conclusions of the study.

      Minor comment:

      Please add the information on the strain background of the mice to the methods section of the manuscript. Strain background can have a significant impact on many aspects of neuronal function, and this information is therefore essential for the interpretation of potential differences to other studies.

      We deeply apologize for forgetting to include the two important references mentioned by the reviewer in the reference list. We understand that the reviewer as a result could not assess the validity of our statement that co-culture of glia is required for efficient synapse formation by human neurons that are induced from ES or iPS cells. Note that this conclusion does not postulate that all synapse formation requires glia, since the cited papers demonstrate that human neurons induced by our protocol still form scarce synapses without glia. This observation has been confirmed in many different experiments that were performed after the data presented in the cited papers. As a result of this extensive prior documentation that human neurons produced by forced expression of Ngn2 require coculture of glia for efficient synapse formation, we do not feel that we need to repeat this basic characterization of our culture system again to validate multiple previous papers and hope the reviewer will concur. We have additionally added the relevant mouse strain information to the methods section.

    1. eLife Assessment

      This fundamental work extends our understanding of the role of TGFβ2 as a modulator of mechanosensing in the eye and identifies the TRPV4 ion channel as a common regulator of Trabecular Meshwork (TM) contractility and pathological OHT and the data and evidence provided are convincing. This work will clearly be of interest to researchers investigating the role of mechanosensors in the TM and may underpin future research into treatments that aim to lower intra ocular pressure. This work will additionally be of interest to the growing field of researchers investigating the regulation of force sensing via ion channels and their roles in health and disease, in particular the ion channel TRPV4.

    2. Reviewer #1 (public review):

      Summary:

      This comprehensive study employed molecular, optical, electrophysiological and tonometric strategies to establish the role of TGFβ2 in transcription and functional expression of mechanosensitive channel isoforms alongside studies of TM contractility in biomimetic hydrogels, and intraocular pressure regulation in a mouse model of TGFβ2 -induced ocular hypertension. TGFβ2 upregulated expression of TRPV4 and PIEZO1 transcripts and time-dependently augmented functional TRPV4 activation. TRPV4 activation induced TM contractility whereas pharmacological inhibition suppressed TGFβ2-induced hypercontractility and abrogated ocular hypertension in eyes overexpressing TGFβ2. Trpv4-/- mice resisted TGFβ2-driven increases in IOP. These data establish a fundamental role of TGFβ as a modulator of mechanosensing and identifies TRPV4 channel as a common mechanism for TM contractility and pathological ocular hypertension.

      The manuscript is very well written and details the important function of TRPV4 in TM cell function. These data provide novel therapeutic targets and potential for disease-altering therapeutics.

    3. Reviewer #2 (public review):

      The manuscript by Christopher N. Rudzitis et al. describes the role of TGFβ2 in the transcription and functional expression of mechanosensitive channel isoforms, alongside studies on TM contractility in biomimetic hydrogels and intraocular pressure. Overall, it is a very interesting study, nicely designed, and will contribute to the available literature on TRPV4 sensitivity to mechanical forces.

    4. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      The experimental rigor and design of the noctural IOP experiments was weak with low n values and differing methods of IOP measurement (conscious versus anesthetized). The same method of IOP measurement needs to be used for all measurements to make any conclusions on the circadian patterns of IOP in each condition.

      One of the goals of our study was to confirm the results from the Patel et al (2021; PMID33853948) study, which in which nocturnal IOP measurements were conducted in anesthetized mice and diurnal IOP measurements in awake animals but we agree with both Reviewers that IOP should be measured under identical experimental conditions. Parenthetically, the number of animals per each treatment paradigm in the original version (N = 4) was sufficient to produce statistical significance for diurnal control vs diurnal TGFB, and diurnal control vs nocturnal control conditions.

      To address the comment, we generated an additional cohort of TGFb2-expressing mice (N = 6) in which nocturnal and diurnal measurements were performed in awake animals. The results are shown in the revised Figure 6. Similar to the anesthetized cohort, the diurnal IOP in Lv-TGFB2 mice was statistically indistinguishable from the nocturnal value, indicating that TGFB2-induced OHT is not additive to physiological (circadian) OHT. The TRPV4-dependence of ocular hypertension induced by physiological and pathological methods suggests that the channel functions as a final common mechanism for ocular hypertension.

      Reviewer #2 (Public review):

      Figure 1A-C. Often there is a difference between the massage (message?, op. authors) and transcript data. I recommend the authors to confirm with qPCR data with another mode of protein measurements.

      We are not sure we understand the Reviewer’s comment regarding the “difference between the message and transcript data” but note that the mRNA data shown in panels A & B are confirmatory of previously published transcriptomic and proteomic screens (eg, Fleenor et al., IOVS 2006; Bollinger et al., IOVS 2011;  Callaghan et al., Scientific Reports 2022; Li et al., Current Eye Research 2022 etc) and were included to show that the transcriptional response of canonical SMAD and pro-fibrotic genes unfolds as predicted from previous work. With regard to TRPV4 signaling, we expand transcriptomic data with protein analysis (Western blots) and functional analyses (measurements of TRPV4-mediated current and calcium imaging). Transcriptomic, protein expression, electrophysiological and imaging experiments revealed a remarkable consistency in TGFB2-dependence of gene (Fig. 1C) and protein expression (Fig. 1D), transmembrane current (Fig. 3C) and intracellular calcium (Fig. 2).

      Parenthetically, we attempted to get a sense for the TGFB2-dependence of Piezo1 protein expression by conducting Western blots with multiple antibodies and experimental conditions. These efforts were unsuccessful, presumably due to the complexity (30-40 TM domains) and large molecular weight (280-300 kDa) of the protein. We note, however, that Piezo1 signaling cannot account for the observed OHT given that studies by us and others  (Yarishkin et al., 2021, PMID: 33226641 and Zhu et al., 2021; PMID: 33532718) associated Piezo1 signaling with facility increases. The revised m/s reads: “The suppression of outflow facility by Piezo1 inhibitors applied under in vitro and in vivo conditions (39, 81) instead suggests that Piezo1 opposes the hypertensive functions of TRPV4.” The preprint by Redmon et al. (bioRxiv 2024, PMID 39041037) expands the TRPV4-dependence of OHT to microbead-induced, steroid-induced and nocturnal models of OHT to indicate that TRPV4 functions as a universal driver of elevated IOP.  We reiterate this in the revised Discussion.

      Does direct TRPV4 activation also induce the expression of these markers? Does inhibition of TRPV4, after TGF-β treatment, prevent the expression of these markers? Is TRPV4 acting downstream of this response?

      A RNASeq study conducted by us (Rudzitis et al., under review) suggests that the agonist GSK101 has minimal effect on the fibrotic and canonical pathways shown in panels A and B. These data are beyond the scope of the present study. They will be published elsewhere, however, we include the data associated with genes depicted in panels A and B for the reviewer at the end of this Response.

      We conducted an additional series of experiments to test whether TGFB2-induced upregulation of the TRPV4 and Piezo1 genes is itself TRPV4-dependent. As shown in the new SFig. 1, upregulation of the two genes is unaffected by TRPV4 inhibition.

      Figure 1D. Beta tubulin is not a membrane marker. Having staining of b tubulin in membrane fraction shows contamination from the cytoplasm. Does the overall expression also increase?

      b-tubulin associates with the plasma membrane by binding to integral membrane proteins in the plasma and organellar membranes through palmitoylation and attachment to linker proteins and as an integral component of exocytotic vesicles (Wolff, BBA 2009; Hogerheide et al., PNAS 2017). The protein is often used as a loading control for the TRPV4 protein (please see https://www.cellsignal.com/products/primary-antibodies/trpv4-antibody/65893; Grove et al., Science Signaling 2019 and Moore et al., PNAS 2013).  Parenthetically, our RNASeq studies did not find modulation of b-tubulin expression by TGFβ2 [CNR and DK, unpublished observations].

      We examined the overall (cytosolic and membrane) TRPV4 expression and observed, similarly to the membrane fraction alone (Figure 2), upregulation following cytokine stimulation:

      Author response image 1.

      Western blot, total protection extract from control and TGFb2-treated TM cells [Alomone antibody].

      These results in our estimation do not add to the overall narrative and were not included into the paper.

      Figure 4A: it is not very clear. I recommend including a zoom image or better resolution image.

      We include a whole-page image as the new SFigure 4.

      Figure 5B and 6B. Why there is a difference between groups in pre-injection panel. As Figure 5A, in pre-injection, there is no difference between LV-TGFβ and LV-control while in 5B there is a significant difference between these groups.

      We revised Figure Legends to clarify that “pre-injection” in Figures 5B and 6B refers to IOP measurements before the intracameral injection of HC-06  not pre-injection of lentiviral constructs.

      Discussion section. Line 279: "TRPV4 channels in cells treated with TGFβ2 are likely to be constitutively active" ... needs to be discussed further.

      We rewrote the paragraph to clarify that TRPV4 is a thermosensitive channel that is expected to be constitutively active at the incubator temperature:

      “The effectiveness of TRPV4 inhibition in suppressing TGFB2-induced contractility (Fig. 4) is consistent with constitutive activation of TRPV4 channels in incubator-cultured cells.  TRPV4 is a thermosensitive channel (Q10 ~10). Mouse TRPV4 is activated by physiological temperatures (Chung et al., 2003; Shibasaki et al., 2007) with peak activation between ~34 - 37oC (Guler et al., 2003). The several-fold increase in functional expression of the channel in TGFB2-treated cells (Fig. 2) would be expected to promote tonic influx of Ca2+ and Ca2+-dependent cellular signaling. The abrogation of the contractile response in the presence of HC-06 indicates that TRPV4-mediated Ca2+ influx represents the principal source of calcium that drives the contractile response. Consistent with this, supplementation with the agonist GSK101 was sufficient to evoke TM contraction (Fig. 4B).”

      Line 280: "The residual contractility in HC-06-treated cells may reflect TGFβ2-mediated contributions from Piezo1." Piezo1 has a low threshold for mechanosensitivity. How do the authors discuss the observation that, in the presence of Piezo1, TRPV4 has a more prominent mechanosensory function? Is this tied to TGFβ signalling?

      This is an interesting question. Our macroscopic and single channel recordings of Piezo1 activity in TM cells recapitulate the time course published in the original Coste et al. (2010) study, showing the channel inactivates within 10-100 msec (Yarishkin et al., 2021). Thus, it is likely that the channel is largely inactivated during chronic ocular hypertension. Indeed, it has been suggested that resting membrane tension alone may be sufficient to inactivate Piezo1 (Lewis and Grandl, 2015), with cells grown on stiff substrates (e.g., under our experimental conditions) experiencing almost complete Piezo1 inactivation. We propose that the primary function of Piezo channels may be to sense and transduce transient mechanical loading. The remarkable IOP-lowering effectiveness of TRPV4 antagonists and knockdown indicates that - in contrast to Piezo1 - TRPV4 activation is sustained.  

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The complete strain name for the Trpv4-/- mice are missing.

      Corrected.

      The layout for Figure 6 is confusing as HC-06 was only used in panels B and C but the labels are above panel A.

      Corrected.

      Reviewer #2 (Recommendations for the authors):

      Only two mice were used for the noctural IOP experiments. Justification for retreating the same mice in opposite eyes and counting it as n=4 is not rigorous or justified.

      The number of mice investigated in the original submission was four. In Week 1, two mice underwent PBS injections and 2 two mice were treated with HC-06. After the baseline was re-established in Week 2, the treatments were reversed.

      We supplemented these numbers with an additional cohort of 6 mice, with identical results re: nocturnal vs diurnal IOP. These data are presented in the revised Figure 6.

      Why are daytime IOPs measured in awake mice but noctural IOP's measured in isoflurane anesthetized mice? Anesthesia is well known to effect IOP and using two different methods could alter the results, especially when comparing between the groups. This could be why you did not see a noctural rise in the TGFB injected eyes. The same method needs to be used for all measurements to make any conclusions on the circadian patterns of IOP in each condition.

      This is a good point, please see our response above.