26,869 Matching Annotations
  1. May 2024
    1. eLife assessment

      This important study investigates the relationship between transcription factor condensate formation, transcription, and 3D gene clustering of the MET regulon in the model organism S. cerevisiae. The authors provide solid experimental evidence that transcription factor condensates enhance transcription of MET-regulated genes, but the evidence that nuclear condensates per se drive MET gene clustering is incomplete and would benefit from further experimental analyses. This paper will be of interest to molecular biologists working on chromatin and transcription, although its impact would be strengthened by revising the literature citations and including additional experimental work.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, James Lee, Lu Bai, and colleagues use a multifaceted approach to investigate the relationship between transcription factor condensate formation, transcription, and 3D gene clustering of the MET regulon in the model organism S. cerevisiae. This study represents a second clear example of inducible transcriptional condensates in budding yeast, as most evidence for transcriptional condensates arises from studies of mammalian systems. In addition, this study links the genomic location of transcriptional condensates to the potency of transcription of a reporter gene regulated by the master transcription factor contained in the condensate. The strength of evidence supporting these two conclusions is strong. Less strong is evidence supporting the claim that Met4-containing condensates mediate the clustering of genes in the MET regulon.

      Strengths:

      The manuscript is for the most part clearly written, with the overriding model and specific hypothesis being tested clearly explained. Figure legends are particularly well written. An additional strength of the manuscript is that most of the main conclusions are supported by the data. This includes the propensity of Met4 and Met32 to form puncta-like structures under inducing conditions, formation of Met32-containing LLPS-like droplets in vitro (within which Met4 can colocalize), colocalization of Met4-GFP with Met4-target genes under inducing conditions, enhanced transcription of a Met3pr-GFP reporter when targeted within 1.5 - 5 kb of select Met4 target genes, and most impressively, evidence that several MET genes appear to reposition under transcriptionally inducing conditions. The latter is based on a recently reported novel in vivo methylation assay, MTAC, developed by the Bai lab.

      Weaknesses:

      My principal concern is that the authors fail to show convincing evidence for a key conclusion, highlighted in the title, that nuclear condensates per se drive MET gene clustering. Figure 4E demonstrates that Met4 molecules, not condensates per se, are necessary for fostering distant cis and trans interactions between MET6 and three other Met4 targets under -met inducing conditions. In addition, the paper would be strengthened by discussing a recent study conducted in yeast that comes to many of the same conclusions reported here, including the role of inducible TF condensates in driving 3D genome reorganization (Chowdhary et al, Mol. Cell 2022).

      Other concerns:

      (1) A central premise of the study is that the inducible formation of condensates underpins the induction of MET gene transcription and MET gene clustering. Yet, Figure 1 suggests (and the authors acknowledge) that puncta-like Met4-containing structures pre-exist in the nuclei of non-induced cells. Thus, the transcription and gene reorganization observed is due to a relatively modest increase in condensate-like structures. Are we dealing with two different types of Met4 condensates? (For example, different combinations of Met4 with its partners; Mediator- or Pol II-lacking vs. Mediator- or Pol II-containing; etc.?) At the very least, a comment to this effect is necessary.

      (2) Using an in vitro assay, the authors demonstrate that Met4 colocalizes with Met32 LLPS droplets (Figure 2F). Is the same true in vivo - that is, is Met32 required for Met4 condensation? This could be readily tested using auxin-induced degradation of Met32. Along similar lines, the claim that Met32 is required for MET gene clustering (line 250) requires auxin-induced degradation of this protein.

      (3) The authors use a single time point during -met induction (2 h) to evaluate TF clustering, transcription (mRNA abundance), and 3D restructuring. It would be informative to perform a kinetic analysis since such an analysis could reveal whether TF clustering precedes transcriptional induction or MET gene repositioning. Do the latter two phenomena occur concurrently or does one precede the other?

      (4) Based on the MTAC assay, MET13 does not appear to engage in trans interactions with other Met4 targets, whereas MET6 does (Figures 4C and 4E). Does this difference stem from the greater occupancy of Met4 at MET6 vs. MET13, greater association of another Met co-factor with the chromatin of MET6 vs. MET13, or something else?

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript combines live yeast cell imaging and other genomic approaches to study how transcription factor (TF) condensates might help organize and enhance the transcription of the target genes in the methionine starvation response pathway. The authors show that the TFs in this response can form phase-separated condensates through their intrinsically disordered regions (IDRs), and mediate the spatial clustering of the related endogenous genes as well as reporter inserted near the endogenous target loci.

      Strengths:

      This work uses rigorous experimental approaches, such as imaging of endogenously labeled TFs, determining expression and clustering of endogenous target genes, and reporter integration near the endogenous target loci. The importance of TFs is shown by rapid degradation. Single-cell data are combined with genomic sequencing-based assays. Control loci engineered in the same way are usually included. Some of these controls are very helpful in showing the pathway-specific effect of the TF condensates in enhancing transcription.

      Weaknesses:

      Perhaps the biggest weakness of this work is that the role of IDR and phase separation in mediating the target gene clustering is unclear. This is an important question. TF IDRs may have many functions including mediating phase separation and binding to other transcriptional molecules (not limited to proteins and may even include RNAs). The effect of IDR deletion on reduced Fano number in cells could come from reduced binding with other molecules. This should be tested on phase separation of the purified protein after IDR deletion. Also, the authors have not shown IDR deletion affects the clustering of the target genes, so IDR deletion may affect the binding of other molecules (not the general transcription machinery) that are specifically important for target gene transcription. If the self-association of the IDR is the main driving force of the clustering and target gene transcription enhancement, can one replace this IDR with totally unrelated IDRs that have been shown to mediate phase separation in non-transcription systems and still see the gene clustering and transcription enhancement effects? This work has all the setup to test this hypothesis.

      The Met4 protein was tagged with MBP but Met 32 was not. MBP tag is well known to enhance protein solubility and prevent phase separation. This made the comparison of their in vitro phase behavior very different and led the authors to think that maybe Met32 is the scaffold in the co-condensates. If MBP was necessary to increase yield and solubility during expression and purification, it should be cleaved (a protease cleavage site should be engineered) to allow phase separation in vitro.

      Are ATG36 and LDS2 also supposed to be induced by -met? This should be explained clearly. The signals are high at -met.

      Figure 6B, the Met4-GFP seems to form condensates at all three loci without a very obvious difference, though 6C shows a difference. 6C is from only one picture each. The authors should probably quantify the signals from a large number of randomly selected pictures (cells) and do statistics.

    4. Reviewer #3 (Public Review):

      Summary:

      In this study, the authors probe the connections between clustering of the Met4/32 transcription factors (TFs), clustering of their regulatory targets, and transcriptional regulation. While there is an increasing number of studies on TF clustering in vitro and in vivo, there is an important need to probe whether clustering plays a functional role in gene expression. Another important question is whether TF clustering leads to the clustering of relevant gene targets in vivo. Here the authors provide several lines of evidence to make a compelling case that Met4/32 and their target genes cluster and that this leads to an increase in transcription of these genes in the induced state. First, they found that, in the induced state, Met4/32 forms co-localized puncta in vivo. This is supported by in vitro studies showing that these TFs can form condensates in vitro with Med32 being the driver of these condensates. They found that two target genes, MET6 and MET13 have a higher probability of being co-localized with Met4 puncta compared with non-target loci. Using a targeted DNA methylation assay, they found that MET13 and MET6 show Met4-dependent long-range interactions with other Met4-regulated loci, consistent with the clustering of at least some target genes under induced conditions. Finally, by inserting a Met4-regulated reporter gene at variable distances from MET6, they provide evidence that insertion near this gene is a modest hotspot for activity.

      Weaknesses:

      (1) Please provide more information on the assay for puncta formation (Figure 1). It's unclear to me from the description provided how this assay was able to quantitate the number of puncta in cells.

      2) How does the number of puncta in cells correspond with the number of Met-regulated genes? What are the implications of this calculation?

      3) A control for chromosomal insertion of the Met-regulated reporter was a GAL4 promoter derivative reporter. However, this control promoter seems 5-10 fold more active than the Met-regulated promoter (Figure 6). It's possible that the high activity from the control promoter overcomes some other limiting step such that chromosomal location isn't important. It would be ideal if the authors used a promoter with comparable activity to the Met-reporter as a control.

      (4) It seems like transcription from a very large number of genes is altered in the Met4 IDR mutant (Figure 7F). Why is this and could this variability affect the conclusions from this experiment?

    1. eLife assessment

      Wounds are commonly infected, which can lead to delayed or poor wound healing, thereby significantly impacting morbidity and overall quality of life for patients. This manuscript uses single cell RNA sequencing to try to understand the impact of infection on various cell types during wound healing in a mouse model. The methodology is solid and the results provide a valuable 'atlas' of the cellular changes associated with infected and uninfected wounds which will be of interest to the field.

    2. Reviewer #2 (Public Review):

      Summary:

      The authors have performed a detailed analysis of the complex transcriptional status of numerous cell types present in wounded tissue, including keratinocytes, fibroblasts, macrophages, neutrophils, and endothelial cells. The comparison between infected and uninfected wounds is interesting and the analysis suggests possible explanations for why infected wounds are delayed in their healing response.

      Strengths:

      The paper presents a thorough and detailed analysis of the scRNAseq data. The paper is clearly written and the conclusions drawn from the analysis are appropriately cautious. The results provide an important foundation for future work on the healing of infected and uninfected wounds.

      Weaknesses:

      The analysis is purely descriptive and no attempt is made to validate whether any of the factors identified are playing functional roles in wound healing. Such experiments would be appropriate for followup work. The experimental setup is analyzing a single time point and does not include a comparison to unwounded skin. Nevertheless, the present data do provide a useful point of comparison for the field.

    1. eLife assessment

      The current manuscript re-examines an established claim in the literature that human PANX-1 is regulated by Src kinase phosphorylation at two tyrosine residues, Y199 and Y309. This issue is important for our understanding of Pannexin channel regulation. The authors present an extensive series of experiments that fail to detect PANX-1 phosphorylation at these sites. Although the authors' approach is more rigorous than the previous studies, this work relies primarily on negative results that are not unambiguously definitive; the work nonetheless provides a compelling reason for the field to reexamine conclusions drawn in earlier studies.

    2. Reviewer #2 (Public Review):

      The widely distributed pannexin 1 (PANX1) is an ATP-permeable channel that plays an important role in intercellular communication and has been implicated in various pathophysiological processes and diseases. Previous studies have demonstrated that PANX1 can be phosphorylated at two molecular sites via the non-receptor kinase Src, thereby leading to channel opening and ATP release. In this paper, the authors used a variety of methods to detect tyrosine phosphorylation modification of PANX1 channel protein, however, their results showed that commercially available antibodies against the two phosphorylation sites used in previous studies did not work well, in other words, phosphorylation changes in PANX1 could not be detected by those antibodies. Therefore, the authors call for the re-examination and evaluation of previous research results.

      In general, this is a meticulous study, using different detection methods and different expression systems.

    3. Reviewer #3 (Public Review):

      The manuscript by Ruan et al. addresses an important issue in Panx1 research, i.e. the activation of the channel formed by Panx1 via protein phosphorylation. If the authors' conclusions are correct, the previous claims for Panx1 phosphorylation on the basis of the commercial anti-phospho-Panx1 antibodies would be in question.

      This is a very detailed and comprehensive analysis making use of state-of-the-art techniques, including mass spectrometry and phos-tag gel electrophoresis.

      In general, the study is well-controlled as relating to negative controls.

      The value of this manuscript is, that it could spawn new, more function-oriented studies on the activation of Panx1 channels.

      The weaknesses identified previously are reproduced below:

      Weaknesses:

      Although the manuscript addresses an important issue, the activation of the ATP-release channel Panx1 by protein phosphorylation, the data provided do not support the firm conclusion that such activation does not exist. The failure to reproduce published data obtained with commercial anti-phospho Panx1 antibodies can only be of limited interest for a subfield.

      (1) The title claiming that "Panx1 is NOT phosphorylated..." is not justified by the failure to reproduce previously published data obtained with these antibodies. If, as claimed, the antibodies do not recognize Panx1, their failure cannot be used to exclude tyrosine phosphorylation of the Panx1 protein. There is no positive control for the antibodies.

      (2) The authors claim that exogenous SRC expression does not phosphorylate Y198. DeLalio et al. 2019 show that Panx1 is constitutively phosphorylated at Y198, so an effect of exogenous SRC expression is not necessarily expected.

      (3) The authors argue that the GFP tag of Panx1at the COOH terminus does not interfere with folding since the COOH modified (thrombin cleavage site) Panx1 folds properly, forming an amorphous glob in the cryo-EM structure. However, they do not show that the COOH-modified Panx1 folds properly. It may not, because functional data strongly suggest that the terminal cysteine dives deep into the pore. For example, the terminal cysteine, C426, can form a disulfide bond with an engineered cysteine at position F54 (Sandilos et al. 2012).

      (4) The authors dismiss the additional arguments for tyrosine phosphorylation of Panx1 given by the various previous studies on Panx1 phosphorylation. These studies did not, as implied, solely rely on the commercial anti-phospho-Panx1 antibodies, but also presented a wealth of independent supporting data. Contrary to the authors' assertion, in the previous papers the pY198 and pY308 antibodies recognized two protein bands in the size range of glycosylated and partial glycosylated Panx1.

      (5) A phosphorylation step triggering channel activity of Panx1 would be expected to occur exclusively on proteins embedded in the plasma membrane. The membrane-bound fraction is small in relation to the total protein, which is particularly true for exogenously expressed proteins. Thus, any phosphorylated protein may escape detection when total protein is analyzed. Furthermore, to be of functional consequence, only a small fraction of the channels present in the plasma membrane need to be in the open state. Consequently, only a fraction of the Panx1 protein in the plasma membrane may need to be phosphorylated. Even the high resolution of mass spectroscopy may not be sufficient to detect phosphorylated Panx1 in the absence of enrichment processes.

      (6) In the electrophysiology experiments described in Figure 7, there is no evidence that the GFP-tagged Panx1 is in the plasma membrane. Instead, the image in Figure 7a shows prominent fluorescence in the cytoplasm. In addition, there is no evidence that the CBX-sensitive currents in 7b are mediated by Panx1-GFP and are not endogenous Panx1. Previous literature suggests that the hPanx1 protein needs to be cleaved (Chiu et al. 2014) or mutated at the amino terminus (Michalski et al 2018) to see voltage-activated currents, so it is not clear that the currents represent hPANX1 voltage-activated currents.

      Note from the editors: The authors provided a rebuttal to the latest review, but no additional data, so we encourage readers to read the concerns and the author responses.

    1. eLife assessment

      The authors present an important study of a multi-cellular platform involving co-culturing of various hiPSC-derived hepatocyte like cells, cholangiocytes, stellate cells and macrophages to mimic the liver microenvironment. The aggregates are then treated with fatty acids and examined through transcriptomic and functional assays. The techniques and methodologically are sound, and the evidence supporting the conclusion is convincing, although more clinically relevant data demonstrating the effect of some potential pharmacological agents on the platform would serve to strengthen the study.

    2. Reviewer #1 (Public Review):

      There is an undisputable need for better in vitro models recapitulating steatotic liver diseases. This article is from a group of well-known stem cell experts that use human induced pluripotent stem cells (hiPSCs) to build a multicellular steatosis model in vitro. While the model is strong for testing hepatocytes responses, it falls short on translational aspects as well as on non-parenchymal liver cells.

      (1) The authors should use the new nomenclature for the disease, MASLD / MASH, as proposed by the scientific societies (Rinella ME, et al. J Hepatol. 2023; 79(6):1542-1556. PMID: 37364790).

      (2) There has been a similar approach by the Takebe group (Ouchi R, et al., Cell Metab. 2019; 30(2):374-384, PMID: 31155493). What is different in this model?

      (3) The work is very technical and does neither provide any new mechanistic insights nor does it test any new interventions. I do see the clear technical advance in the long-term culture. However, I do not see that this system would allow modelling true "chronic" changes in MASLD, e.g. steatohepatitis and/or fibrosis.

      (4) While I am very convinced about the validity of the "hepatocyte" component in this system, the NPC compartment is insufficient. The 3D model does certainly not contain Kupffer cells (which have very distinct characteristics from "M0" macrophages) and does not contain true HSCs (LX-2 is a very insufficient model). Also, the model lacks flow conditions, which does not allow to factor in pathogenic signals from the circulation / portal vein (e.g. gut-liver axis). This will only allow very limited insights into the crosstalk between hepatocytes and NPCs.

      (5) The translational value of this model remains unclear to me. The scRNA-seq data should be meticulously compared to sc/snRNA-seq data from human MASLD livers at different stages to understand, what this system is able to model (maybe very early stages of steatosis?).

      (6) The study lacks a "use case" to study interventions, e.g. testing resmetirom or any other of the new MASLD drugs in this system.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors developed a 3D multi-cellular platform mimicking the complex interplays involved in the pathogenesis of NAFLD/NASH by employing hiPSCs-derived parenchymal and non-parenchymal cells in combination of organoids obtained from primary human cholangiocytes and the human hepatic stellate cell line LX2. They show that hiPSC-derived hepatocyte are able to accumulate intracellular lipids in fashion similar to human NAFLD and that prolonged accumulation leads to activation of inflammatory and fibrogenic pathways.

      Strengths:

      This is an original attempt to create a 3D all-human multicellular cellular platform recapitulating human NAFLD/NASH. The results are very encouraging. It is of particular note the fact that fibrogenic markers in the 3D system are not extremely (artificially) activated as in the classic 2D system. This makes the proposed platform more realistic.

      Weaknesses:

      The mixture of hiPSC-derived cells and primary or cell-line cells is understandable although potentially adding some variability to the system. The only unclear aspect is the characteristic of the collagen used to create the 3D system. Which type of collagen? Human? Which stiffness?

    1. eLife assessment

      The work by Lewis and co-workers presents important findings on the role of myosin structure/energetics on the molecular mechanisms of hibernation by comparing muscle samples from small and large hibernating mammals. The solid methodological approaches have revealed insights into the mechanisms of non-shivering thermogenesis and energy expenditure.

    2. Reviewer #1 (Public Review):

      Summary:

      The evolution of non-shivering thermogenesis is of fundamental importance to understand. Here, in small mammals the contractile apparatus of the muscle are shown to increase energy expenditure upon a drop in ambient temperature. Additionally, in the state of torpor, small hibernators did not show an increase in energy expenditure under the same challenge.

      Strengths:

      The authors have conducted a very well-planned study that has sampled the muscle of large and small hibernators from two continents. Multiple approaches were then used to identify the state of the contractile apparatus, and its energy expenditure under torpor or otherwise.

      Weaknesses:

      There was only one site of biopsy from the animals used (leg). As the authors state, it would be interesting to know if non-shivering thermogenesis is something that is regionally different in the animal, given the core body and distal limbs have different temperatures.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors utilized (permeabilized) fibers from muscle samples obtained from brown and black bears, squirrels, and Garden dormice, to provide interesting and valuable data regarding changes in myosin conformational states and energetics during hibernation and different types of activity in summer and winter. Assuming that myosin structure is similar between species then its role as a regulator of metabolism would be similar and not different, yet the data reveal some interesting and perplexing differences between the selected hibernating species.

      Strengths:

      The experiments on the permeabilized fibers are complementary, sophisticated, and well-performed, providing new information regarding the characteristics of skeletal muscle fibers between selected hibernating mammalian species under different conditions (summer, interarousal, and winter).

      The studies involve complementary assessments of muscle fiber biochemistry, sarcomeric structure using X-ray diffraction, and proteomic analyses of posttranslational modifications.

      Weaknesses:

      It would be helpful to put these findings on permeabilized fibers into context with the other anatomical/metabolic differences between the species to determine the relative contribution of myosin energetics (with these other contributors) to overall metabolism in these different species, including factors such as fat volume/distribution.

    4. Reviewer #3 (Public Review):

      Summary and Strengths:

      The manuscript by Lewis et al, investigates whether myosin ATP activity may differ between states of hibernation and activity in both large and small mammals. The study interrogates (primarily) permeabilized muscle strips or myofibrils using several state-of-the-art assays, including the mant-ATP assay to investigate ATP utilization of myosin, X-ray diffraction of muscles, proteomics studies, metabolic tests, and computational simulations. The overall data suggests that ATP utilization of myosin during hibernation is different than in active conditions.

      A clear strength of this study is the use of multiple animals that utilize two different states of hibernation or torpor. Two large animal hibernators (Eurasian Brown Bear, American Black Bear) represent large animal hibernators that typically undergo a prolonged hibernation. Two small animal hibernators (Garden Dormouse, 13 Lined Ground Squirrel) undergo torpor with more substantial reductions in heart rate and body temperature, but whose torpor bouts are interrupted by short arousals that bring the animals back to near-summer like metabolic conditions.

      Especially interesting, the investigators analyze the impact that body temperature may have on myosin ATP utilization by performing assays at two different temperatures (8 and 20 degrees C, in 13 Lined Ground Squirrels).

      The multiple assays utilized provide a more comprehensive set of methods with which to test their hypothesis that muscle myosins change their metabolic efficiency during hibernation.

      Suggestions and potential Weaknesses:

      The following highlight comments from the first Public Review that this reviewer acknowledges authors may not be able to address in the current study but may merit carrying to the revised article of record.

      (1) Statistical Analysis<br /> The revised manuscript addresses the substantial issues. The two remaining questions may be noted for future experimental design(s): 1.c. That myosin isoforms may be considered a main effect and 1.e. The importance of biological vs statistical significance, especially for the mant-ATP chase data from the American Black Bear, where there appear to be shifts between the summer and winter data.

      (2). Consistency of DRX/SRX data.<br /> The responses to the first Public Review on the prior version of this manuscript highlight that a potential disconnect between the mant-ATP-predicted SRX:DRX proportions and x-ray diffraction studies measuring the position of the myosin heads (Mohran et al PMID 38103642) may be outside of the scope of the current manuscript. The reviewer accepts that a substantial discussion is outside of this article, but considers a brief mention possible differences between ATP kinetics and structural movements of value.

      Overall, the manuscript represents a valuable data set comparing myosin properties of skeletal muscles multiple species exhibiting different forms of hibernation/torpor.

    1. eLife assessment

      This valuable study reports a potential connection between the seminal microbiome and sperm quality/male fertility. The data are generally convincing, but the statistical methods employed need further justification. This study will be of interest to clinicians and biomedical researchers who work on microbiome and male fertility.

    2. Reviewer #2 (Public Review):

      Summary:

      The study by Mowla et al analysed seminal microbiome together with semen quality parameters in fertile men and men from infertile couples with different infertility diagnoses. The study is of potential interest, with solid study design and methodology, nevertheless, the statistical analysis approach is not fully justified.

      -The patient groups have different diagnoses and should be handled as different groups, and not fused into one 'patient' group in analyses.<br /> Why are the data in tables presented as controls and cases? I would consider men from couples with recurrent pregnancy loss, unexplained infertility, and male factor infertility to have different seminal parameters (not to fuse them into one group). This means, that the statistical analyses should be performed considering each group separately, and not to fuse 3 different infertility diagnoses into one patient group.

      -Were any covariables included in the statistical analyses, e.g. age, BMI, smoking, time of sexual abstinence, etc?

      -Furthermore, it is known that 16S rRNA gene analysis does not provide sensitive enough detection of bacteria on the species level. How much do the authors trust their results on the species level?

      -Were the analyses of bacterial genera and species abundances with seminal quality parameters controlled for diagnosis and other confounders?

      Strengths:

      The cohort of participants seems to be homogenous in the sense of ethnicity and location.

      The authors stress that their study is the biggest on the microbiome in semen. However, when considering that the study consists of 4 groups (with n=46-63), it does not stand out from previous studies.

      Weaknesses:

      There is a lack of paired seminal/urinal samples.

    1. Reviewer #2 (Public Review):

      Summary:

      In this study, the authors performed a screening for PDXP inhibitors to identify compounds that could increase levels of pyridoxal 5'- phosphate (PLP), the co-enzymatically active form of vitamin B6. For the screening of inhibitors, they first evaluated a library of about 42,000 compounds for activators and inhibitors of PDXP and secondly, they validated the inhibitor compounds with a counter-screening against PGP, a close PDXP relative. The final narrowing down to 7,8-DHF was done using PLP as a substrate and confirmed the efficacy of this flavonoid as an inhibitor of PDXP function. Physiologically, the authors show that, by acutely treating isolated wild-type hippocampal neurons with 7,8-DHF they could detect an increase in the ratio of PLP/PL compared to control cultures. This effect was not seen in PDXP KO neurons.

      Strengths:

      The screening and validation of the PDXP inhibitors have been done very well because the authors have performed crystallographic analysis, a counter screening, and mutation analysis. This is very important because such rigor has not been applied to the original report of 7,8 DHF as an agonist for TrkB. Which is why there is so much controversy on this finding.

      Weaknesses:

      As mentioned in the summary report the study may benefit from some in vivo analysis of PLP levels following 7,8-DHF treatment, although I acknowledge that it may be challenging because of the working out of the dosage and timing of the procedure.

    2. Reviewer #3 (Public Review):

      This is interesting biology. Vitamin B6 deficiency has been linked to cognitive impairment. It is not clear whether supplements are effective in restoring functional B6 levels. Vitamin B6 is composed of pyridoxal compounds and their phosphorylated forms, with pyridoxal 5-phosphate (PLP) being of particular importance. The levels of PLP are determined by the balance between pyridoxal kinase and phosphatase activities. The authors are testing the hypothesis that inhibition of pyridoxal phosphatase (PDXP) would arrest the age-dependent decline in PLP, offering an alternative therapeutic strategy to supplements. Published data illustrating that ablation of the Pdxp gene in mice led to increases in PLP levels and improvement in learning and memory trials are consistent with this hypothesis.

      In this report, the authors conduct a screen of a library of ~40k small molecules and identify 7,8-dihydroxyflavone (DHF) as a candidate PDXP inhibitor. They present an initial characterization of this micromolar inhibitor, including a co-crystal structure of PDXP and 7,8-DHF. In addition, they demonstrate that treatment of cells with 7,8 DHP increases PLP levels. Overall, this study provides further validation of PDXP as a therapeutic target for the treatment of disorders associated with vitamin B6 deficiency and provides proof-of-concept for inhibition of the target with small-molecule drug candidates.

      Strengths include the biological context, the focus on an interesting and under-studied class of protein phosphatases that includes several potential therapeutic targets, and the identification of a small molecule inhibitor that provides proof-of-concept for a new therapeutic strategy. Overall, the study has the potential to be an important development for the phosphatase field in general.

      Weaknesses include the fact that the compound is very much an early-stage screening hit. It is an inhibitor with micromolar potency for which mechanisms of action other than inhibition of PDXP have been reported. Extensive further development will be required to demonstrate convincingly the extent to which its effects in cells are due to on-target inhibition of PDXP.

    1. eLife assessment

      This study provides fundamental new knowledge into the role of reversible cysteine oxidation and reduction in protein kinase regulation. The data provide convincing evidence that intra-molecular disulfide bonds serve a repressive regulatory role in the Brain Selective Kinases (BRSK) 1 & 2; part of the as yet understudied 'dark kinome'. The findings will be of broad interest to biochemists, structural biologists, and those interested in the rational design and development of next-generation kinase inhibitors.

    1. Reviewer #3 (Public Review):

      Summary:

      This manuscript reports the novel observation of alterations in the nuclear pore (NUP) components and the function of the nuclear envelope in knock-in models of APP and presenilin mutations. The data show that loss of NUP immunoreactivity (IR) and pore density are observed at times prior to plaque deposition in this model. The loss of NUP IR is correlated with an increase in intraneuronal Abeta IR with two monoclonal antibodies that react with the N-terminus of Abeta. Similar results are observed in cultured neurons from APP-KI and Wt mice where further results with cultured neurons indicate that Abeta "drives" this process: incubation of neurons with oligomeric, but not monomeric or fibrillar Abeta causes loss of NUP IR, incubation with conditioned media from KI cells but not wt cells also causes loss of NUP IR and treatment with the gamma secretase inhibitor, NAPT partially blocks the loss of NUP IR. Further data show that nuclear envelope function is altered in KI cells and KI cells are more sensitive to TNFalpha-induced necroptosis. This is potentially an important and significant report, but how this fits within the larger picture of what is known about amyloid aggregation and accumulation and pathogenesis in neurons needs to be clarified. The results from mouse brains are strong, while the results from cultured cells are in some instances are of a lower magnitude, less convincing, ambiguous, and sometimes over-interpreted.

      Comments on revised version:

      I am disappointed in the responses submitted in the revised manuscript. Although there are two new supplemental figures shown, there is no new data that would be needed to address the points raised by myself and the other reviewers. For example, I asked the authors to provide data to place their observations on lower levels of NUPs and mislocalization of nuclear proteins in the context of previously published reports of nuclear amyloid pathology in APP mouse models reported by Pensalfini et al 2014 and Lee et al, 2022 who report amyloid fibrils in some neuronal nuclei along with rosettes of perinuclear autophagic vacuoles containing Abeta immunoreactive material that also stains with amyloid fibril-specific antibodies. In response the authors state: "We have devoted a section of the discussion to highlight some of these findings in the context of Pensalfini et al. 2014 and Lee et al. 2022. Lee et al. tested multiple animal strains to observe the Panthos structures but did not use the App KI mouse model. Since none of our experiments directly tested their observations (e.g. perinuclear fibrils or acidity of autophagic vesicles) in App KI, we decided to take a more conservative approach in our interpretations by framing the NPC deficits without specifying the nature of the intracellular Aβ. We note in discussion that it is entirely possible that App KI animals also show the same Panthos phenotypes and the perinuclear accumulation of Aβ which results in damaged NUPs. To do that, the Panthos phenotype must first be established in App KI mice. "

      But the "discussion" is just a couple of sentences that misrepresents the findings of the previous publications and excuses for not doing experiments that the authors should do, like examining whether neurons with intranuclear amyloid and perinuclear autophagic vacuoles occur in the mouse model they use. They are experiments that they should do, and it would be easy to do. Is not an imposition to ask for this data because they presumably have the mouse brain tissue, so they could cut more brain sections and co-stain them with NUP antibodies and the antibodies against fibrillar Abeta and autophagic vesicle markers.

      This is just one of many comments where new data is needed but not provided. Disappointing that the revised manuscript is not significantly improved.

    1. eLife assessment

      In this interesting study, Drożdżyk and colleagues analyze the ability of placental CALHM orthologs to form stable complexes, identifying that CALHM2 and CALHM4 form heterooligomeric channels. The authors then determine cryo-EM structures of heterooligomeric CALHM2 and CALHM4 that reveal a distinct arrangement in which the two orthologs can interact, but preferentially segregate in the channel. This is an important study; the data provide compelling support for the interpretations and overall, the work is clearly described.

    2. Reviewer #1 (Public Review):

      The Calcium Homeostasis Modulators (CALHM) are a family of large pore channels, of which the physiological role of CALHM1 and 3 is well understood, in particular their key role in taste sensation via the release of the neurotransmitter ATP. The activation mechanism of CALHM1 involves membrane depolarization and a decrease in extracellular Ca concentration, allowing the passage of large cellular metabolites. However, the activation mechanism and physiological roles of other family members are much less well understood. Many structures of homomeric CALHM proteins have been determined, revealing distinct oligomeric assemblies despite a common transmembrane domain topology. CALHM1 and 3 have been shown functionally to form heteromeric assemblies with properties distinct from those of homomeric CALHM1. However, the structural basis of heteromeric CALHM1 and 3 remains unexplored.

      In this paper, Drozdzyk et al. present an important study on the structures of heteromeric channels composed of CALHM2 and CALHM4, extending the structural understanding of the CALHM family beyond homomeric channels. The study relies primarily on cryo-EM. Despite the inherent challenges of structural determination due to the similar structural features of CALHM2 and CALHM4, the authors innovatively use synthetic nanobodies to distinguish between the subunits. Their results show a broad distribution of different heteromeric assemblies, with CALHM4 conformation similar to its homomeric form and CALHM2 conformation influenced by its proximity to CALHM4, and provide detailed insights into the interaction between CALHM2 and CALHM4.

      The manuscript is well-structured and presents clear results that support the conclusions drawn. The discovery of heteromeric CALHM channels, although currently limited to an overexpressed system, represents a significant advance in the field of large-pore channels and will certainly encourage further investigation into the physiological relevance and roles of heteromeric CALHM channels.

      Comments on the revised version:

      I appreciate the authors' efforts to try the alternative data processing strategy. Congratulations to the authors for this interesting and important work!

    3. Reviewer #2 (Public Review):

      Summary:

      The authors identified that two of the placental CALHM orthologs, CALHM2 and CALHM4 can form heterooligomeric channels that are stable following detergent solubilization. By adding fiducial markers that specifically recognize either CALHM2 or CALHM4, the authors determine a cryo-EM density map of heterooligomeric CALHM2/CALHM4 from which they can determine how the channel in assembled. Surprisingly, the two orthologs segregate into two distinct segments of the channel. This segregation enables the interfacial subunits to ease the transition between the preferred conformations of each ortholog, which are similar to the confirmation that each ortholog adopts in homooligomeric channels.

      Strengths:

      Through the use of fiducial markers, the authors can clearly distinguish between the CALHM2 and CALHM4 promoters in the heterooligomeric channels, strengthening their assignment of most of the promoters. The authors take appropriate caution in identifying two subunits that are likely a mix of the two orthologs in the channel.

      Weaknesses:

      Despite the authors' efforts, no currents could be observed that corresponded to CALHM2/CALHM4 channels and thus the functional effect of their interaction is not known.

  2. Apr 2024
    1. Author response:

      The following is the authors’ response to the original reviews.

      We would like to express our gratitude to the reviewers for their suggestions and critiques as we continually strive to enhance the quality of the manuscript. We improved it, by incorporating the reviewers’ suggestions, changing the content and numbering of figures (Figs 1, 3S1 were edited; 4 figures were moved to supplemental materials), and adding several analyses suggested by the reviewers along with accompanying figures (1S2, 1S3) and tables (1 and 2). These analyses include investigating the link between freezing behavior and 44-kHz calls as well as their sound mean power and duration. Also, we have introduced detailed information regarding the experiments performed as well as expanded the description and discussion of the results section. Finally, we added the information about 44-kHz calls reported by another group – which was inspired by our findings.

      Below is the point-by-point response to the reviewers’ comments.

      Reviewer #1 (Public Review):

      Olszyński and colleagues present data showing variability from canonical "aversive calls", typically described as long 22 kHz calls rodents emit in aversive situations. Similarly long but higher-frequency (44 kHz) calls are presented as a distinct call type, including analyses both of their acoustic properties and animals' responses to hearing playback of these calls. While this work adds an intriguing and important reminder, namely that animal behavior is often more variable and complex than perhaps we would like it to be, there is some caution warranted in the interpretation of these data. The authors also do not provide adequate justification for the use of solely male rodents. With several reported sex differences in rat vocal behaviors this means caution should be exercised when generalizing from these findings.

      We fully agree that our data should be interpreted with caution and we followed the Reviewer’s suggestions along these lines (see below). Also, we appreciate the suggestion to explore the prevalence of 44-kHz calls in female subjects, which would indeed represent an important and intriguing extension of our research. However, due to present financial constraints, we can only plan such experiments. To address the comment, we have added the sentence: “Here we are showing introductory evidence that 44-kHz vocalizations are a separate and behaviorally-relevant group of rat ultrasonic calls. These results require further confirmations and additional experiments, also in form of repetition, including research on female rat subjects.”

      It is important to note that the data presented in the current manuscript originates primarily from previously conducted experiments. These earlier experiments employed male subjects only; it was due to established evidence indicating that the female estrus cycle significantly influences ultrasonic vocalization (Matochik et al., 1992). Adhering to controls for the estrus cycle would require a greater number of female subjects than males, which would not only increase animal suffering but also escalate the demands of human labor and financial costs.

      Firstly, the authors argue that the shift to higher-frequency aversive calls is due to an increase in arousal (caused by the animals having received multiple aversive foot shocks towards the end of the protocols). However, it cannot be ruled out that this shift would be due to factors such as the passage of time and increase in fatigue of the animals as they make vocalizations (and other responses) for extended periods of time. In fact the gradual frequency increase reported for 22 kHz calls and the drop in 44 kHz calls the next day in testing is in line with this.

      Answer: We would like to point out that the “increased-arousal” hypothesis, declared in the manuscript, is only a hypothesis – as reflected by the wording used. However, we changed the beginning of the sentence in question from “It could be argued” to “We would like to propose a hypothesis” to emphasize the speculative aspect of the proposed explanation behind the increase of 44-kHz ultrasonic emissions.

      Also, we do agree that other factors could contribute to the increased emission of 44kHz calls. These factors could include: heightened fear, stress/anxiety, annoyance/anger, disgust/boredom, grief/sadness, despair/helplessness, and weariness/fatigue. We are listing these potential factors in the discussion. Also, we added: “It is not possible, at this stage, to determine which factors played a decisive role. Please note that the potential contribution of these factors is not mutually exclusive”. However, we propose a list of arguments supporting the idea that 44-kHz vocalizations communicate an increased negative emotional state. Among these arguments were the conclusions drawn from additional analyses – mostly inspired by the fatigue hypothesis proposed by the Reviewer #1. In particular, we investigated changes in the sound mean power and duration of 22-kHz and 44-kHz calls. Specifically, we showed that the mean power of 44-kHz vocalizations did not change, and was higher than that of 22-kHz vocalizations (Fig. 1S2EF).

      Finally, the Reviewer #1 listed “the gradual frequency increase reported for 22 kHz calls and the drop in 44 kHz calls the next day” as arguments for the fatigue hypothesis. We do not agree that the “increase” should be interpreted as a sign of fatigue [Producing and maintaining higher frequency calls require greater effort from the vocalizer, on which we elaborated in the manuscript], also we are not sure what “drop in 44 kHz calls” the Reviewer is referring to [We assume it refers to less 44-kHz calls during testing vs. training; we suppose that the levels of arousal are lower in the test due to shorter session time and lack of shocks, which additionally contributes to fear extinction].

      Secondly, regarding the analysis where calls were sorted using DBSCAN based on peak frequency and duration, it is not surprising that the calls cluster based on frequency and duration, i.e. the features that are used to define the 44 kHz calls in the first place. Thus presenting this clustering as evidence of them being truly distinct call types comes across as a circular argument.

      Answer: The DBSCAN sorting results were to convey that when changing the clustering ε value, the degree of cluster separation, the 44-kHz vocalizations remained distinct from the 22-kHz and various short-call clusters that merged. In other words: 44-kHz calls remained separate from long 22-kHz, short 22-kHz and 50-kHz vocalizations, which all consolidated into one common cluster. As a result, in this mathematical analysis, 44-kHz vocalizations remained distinct without applying human biases. Additionally, frequency and duration are the two most common features used to define all types of calls (Barker et al., 2010; Silkstone & Brudzynski, 2019a, 2019b; Willey & Spear, 2013). In summary, we did not expect the analysis to isolate out the 44-kHz calls, and we were surprised by this result.

      The sparsity of calls in the 30-40 kHz range (shown in the individual animal panels in Figure 2C) could in theory be explained by some bioacoustics properties of rat vocal cords, without necessarily the calls below and above that range being ethologically distinct.

      Answer: We respectfully disagree with the argument regarding sparsity. It is important to note that, during prolonged fear conditioning experiments, we observed an increased incidence of 44-kHz calls (Fig. 1E-G) of up to >19% (Fig. 1S2AB) of the total ultrasonic vocalizations during specific inter-trial intervals. Also, it is possible that in observed experimental circumstances almost every fifth call could be attributed to the vocal apparatus as an artifact of its functioning (assuming we are interpreting the Reviewer’s argument correctly). While we do not believe this to be the case, we acknowledge the importance of considering such a hypothesis.

      The behavioral response to call playback is intriguing, although again more in line with the hypothesis that these are not a distinct type of call but merely represent expected variation in vocalization parameters. Across the board animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls. This does raise interesting questions about how, ethologically, animals may interpret such variation and integrate this interpretation in their responses. However, the categorical approach employed here does not address these questions fully.

      Answer: We are unsure of the Reviewer’s critique in this paragraph and will attempt to address it to the best of our understanding. Our finding of up to >19% of long seemingly aversive, 44-kHz calls, at a frequency in the define appetitive ultrasonic range (usually >32 kHz) is unexpected rather than “expected”. We would agree that aversive call variation is expected, but not in the appetitive frequency range.

      Kindly note the findings by Saito et al. (2019), which claim that frequency band plays the main role in rat ultrasonic perception. It is possible that the higher peak frequency of 44kHz calls may be a strong factor in their perception by rats, which is, however, modified by the longer duration and the lack of modulation.

      Also, from our experience, it is quite challenging to demonstrate different behavioral responses of naïve rats to pre-recorded 22-kHz (aversive) vs. 50-kHz (appetitive) vocalizations. Therefore, to demonstrate a difference in response to two distinct, potentially aversive, calls, i.e., 22-kHz vs. 44-kHz calls, to be even more difficult (as to our knowledge, a comparable experiment between short vs. long 22-kHz ultrasonic vocalizations, has not been done before).

      Therefore, we do not take lightly the surprising and interesting finding that “animals respond rather similarly to hearing 22 kHz calls as they do to hearing 44 kHz calls, with occasional shifts of 44 kHz call responses to an intermediate between appetitive and aversive calls”. We would rather put this description in analogous words: “the rats responded similarly to hearing 44-kHz calls as they did to hearing aversive 22-kHz calls, especially regarding heartrate change, despite the 44-kHz calls occupying the frequency band of appetitive 50-kHz vocalizations” and “other responses to 44-kHz calls were intermediate, they fell between response levels to appetitive vs. aversive playback” – which we added to the Discussion.

      Finally, we acknowledge that our findings do not present a finite and complete picture of the discussed aspects of behavioral responses to the presented ultrasonic stimuli (44-kHz vocalizations). Therefore, we have incorporated the Reviewer’s suggestion in the discussion. The added sentence reads: “Overall, these initial results raise further questions about how, ethologically, animals may interpret the variation in hearing 22-kHz vs. 44-kHz calls and integrate this interpretation in their responses.”

      In sum, rather than describing the 44kHz long calls as a new call type, it may be more accurate to say that sometimes aversive calls can occur at frequencies above 22 kHz. Individual and situational variability in vocalization parameters seems to be expected, much more so than all members of a species strictly adhering to extremely non-variable behavioral outputs.

      Answer: The surprising fact that there are presumably aversive calls that are beyond the commonly applied thresholds, i.e. >32 kHz, while sharing some characteristics with 22-kHz calls, is the main finding of the current publication. Whether they be finally assigned as a new type, subtype, i.e. a separate category or become a supergroup of aversive calls with 22-kHz vocalizations is of secondary importance to be discussed with other researchers of the field of study.

      However, we would argue – by showing a comparison – that 22-kHz calls occur at durations of <300 ms and also >300 ms, and are, usually, referred to in literature as short and long 22-kHz vocalizations, respectively (not introduced with a description that “sometimes 22kHz calls can occur at durations below 300 ms”). These are then regarded and investigated as separate groups or classes usually referred to as two different “types” (e.g., Barker et al., 2010) or “subtypes” (e.g., Brudzynski, 2015). Analogously, 44-kHz vocalizations can also be regarded as a separate type or a subtype of 22-kHz calls. The problem with the latter is that 22-kHz vocalizations are traditionally and predominantly defined by 18–32 kHz frequency bandwidth (Araya et al., 2020; Barroso et al., 2019; Browning et al., 2011; Brudzynski et al., 1993; Hinchcliffe et al., 2022; Willey & Spear, 2013).

      Reviewer #2 (Public Review):

      Olszyński et al. claim that they identified a "new-type" ultrasonic vocalization around 44 kHz that occurs in response to prolonged fear conditioning (using foot-shocks of relatively high intensity, i.e. 1 mA) in rats. Typically, negative 22-kHz calls and positive 50-kHz calls are distinguished in rats, commonly by using a frequency threshold of 30 or 32 kHz. Olszyński et al. now observed so-called "44-kHz" calls in a substantial number of subjects exposed to 10 tone-shock pairings, yet call emission rate was low (according to Fig. 1G around 15%, according to the result text around 7.5%).

      Answer: We are thankful for praising the strengths. Please note Figure 1G referred to 10-trial Wistar rats during delay fear conditioning session in which 44-kHz constituted 14.1% of ultrasonic vocalizations. The 7.5% number in results refers to the total of vocalizations analyzed across all animal groups used in fear conditioning experiments. These values have been updated in the current version of the manuscript. Also, please note – 44-kHz calls constituted up to 19.4% of calls, on average, in one of the ITI during fear conditioning session. However, the prevalence of aversive calls and of 44-kHz vocalizations in particular varied. It varied between individual rats; we added the text: “for n = 3 rats, 44-kHz vocalizations accounted for >95% of all calls during at least one ITI (e.g., 140 of total 142, 222 of 231, and 263 of 265 tallied 44-kHz calls), and in n = 9 rats, 44-kHz vocalizations constituted >50% of calls in more than one ITI.” See also further for the description of the array of experiments analyzed and the prevalence/percentage of 44-kHz calls encountered (Tab. 1, Fig. 1S3).

      Weaknesses: I see a number of major weaknesses.

      While the descriptive approach applied is useful, the findings have only focused importance and scope, given the low prevalence of "44 kHz" calls and limited attempts made to systematically manipulate factors that lead to their emission. In fact, the data presented appear to be derived from reanalyses of previously conducted studies in most cases and the main claims are only partially supported. While reading the manuscript, I got the impression that the data presented here are linked to two or three previously published studies (Olszyński et al., 2020, 2021, 2023). This is important to emphasize for two reasons:

      (1) It is often difficult (if not impossible) to link the reported data to the different experiments conducted before (and the individual experimental conditions therein). While reanalyzing previously collected data can lead to important insight, it is important to describe in a clear and transparent manner what data were obtained in what experiment (and more specifically, in what exact experimental condition) to allow appropriate interpretation of the data. For example, it is said that in the "trace fear conditioning experiment" both single- and grouphoused rats were included, yet I was not able to tell what data were obtained in single- versus group-housed rats. This may sound like a side aspect, however, in my view this is not a side aspect given the fact that ultrasonic vocalizations are used for communication and communication is affected by the social housing conditions.

      Answer: Preparing the current manuscript, we indeed used data collected during fear conditioning experiments which were described previously (Olszyński et al., 2021; Olszyński et al., 2022). Please note, however, that vocalization behavior during the fear conditioning itself was not the main subject of these publications. Our previous publications (Olszyński et al., 2020; Olszyński et al., 2021; Olszyński et al., 2022) present primarily ultrasonic-vocalization data from playback-part of experiments whereas here we analyze recordings obtained during fear conditioning experiments, thus we are analyzing new parts, i.e., not yet analyzed, of previously published studies. Also, we have performed additional experiments.

      In the first version of the current manuscript, we did not attempt to demonstrate exactly which calls were recorded in which conditions as the focus was to demonstrate that 44-kHz calls were emitted in several different fear-conditioning experiments. Also, as the experiments were not performed simultaneously and are results from different experimental situations, we would prefer to not compare these results directly.

      However, in the current version of the manuscript, we have introduced an additional reference system, based on Tab. 1, to more clearly indicate which rats have been employed in each analysis, e.g. the group of “Wistar rats that undergone 10 trials of fear conditioning” are described as “Tab. 1/Exp. 1-3/#2,4,8,13; n = 46”, i.e., these are the rats listed in rows 2, 4, 8, and 13 of Tab. 1.

      We have also tried to unify the analyses, in terms of rats used, as much as possible. Finally, we have also introduced Fig. 1S3 to demonstrate the prevalence of 44-kHz calls in all experiments analyzed with the note that “the experiments were not performed in parallel”.

      Regarding the Reviewer’s concerns about analyzing single- and pair-housed rats together. We have examined ultrasonic vocalizations emitted and freezing behavior in these two groups.

      • Ultrasonic vocalizations; when comparing the number of vocalizations, their duration, peak frequency and latency to first occurrence, equally for all types of calls and divided into types (short 22-kHz, long 22-kHz, 44-kHz, 50-kHz), the only difference was observed in peak frequency in 50-kHz vocalizations (50.7 ± 2.8 kHz for paired vs. 61.8 ± 3.1 kHz for single rats; p = 0.0280, Mann-Whitney). Since 50-kHz calls are not the subject of the current publication, we did not investigate this difference further. Also, this difference was not observed during playback experiments (Olszyński et al., 2020, Tab. 1).

      • Freezing. There were no differences between single- and pair-housed groups in freezing behavior, both in the time before first shock presentation and during fear conditioning training (Mann-Whitney).

      In summary, since the two groups did not differ in relevant ultrasonic features and freezing, we decided to present the results obtained from these rats together. However, we agree with the Reviewer, and it is possible that social housing conditions may in fact affect the emission of 44-kHz vocalizations, which could be a subject of another project – involving, e.g., larger experimental groups observed under hypothesis-oriented and defined conditions.

      (2) In at least two of the previously published manuscripts (Olszyński et al., 2021, 2023), emission of ultrasonic vocalizations was analyzed (Figure S1 in Olszyński et al., 2021, and Fig. 1 in Olszyński et al., 2023). This includes detailed spectrographic analyses covering the frequency range between 20 and 100 kHz, i.e. including the frequency range, where the "newtype" ultrasonic vocalization, now named "44 kHz" call, occurs, as reflected in the examples provided in Fig. 1 of Olszyński et al. (2023). In the materials and methods there, it was said: "USV were assigned to one of three categories: 50-kHz (mean peak frequency, MPF >32 kHz), short 22-kHz (MPF of 18-32 kHz, <0.3 s duration), long 22-kHz (MPF of 18-32 kHz, >0.3 s duration)". Does that mean that the "44 kHz" calls were previously included in the count for 50-kHz calls? Or were 44 kHz calls (intentionally?) left out? What does that mean for the interpretation of the previously published data? What does that mean for the current data set? In my view, there is a lack of transparency here.

      Answer: As mentioned above, we indeed used data collected during fear conditioning experiments which were described previously (Olszyński et al., 2021; Olszyński et al., 2022). However, in these publications, ultrasonic vocalizations emitted during playback experiments were the main subject, while the ultrasonic calls emitted during fear conditioning (performed before the playback) were only analyzed in a preliminary way. As a result, the 44-kHz vocalizations analyzed in the current manuscript were not included in the previous analyses. In particular, in Olszyński et al. (2021), we counted the overall number of ultrasonic vocalizations before fear conditioning session to determine the basal ultrasonic emissions (Fig. S1). Then, our next article (Olszyński et al., 2022), we analyzed again the number of all ultrasonic vocalizations before fear conditioning (Fig. S1) and restricted the analysis of vocalizations during fear conditioning to 22-kHz calls (Tab. S1 and S2).

      Also, we re-reviewed all the data used in our previous playback publications. Overall, 44-kHz calls were extremely rare in playback parts of the experiments. There were no 44-kHz calls in the playback data used in Olszyński et al. (2022) and Olszyński et al. (2020). In Olszyński et al. (2021), one rat produced eight 44-kHz calls. These 44-kHz calls constituted 0.03% of all vocalizations analyzed in the experiment (8/24888) and were included in the total number of calls analyzed (but not in the 50-kHz group), they were not described in further detail in that publication.

      Moreover, whether the newly identified call type is indeed novel is questionable, as also mentioned by the authors in their discussion section. While they wrote in the introduction that "high-pitch (>32 kHz), long and monotonous ultrasonic vocalizations have not yet been described", they wrote in the discussion that "long (or not that long (Biały et al., 2019)), frequency-stable high-pitch vocalizations have been reported before (e.g. Sales, 1979; Shimoju et al., 2020), notably as caused by intense cholinergic stimulation (Brudzynski and Bihari, 1990) or higher shock-dose fear conditioning (Wöhr et al., 2005)" (and I wish to add that to my knowledge this list provided by the authors is incomplete). Therefore, I believe, the strong claims made in abstract ("we are the first to describe a new-type..."), introduction ("have not yet been described"), and results ("new calls") are not justified.

      Answer: We would argue that 44-kHz vocalizations were indeed reported but not described. As far as we are concerned, an in-depth analysis of the properties and experimental circumstance of emission of long, high-frequency calls has not yet been performed. These researchers have observed, at least to a degree, similar calls to the ones we observed – as we mentioned in the discussion section. However, since these reported 44-kHz vocalizations were not fully described, we can only guess that they may be similar to ours. We speculate that perhaps like us, these researchers unknowingly recorded 44-kHz calls in their experiments and may also be able to describe them more extensively when re-analyzing their data as we have done here.

      Possibly, it was difficult to find reports on vocalizations, similar to the 44-kHz calls that we observed, because of the canonical and accepted definitions of ultrasonic vocalization types. Biały et al. (2019) allocated them as a part of 22-kHz group, perhaps because their calls were often of a step variation having both low and high components. Shimoju et al. (2020) grouped them along with 50-kHz vocalizations because they appeared during stroking rats held vertically; this procedure was compared to tickling which usually elicits appetitive calls.

      The Reviewer #2 states there are other publications to complete the list. We are aware of other articles authored by the same team as Shimoju et al. (2020) with different first authors. However, they are reporting similar findings to the cited article. Otherwise, we would gladly cite a more complete list of publications showing atypical, long, monotonous highfrequency vocalizations, similar to those observed in our experiments. Therefore, we would argue that ultrasonic vocalizations which were long, flat, high in frequency, and repeatedly occurring in a defined behavioral situation, have not been reported before. However, concerning the strong claims of novelty of our finding, we toned them down where we found this was warranted.

      In general, the manuscript is not well written/ not well organized, the description of the methods is insufficient, and it is often difficult (if not impossible) to link the reported data to the experiments/ experimental conditions described in the materials and methods section.

      Answer: The description of the methods has been adjusted and expanded. We added the requested link to each particular experiment as a formula “Tab. 1/Exp. nos./# nos.” which shows, each time, which experiments and experimental groups were analyzed. The list of the experiments and groups is found in the Tab. 1.

      For example, I miss a clear presentation of basic information: 1) How many rats emitted "44 kHz" calls (in total, per experiment, and importantly, also per experimental condition, i.e. single- versus group-housed)?

      Answer: We now clearly show which experiments were performed and how many animals were tested in each condition (Tab. 1), while the prevalence of 44-kHz calls amongst experimental conditions and animal groups is shown in Fig. 1S3. Also, we included information regarding the number of animals and treatment of each group of rats when reporting results. For example, we are stating that:

      (1a) “53 of all 84 conditioned Wistar rats (Tab. 1/Exp. 1-3/#2,4,6-8,13, Figs 1B, 1E, 1S1BC) displayed” 44-kHz vocalizations – as a general assessment; these numbers are different from those in the first version of the Ms, when we are mentioning Wistar rats conditioned 6 or 10 times only.

      (1b) “From this group of rats (n = 46), n = 41 (89.1%) emitted long 22-kHz calls, and 32 of them (69.6%) emitted 44-kHz calls” – this time referring only to 10-times conditioned Wistar rats as the biggest group that could be analyzed together (Figs 1F, 1G, 1S2A).

      (1c) “for n = 3 rats, 44-kHz vocalizations accounted for >95% of all calls during at least one ITI (e.g., 140 of total 142, 222 of 231, and 263 of 265 tallied 44-kHz calls), and in n = 9 rats, 44kHz vocalizations constituted >50% of calls in more than one ITI.”

      (2) Out of the ones emitting "44 kHz" calls, what was the prevalence of "44 kHz" calls (relative to 22- and 50-kHz calls, e.g. shown as percentage)?

      Answer: The prevalence of 44-kHz vocalizations in all investigated experiments and groups is shown in Fig. 1S3CD. Also, more information regarding the percentage of 44-kHz calls was demonstrated in Fig. 1S2AB where we calculated the distribution of 44-kHz calls to 22-kHz calls in Wistar rats, in 10-trial fear conditioning, across the length of the session.

      Additionally, the values are listed in the sentence regarding all Wistar rats which underwent 10 trials of fear conditioning: “these vocalizations were less frequent following the first trial (1.2 ± 0.4% of all calls), and increased in subsequent trials, particularly after the 5th (8.8 ± 2.8%), through the 9th (19.4 ± 5.5%, the highest value), and the 10th (15.5 ± 4.9%) trials, where 44-kHz calls gradually replaced 22-kHz vocalizations in some rats (Fig. 1F, 1S2B, Video 1; comp Fig. 1D vs. 1E).”

      (3) How did this ratio differ between experiments and experimental conditions?

      Answer: The prevalence of 44-kHz vocalizations in all experimental conditions is shown in Fig. 1S3. However, the direct comparison of results obtained in different conditions was not the goal of the present work. Also, we would argue, that such direct comparisons of results of different experiments would not be allowed. These experiments were done with different groups of animals, at different times, with different timetables of experimental manipulations.

      However, we are comfortable to state that:

      • There were more 44-kHz vocalizations during fear conditioning training than testing in all fear-conditioned Wistar rats;

      • We observed more 44-kHz vocalizations in Wistar rats compared to SHR.

      (4) Was there a link to freezing? Freezing was apparently analyzed before (Olszyński et al., 2021, 2023) and it would be important to see whether there is a correlation between "44-kHz" calls and freezing. Moreover, it would be important to know what behavior the rats are displaying while such "44-kHz" calls are emitted? (Note: Even not all 22-kHz calls are synced to freezing.) All this could help to substantiate the currently highly speculative claims made in the discussion section ("frequency increases with an increase in arousal" and "it could be argued that our prolonged fear conditioning increased the arousal of the rats with no change in the valence of the aversive stimuli"). Such more detailed analyses are also important to rule out the possibility that the "new-type" ultrasonic vocalization, the so-called "44 kHz" call, is simply associated with movement/ thorax compression.

      Answer: We analyzed freezing behavior and its association with ultrasonic emissions. The emission of 44-kHz vocalizations was associated with freezing. The results are now described and presented in the manuscript, i.e., Tab. 2, its legend and the description in Results: “Freezing during the bins of 22-kHz calls only (p < 0.0001, for both groups) and during 44-kHz calls only bins (p = 0.0003) was higher than during the first 5 min baseline freezing levels of the session. Also, the freezing associated with emissions of 44-kHz calls only was higher than during bins with no ultrasonic vocalizations (p = 0.0353), and it was also 9.9 percentage points higher than during time bins with only long 22-kHz vocalizations, but the difference was not significant (p = 0.1907; all Wilcoxon)” and “To further investigate this potential difference, we measured freezing during the emission of randomly selected single 44-kHz and 22-kHz vocalizations. The minimal freezing behavior detection window was reduced to compensate for the higher resolution of the measurements (3, 5, 10, or 15 video frames were used). There was no difference in freezing during the emission of 44-kHz vs. 22-kHz vocalizations for ≥150ms-long calls (3 frames, p = 0.2054) and for ≥500-ms-long calls (5 frames, p = 0.2404; 10 frames, p = 0.4498; 15 frames, p = 0.7776; all Wilcoxon, Tab. 2B).”

      Please note, that the general observation that "frequency increases with an increase in arousal" is not our claim but a general rule derived from large body of observations and proposed by the others (Briefer et al., 2012); we changed the wording of this statement to: “frequency usually increases with an increase in arousal (Briefer et al., 2012)”.

      The figures currently included are purely descriptive in most cases - and many of them are just examples of individual rats (e.g. majority of Fig. 1, all of Fig. 2 to my understanding, with the exception of the time course, which in case of D is only a subset of rats ("only rats that emitted 44-kHz calls in at least seven ITI are plotted" - is there any rationale for this criterion?)), or, in fact, just representative spectrograms of calls (all of Fig. 3, with the exception of G, all of Fig. 4).

      Answer: Please note, the former figures 2, 4, 6, and 8 have been now moved to supplementary figures 1S1, 2S1, 3S1, and 4S1 – to better organize the presentation of data. Figures 1, 3, 5, 7 are now 1, 2, 3, 4 respectively. In regards to presenting data from individual rats, this was to show the general patterns of ultrasonic-calls distributions observed. Showing the full data set as seen in Fig. 5A (now Fig. 3A) would obscure the readability of the graph without using mathematical clustering techniques such as DBSCAN.

      Concerning the Reviewer’s #2 question regarding the criterion of “minimum seven ITI”, we selected the highest vocalizers by taking animals above the 75th percentile of the number of ITI with 44-kHz calls. However, in the current version of the manuscript, we decided to omit this part of the analysis and the accompanying part of the figure, since it did not provide any additional informative value (apart from employing questionable criterion).

      Moreover, the differences between Fig. 5 and Fig. 6 are not clear to me. It seems Fig. 5B is included three times - what is the benefit of including the same figure three times?

      Answer: We hope that designating Fig. 6 as supplementary to Fig. 5 (now Figs 3S1 and 3, respectively) will make interpreting them more streamlined. Fig. 6A (now Fig. 3S1A) is a more detailed look on information presented in Fig. 5B (now Fig. 3B) with spectrogram images of ultrasonic vocalizations from different areas of the plot. Also, Fig. 3B (former Fig. 5B) was removed from Fig. 3S1B (former Fig. 6B).

      A systematic comparison of experimental conditions is limited to Fig. 7 and Fig. 8, the figures depicting the playback results (which led to the conclusion that "the responses to 44-kHz aversive calls presented from the speaker were either similar to 22-kHz vocalizations or in between responses to 22-kHz and 50-kHz playbacks", although it remains unclear to me why differences were seen b e f o r e the experimental manipulation, i.e. the different playback types in Fig. 8B).

      Answer: There were indeed instances of such before-differences. Such differences were observed in our previous studies (Olszyński et al., 2020, Tabs S9-12; Olszyński et al., 2021, Tabs S7; Olszyński et al., 2022, Tabs S4, S9, S13, S17, S18) and were most likely due to analyzing multiple comparisons. However, we think that the carry-over effect, mentioned by the Reviewer #2 (see below), also played a role.

      Related to that, I miss a clear presentation of relevant methodological aspects: 1) Why were some rats single-housed but not the others?

      Answer: As stated before, data were collected from our previous experiments and the observation of 44-kHz vocalizations in fear conditioning was an emergent discovery as we decided to analyze ultrasonic recordings from fear conditioning procedures. Single-housed animals were part of our experiment comparing fear conditioning and social situation on the perception of ultrasonic playback as described in Olszyński et al. (2020). Aside from this experiment, all other rats were housed in pairs.

      (2) Is the experimental design of the playback study not confounded? It is said that "one group (n = 13) heard 50-kHz appetitive vocalization playback while the other (n = 16) 22-kHz and 44kHz aversive calls". How can one compare "44 kHz" calls to 22- and 50-kHz calls when "44 kHz" calls are presented together with 22-kHz calls but not 50-kHz calls? What about carry-over effects? Hearing one type of call most likely affects the response to the other type of call. It appears likely that rats are a bit more anxious after hearing aversive 22-kHz calls, for example. Therefore, it would not be very surprising to see that the response to "44 kHz" calls is more similar to 22-kHz calls than 50-kHz calls.

      Of note, in case of the other playback experiment it is just said that rats "received appetitive and aversive ultrasonic vocalization playback" but it remains unclear whether "44 kHz" calls are seen as appetitive or aversive. Later it says that "rats were presented with two 10-s-long playback sets of either 22-kHz or 44-kHz calls, followed by one 50-kHz modulated call 10-s set and another two playback sets of either 44-kHz or 22-kHz calls not previously heard" (and wonder what data set was included in the figures and how - pooled?). Again, I am worried about carry-over effects here. This does not seem to be an experimental design that allows to compare the response to the three main call types in an unbiased manner.

      Answer: We apologize for being confounding and brief in our original description of the playback experiments. We wanted to avoid confusion associated with including several additional playback signals (please note some are not related to the current comparisons and include different 50-kHz ultrasonic subtypes and two different subtypes of short 22-kHz calls). We lengthened the description of these playback experiments in the current version.

      In general, including more than one type of ultrasonic calls as playback has a risk of a carry-over effect as well as a habituation effect (the responses become weak). However, it greatly reduces the number of required animals. Finally, regarding the first experiment, we chose 3 playbacks to compare the rats’ reactions, as this was the most conservative choice we thought of.

      We would like to highlight that we wanted to compare specifically the rats’ responses to 22-kHz vs. 44-kHz playback (as well as the effects of playback of different subtypes 50-kHz calls, which is not the subject of the current work). Therefore, we would argue, that the design of both experiments is actually unbiased regarding this key comparison (responses to 22-kHz vs. 44-kHz playback). In both experiments, 22-kHz and 44-kHz playbacks were included in the same sequences of stimuli and counterbalanced regarding their order (i.e., taking into account possible carry-over effects), and presented to the same rats. We regarded the group of rats that heard 50-kHz recordings as a baseline/control, since we know from previous playback studies what reactions to expect from rats exposed to these vocalizations (and 22-kHz playback), while in the second experiment, we reduced the 50-kHz playback to one set in order to minimize possible habituation to multiple playbacks.

      We agree that the design of both experiments does not allow for full comparison of the effects of aversive playbacks to 50-kHz playback. Also, we agree that some carry-over effects could play a role. It was mentioned in the discussion: ”Please factor in potential carryover effects (resulting from hearing playbacks of the same valence in a row) in the differences between responses to 50-kHz vs. 22/44-kHz playbacks, especially, those observed before the signal (Fig. 4AB).” However, we would still argue that the observed lack of difference in heartrate response (Fig. 4A) and the differences regarding the number of 50-kHz calls emitted (e.g., Fig. 4S1F) are void of the constraints raised by the Reviewer #2.

      We acknowledge that our studies do not give a complete picture of 44-kHz ultrasonic perception in relation to other ultrasonic bands and, given the possibility, we would like to perform more in-depth and focused experiments to study this aspect of 44-kHz calls in the future.

      Finally, regarding the second experiment, the description of the rats now includes that they “received 22-kHz, 44-kHz, and 50-kHz ultrasonic vocalization playback”, while the description of the experiment itself includes: “Responses to the pairs of playback sets were averaged”.

      Of note, what exactly is meant by "control rats" in the context of fear conditioning is also not clear to me. One can think of many different controls in a fear conditioning experiment.

      More concrete information is needed.

      Answer: This information was included in our previous publications. However, it was now provided in the method section of the current version of the manuscript. In general, control rats were subjected to the same procedures but did not receive electric shocks.

      Literature included in the answers

      Araya, E. I., Baggio, D. F., Koren, L. O., Andreatini, R., Schwarting, R. K. W., Zamponi, G. W., & Chichorro, J. G. (2020). Acute orofacial pain leads to prolonged changes in behavioral and affective pain components. Pain, 161(12), 2830-2840. https://doi.org/10.1097/j.pain.0000000000001970

      Barker, D. J., Root, D. H., Ma, S., Jha, S., Megehee, L., Pawlak, A. P., & West, M. O. (2010). Dose-dependent differences in short ultrasonic vocalizations emitted by rats during cocaine self-administration. Psychopharmacology (Berl), 211(4), 435-442. https://doi.org/10.1007/s00213-010-1913-9

      Barroso, A. R., Araya, E. I., de Souza, C. P., Andreatini, R., & Chichorro, J. G. (2019). Characterization of rat ultrasonic vocalization in the orofacial formalin test: Influence of the social context. Eur Neuropsychopharmacol, 29(11), 1213-1226. https://doi.org/10.1016/j.euroneuro.2019.08.298

      Biały, M., Podobinska, M., Barski, J., Bogacki-Rychlik, W., & Sajdel-Sulkowska, E. M. (2019). Distinct classes of low frequency ultrasonic vocalizations in rats during sexual interactions relate to different emotional states. Acta Neurobiol Exp (Wars), 79(1), 1-12. https://www.ncbi.nlm.nih.gov/pubmed/31038481

      Briefer, E. F., Padilla de la Torre, M., & McElligott, A. G. (2012). Mother goats do not forget their kids' calls. Proc Biol Sci, 279(1743), 3749-3755. https://doi.org/10.1098/rspb.2012.0986

      Browning, J. R., Browning, D. A., Maxwell, A. O., Dong, Y., Jansen, H. T., Panksepp, J., & Sorg, B. A. (2011). Positive affective vocalizations during cocaine and sucrose self administration: a model for spontaneous drug desire in rats. Neuropharmacology, 61(1-2), 268-275. https://doi.org/10.1016/j.neuropharm.2011.04.012

      Brudzynski, S. M. (2015). Pharmacology of Ultrasonic Vocalizations in adult Rats: Significance, Call Classification and Neural Substrate. Curr Neuropharmacol, 13(2), 180-192. https://doi.org/10.2174/1570159x13999150210141444

      Brudzynski, S. M., & Bihari, F. (1990). Ultrasonic vocalization in rats produced by cholinergic stimulation of the brain. Neurosci Lett, 109(1-2), 222-226. https://doi.org/10.1016/0304-3940(90)90567-s

      Brudzynski, S. M., Bihari, F., Ociepa, D., & Fu, X. W. (1993). Analysis of 22 kHz ultrasonic vocalization in laboratory rats: long and short calls. Physiol Behav, 54(2), 215-221. https://doi.org/10.1016/0031-9384(93)90102-l

      Hinchcliffe, J. K., Jackson, M. G., & Robinson, E. S. (2022). The use of ball pits and playpens in laboratory Lister Hooded male rats induces ultrasonic vocalisations indicating a more positive affective state and can reduce the welfare impacts of aversive procedures. Lab Anim, 56(4), 370-379. https://doi.org/10.1177/00236772211065920

      Matochik, J. A., White, N. R., & Barfield, R. J. (1992). Variations in scent marking and ultrasonic vocalizations by Long-Evans rats across the estrous cycle. Physiol Behav, 51(4), 783-786. https://doi.org/10.1016/0031-9384(92)90116-j

      Olszyński, K. H., Polowy, R., Małż, M., Boguszewski, P. M., & Filipkowski, R. K. (2020). Playback of Alarm and Appetitive Calls Differentially Impacts Vocal, Heart-Rate, and Motor Response in Rats. iScience, 23(10), 101577. https://doi.org/10.1016/j.isci.2020.101577

      Olszyński, K. H., Polowy, R., Wardak, A. D., Grymanowska, A. W., & Filipkowski, R. K. (2021). Increased Vocalization of Rats in Response to Ultrasonic Playback as a Sign of Hypervigilance Following Fear Conditioning. Brain Sci, 11(8). https://doi.org/10.3390/brainsci11080970

      Olszyński, K. H., Polowy, R., Wardak, A. D., Grymanowska, A. W., Zieliński, J., & Filipkowski, R. K. (2022). Spontaneously hypertensive rats manifest deficits in emotional response to 22-kHz and 50-kHz ultrasonic playback. Prog Neuropsychopharmacol Biol Psychiatry, 120, 110615. https://doi.org/10.1016/j.pnpbp.2022.110615

      Saito, Y., Tachibana, R. O., & Okanoya, K. (2019). Acoustical cues for perception of emotional vocalizations in rats. Scientific Reports, 9(1), 10539.

      Sales, G. D. (1979). Strain Differences in the Ultrasonic Behavior of Rats (Rattus norvegicus) Am Zool, 19(2), 513-527. https://www.jstor.org/stable/3882331

      Shimoju, R., Shibata, H., Hori, M., & Kurosawa, M. (2020). Stroking stimulation of the skin elicits 50-kHz ultrasonic vocalizations in young adult rats. J Physiol Sci, 70(1), 41. https://doi.org/10.1186/s12576-020-00770-1

      Silkstone, M., & Brudzynski, S. M. (2019a). The antagonistic relationship between aversive and appetitive emotional states in rats as studied by pharmacologically-induced ultrasonic vocalization from the nucleus accumbens and lateral septum. Pharmacology Biochemistry and Behavior, 181, 77-85. https://doi.org/10.1016/j.pbb.2019.04.009

      Silkstone, M., & Brudzynski, S. M. (2019b). Intracerebral injection of R-(-)-Apomorphine into the nucleus accumbens decreased carbachol-induced 22-kHz ultrasonic vocalizations in rats. Behavioural Brain Research, 364, 264-273. https://doi.org/10.1016/j.bbr.2019.01.044

      Willey, A. R., & Spear, L. P. (2013). The effects of pre-test social deprivation on a natural reward incentive test and concomitant 50 kHz ultrasonic vocalization production in adolescent and adult male Sprague-Dawley rats. Behav Brain Res, 245, 107-112. https://doi.org/10.1016/j.bbr.2013.02.020

      Wöhr, M., Borta, A., & Schwarting, R. K. (2005). Overt behavior and ultrasonic vocalization in a fear conditioning paradigm: a dose-response study in the rat. Neurobiol Learn Mem, 84(3), 228-240. https://doi.org/10.1016/j.nlm.2005.07.004

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      Additional considerations:

      The discussion of the "perfect fifth" and the proposition that this observation could be evidence of an evolutionary mechanism underlying it is rather far-fetched, especially for being presented in the Results section (with no supporting non-anecdotal evidence).

      Answer: We agree with the Reviewer #1. The text was modified, the word “evolutionary” was deleted. Instead, we expended on the possible reason for prevalence of the perfect fifth in the current version of the manuscript; we added that the prevalence of the perfect fifth: “could be explained by the observation that all physical objects capable of producing tonal sounds generate harmonic vibrations, the most prominent being the octave, perfect fifth, and major third (Christensen, 1993, discussed in Bowling and Purves, 2015).”

      It is not clear why Sprague-Dawleys were used as "receivers" in the playback experiment, when presumably the calls were recorded from Wistars and SHRs. While this does not critically impact the conclusions, within the species rats should be able to respond appropriately to calls made by rats of different genetic backgrounds, it adds an unnecessary source of variance.

      Answer: Sprague-Dawley rats were used to test another normotensive strain of rats. Regarding the Reviewer’s main point – we beg to differ as we think that it is worth testing playback stimuli in different strains. Diverging the stimuli between different rat strains would add unnecessary variance and it seemed logical to use the same recordings to test effects in different strains. Please note that finally, in spite of this additional variance, the results of both playback experiments are, in general, similar – which may point to a universal effect of 44-kHz playback across rat strains.

      It is pertinent to note that for the trace fear conditioning experiment, the rats had previously been exposed to a vocalization playback experiment. While such a pre-exposure is unlikely to be a very strong stressor, the possibility for it to influence the vocal behaviors of these rats in later experiments cannot be ruled out. It is also not clear what the control rats in this experiment experienced (home cage only?), nor what they were used for in analyses.

      Answer: In the current version of the manuscript, we have described in greater detail all the experiments performed and analyzed. We would like to emphasize that both delay and trace fear conditioning experiments with radiotelemetric transmitters were not performed specifically to elicit any particular response during fear conditioning, rather that our observation of 44-kHz vocalizations emerged as a result of re-examining the audio recordings. As a result, this work summarizes our observations of 44-kHz calls from several different experiments. It is relevant to note, that 44-kHz vocalizations were observed “in rats which were exposed to vocalization playback experiment”, in rats before the playback experiments as well as in naïve rats, without transmitters implemented, trained in fear conditioning (Tab. 1/Exp. 1-3).

      Our main message is that 44-kHz vocalizations were present in several experiments, with different conditions and subjects, while we are not attempting to compare in detail the results across the different experiments. In other words, we agree that pre-exposure to playback (and even more likely – transmitters implantation) could influence, but are not necessary, for 44-kHz ultrasonic emissions by the rats. To demonstrate this, we added a prolonged fear conditioning group with naïve Wistar rats (Exp. 3) to verify the emission of 44kHz calls in the absence of those experimental factors.

      We modified the methods section to clarify the circumstances under which these discoveries were made, such as including the information regarding the control rats in trace fear conditioning. In particular we mention that: “Control rats were subjected to the exact same procedures but did not receive the electric shock at the end of trace periods”.

      For Figure 1A-E, only example call distributions from individual rats are shown. It would perhaps be more informative to see the full data set displayed in this manner, with color/shape codes distinguishing individuals if desired.

      Answer: Please note the Fig. 1S1 shows more examples of ultrasonic call distribution. Showing all the data would make it more difficult to read and interpret. The problem is partly amended in Fig. 3A.

      It is not clear what is presented in Figure 2D vs. E, i.e. panel D is shown only for "selected rats" but the legend does not clarify how and why these rats were selected. It is also not clear why the legend reports p-values for both Friedman and Wilcoxon tests; the latter is appropriate for paired data which seems to be the case when the question is whether the call peak frequency alters across time, but the Friedman assumes non-paired input data.

      Answer: The question refers to the current Fig. 1S2C panel (former Fig. 2E panel) and the former Fig. 2D panel. The latter was not included in the current version of the manuscript, since both reviewers opposed the presentation of “selected rats” only (see above). The full description of the Fig. 1S2C panel is now in the results section together with p-values for Friedman and Wilcoxon test. We used the latter to investigate the difference between the first and the last ITI (selected paired data), while the Friedman to investigate the presence of change within the chain of ten ITI – since it is a suitable test for a difference between two or more paired samples.

      Reviewer #2 (Recommendations For The Authors):

      The weaknesses listed in the public review need to be addressed.

      Answer: We have done our best to address the weaknesses.

      Notes: 1) Page and line numbers would have been useful.

      Answer: We are including a separate manuscript version with page and line numbers.

      .(2) English language needs to be improved.

      Answer: The text has been checked by two native English speakers (one with a scientific background). Both only identified minor changes to improve the text which we applied.

      (3) I am a bit unsure whether the comment about the Star Wars movie (1997) and the Game of Thrones series (2011) is supposed to be a joke.

      Answer: These are indeed two genuine examples of the perfect fifth in human music that we hope are easily recognizable and familiar to readers. Parts of the same examples of the perfect fifth can also heard in the rat voice files provided.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      During the last decades, extensive studies (mostly neglected by the authors), using in vitro and in vivo models, have elucidated the five-step mechanism of intoxication of botulinum neurotoxins (BoNTs). The binding domain (H chain) of all serotypes of BoNTs binds polysialogangliosides and the luminal domain of a synaptic vesicle protein (which varies among serotypes). When bound to the synaptic membrane of neurons, BoNTs are rapidly internalized by synaptic vesicles (SVs) via endocytosis. Subsequently, the catalytic domain (L chain) translocates, a process triggered by the acidification of these organelles. Following translocation, the disulfide bridge connecting the H chain with the L chain is reduced by the thioredoxin reductase/thioredoxin system, and it is refolded by the chaperone Hsp90 on SV's surface. Once released into the cytosol, the L chains of different serotypes cleave distinct peptide bonds of specific SNARE proteins, thereby disrupting neurotransmission. In this study, Yeo et al. extensively revise the neuronal intoxication model, suggesting that BoNT/A follows a more complex intracellular route than previously thought. The authors propose that upon internalization, BoNT/A-containing endosomes are retro-axonally trafficked to the soma. At the level of the neuronal soma, this serotype then traffics to the endoplasmic reticulum (ER) via the Golgi apparatus. The ER SEC61 translocon complex facilitates the translocation of BoNT/A's LC from the ER lumen into the cytosol, where the thioredoxin reductase/thioredoxin system and HSP complexes release and refold the catalytic L chain. Subsequently, the L chain diffuses and cleaves SNAP25 first in the soma before reaching neurites and synapses. Strengths:

      I appreciate the authors' efforts to confirm that the newly established methods somehow recapitulate aspects of the BoNTs mechanism of action, such as toxin binding and uptake occurring at the level of active synapses. Furthermore, even though I consider the SNAPR approach inadequate, the genome-wide RNAi screen has been well executed and thoroughly analyzed. It includes well-established positive and negative controls, making it a comprehensive resource not only for scientists working in the field of botulinum neurotoxins but also for cell biologists studying endocytosis more broadly. Weaknesses:

      I have several concerns about the authors' main conclusions, primarily due to the lack of essential controls and validation for the newly developed methods used to assess toxin cleavage and trafficking into neurons. Furthermore, there is a significant discrepancy between the proposed intoxication model and existing studies conducted in more physiological settings. In my opinion, the authors have omitted over 20 years of work done in several labs worldwide (Montecucco, Montal, Schiavo, Rummel, Binz, etc.). I want to emphasize that I support changes in biological dogma only when these changes are supported by compelling experimental evidence, which I could not find in the present manuscript.

      We thank the reviewer for his reading and comments and for pointing out the discrepancy between our proposed model and the existing model. However, we respectfully disagree with the phrase of “extensive studies have elucidated the five-steps mechanism of intoxication…”. This sentence and the following imply that the model is well-established and demonstrated. It also highlights how the reviewer is convinced about this previous model.

      We contest this model for theoretical reasons and contest the strength of evidences that support it. We previously included references to previous work showing that the model is also being challenged by others. In light of the reviewer’s comments, we incluced more references in the introduction and we also explicit our main theoretical concern in the introduction:

      “Arguably, the main problem of the model is its failure to propose a thermodynamically consistent explanation for the directional translocation of a polypeptidic chain across a biologial membrane. Other known instances of polypeptide membrane translocation such as the co-translational translocation into the ER indicate that it is an unfavorable process, which consumes significant energy (Alder and Theg 2003). ”

      We also added the following text in the Discussion to address with the reviewer’s concerns: “Our study contradicts the long-established model of BoNT intoxication, which is described in several reviews specifically dedicated to the subject 1–4. In short, these reviews support the notion that BoNT are molecular machines able to mediate their own translocation across membranes; this notion has convinced some cell biologists interested in toxins and retrograde traffic, who describe BoNT mode of translocation in their reviews 5,6.

      But is this notion well supported by data? A careful examination of the primary literature reveals that early studies indeed report that BonTs form ion channels at low pH values 7,8. These studies have been extended by the use of patch-clamp 9,10. These works and others lead to various suppositions on how the toxin forms a channel and translocate the LC 1,11 .

      However, only a single study claims to reconstitute in vitro the translocation of BonT LC across membranes 12. In this paper, the authors report using a system of artificial membranes separating two aqueous compartments. They load the toxin in the cis compartment and measure the protease activity in the trans compartment after incubation. However, when the experimental conditions described are actually converted in terms of molarity, it appears that the cis compartment was loaded at 10e-8M BonT and that the reported translocated protease activity is equivalent to 10e-17 M (Figure 3D, 12). Thus, in this experiment, about 1 LC molecule in 100 millions has crossed the membrane. Such extremely low transfert rate does not tally with the extreme efficiency of intoxication in vivo, even while taking into account the difference between artificial and biological membranes.

      In sum, a careful analysis of the primary literature indicate that while there is ample evidence that BoNTs have the ability to affect membranes and possibly create ion channels, there is actually no credible evidence that these channels mediate translocation of the LC. As mentioned earlier, it is not clear how such a self-translocation mechanism would function thermodynamically. By contrast, our model proposes a mechanism without a thermodynamic problem, is consistent with current knowledge about other protein toxins, such as PE, Shiga and Ricin, and can help explain previously puzzling features of BonT effects. It is worth noting that a similar self-translocation model was proposed for other protein toxins such as Pseudomonas exotoxin, which have similar molecular organisation as BonT (68). However, it has since been demonstrated that the PE toxins require cellular machinery, in particular in the ER, for intoxication (21,69,70).”

      Reviewer #2 (Public Review):

      Summary:

      The study by Yeo and co-authors addresses a long-lasting issue about botulinum neurotoxin (BoNT) intoxication. The current view is that the toxin binds to its receptors at the axon terminus by its HCc domain and is internalized in recycled neuromediator vesicles just after the release of the neuromediators. Then, the HCn domain assists the translocation of the catalytic light chain (LC) of the toxin through the membrane of these endocytic vesicles into the cytosol of the axon terminus. There, the LC cleaves its SNARE substrate and blocks neurosecretion. However, other views involving kinetic aspects of intoxication suggest that the toxin follows the retrograde axonal transport up to the nerve cell body and then back to the nerve terminus before cleaving its substrate.

      In the current study, the authors claim that the BoNT/A (isotype A of BoNT) not only progresses to the cell body but once there, follows the retrograde transport trafficking pathway in a retromer-dependent fashion, through the Golgi apparatus, until reaching the endoplasmic reticulum. Next, the LC dissociates from the HC (a process not studied here) and uses the translocon Sec61 machinery to retro-translocate into the cytosol. Only then, does the LC traffic back to the nerve terminus following the anterograde axonal transport. Once there, LC cleaves its SNARE substrate (SNAP25 in the case of BoTN/A) and blocks neurosecretion.

      To reach their conclusion, Yeo and co-authors use a combination of engineered tools: a cell line able to differentiate into neurons (ReNcell VN), a reporter dual fluorescent protein derived from SNAP25, the substrate of BoNT/A (called SNAPR), the use of either native BoNT/A or a toxin to which three fragment 11 of the reporter fluorescent protein Neon Green (mNG) are fused to the N-terminus of the LC (BoNT/A-mNG11x3), and finally ReNcell VN transfected with mNG1-10 (a protein consisting of the first 10 beta strands of the mNG).

      SNAPR is stably expressed all over in the ReNcell VN. SNAPR is yellow (red and green) when intact and becomes red only when cleaved by BoNT/A LC, the green tip being degraded by the cell. When the LC of BoNT/A-mNG11x3 reaches the cytosol in ReNcell VN transfected by mNG1-10, the complete mNG is reconstituted and emits a green fluorescence.

      In the first experiment, the authors show that the catalytic activity of the LC appears first in the cell body of neurons where SNAPR is cleaved first. This phenomenon starts 24 hours after intoxication and progresses along the axon towards the nerve terminus during an additional 24 hours. In a second experiment, the authors intoxicate the ReNcell VN transfected by mNG1-10 using the BoNT/A-mNG11x3. The fluorescence appears also first in the soma of neurons, then diffuses in the neurites in 48 hours. The conclusion of these two experiments is that translocation occurs first in the cell body and that the LC diffuses in the cytosol of the axon in an anterograde fashion.

      In the second part of the study, the authors perform a siRNA screen to identify regulators of BoNT/A intoxication. Their aim is to identify genes involved in intracellular trafficking of the toxin and translocation of the LC. Interestingly, they found positive and negative regulators of intoxication. Regulators could be regrouped according to the sequential events of intoxication.

      Genes affecting binding to the cell-surface receptor (SV2) and internalization. Genes involved in intracellular trafficking. Genes involved in translocation such as reduction of the disulfide bond linking the LC to the HC and refolding in the cytosol. Genes involved in signaling such as tyrosine kinases and phosphatases. All these groups of genes may be consistent with the current view of BoNT intoxication within the nerve terminus. However, two sets of genes were particularly significant to reach the main conclusion of the work and definitely constitute an original finding important to the field. One set of genes consists of those of the retromer, and the other relates to the Sec61 translocon. This should indicate that once endocytosed, the BoNT traffics from the endosomes to the Golgi apparatus, and then to the ER. Ultimately, the LC should translocate from the ER lumen to the cytosol using the Sec61 translocon. The authors further control that the SV2 receptor for the BoNT/A traffics along the axon in a retromer-dependent fashion and that BoNT/A-mNG11x3 traverses the Golgi apparatus by fusing the mNG1-10 to a Golgi resident protein.

      Strengths:

      The findings in this work are convincing. The experiments are carefully done and are properly controlled. In the first part of the study, both the activity of the LC is monitored together with the physical presence of the toxin. In the second part of the work, the most relevant genes that came out of the siRNA screen are checked individually in the ReNcell VN / BoNT/A reporter system to confirm their role in BoNT/A trafficking and retro-translocation.

      These findings are important to the fields of toxinology and medical treatment of neuromuscular diseases by BoNTs. They may explain some aspects of intoxication such as slow symptom onset, aggravation, and appearance of central effects.

      Weaknesses:

      The findings antagonize the current view of the intoxication pathway that is sustained by a vast amount of observations. The findings are certainly valid, but their generalization as the sole mechanism of BoNT intoxication should be tempered. These observations are restricted to one particular neuronal model and engineered protein tools. Other models such as isolated nerve/muscle preparations display nerve terminus paralysis within minutes rather than days. Also, the tetanus neurotoxin (TeNT), whose mechanism of action involving axonal transport to the posterior ganglia in the spinal cord is well described, takes between 5 and 15 days. It is thus possible that different intoxication mechanisms co-exist for BoNTs or even vary depending on the type of neurons.

      Although the siRNA experiments are convincing, it would be nice to reach the same observations with drugs affecting the endocytic to Golgi to ER transport (such as Retro-2, golgicide or brefeldin A) and the Sec61 retrotranslocation (such as mycolactone). Then, it would be nice to check other neuronal systems for the same observations.

      We thank the reviewer for the careful reading and comments of our manuscript. The reference to “a vast amount of observation” is a similar argument to the Reviewer 1 and used to suggest that our study may not be applicable as a general mechanism.

      We respectfully disagree as described above and posit on the contrary that the model we propose is much more likely to be general than the model presented in current reviews for the several reasons cited (see added text in Introduction and Discussion). While we agree that more work is needed to confirm the proposed mechanisms of BonT translocation in other models, these experiments fall outside the perimeter of our study.

      The fact that nerve/muscle preparations of BonT activity have relatively fast kinetics does not pose a contradiction to our model. Our model reveals primarily the requirement for trafficking to the ER membranes. This ER targeting requires trafficking through the Golgi complex, in turn explaining the requirement for trafficking to the soma of neurons in the experimental system we used. However, in neuronal cells in vivo, Golgi bodies can be found along the lenght of the axon, thus BonT may not always require trafficking to the soma of the affected cells. The time required for intoxication could thus vary greatly depending on the neuronal structural organisation.

      TenT is proposed to transfer from excitatory neurons into inhibitory neurons before exerting its action. While the detailed mechanism of this fascinating mechanism remain to be explored, it clearly falls beyond the purview of this manuscript.

      Regarding the use of drugs, we agree that it would be a nice addition; unfortunately we are unable to perform such experiments at this stage. Setting up a large scale siRNA screen for BonT mechanism of action is challenging as it requires a special facility with controlled access and police authorisation (in Singapore) given the high toxicity of this molecule. Unfortunately, the authorisations have now lapsed.

      Reviewer #3 (Public Review): Summary:

      The manuscript by Yao et al. investigates the intracellular trafficking of Botulinum neurotoxin A (BoNT/A), a potent toxin used in clinical and cosmetic applications. Contrary to the prevailing understanding of BoNT/A translocation into the cytosol, the study suggests a retrograde migration from the synapse to the soma-localized Golgi in neurons. Using a genome-wide siRNA screen in genetically engineered neurons, the researchers identified over three hundred genes involved in this process. The study employs organelle-specific split-mNG complementation, revealing that BoNT/A traffics through the Golgi in a retromer-dependent manner before moving to the endoplasmic reticulum (ER). The Sec61 complex is implicated in the retro-translocation of BoNT/A from the ER to the cytosol. Overall, the research challenges the conventional model of BoNT/A translocation, uncovering a complex route from synapse to cytosol for efficient intoxication. The findings are based on a comprehensive approach, including the introduction of a fluorescent reporter for BoNT/A catalytic activity and genetic manipulations in neuronal cell lines. The conclusions highlight the importance of retrograde trafficking and the involvement of specific genes and cellular processes in BoNT/A intoxication.

      Strengths:

      The major part of the experiments are convincing. They are well-controlled and the interpretation of their results is balanced and sensitive.

      Weaknesses:

      To my opinion, the main weakness of the paper is in the interpretation of the data equating loss of tGFP signal (when using the Red SNAPR assay) with proteolytic cleavage by the toxin. Indeed, the first step for loss of tGFP signal by degradation of the cleaved part is the actual cleavage. However, this needs to be degraded (by the proteasome, I presume), a process that could in principle be affected (in speed or extent) by the toxin.

      We thank the reviewer for his comments and careful reading of our manuscript.

      Regarding the read-out of the assay, we agree that the assay could be sensitive to alteration in the protein degradation pathway. We have added the following sentence in the Discussion to take it into account:

      “As noted by one reviewer, the assay may be sensitive to perturbation in the general rate of protein degradation, a consideration to keep in mind when evaluating the results of large scale screens.”

      While this may be valid for some hits in the general list, it is important to note that the main hits have been shown to affect toxin trafficking by an independent, orthogonal assay based on the split GFP reconstitution.

      Recommendations to authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) To assess the activity of BoNT/A in neurons, Yeo et al. have generated a neuronal stem line referred to as SNAPR. This cell line stably expresses a chimeric reporter protein that consists of SNAP25 flanked at its N-terminus with a tagRFPT and at its C-terminus with a tagGFP. After exposure to BoNT/A, SNAP25 is cleaved and, the C-terminal tGFP-containing moiety is rapidly degraded. I have many doubts about the validity of the described method. Indeed, BoNT/A activity is analysed in an indirect way by quantifying the degradation of the GFP moiety generated after toxin cleavage (Fig. 2). In this regard, the authors should consider that their approach is dependent, not only on the toxin's metalloprotease activity but also on the functionality of the proteasome in neurons. Therefore, considering the current dataset, it is impossible to rule out the possibility that the progression of GFP signal loss from the soma to the neurite terminals may be attributed to the different proteasome activity in these compartments. Is it conceivable that the GFP fragment generated upon toxin cleavage degrades more rapidly in the soma in comparison to axonal terminals? This alternative explanation could challenge the conclusion drawn in Fig. 2.

      The reviewer’s alternative explanation disregards the experiments performed with the split-GFP complementation approach, which indicate translocation in the soma first. The split GFP reporter is not dependent on the proteasome activity. It also disregard the genetic data implicating many genes involved in membrane retrograde traffic, which are also not consistent with the hypothesis of the reviewer. These genes depletions not only affect SNAPR degradation but also BoNT/A-mNG11 trafficking: thus, their effect cannot be attributed to an completely hypothetical spatial heterogeneous distribution of the proteasome.

      For this reason, I strongly suggest using a more physiological approach that does not depend on proteasomal degradation or on the expression of the sensor in neurons. The authors should consider performing a time course experiment following intoxication and staining BoNT/A-cleaved SNAP25 by using specific antibodies (see Antonucci F. et al., Journal of Neuroscience, 2008 or Rheaume C. et al., Toxins 2015).

      For the above reason, we do not agree with the pressing importance of confirming by a third method using specific antibodies; especially considering that BonT is very difficult to detect in cells when incubated at physiological levels. By the way, the cited paper, by Antonucci F; et al. documents long distance retrograde traffic of BonT/A, which is in line with our data.

      An alternative approach could involve the use of microfluidic devices that physically separate axons from cell bodies. Such a separation will allow us to test the authors' primary conclusion that SNAP25 is initially cleaved in the soma. The suggested experiments will also rule out potential overexpression artifacts that could influence the authors' conclusions when using the newly developed SNAPR approach. Without these additional experiments, the authors' main conclusion that SNAP25 is cleaved first in the neuronal soma rather than at the nerve terminal is inadequate.

      As discussed above we disagree about the doubts raised by the reviewer: we present three types of evidences (SNAPR, split GFP and genetic hits) and they all point in the same direction. Thus, we respectfully doubt that a fourth approach would convince this reviewer. To note, we have attempted to use microfluidics devices as suggested by the reviewer, however, the Ren-VM neurons were not able to extend axons long enough across the device.

      (2) To detect BoNT/A translocation into the cytosol, the authors have used a complementation assay by intoxicating ReNcell VM cell expressing a cytosolic HA-tagged split monomeric NeonGreen (Cyt-mNG1-10) with an engineered BoNT/A, where the catalytic domain (LC) was fused to mNG1-11. When drawing conclusions regarding the detection of cytosolic LC in the neuronal soma, the authors should highlight the limitations of this assay and explicitly describe them to the readers. Firstly, the authors need to investigate whether the addition of mNG1-11 to the LC affects the translocation process itself (by comparing with a WT, not tagged, LC).

      Additionally, from the data shown in Fig. 2C, it is evident that the Cyt-mNG1-10 is predominantly expressed in the cytosol and less detected in neurites. This raises the question of whether there might be a bias for the cell soma in this assay. To address this important concern, I suggest quantifying MFI per cell (Fig. 2D) taking into consideration the amount of HA-tagged Cyt-mNG1-10. Furthermore, I strongly suggest targeting mNG1-10 to synapses and performing a similar time course experiment to observe when LC translocation occurs at nerve terminals. Alternative experiments, to prove that BoNT/A requires retrograde trafficking before it can translocate, may be done to repeat the experiments shown in Fig. 2D in the presence of inhibitors (or by KD some of the hits identified as microtubule stabilizers) that should interfere with BoNT/A trafficking to the neuronal somata. Without these additional experiments, the authors' main conclusion that the BoNT/A catalytic domain is first detected in the neuronal soma rather than at the nerve terminal is very preliminary.

      Similarly as for the SNAPR assay, the reviewer is raising the level of doubt to very high levels. We respect his thoroughness and eagerness to question the new model. However, we note that a similar level of scrutiny does not apply to the prevalent competitive model. Indeed, the data supporting the self-translocation model is based on a single in vitro experiment published in one panel as we have explain din the discussion (see above).

      (3) In the genome-wide RNAi screening, rather than solely assessing SV2 surface levels, it would have been beneficial to directly investigate BoNT/A binding to the neuronal membrane. For instance, this could have been achieved by using a GFP-tagged HC domain of BoNT/A. At present, the authors cannot exclude the possibility that among the 135 hits that did not affect SV2 levels, some might still inhibit BoNT/A binding to the neuronal surface. These concerns, already exemplified by B4CALT4 (which is known to be involved in the synthesis of GT1b), should be explicitly addressed in the main text.

      We agree with the reviewer that perturbation of binding of BonT is possible. We added the following text:

      “Network analysis reveals regulators of signaling, membrane trafficking and thioreductase redox state involved in BoNT/A intoxication

      Among the positive regulators of the screen, 135 hits did not influence significantly surface SV2 levels and are thus likely to function in post-endocytic processes (Supplementary Table 2). However, we cannot formerly exclude that they could affect binding of BonT to the cell surface independently of SV2.”

      (4) The authors should clearly state which reagents they have tried to use in order to explain the challenges they faced when directly testing the trafficking of BoNT/A. The accumulation of Dendra-SV2 bulbous structures at the neurite tips in VPS35-depleted cells could be interpreted as a sign of neuronal stress/death. Have the authors investigated other proteins that do not undergo retro-axonal trafficking in a retromer-dependent manner? This control is essential. In this regard, the use of a GFP-tagged HC domain of BoNT/A could prove to be quite helpful.

      We tried multiple commercially available antibodies against BonT but we could not get a very good signal. The postdoc in charge of this project has now gone to greener pastures and we are not in the capacity to provide the details corresponding to these antibodies. We di dnot observe significant cell death after VPS-35 knockdown at the time of the experiment, however longe rterm treatment might result in toxicity indeed.

      (5) Considering my concerns related to the SNAPR system and the complementation assay to study SNAP25 cleavage and BoNT/A trafficking, I suggest validating some of their major hits (ex. VPS34 and Sec61) by performing WB or IF analysis to examine the cleavage of endogenous SNAP25. Furthermore, the authors should test VPS35 depletion in the context of the experiments performed in Fig. 6G-H, by validating that this protein is essential for BoNT/A retrograde trafficking.

      The reviewer concerns are well noted but as discussed above, the two systems we used are completely orthogonal. Thus, for the reviewer’s concerns to be valid, it would have to be two completely independent artefacts giving rise to the same result. The alternative explanation is that BonT/A translocates in the soma. The Ockham razor principle dictates that the simplest explanation is the likeliest.

      (6) The introduction and the discussion section of this paper completely disregard more than 20 years of research conducted by several labs worldwide (Montecucco, Montal, Schiavo, Rummel, Binz, etc). The authors should make an effort to contextualize their data within the framework of these studies and address the significant discrepancies between their proposed intoxication model and existing research that clearly demonstrates BoNTs translocating upon the endocytic retrieval of SVs at presynaptic sites. Nevertheless, even assuming that the model proposed by the authors is accurate, numerous questions emerge. One such question is: How can the authors explain the exceptional toxicity of botulinum neurotoxin in an ex vivo neuromuscular junction preparation devoid of neuronal cell bodies (see Cesare Montecucco and Andreas Rummel's seminal studies)?

      Please see above in the answer to public reviews.

      (7) Scale bars should be added to all representative pictures.

      This has been done. Thank you for the thorough reading of our manuscript.

      Reviewer #2(Recommendations For The Authors):*

      (1) The title overstates the results. It may be indicated "in differenciated ReNcell VM".

      Title changed to: “Botulinum toxin intoxication requires retrograde transport and membrane translocation at the ER in RenVM neurons”

      (2) In the provided manuscript there are two Figure 2 and no Figure 3. This made the reading and understanding extremely difficult and should be corrected. As a result, the Figure legends do not fit the numbering. There are also discrepancies between some Figure panels (A, B, C, etc), the text, and the Legends. All this needs to be carefully checked.

      We apologize for the confusion as the manuscript as followed multiple rounds of revisions. We have carefully verified labels and legends.

      (3) The BoNT/A-mNG11x3 may introduce some bias that could be discussed. Would these additional peptides block LC translocation from synaptic vesicles in the nerve termini? In addition, the mNG peptides that are unfolded before complementation may direct LC towards Sec61. These aspects should be discussed.

      The comment would be valid if BoNT/A-mNG11x3 was the only approach used in the paper, however the SNAPR reporter is used with native BonT and shows data consistent with the split GFP approach.

      (4) In the Figure about SV2 (Fig 3 or 4): The authors did not locate SV2. The cells seem not to have the same differentiated phenotype as in Figure 1 and Figure 2/3A.

      We apologized above for the mislabeling. It is not clear what is the question here.

      (5) The authors should check whether BoNT/A wt cleaves the endogeneous SNAP25 by western blot for instance in the original ReNcell VN before SNAPR engineering. This should be compared with wt SNAP25 cleavage by the BoNT/A-LC-mNG.

      It is likely that BoNT/A-LC-mNG11 should have similar activity as it is only adding a small peptide at the end of the LC. At any rate, it is not clear why this is so important since both molecules translocate in the cytosol, with the same kinetics and in the same subcellular locale.

      (6) Perhaps I did not understand. How can the authors exclude that what is observed is the kinetic overproduction of the reporter substrate SNAPR?

      The authors could use SLO toxin (PNAS 98, 3185-3190, 2001) to permeabilize the cells all along their body and axon to introduce BoNT/A or LC (wt) and observe synchronized SNAPR cleavage throughout the cells.

      The concept mentioned here is not very clear to us. The reviewer is proposing that the SNAPR is produced much more efficiently at the tips of the neurites and thus its cleavage takes longer to be detected and is apparent first in the soma?? With all due respect, this is a strange hypothesis, at odds with what we know of protein dynamics in the neurons (i.e. most proteins are largely made in the soma and transported or diffuse into the neurites).

      Again, the two orthogonal approaches: split GFP and SNAPR reporter use different constructs and methods, yet converge on similar results. Perhaps, the incredulity of the reviewer might be more productively directed at the current data “demonstrating” the translocation of LC in the synaptic button?

      (7) The authors could also use an essay on neurotransmitter release monitoring by electrophysiology measurements to check the functional consequences of the kinetic diffusion of LC activity along the axon. Can the authors exclude that some toxin molecules translocate from the endocytic vesicles and block neurotransmission within minutes or a few hours?

      It is well established that inhibition of neurotransmission does not occur within minutes in vivo and in vitro, but rather within hours or even days. This kinetic delay is experienced by many patients and is one of the key argument against the current model of self-translocation at the synaptic vesicle level.

      Minor remarks

      Thank you for pointing out all these.

      (1) Please check typos. There are many. Check space before the parenthesis, between numbers and h (hours), reference style etc.

      Thank you. We have reviewed the text and try to eliminate all these instances.

      (2) Line 90: The C of HC should be capitalized.

      Fixed

      (3) Line 107: add space between "neurons(Donato".

      Fixed

      (4) Line 109: space "72 h".

      Fixed

      (5) Line 115: a word is missing ? ...to show retro-axonal... ? Please clarify this sentence.

      Fixed

      (6) Figure 1E: does nm refer to nM (nanomolar)? Please correct. No mention of panel F.

      Fixed

      (7) Line 161: do you mean ~16 µm/h? Please correct.

      Fixed

      (8) Line 168, words are missing.

      Fixed, thank you

      We verified that Cyt-mNG1-10 was expressed using the HA tag, the expression was homogeneously distributed in differentiated neurons and we observed no GFP signal (Figure2C).

      (9) Line 171: Isn't mNG 11 the eleventh beta strand of the neon green fluorescent protein, not alpha helix? Otherwise, can the authors confirm it acquires the shape of an alpha helix? Same at line 326.

      We have corrected the mistake; thanks for pointing it out.

      (10) Figure 2 is doubled. The legend of Fig 2 refers to Figure 3. There is no legend for Figure 2. Then, some figures are shifted in their numbering.

      Fixed

      (11) The fluorescence in the cell body must appear before the fluorescence in the axon due to higher volume. Please discuss.

      The fluorescence progresses in the neurites extensions in a centripetal fashion. The volume of the neurite near the cell body is not significantly different from the end of the neurite. Thus the fluorescence data is consistent with translocation in soma and not with an effect due to higher volume in the soma.

      (12) Figure 2D, right: the term intoxication is improper for this experiment. Rather, it is the presence of the BoNT/A-mNG11 that is detected. I believe the authors should be particularly careful about the use of terms: intoxication means blockade of neurosecretion, SNAPR cleavage means activity etc.

      While the reviewer is correct that it is the presence of BoNT/A-mNG11 that is detected, it remains that it is an active toxin, so the neurons are effectively intoxicated; as they are when we use the wild type toxin. We do not imply that we are measuring intoxication, but simply that the neurons are put into contact with a toxin.

      (13) Line 196: Should we read TXNRD1 is required for BoNT/A LC translocation? TXNRD1 in the current model of translocation is located in the cytoplasm and is supposed to play a role in the cleavage of the disulfide bond linking LC to HC. In the model proposed by this study, LC is translocated through the Sec61 translocon. In this case, I would assume that the protein disulfide isomerase (PDI) in the endoplasmic reticulum would reduce the LC-HC disulfide bond. In that case, TXNRD1 would not be required anymore. Please discuss.

      Why should we assume that a PDI is involved in the reduction of the LC-HC disulfide bond? In our previous studies on A-B toxins (PE and Ricin), different reduction systems seemed to be at play. There is no conceptual imperative to assume reduction in the ER because the Sec61 translocon is implicated. Reduction might occur on the cytosolic side by TXNRD1 or the effect of this reductase could be indirect.

      (14) The legend of Figure 4 (in principle Figure 5?) is not matching with the panels and panel entries are missing (Figure 4F in particular).

      Fixed

      (15) Figure 6 panels E and H, please match colors with legend (grey and another color).

      Not clear

      (16) Please indicate BoNT/A construct concentrations in all Figure legends.

      Done

      (17) Line 416: isn't SV2 also involved in epilepsy?

      Yes it is.

      (18) Line 433: as above, shouldn't the disulfide bond linking LC to HC be cleaved by PDI in the ER in this model (as for other translocating bacterial toxins) rather than by thioredoxin reductases in the cytoplasm? Please discuss.

      See above

      (19) Identification of vATPase in the screen could be consistent with the endocytic vesicle acidification model of translocation.

      Yes

      (20) Did the authors add KCl in screening controls without toxins? This should be detailed in the Materials and Methods. Could there be a KCl effect on the cells? KCl exposure for 48 hours may be highly stressful for cells. The KCl exposure should last only several minutes for toxin entry.

      We did not observe significant cell detah with the cell culture conditions used. Cell viability was controlled at multiple stages using nuclei number for instance

      Reviewer #3 (Recommendations For The Authors):

      Main comments: (1) In Figure 1B: could you devise a means to prevent proteosomal degradation of the tGFP cleaved part to assess whether this is formed?

      We have also used a FRET assay after tintoxication and obtained similar results

      (2) Line 152: Where it reads "was not surprising", maybe I missed something, but to me, this is indeed surprising. If the toxin is rapidly internalized and translocated (therefore, it is able to cleave SNAP25), the fact that tGFP requires 48 hours to be degraded seems surprising to me. Or does it mean that the toxin also slows down the degradation of the tGFP fragment? So, how can you differentiate between the effect being on cleavage of the fragment or in tGFP degradation?

      The reviewer is correct, the “not” was a typo due to re-writting; the long delay between adding the toxin and observing cleavage was suprising indeed. Our interpretation is that it is trafficking that takes time, indeed, the split-GFP data kinetics indicates that the toxin takes about 48h to fill up the entire cytosol (Fig. 2D).

      (3) Regarding the effect of Sec61G knockdown, is it possible that the observed effects are indirect and not due to the translocon being directly responsible for translocating the protein?

      As discussed in the last part of the results,Sec61 knock-down results in block of intoxication, but does not prevent BonT from reaching the lumen of the ER (Figure 6G,H). Thus, Sec61 is “is instrumental to the translocation of BoNT/A LC into the neuronal cytosol at the soma.”

      Minor comments:

      (1) Fig. 3E: in the legend I think one of the NT3+ should be NT3-.

      Yes, thanks for spotting it

      (2) Would you consider adding Figure S4 as a main figure?

      Thanks for the suggestion

      (3) Please, check that all microscopy image panels have scale bars.

      Done

      (4) Figure 6B (bottom panes): why does it seem that there is a lot of mNeonGreen positive signal in regions that are not positive for HA? Shouldn't complementation keep HA in the complemented protein.

      Our assumption i sthat there is an excess of receptor protein (HA tag) over reconstituted protein (GFP protein) given the relatively low concentration of toxin being internalized and translocated Refs: (1) Pirazzini M, Azarnia Tehran D, Leka O, Zanetti G, Rossetto O, Montecucco C. On the translocation of botulinum and tetanus neurotoxins across the membrane of acidic intracellular compartments. Biochim Biophys Acta. 2016 Mar;1858(3):467–474. PMID: 26307528

      (2) Pirazzini M, Rossetto O, Eleopra R, Montecucco C. Botulinum Neurotoxins: Biology, Pharmacology, and Toxicology. Pharmacol Rev. 2017 Apr;69(2):200–235. PMCID: PMC5394922

      (3) Dong M, Masuyer G, Stenmark P. Botulinum and Tetanus Neurotoxins. Annu Rev Biochem. Annual Reviews; 2019 Jun 20;88(1):811–837.

      (4) Rossetto O, Pirazzini M, Fabris F, Montecucco C. Botulinum Neurotoxins: Mechanism of Action. Handb Exp Pharmacol. 2021;263:35–47. PMCID: 6671090

      (5) Williams JM, Tsai B. Intracellular trafficking of bacterial toxins. Curr Opin Cell Biol. 2016 Aug;41:51–56. PMCID: PMC4983527

      (6) Mesquita FS, van der Goot FG, Sergeeva OA. Mammalian membrane trafficking as seen through the lens of bacterial toxins. Cell Microbiol. 2020 Apr;22(4):e13167. PMCID: PMC7154709

      (7) Hoch DH, Romero-Mira M, Ehrlich BE, Finkelstein A, DasGupta BR, Simpson LL. Channels formed by botulinum, tetanus, and diphtheria toxins in planar lipid bilayers: relevance to translocation of proteins across membranes. Proc Natl Acad Sci U S A. 1985 Mar;82(6):1692–1696. PMCID: PMC397338

      (8) Donovan JJ, Middlebrook JL. Ion-conducting channels produced by botulinum toxin in planar lipid membranes. Biochemistry. 1986 May 20;25(10):2872–2876. PMID: 2424493

      (9) Fischer A, Montal M. Single molecule detection of intermediates during botulinum neurotoxin translocation across membranes. Proc Natl Acad Sci U S A. 2007 Jun 19;104(25):10447–10452. PMCID: PMC1965533

      (10) Fischer A, Nakai Y, Eubanks LM, Clancy CM, Tepp WH, Pellett S, Dickerson TJ, Johnson EA, Janda KD, Montal M. Bimodal modulation of the botulinum neurotoxin protein-conducting channel. Proc Natl Acad Sci U S A. 2009 Feb 3;106(5):1330–1335. PMCID: PMC2635780

      (11) Fischer A, Montal M. Crucial role of the disulfide bridge between botulinum neurotoxin light and heavy chains in protease translocation across membranes. J Biol Chem. 2007Oct 5;282(40):29604–29611. PMID: 17666397

      (12) Koriazova LK, Montal M. Translocation of botulinum neurotoxin light chain protease through the heavy chain channel. Nature structural biology. 2003. p. 13–18. PMID: 12459720

      (13) Moreau D, Kumar P, Wang SC, Chaumet A, Chew SY, Chevalley H, Bard F.Genome-wide RNAi screens identify genes required for Ricin and PE intoxications. Dev Cell. 2011 Aug 16;21(2):231–244. PMID: 21782526

      (14) Bassik MC, Kampmann M, Lebbink RJ, Wang S, Hein MY, Poser I, Weibezahn J, Horlbeck MA, Chen S, Mann M, Hyman AA, Leproust EM, McManus MT, Weissman JS. A systematic mammalian genetic interaction map reveals pathways underlying ricin susceptibility. Cell. 2013 Feb 14;152(4):909–922. PMCID: PMC3652613

      (15) Tian S, Muneeruddin K, Choi MY, Tao L, Bhuiyan RH, Ohmi Y, Furukawa K, Furukawa K, Boland S, Shaffer SA, Adam RM, Dong M. Genome-wide CRISPR screens for Shiga toxins and ricin reveal Golgi proteins critical for glycosylation. PLoS Biol. 2018 Nov;16(11):e2006951. PMCID: PMC6258472

    2. eLife assessment

      In this valuable manuscript, Yeo et al. describe new methods for assessing the intracellular itinerary of Botulinum neurotoxin A (BoNT/A), a potent toxin used in clinical and cosmetic applications. The current manuscript challenges previously held views on how the catalytic portion of the toxin makes its way from the endocytic compartment to the cytosol, to meet its substrates. The approach taken is deemed innovative and the experiments are carefully performed, presenting solid evidence for some of the drawn conclusion; however, the conclusions one may draw from the experimental results are somewhat limited, as it is possible that the scope of their findings could be restricted to the specific neuron model and molecular tools that were used. This paper could be of interest to both cell biologists and physicians.

    3. Reviewer #1 (Public Review):

      As outlined in my previous public review, Yeo et al. revised the current neuronal intoxication model, common to all serotypes of botulinum neurotoxins. Using a combination of genetic and imaging approaches, they demonstrate that upon internalization, BoNT/A-containing endosomes undergo retro-axonally trafficking to the neuronal soma. Within the soma, this particular serotype then traffics to the endoplasmic reticulum (ER) via the Golgi apparatus. At the ER, the SEC61 translocon complex facilitates the translocation of BoNT/A's metalloprotease domain (light chain, LC) from the ER lumen into the cytosol, where the thioredoxin reductase/thioredoxin system and HSP complexes release and refold the catalytic LC. Subsequently, the LC diffuses and cleaves SNAP25 first in the soma before reaching neurites and synapses.

      Although I still acknowledge the well-executed and thoroughly analyzed genome-wide RNAi screen, I must once again highlight significant pitfalls and weaknesses in the paper due to the lack of essential controls and validations. Consequently, I suggest readers to approach the authors' findings with caution, as they may be limited to the combination of one specific cellular model and genetic engineering tools. During the revision process, authors declined to conduct additional experiments that could have strengthened their main conclusions. These include, but are not limited to:

      (1) Investigating weather in the newly generated cell line Red-SNAPR, the GFP fragment produced upon toxin cleavage degrades more rapidly in the soma compared to axon terminals, possibly due to differences in proteasome activity in these two compartments.

      (2) Validating toxin cleavage activity in the soma before reaching synapses by conducting an additional and more physiological approach, a time course experiment using native BoNT/A and staining BoNT/A-cleaved SNAP25 with specific antibodies.

      (3) Assessing whether the addition of mNG1-11 to the LC affects the translocation process itself and quantifying the mean fluorescence intensity (MFI) per cell, taking into consideration the amount of HA-tagged Cyt-mG1-10, which appears predominantly expressed in the cytosol and less detected in neurites. This raises the question of potential bias toward the cell soma in this assay.

      (4) Validating major hits (e.g., VPS34 and Sec61) by performing WB or IF analysis to test the cleavage of endogenous SNAP25.

      Additionally, during the revision process, the authors raised concerns about the level of scrutiny applied by this reviewer, particularly in comparison to the seminal study of Lilia K. Koriazova & Mauricio Montal published in Nature Structural Biology (PMID: 12459720). In this 2003 paper, Montal's lab pioneered the use of single-channel recordings and substrate proteolysis analysis to reconstitute the translocation of BoNT/A light chain protease across an artificial lipid bilayer via the channel formed by its heavy chain. The authors highlighted that, when converting the experimental conditions from the aforementioned paper into molarity, it appears that the cis compartment was loaded with 10−8 M BoNT/A, and the reported translocated protease activity (measured by substrate cleavage) is equivalent to 10−17 M. This implies that only about 1 LC molecule in 100 million has crossed the membrane. The calculation performed by authors is indeed accurate. However, readers should be informed about another piece of information present in the same paper that might help them to clarify this important point. Koriazova & Montal, by discussing this experiment, have pointed out that this value (10−17 M) corresponds to ≈3600 LC molecules, a number closed to the maximum number of channels that can be formed under the used experimental conditions. Indeed, from the same paper, quotation: 'This number is in close agreement with the maximum number of channels inserted in the bilayer under the assay condition, ≈2000 (Fig. 3a), as estimated from macroscopic membrane conductance ∼1 × 105 pS and γ = 50 pS measured in 0.1 M KCl'. Another aspect that Yeo et al. forgot to mention in their rebuttal letter is that the system used by Koriazova & Montal lacks any chaperones in the trans compartment. Nowadays, we know that upon translocation, the refolding of the L chain is aided by Hsp90 (Azarnia Tehran et al., Cellular microbiology, 2017). Keeping this in mind, is not unrealistic to hypothesize that the number of LC molecules calculated more than 22 years ago by Koriazova & Montal (in an indirect way by checking SNAP25 cleavage using an ELISA-based assay) might be an underestimation. Indeed, the addition of Hsp90 in their system might aid in the refolding of LC molecules that, even if they have successfully be translocated, might not cleave the substrate due to their unfolded state.

      As active scientist, I understand the challenges of peer review and publication, which can often be slow and frustrating involving seemingly endless rounds of review. Therefore, I am in favor of the new eLife publishing model. Indeed, this paper has already been published as Reviewed Preprints and will soon be declared as the final Version of Record, accompanied by this public review. Having said that, I hope that the readers of this journal and future scientists will prove me wrong. I hope they will engage with this paper, providing comments, validations (which are currently missing), and citations as frequently as they did for the seminal works of Koriazova & Montal.

    4. Reviewer #2 (Public Review):

      Summary:

      The study by Yeo and co-authors addresses a long-lasting issue about botulinum neurotoxin (BoNT) intoxication. The current view is that the toxin binds to its receptors at the axon terminus by its HCc domain and is internalized in recycled neuromediator vesicles just after release of the neuromediators. Then, the HCn domain assists the translocation of the catalytic light chain (LC) of the toxin through the membrane of these endocytic vesicles into the cytosol of the axon terminus. There, the LC cleaves its SNARE substrate and blocks neurosecretion. However, other views involving kinetic aspects of intoxication suggest that the toxin follows the retrograde axonal transport up to the nerve cell body and then back to the nerve terminus before cleaving its substrate.

      In the current study, the authors claim that the BoNT/A (isotype A of BoNT) not only progresses to the cell body but once there, follows the retrograde transport trafficking pathway in a retromer-dependent fashion, through the Golgi apparatus, until reaching the endoplasmic reticulum. Next, the LC dissociates from the HC (a process not studied here) and uses the translocon Sec61 machinery to retro-translocate into the cytosol. Only then, the LC traffics back to the nerve terminus following the anterograde axonal transport. Once there, LC cleaves its SNARE substrate (SNAP25 in the case of BoTN/A) and blocks neurosecretion.

      To reach their conclusion, Yeo and co-authors use a combination of engineered tools: a cell line able to differentiate into neurons (ReNcell VN), a reporter dual fluorescent protein derived from SNAP25, the substrate of BoNT/A (called SNAPR), the use of either native BoNT/A or a toxin to which three fragment 11 of the reporter fluorescent protein Neon Green (mNG) are fused to the N-terminus of the LC (BoNT/A-mNG11x3), and finally ReNcell VN transfected with mNG1-10 (a protein consisting of the first 10 beta strands of the mNG).

      SNAPR is stably expressed all over in the ReNcell VN. SNAPR is yellow (red and green) when intact and becomes red only when cleaved by BoNT/A LC, the green tip being degraded by the cell. When the LC of BoNT/A-mNG11x3 reaches the cytosol in ReNcell VN transfected by mNG1-10, the complete mNG is reconstituted and emits a green fluorescence.

      In the first experiment, the authors show that the catalytic activity of the LC appears first in the cell body of neurons where SNAPR is cleaved first. This phenomenon starts 24 h after intoxication and progresses along the axon towards the nerve terminus during an additional 24 h. In a second experiment, the authors intoxicate the ReNcell VN transfected by mNG1-10 using the BoNT/A-mNG11x3. The fluorescence appears also first in the soma of neurons, then diffuses in the neurites in 48 h. The conclusion of these two experiments is that translocation occurs first in the cell body and that the LC diffuses in the cytosol of the axon in an anterograde fashion.

      In the second part of the study, the authors perform a siRNA screen to identify regulators of BoNT/A intoxication. Their aim is to identify genes involved in intracellular trafficking of the toxin and translocation of the LC. Interestingly, they found positive and negative regulators of intoxication. Regulators could be regrouped according to the sequential events of intoxication. Genes affecting binding to the cell-surface receptor (SV2) and internalization. Genes involved in intracellular trafficking. Genes involved in translocation such as reduction of the disulfide bond linking the LC to the HC and refolding in the cytosol. Genes involved in signaling such as tyrosine kinases and phosphatases. All these groups of genes may be consistent with the current view of BoNT intoxication within the nerve terminus. However, two sets of genes were particularly significant to reach the main conclusion of the work and definitely constitute an original finding important to the field. One set of genes consists in those of the retromer, the other relates to the Sec61 translocon. This should indicate that once endocytosed, the BoNT traffics from the endosomes to Golgi apparatus, then to the ER. Ultimately, the LC should translocate from the ER lumen to the cytosol using the Sec61 translocon. The authors further control that the SV2 receptor for the BoNT/A traffics along the axon in a retromer-dependent fashion and that BoNT/A-mNG11x3 traverses the Golgi apparatus by fusing the mNG1-10 to a Golgi resident protein.

      Strengths:

      The findings in this work are convincing. The experiments are carefully done and are properly controlled. In the first part of the study, both the activity of the LC is monitored together with the physical presence of the toxin. In the second part of the work, the most relevant genes that came out of the siRNA screen are checked individually in the ReNcell VN / BoNT/A reporter system to confirm their role in BoNT/A trafficking and retro-translocation.<br /> These findings are important to the fields of toxinology and medical treatment of neuromuscular diseases by BoNTs. They may explain some aspects of intoxication such as slow symptom onset, aggravation and appearance of central effects.

      Weaknesses:

      The findings antagonize the current view of the intoxication pathway that is sustained by a vast amount of observations. The findings are certainly valid, but their generalization as the sole mechanism of BoNT intoxication should be tempered. These observations are restricted to one particular neuronal model and engineered protein tools. Other models such as isolated nerve/muscle preparations display nerve terminus paralysis within minutes rather than days. Also, the tetanus neurotoxin (TeNT), which mechanism of action involving axonal transport to the posterior ganglia in the spinal cord is well described, takes between 5 and 15 days. It is thus possible that different intoxication mechanisms co-exist for BoNTs or even vary depending on the type of neurons.

      Although the siRNA experiments are convincing, it would be nice to reach the same observations with drugs affecting the endocytic to Golgi to ER transport (such as Retro-2, golgicide or brefeldin A) and the Sec61 retrotranslocation (such as mycolactone). Then, it would be nice to check other neuronal systems for the same observations.

    5. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Yeo et al. investigates the intracellular trafficking of Botulinum neurotoxin A (BoNT/A), a potent toxin used in clinical and cosmetic applications. Contrary to the prevailing understanding of BoNT/A translocation into the cytosol, the study suggests a retrograde migration from the synapse to the soma-localized Golgi in neurons. Using a genome-wide siRNA screen in genetically engineered neurons, the researchers identify over three hundred genes involved in this process. The study employs organelle-specific split-mNG complementation, revealing that BoNT/A traffics through the Golgi in a retromer-dependent manner before moving to the endoplasmic reticulum (ER). The Sec61 complex is implicated in the retro-translocation of BoNT/A from the ER to the cytosol. Overall, the research challenges the conventional model of BoNT/A translocation, uncovering a complex route from synapse to cytosol for efficient intoxication. The findings are based on a comprehensive approach, including the introduction of a fluorescent reporter for BoNT/A catalytic activity and genetic manipulations in neuronal cell lines. The conclusions highlight the importance of retrograde trafficking and the involvement of specific genes and cellular processes in BoNT/A intoxication.

      Strengths:

      The major part of the experiments are convincing. They are well-controlled and the interpretation of their results is balanced and sensitive.

      Weaknesses:

      To my opinion, the main weakness of the paper is that all experiments are performed using a single cellular system (RenVM neurons), as stated in the title. It is therefore unclear at the moment to what extent the findings in this paper can be generalized to other neuronal cell models / in vivo situation.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      (1) Importantly, it would be useful to have provided more detailed information on the structure and histological properties of the murine cysts and how such findings relate to human lung cysts. Also, the authors should examine whether there is any information on Bmpr1a in human cyst formation (i.e GWAS data).

      We fully agree that it is important to examine Bmpr1a in human cyst pathology. Unfortunately, there is no GWAS data on this. From the published RNA-seq data, which were obtained from postnatal lung specimen of congenital pulmonary airway malformation (CPAM) patients, “integrated suppression of BMP signaling pathway” was reported although altered expression of BMPR1A was not presented. We speculate that (1) BMPR1A is critical in embryonic development and a germline deficiency of BMPR1A may lead to early embryonic lethality prior to lung formation as supported by mouse data; (2) As suggested by our previously published study related to TGF-beta signaling and prenatal pulmonary cysts (Miao et al., Am J Physiol Lung Cell Mol Physiol 2021), dysregulation of BMPR1A-mediated signaling in a particular time window of fetal lung development may be sufficient to cause cyst formation, so that BMPR1A alteration may not be persistent to postnatal lung specimens.

      (2) Throughout the paper, there is a lack of quantification for the histological findings. Littermate controls should also be clearly defined genetically,

      We thank the reviewer for this suggestion and acknowledge the importance of quantitative measurement for the changes. We now add quantitative data on branching number and size of the airway tips to define the difference between wild-type and Bmpr1a CKO mouse lungs in Fig.1. “The littermate controls were the mice without any gene deletion due to lack of transgenes Tbx4-rtTA and/or TetO-Cre”, which is now added in Materials and Methods.

      (3) Figure 1 suppl: "Doxycycline" is misspelled.

      This has been corrected.

      (4) Figure1c Suppl: Hard to discern clear-cut expression of Bmpr1a protein in mesenchyme in WT. Comparable images with similar sizes of airways should be used.

      To provide a clearer comparison of Bmpr1a expression patterns between Bmpr1a CKO and control lungs, we enlarge the fluorescent stained lungs presented in Supplemental Figure 1C as suggested by the editor. Additionally, dotted lines have been added to delineate the airway boundaries from the surrounding mesenchyme to better visualize the Bmpr1a distribution in lung mesenchyme. Bmpr1a expression in fetal lung mesenchyme is easily detected at E15.5 when significant dilation of airways is presented in Bmpr1a CKO lung. It is rare to have comparable sizes of peripheral airways in the Bmpr1a CKO lung at this point.

      (5) Figure 2a: Expression of several genes studied and altered should be identified on scatter plot.

      As suggested by the reviewer, we now highlight the related genes, including Acta2, Myocd, Eln, Bmp4, Sox2, etc., in the scatter plot. In addition, we also highlight these critical genes in the heatmap (Fig. 2B and Fig. 7B).

      (6) Figure 2c: Authors should also consider staining for other smooth muscle markers.

      We now include a panel of Myh11 immunostaining in Figure 2E. Myh11 is another common marker for smooth muscle cells. Lack of Myh11 staining in Bmpr1a CKO lung airways further supports our conclusion that loss of mesenchymal Bmpr1a leads to defective airway smooth muscle growth.

      (7) Figure 3: ELN expression should be defined in a clear quantitative manner.

      We have presented RNA-seq data, Real-time PCR results, immunostaining, and western blot data for in vivo samples. Additionally, we have included in vitro experiment illustrating that Bmp4 induces Eln expression, suggesting that BMP signaling regulates Eln expression. We believe that these datasets collectively support our conclusion.

      (8) Figure 4: Additional information on p38 dependent signaling (Including in vivo studies) would potentially help to understand key molecular events and perhaps could help to address key mechanistic events, including their location and identity.

      We sincerely appreciate the insightful suggestion from the reviewer. While the study of p38-dependent signaling is definitely important to dissect the entire mechanisms, we are not going to include such experiments in this manuscript due to time constraints associated with in vivo studies.

      (9) Figure 6: Would be helpful to know whether Bmpr1a receptor is expressed in Myocd KO.

      Bmpr1a expression is not changed in Myocd KO lungs, which is now included as Figure 6C. Together with other data, this suggests that Myocd is a downstream target directly mediating Bmpr1a-regulated airway smooth muscle development.

      (10) Figure 7: Not clear how these findings, though interesting, relate to the body of studies and the pathogenesis of cyst formation. Other points: 1) The authors should re-examine/repeat co-staining in the KO mouse lung (right 2 images in the top group of 4) for Foxj1, Sox2, and CDH (right 2 images, Figure 7A). For one thing, the cadherin stain in the 2 KO images seems localized to the lumen. Secondly, the pattern of cadherin staining looks exactly the same in both KO images, suggesting an error and/or duplication 2) authors should place arrows on the heat map showing the location of SPC, Sox2, Sox9, and FoxJ1 bands 3) figure 7D graph needs numbers on y axis.

      Fig.7 provides an additional potential mechanism by which deficient Bmp signaling leads to abnormally increased Bmp ligand expression, which disrupts the formation of epithelial proximal-distal axis, and results in cystic defects. Further in vivo experiments are needed to test this, which is beyond the scope of this paper.

      The E-cadherin staining signal in the lumen is caused by the tissue section positioned at an interface between lumen and the apical membrane of the lining epithelial cells where the E-cadherin is localized.

      Triple immunostaining of E-Cadherin, Sox2, and FoxJ1 was performed for the same tissue section (upper two panels of Figure 7A) as these antibodies were derived from different species, but the images are presented in two different combinations for simplicity and clarity. For the lower two panels of Figure 7A, double immunostaining of Sox9/E-Cadherin and Spc/E-Cadherin were performed separately on different tissue sections due to both anti-Sox9 and anti-Spc antibodies were produced from rabbits.

      The genes listed in the heatmap are canonical and putative marker genes for differential lung epithelial cell lineages, such as Scgb1a1 for Clara cells and FoxJ1 for ciliated cells. Therefore, progenitor cell marker Sox2 and Sox9 were not included. In the updated heatmap, four widely acknowledged epithelial cell markers—Scgb1a1, FoxJ1, Sftpb, and Sftpc have been distinguished by utilizing a distinct font color (red) to enhance their visibility.

      Label for the y axis of Fig.7D is now added.

      Reviewer #2 (Public Review):

      (1) The authors may be aware that a recent paper (https://doi.org/10.1038/s41598-022-24858-3) reported on transcriptional changes seen in human CPAM. It would seem that some of the molecular changes seen in human CPAM move in the opposite direction of what is reported in mice lacking mesenchymal Bmrp1a. Perhaps the authors could comment on these differences in the discussion and whether they potentially explain the etiology of CPAM or branching morphogenesis in general.

      We thank the reviewer for referring this paper regarding human CPAM study. CPAM has a variety of histopathology. The type 1 CPAM is assumed to develop from more proximal bronchial/bronchiolar airways while type 2 CPAM is developed from relatively distal bronchiolar airways. In that publication, surgical resected lung specimens were collected from type 1 CPAM patients postnatally (0.5-1 year), in which the cysts were lined with ciliated pseudostratified columnar epithelial cells. Gene expression was compared between cystic lung tissues and adjacent non-cystic lung tissues. Interestingly, integrated suppression of BMP signaling pathway was shown by their data analysis. In our mouse model, the histopathology appears as human type 2 CPAM, such as back-to-back cysts lining with a simple layer of epithelial cells. Therefore, several factors could explain the differences between their published data and our study at the molecular level: (1) Different types of CPAM based on the histopathology; (2) Different sampling time points, developing cysts at fetal stage in mouse sample vs. developed cysts in postnatal huma samples; (3) Different comparison of diseased and normal tissues: separate normal lungs vs. cystic lungs in mice while in human cystic tissues vs. non-cystic tissues in the same lungs. We now include this reference in the Discussion.

      (2) Figure 4 shows that BMP4 increases SMADs, p38, and several muscle genes in mesenchymal cells. Figure 5 extends this finding with a clever strategy to label airway and vascular smooth muscle with different fluorescent molecules used to isolate different types of mesenchymal cells. It shows that non-vascular smooth muscle cells but not perivascular smooth muscles are responsive to BMP4 signaling as defined by increased expression of Myh11. Are there cell-restricted responses to the other genes shown in Figure 4? Given the lack of SMAD signaling and the increase seen in p38 signaling, would blocking p38 signaling influence the BMP responsiveness of these nonvascular smooth muscle cells?

      We thank the reviewer for this constructive comment. As we have addressed above, we will leave p38-mediated signaling and cyst formation to next step study due to time constraints associated with these studies.

      (3) Figure 6 shows that mesenchymal loss of Myocd causes a deficiency of airway smooth muscle cells, but this was not sufficient to create cysts. Did the authors ever check to see if it changed Sox2-Sox9 staining in the airway epithelium?

      There is no significant change in Sox2 expression in proximal airway epithelia of Myocd CKO lungs as detected by immunostaining. The result was not included in this manuscript.

      (4) Figure 7 shows that mesenchymal loss of Bmpr1a proximalizes the distal airway as defined by loss of Sox2 and FoxJ1 (a ciliated marker) and gain in (Sox9 and SP-C) staining. But Club cells expressing Scgb1a1 and Cyp2F2 are the predominant epithelial cells in the distal airway. The transcriptomics data in panel B shows expression of these genes is less in the mutant mice. Does this mean they fail to generate Club cells or there is just less expression per cell? In other words, what are the primary epithelial cells present in the airways of mice with loss of mesenchymal Bmpr1a?

      As shown in the heatmap of Fig.7b, the dysregulated gene expression in the Bmpr1a CKO extends beyond the featured epithelial cell markers, encompassing alterations in numerous putative marker genes. For example, several putative Club cell markers in addition to Scgb1a1 and Cyp2F2 were reduced in the Bmpr1a CKO lungs, suggesting a compromised differentiation of Club cells. Additionally, we observed upregulations of some molecular markers for distal progenitors and differentiated cells in the proximal region of airways, again suggesting a significant disruption in epithelial differentiation in the Bmpr1a CKO lungs. These abnormal cells can be further defined by a single cell transcriptomic approach in future.

      Recommendations for Authors:

      Reviewer #1 (Recommendations For The Authors):

      As discussed above, there may be an issue with the histological images and staining in 2 images in Figure 7A. The precise images, problems and suggestions to resolve the issue are in the Review.

      Please see our response to Reviewer 1 above.

      Reviewer #2 (Recommendations For The Authors):

      Minor Weaknesses:

      (1) Please enlarge the fluorescent stained lungs presented in Supplemental Figure 1C.

      We have revised this panel accordingly.

      (2) Figure 1D and E show that loss of Bmpr1a does not change proliferation or apoptosis on E15.5. Was that also seen through E18.5?

      We thank the reviewer for the thoughtful question about proliferation and apoptosis at later embryonic stages. Our focus here was to elucidate the mechanisms underlying abnormal branching morphogenesis and lung cyst initiation that occur prior to E15.5 in our model. Measuring the dynamic changes in cell proliferation and apoptosis at later timepoints will help to understand cyst progression, which will be our next focus.

      (3) BMP inhibitors used in Figure 4 show that BMP signaling regulates mesenchymal myogenesis independent of SMAD. But the experiments don't show how the inhibitors impact the control cells.

      We have examined the effects of the BMPR1 inhibitor LDN on the control cells. At the same dose (200 nM) and serum-free culture condition, LDN did not affect the basal level of BMP signaling (data not included) but blocked exogeneous BMP4-induced signaling elevation (Fig.4E).

      (4) Bmpr1a was deleted by administering doxycycline to pregnant dams prior to lung bud formation. It caused cystic disorders by disrupting proximal airspace. Could the authors speculate on why it does not impact tracheal and bronchiolar development? In other words, does the TBX4 promoter not target these cells? Do these cells not express Bmpr1a?

      The Tbx4 enhancer does target mesenchymal cells surrounding the trachea and bronchioles. Deletion of Bmpr1a in tracheal mesenchymal cells result in disruption of tracheal cartilage formation and smooth muscle differentiation. These phenotypes are evident in the gross view of lungs from E15.5 and later (Fig.1A). However, our manuscript is focusing on the phenotype of prenatal lung cysts, and we have chosen not to include complex data on tracheal development.

    2. eLife assessment

      This valuable paper characterizes a murine model for congenital cystic airway abnormalities (CPAM). In contrast to previous assumptions that only epithelial cells are involved in the formation of pulmonary cysts, the authors provide compelling new evidence that defective BMP signaling in lung mesenchymal cells can disrupt airway development. Knowing that proper BMP signaling in mesenchymal cells is required for normal cyst-free lungs could potentially pave the way to understanding and preventing CPAM in infants at risk for this common disorder, which can be fatal if untreated. The relevance of the murine model could be enhanced by further molecular and histological comparison with human cysts.

    3. Reviewer #2 (Public Review):

      Congenital cystic airway abnormalities (CPAM) are a common poorly understood disorder in airway lung development that can be fatal if not effectively treated at birth. This study by Luo and colleagues provides compelling new evidence that bone morphogenetic protein signaling in distal mesenchymal cells is required for normal mouse lung development. Genetic loss of BMP receptor in mice and in fetal mesenchymal cells causes type 2 or alveolar-like CPAM pathology. Furthermore, this is associated with changes in expression of Sox2-Sox9 suggesting defects in the proximal to distal cellularity of the lung. Interestingly, cysts are formed even when SMAD1 and 5, two major downstream effects of BMP signaling are deleted suggesting a role for non-canonical BMP signalling. Furthermore, they were independent of ablating BMP signaling in non-vascular mesenchymal cells. The findings are compelling and provide strong evidence that cystic lung development is caused by loss of non-canonical BMP signaling in mesenchymal cells. The main weakness of the paper is that it does not identify the downstream non-canonical effector of mesenchymal BMP signaling. The authors provide a plausible suggestion that it may be p38 MAPK that deserves further investigation. Despite this minor weakness, the overall findings are novel and considered important because they provide a foundation for new studies, including experiments that may produce drugs designed to prevent or treat newborn infants with CPAM.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      This manuscript explores the impact of serotonin on olfactory coding in the antennal lobe of locusts and odor-evoked behavior. The authors use serotonin injections paired with an odor-evoked palp-opening response assay and bath application of serotonin with intracellular recordings of odor-evoked responses from projection neurons (PNs).

      Strengths:

      The authors make several interesting observations, including that serotonin enhances behavioral responses to appetitive odors in starved and fed animals, induces spontaneous bursting in PNs, and uniformly enhances PN responses to odors. Overall, I had no technical concerns. Weaknesses:

      While there are several interesting observations, the conclusions that serotonin enhanced sensitivity specifically and that serotonin had feeding-state-specific effects, were not supported by the evidence provided. Furthermore, there were other instances in which much more clarification was needed for me to follow the assumptions being made and inadequate statistical testing was reported.

      Major concerns.

      • To enhance olfactory sensitivity, the expected results would be that serotonin causes locusts to perceive each odor as being at a relatively higher concentration. The authors recapitulate a classic olfactory behavioral phenomenon where higher odor concentrations evoke weaker responses which is indicative of the odors becoming aversive. If serotonin enhanced the sensitivity to odors, then the dose-response curve should have shifted to the left, resulting in a more pronounced aversion to high odor concentrations. However, the authors show an increase in response magnitude across all odor concentrations. I don't think the authors can claim that serotonin enhances the behavioral sensitivity to odors because the locusts no longer show concentration-dependent aversion. Instead, I think the authors can claim that serotonin induces increased olfactory arousal.

      The reviewer makes a valid point. Bath application of serotonin increased POR behavioral responses across all odor concentrations, and concentration-dependent aversion was also not observed. Furthermore, the monotonic relationship between projection neuron responses and the intensity of current injection is altered when serotonin is exogenously introduced (see Author response image 1; see below for more explanation). Hence, our data suggests that serotonin alters the dose-response relationship between neural/behavioral responses and odor intensity. As recommended, we have followed what the reviewer has suggested and revised our claim to serotonin inducing increase in olfactory arousal. The new physiology data has been added as Supplementary Figure 3 to the revised manuscript.

      • The authors report that 5-HT causes PNs to change from tonic to bursting and conclude that this stems from a change in excitability. However, excitability tests (such as I/V plots) were not included, so it's difficult to disambiguate excitability changes from changes in synaptic input from other network components.

      To confirm that the PN excitability did indeed change after serotonin application, we performed a new set of current-clamp recordings. In these experiments, we monitored the spiking activities in individual PNs as we injected different levels of current injections (200 – 1000 pico Amperes). Note that locust LNs that provide recurrent inhibition arborize and integrate inputs from a large number of sensory neurons and projection neurons. Therefore, activating a single PN should not activate the local neurons and therefore the antennal lobe network.

      We found that the total spiking activity monotonically increased with the magnitude of the current injection in all four PNs recorded (Author response image 1). However, after serotonin injection, we found that the spiking activity remained relatively stable and did not systematically vary with the magnitude of the current injection. While the changes in odor-evoked responses may incorporate both excitability changes in individual PNs and recurrent feedback inhibition through GABAergic LNs, these results from our current injection experiments unambiguously indicate that there are changes in excitability at the level of individual PNs. We have added this result to the revised manuscript.

      Author response image 1.

      Current-injection induced spiking activity in individual PNs is altered after serotonin application. (A) Representative intracellular recordings showing membrane potential fluctuations as a function of time for one projection neuron (PNs) in the locust antennal lobe. A two-second window when a positive 200-1000pA current was applied is shown. Firing patterns before (left) and after (right) serotonin application are shown for comparison. Note, the spiking activity changes after the 5HT application. The black bar represents the 20mV scale. (B) Dose-response curves showing the average number of action potentials (across 5 trials) during the 2second current pulse before (green) and after (purple) serotonin for each recorded PN. Note that the current intensity was systematically increased from 200 pA to 1000 pA. The (C) The mean number of spikes across the four recorded cells during current injection is shown. The color progression represents the intensity of applied current ranging 200pA (leftmost bar) to 1000pA (rightmost bar). The dose-response trends before (green) and after (purple) 5HT application are shown for comparison.. The error bars represent SEM across the four cells.

      • There is another explanation for the theoretical discrepancy between physiology and behavior, which is that odor coding is further processing in higher brain regions (ie. Other than the antennal lobe) not studied in the physiological component of this study. This should at least be discussed.

      This is a valid argument. For our model of neural mapping onto behavior to work, we only need the odorant that evokes or suppresses PORs to activate a distinct set of neurons. Having said that, our extracellular recording results (Fig. 6E) indicate that hexanol (high POR) and linalool (low POR) do activate highly non-overlapping sets of PNs in the antennal lobe. Hence, our results suggest that the segregation of neural activity based on behavioral relevance already begins in the antennal lobe. We have added this clarification to the discussion section.

      • The authors cannot claim that serotonin underlies a hunger state-dependent modulation, only that serotonin impacts responses to appetitive odors. Serotonin enhanced PORs for starved and fed locusts, so the conclusion would be that serotonin enhances responses regardless of the hunger state. If the authors had antagonized 5-HT receptors and shown that feeding no longer impacts POR, then they could make the claim that serotonin underlies this effect. As it stands, these appear to be two independent phenomena.

      This is also a valid point. We have clarified this in the revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The authors investigate the influence of serotonin on feeding behavior and electrophysiological responses in the antennal lobe of locusts. They find that serotonin injection changes behavior in an odorspecific way. In physiology experiments, they can show that antennal lobe neurons generally increase their baseline firing and odor responses upon serotonin injection. Using a modeling approach the authors propose a framework on how a general increase in antennal lobe output can lead to odorspecific changes in behavior. The authors finally suggest that serotonin injection can mimic a change in a hunger state.

      Strengths:

      This study shows that serotonin affects feeding behavior and odor processing in the antennal lobe of locusts, as serotonin injection increases activity levels of antennal lobe neurons. This study provides another piece of evidence that serotonin is a general neuromodulator within the early olfactory processing system across insects and even phyla. Weaknesses:

      I have several concerns regarding missing control experiments, unclear data analysis, and interpretation of results.

      A detailed description of the behavioral experiments is lacking. Did the authors also provide a mineral oil control and did they analyze the baseline POR response? Is there an increase in baseline response after serotonin exposure already at the behavioral output level? It is generally unclear how naturalistic the chosen odor concentrations are. This is especially important as behavioral responses to different concentrations of odors are differently modulated after serotonin injection (Figure 2: Linalool and Ammonium).

      POR protocol: Sixth instar locusts (Schistocera americana) of either sex were starved for 24-48 hours before the experiment or taken straight from the colony and fed blades of grass for the satiated condition. Locusts were immobilized by placing them in the plastic tube and securing their body with black electric tape (see Author response image 2). Locusts were given 20 - 30 minutes to acclimatize after placement in the immobilization tube. As can be noted, the head of the locusts along with the antenna and maxillary palps protruded out of this immobilization tube so they can be freely moved by the locusts. Note that the maxillary palps are sensory organs close to the mouth parts that are used to grab food and help with the feeding process.

      It is worth noting that our earlier studies had shown that the presentation of ‘appetitive odorants’ triggers the locust to open their maxillary palps even when no food is presented (Saha et al., 2017; Nizampatnam et al., 2018; Nizampatnam et al., 2022; Chandak and Raman, 2023.) Furthermore, our earlies results indicate that the probability of palp opening varies across different odorants (Chandak and Raman, 2023). We chose four odorants that had a diverse range of palp-opening: supra-median (hexanol), median (benzaldehyde), and sub-median (linaool). Therefore, each locust in our experiments was presented with one concentration of four odorants (hexanol, benzaldehyde, linalool, and ammonium) in a pseudorandomized order. The odorants were chosen based on our physiology results such that they evoked different levels of spiking activities.

      The odor pulse was 4 s in duration and the inter-pulse interval was set to 60 s. The experiments were recorded using a web camera (Microsoft) placed right in front of the locusts. The camera was fully automated with the custom MATLAB script to start recording 2 seconds before the odor pulse and end recording at odor termination. An LED was used to track the stimulus onset/offset. The POR responses were manually scored offline. Responses to each odorant were scored a 0 or 1 depending on if the palps remained closed or opened. A positive POR was defined as a movement of the maxillary palps during the odor presentation time window as shown on the locust schematic (Main Paper Figure 1).

      Author response image 2.

      Pictures showing the behavior experiment setup and representative palp-opening responses in a locust.

      As the reviewer inquired, we performed a new series of POR experiments, where we explored POR responses to mineral oil and hexanol, before and after serotonin injection. For this study, we used 10 locusts that were starved 24-48 hours before the experiment. Note that hexanol was diluted at 1% (v/v) concentration in mineral oil. Our results reveal that locusts PORs to hexanol (~ 50% PORs) were significantly higher than those triggered by mineral oil (~10% PORs). Injection of serotonin increased the POR response rate to hexanol but did not alter the PORs evoked by mineral oil (Author response image 3).

      Author response image 3.

      Serotonin does not alter the palp-opening responses evoked by paraffin oil. The PORs before and after (5HT) serotonin injection are summarized and shown as a bar plot for hexanol and paraffin oil. Striped bars signify the data collected after 5HT injection. Significant differences are identified in the plot (one-tailed paired-sample t-test; (*p<0.05).

      Regarding recordings of potential PNs - the authors do not provide evidence that they did record from projection neurons and not other types of antennal lobe neurons. Thus, these claims should be phrased more carefully.

      In the locust antennal lobe, only the cholinergic projection neurons fire full-blown sodium spikes. The GABAergic local neurons only fire calcium ‘spikelets’ (Laurent, TINS, 1996; Stopfer et al., 2003; see Author response image 4 for an example). Hence, we are pretty confident that we are only recording from PNs. Furthermore, due to the physiological properties of the LNs, their signals being too small, they are also not detected in the extracellular recordings from the locust antennal lobe. Hence, we are confident with our claims and conclusion.

      Author response image 4.

      PN vs LN physiological differences: Left: A representative raw voltage traces recorded from a local neuron before, during, and after a 4-second odor pulse are shown. Note that the local neurons in the locust antennal lobe do not fire full-blown sodium spikes but only fire small calcium spikelets. On the right: A representative raw voltage trace recorded from a representative projection neuron is shown for comparison. Clear sodium spikes are clearly visible during spontaneous and odor-evoked periods. The gray bar represents 4 seconds of odor pulse. The vertical black bar represents the 40mV.

      The presented model suggests labeled lines in the antennal lobe output of locusts. Could the presented model also explain a shift in behavior from aversion to attraction - such as seen in locusts when they switch from a solitarious to a gregarious state? The authors might want to discuss other possible scenarios, such as that odor evaluation and decision-making take place in higher brain regions, or that other neuromodulators might affect behavioral output. Serotonin injections could affect behavior via modulation of other cell types than antennal lobe neurons. This should also be discussed - the same is true for potential PNs - serotonin might not directly affect this cell type, but might rather shut down local inhibitory neurons.

      There are multiple questions here. First, regarding solitary vs. gregarious states, we are currently repeating these experiments on solitary locusts. Our preliminary results (not included in the manuscript) indicate that the solitary animals have increased olfactory arousal and respond with a higher POR but are less selective and respond similarly to multiple odorants. We are examining the physiology to determine whether the model for mapping neural responses onto behavior could also explain observations in solitary animals.

      Second, this reviewer makes the point raised by Reviewer 1. We agree that odor evaluation and decisionmaking might take place in higher brain regions. All we could conclude based on our data is that a segregation of neural activity based on behavioral relevance might provide the simplest approach to map non-specific increase in stimulus-evoked neural responses onto odor-specific changes in behavioral outcome. Furthermore, our results indicate that hexanol and linalool, two odorants that had an increase and decrease in PORs after serotonin injection, had only minimal neural response overlap in the antennal lobe. These results suggest that the formatting of neural activity to support varying behavioral outcomes might already begin in the antennal lobe. We have added this to our discussion.

      Third, regarding serotonin impacting PNs, we performed a new set of current-clamp experiments to examine this issue (Author response image 1). Our results clearly show that projection neuron activity in response to current injections (that should not incorporate feedback inhibition through local neurons) was altered after serotonin injection. Therefore, the observed changes in the odor-evoked neural ensemble activity should incorporate modulation at both individual PN level and at the network level. We have added this to our discussion as well.

      Finally, the authors claim that serotonin injection can mimic the starved state behavioral response. However, this is only shown for one of the four odors that are tested for behavior (HEX), thus the data does not support this claim.

      We note that Hex is the only appetitive odorant in the panel. But, as reviewer 1 has also brought up a similar point, we have toned down our claims and will investigate this carefully in a future study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      • Was the POR of the locusts towards linalool and ammonium higher than towards a blank odor cartridge? I ask because the locusts appear to be less likely to respond to these odors and so I am concerned that this assay is not relevant to the ecological context of these odors. In other words, perhaps serotonin did not enhance the responses to these odors in this assay, because this is not a context in which locusts would normally respond to these odors.

      The POR response to linalool and ammonium is lower and comparable to that of paraffin oil. Serotonin does not increase POR responses to paraffin oil but does increase response to hexanol (an appetitive odorant). We have clarified this using new data (Author response image 5).

      • It seems to me that Figure 5C is the crux for understanding the potential impact of 5-HT on odor coding, but it is somewhat confusing and underutilized. Is the implication that 5-HT decorrelates spontaneous activity such that when an odor stimulus arrives, the odor-evoked activity deviates to a greater degree? The authors make claims about this figure that require the reader to guess as to the aspect of the figure to which they are referring.

      The reviewer makes an astute observation. Yes, the spontaneous activity in the antennal lobe network before serotonin introduction is not correlated with the ensemble spontaneous activity after serotonin bath application. Remarkably, the odor-evoked responses were highly similar, both in the reduced PCA space and when assayed using high-dimensional ensemble neural activity vectors. Whether the changes in network spontaneous activity have a function in odor detection and recognition is not fully understood and cannot be convincingly answered using our data. But this is something that we had pondered.

      • The modeling component summarized in Figure 6 needs clarification and more detail. Perhaps example traces associated with positive weighting within neural ensemble 1 relative to neural ensemble 2? I struggled to understand conceptually how the model resolved the theoretical discrepancy between physiology and behavior.

      As recommended, here is a plot showing the responses of four PNs that had positive weights to hexanol and linalool. As can be expected, each PN in this group had higher responses to hexanol and no response to linalool. Further, the four PNs that received negative weights had response only to linalool.

      Author response image 5.

      Odor-evoked responses of four PNs that received positive weights in the model (top panel), and four PNs that were assigned negative weights in the model (bottom).

      • Was there a significant difference between the PORs of hungry vs. fed locusts? The authors state that they differ and provide statistics for the comparisons to locusts injected with 5-HT, but then don't provide any statistical analyses of hungry vs. fed animals.

      The POR responses to HEX (an appetitive odorant) were significantly different between the hungry and starved locusts.

      Author response image 6.

      A bar plot summarizing PORs to all four odors for satiated locust (highlighted with stripes), before (dark shade), and after 5HT injection (lighter shade). To allow comparison before 5HT injection for starved locust plotted as well (without stripes). The significance was determined using a one-tailed paired-sample ttest(*p<0.05).

      • Were any of the effects of 5-HT on odor-evoked PN responses significant? No statistics are provided.

      We examined the distribution of odor-evoked responses in PNs before and after 5HT introduction. We found that the overall distribution was not significantly different between the two (one-tailed pairedsample t-test; p = 0.93).

      Author response image 7.

      Comparison of the distribution of odor-evoked PN responses before (green) and after (purple) 5HT introduction. One-tailed paired sample t-test was used to compare the two distributions.

      • The authors interchangeably use "serotonin", "5HT" and "5-HT" throughout the manuscript, but this should be consistent.

      This has been fixed in the revised manuscript.

      • On page 2 the authors provide an ecological relevance for linalool as being an additive in pesticides, however, linalool is a common floral volatile chemical. Is the implication that locusts have learned to associate linalool with pesticides?

      Linalool is a terpenoid alcohol that has a floral odor but has also been used as a pesticide and insect repellent [Beier et al., 2014]. As shown in Author response image 2, it evoked the least POR responses amongst a diverse panel of 22 odorants that were tested. We have clarified how we chose odorants based on the prior dataset in the Methods section.

      • In Figure 1, there should be a legend in the figure itself indicating that the black box indicates the absence of POR and the white box indicates presence, rather than just having it in the legend text.

      Done.

      • In Figure 2, the raw data from each animal can be moved to the supplements. The way it is presented is overwhelming and the order of comparisons is difficult to follow.

      Done.

      • For the induction of bursting in PNs by the application of 5-HT, were there any other metrics observed such as period, duration of bursts, or peak burst frequency? The authors rely on ISI, but there are other bursting metrics that could also be included to understand the nature of this observation. In particular, whether the bursts are likely due to changes in intrinsic biophysical properties of the PNs or polysynaptic effects.

      We could use other metrics as the reviewer suggests. Our main point is that the spontaneous activity of individual PNs changed. We have added a new current-injection experiments to show that the PNs output to square pulses of current becomes different after serotonin application (Author response image 1)

      • Were 4-vinyl anisole, 1-nonanol, and octanoic acid selected as additional odors because they had particular ecological relevance, or was it for the diversity of chemical structure?

      These odorants were selected based on both, chemical structure and ecological relevance. The logic behind this was to have a very diverse odor panel that consisted of food odorant – Hexanol, aggregation pheromone – 4-vinyl anisole, sex pheromone – benzaldehyde, acid – octanoic acid, base – ammonium, and alcohol – 1-nonanol. Additionally, we selected these odors based on previous neural and behavioral data on these odorants (Chandak and Raman, 2023, Traner and Raman, 2023, Nizampatnam et al, 2022 & 2018; Saha et al., 2017 & 2013).

      Reviewer #2 (Recommendations For The Authors):

      The electrophysiology dataset combines all performed experiments across all tested different PN-odor pairs. How many odors have been tested in a single PN and how many PNs have been tested for a single odor? This information is not present in the current manuscript. Can the authors exclude that there are odor-specific modulations?

      In total, our dataset includes recordings from 19 PNs. Seven PNs were tested on a panel of seven odorants (4-vinyl anisole, 1-nonanol, octanoic acid, Hex, Bza, Lool, and Amn), and the remaining twelve were tested with the four main odorants used in the study (Hex, Bza, Lool, and Amn). This information has been added to the Methods section

      How did the authors choose the concentrations of serotonin injections and bath applications - is this a naturalistic amount?

      The serotonin concentration for ephys experiments was chosen based on trial-error experiments:

      0.01mM was the highest concentration that did not cause cell death. For the behavioral experiments, we increased the concentration (0.1 M) due to the presence of anatomical structures in the locust's head such as air sacks, sheath as well as hemolymph which causes some degree of dilution that we cannot control.

      Behavior experiments were performed 3 hours after injection - ephys experiments 5-10 minutes following bath application. Can the authors exclude that serotonin affects neural processing differently on these different timescales?

      We cannot exclude this possibility. We did ePhys experiments 5-10 minutes after bath application as it would be extremely hard to hold cells for that long.

      A longer delay was required for our behavioral experiments as the locusts tended to be a bit more agitated with larger spontaneous movements of palps as well as exhibited unprompted vomiting. A 3hour period allowed the locust to regain its baseline level movements after 5HT introduction. [This information has been added to the methods section of the revised manuscript]

      Concerning the analysis of electrophysiological data. The authors should correct for changes in the baseline before performing PCA analysis. And how much of the variance is explained by PC1 and PC2?

      We did not correct for baseline changes or subtract baseline as we wanted to show that the odor-evoked neural responses still robustly encoded information about the identity of the odorant.

      The authors should perform dye injections after recordings to visualize the cell type they recorded from. Serotonin might affect also other cell types in the antennal lobe.

      As mentioned above, in the locust antennal lobe only PNs fire full-blown sodium spikes, and LNs only fire calcium spikelets (Author response image 4). Since these signals are small, they will be buried under the noise floor when using extracellular recording electrodes for monitoring responses in the AL antennal lobe.

      Hence we are pretty certain what type of cells we are recording from.

      There were several typos in the manuscript, please check again.

      We have fixed many of the grammatical errors and typos in the revised version.

    2. Reviewer #1 (Public Review):

      Summary:

      This manuscript explores the impact of serotonin on olfactory coding in the antennal lobe of locusts and odor-evoked behavior. The authors use serotonin injections paired with an odor-evoked palp-opening response assay and bath application of serotonin with intracellular recordings of odor-evoked responses from projection neurons (PNs).

      Strengths:

      The authors make several interesting observations, including that serotonin enhances behavioral responses to appetitive odors in starved and fed animals, induces spontaneous bursting in PNs, directly impacts PN excitability, and uniformly enhances PN responses to odors.

      Weakness:

      The one remaining issue to be resolved is the theoretical discrepancy between the physiology and the behavior. The authors provide a computational model that could explain this discrepancy and provide the caveat that while the physiological data was collected from the antennal lobe, but there could be other olfactory processing stages involved. Indeed other processing stages could be the sites for the computational functions proposed by the model. There is an additional caveat which is that the physiological data were collected 5-10 minutes after serotonin application whereas the behavioral data were collected 3 hours after serotonin application. It is difficult to link physiological processes induced 5 minutes into serotonin application to behavioral consequences 3 hours subsequent to serotonin application. The discrepancy between physiology and behavior could easily reflect the timing of action of serotonin (i.e. differences between immediate and longer-term impact).

      Overall, the study demonstrates the impact of serotonin on odor-evoked responses of PNs and odor guided behavior in locust. Serotonin appears to have non-linear effects including changing the firing patterns of PNs from monotonic to bursting and altering behavioral responses in an odor-specific manner, rather than uniformly across all stimuli presented.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors investigate the influence of serotonin on feeding behavior and electrophysiological responses in the antennal lobe of locusts. They find that serotonin injection changes behavior in an odor-specific way. In physiology experiments, they can show that projection neurons in the antennal lobe generally increase their baseline firing and odor responses upon serotonin injection. Using a modeling approach the authors propose a framework on how a general increase in antennal lobe output can lead to odor-specific changes in behavior.

      Strengths:

      This study shows that serotonin affects feeding behavior and odor processing in the antennal lobe of locusts, as serotonin injection increases activity levels of projection neurons. This study provides another piece of evidence that serotonin is a general neuromodulator within the early olfactory processing system across insects and even phyla.

      Weaknesses:

      I still have several concerns regarding the generalizability of the model and interpretation of results. The authors cannot provide evidence that serotonin modulation of projection neurons impacts behavior.

      The authors show that odor identity is maintained after 5-HT injection, however, the authors do not show if PN responses to different odors were differently affected after serotonin exposure.

      Regarding the model, the authors show that the model works for odors with non-overlapping PN activation. However, only one appetitive, one neutral, and one aversive odor has been tested and modeled here. Can the fixed-weight model also hold for other appetitive and aversive odors that might share more overlap between active PNs? How could the model generate BZA attraction in 5-HT exposed animals (as seen in behavior data in Figure 1) if the same PNs just get activated more?

      The authors should still not exclude the possibility that serotonin injections could affect behavior via modulation of other cell types than projection neurons. This should still be discussed, serotonin might rather shut down baseline activation of local inhibitory neurons - and thus lead to the interesting bursting phenotypes, which can also be seen in the baseline response, due to local PN-to-LN feedback.

      The authors did not fully tone down their claims regarding causality between serotonin and starved state behavioral responses.<br /> There is no proof that serotonin injection mimics starved behavioral responses.

    1. Author response:

      We would like to thank the reviewers for their helpful comments. We note that both reviews are strongly supportive with comments including, “a biophysical tour de force” (rev #1), “the study is exemplary” (rev #2), and “represents a roadmap for future work” (rev #2). Below we respond to each reviewer comment.

      Reviewer #1

      This study provides a detailed and quantitative description of the allosteric mechanisms resulting in the paradoxical activation of BRAF kinase dimers by certain kinase inhibitors. The findings provide a much needed quantiative basis for this phenomenon and may lay the foundation for future drug development efforts aimed at the important cancer target BRAF. The study builds on very evidence obtained by multiple independent biophysical methods.

      Summary:

      The authors quantitatively describe the complex binding equilibria of BRAF and its inhibitors resulting in some cases in the paradoxical activation of BRAF dimer when bound to ATP competitive inhibitors. The authors use a biophysical tour de force involving FRET binding assays, NMR, kinase activity assays and DEER spectroscopy.

      We are gratified by the reviewer’s supportive summary.

      Strengths:

      The strengths of the study are the beautifully conducted assays that allow for a thorough characterization of the allostery in this complex system. Additionally, the use of F-NMR and DEER spectroscopy provide important insights into the details of the process. The resulting model for binding of inhibitors and dimerization (Fig.4) is very helpful.

      Weaknesses:

      This is a complex system and its communication is inherently challenging. It might be of interest to the broader readership to understand the implications of the model for drug development and therapy.

      We agree with the reviewer that this is a complicated system. With regard to inhibitor development, a key insight is that designing aC-in state inhibitors that avoid paradoxical activation may be non-trivial because these molecules not only induce dimers but also tend to bind the second dimer subunit more weakly than the first, due to allosteric asymmetry and/or inherently different affinities for each RAF isoform. We feel the full implications for future therapeutic development are an extensive topic that is beyond the scope of our work, which is focused on the properties of current inhibitors.

      Recommendations for the author:

      The experimental work, analysis and resulting model are excellent. I had some difficulty following the complex model in some instances and it may be useful to review the description of the model and see whether it can be made more palatable to the broader readership. I think it would be useful to discuss the model presented in reference 40 (Kholodenko) and to compare it to the presented model here.

      We regret any confusion with regards to the nature of the model. Our analysis was built upon the model developed by Boris Kholodenko as reported in his 2015 Cell Reports paper. This formed the theoretical framework that combined with our experimental data allowed us to parameterize this model to obtain experimental values for the equilibrium constants and allosteric coupling factors.

      Reviewer #2

      This manuscript combines elegant biophysical solution measurements to address paradoxical kinase activation by Type II BRAF inhibitors. The novel findings challenge prevailing models, through experiments that are rigorous and carefully controlled. The study is exemplary in the breadth of strategies it uses to address protein kinase dynamics and inhibitor allostery.

      Summary:

      This manuscript uses FRET, 19F-NMR and DEER/EPR solution measurements to examine the allosteric effects of a panel of BRAF inhibitors (BRAFi). These include first-generation aC-out BRAFi, and more recent Type I and Type II aC-in inhibitors. Intermolecular FRET measurements quantify Kd for BRAF dimerization and inhibitor binding to the first and second subunits. Distinct patterns are found between aC-in BRAFi, where Type I BRAFi bind equally well to the first and second subunits within dimeric BRAF. In contrast, Type II BRAFi show stronger affinity for the first subunit and weaker affinity for the second subunit, an effect named "allosteric asymmetry". Allosteric asymmetry has the potential for Type II inhibitors to promote dimerization while favoring occupancy of only one subunit (BBD form), leading to enrichment of an active dimer.

      Measurements of in vitro BRAF kinase activity correlate amazingly well with the calculated amounts of the half site-inhibited BBD forms with Type II inhibitors. This suggests that the allosteric asymmetry mechanism explains paradoxical activation by this class of inhibitors. DEER/EPR measurements further examine the positioning of helix aC. They show systematic outward movement of aC with Type II inhibitors, relative to the aC-in state with Type I inhibitors, and further show that helix aC adopts multiple states and is therefore dynamic in apo BRAF. This makes a strong case that negative cooperativity between sites in the BRAF dimer can account for paradoxical kinase activation by Type II inhibitors by creating a half site-occupied homodimer, BBD. In contrast, Type I inhibitors and aC-out inhibitors do not fit this model, and are therefore proposed to be explained by previous proposed models involving negative allostery between subunits in BRAF-CRAF heterodimers, RAS priming, and transactivation.

      Strengths:

      This study integrates orthogonal spectroscopic and kinetic strategies to characterize BRAF dynamics and determine how it impacts inhibitor allostery. The unique combination of approaches presented in this study represents a road map for future work in the important area of protein kinase dynamics. The work represents a worthy contribution not only to the field of BRAF regulation but protein kinases in general.

      Weaknesses:

      Some questions remain regarding the proposed model for Type II inhibitors and its comparison to Type I and aC-out inhibitors that would be useful to clarify. Specifically, it would be helpful to address whether the activation of BRAF by Type II inhibitors, while strongly correlated with BBD model predictions in vitro, also depends on CRAF via BRAF-CRAF in cells and therefore overlaps with the mechanisms of paradoxical activation by Type I and aC-out inhibitors.

      We agree with the reviewer that this is a worthy question to be pursued. However, given the substantial experimental effort required for such an endeavor, and the highly supportive nature of the reviewer comments, including that “This is a strong manuscript that I feel is well above the bar for publication”, we believe this effort is more appropriate for a future study.

      This is a strong manuscript that I feel is well above the bar for publication. Nevertheless, it is recommended that the authors consider addressing the following points in order to support their major conclusions.

      (1) Fig 3D shows similar effects of Type II and Type I inhibitors in the biphasic increase of cellular pMEK/pERK. From this, the authors argue that Type II inhibitors are explained by negative allostery in the BRAF homodimer (based on Fig 2E), while Type I inhibitors are not. But it seems possible that despite the terrific correlation between BBD and BRAF kinase activities measured in vitro, CRAF is still important to explain pathway activation in cells. It also seems conceivable that the calculated %BBD between different Type II inhibitors may not correlate as well with their effects on pathway activation in cells. These possibilities should be addressed.

      We agree with the reviewer that it is likely that CRAF contributes to paradoxical activation by type II inhibitors in cells. It is also likely that other cellular factors such as RAS-priming and membrane recruitment play a role in activation. However, we note that for the type II inhibitors there is good agreement between the biophysical predictions and the concentration regimes in which activation is observed in cells, suggesting that these predictions are capturing a key part of the activation process that occurs in cells.

      (2) In Fig 2A, is it possible to report the activity of dimeric BRAF-WT in the absence of inhibitor? This would help confirm that the maximal activity measured after titrating inhibitor is indeed consistent with the predicted %BBD population, which would be expected to have half of the specific activity of BB.

      In principle, it is possible to determine the catalytic activity of apo dimers (BB) by combining our model predictions for the concentration of BB dimers and our activity measurements. However, because the activity assays are performed at nanomolar kinase concentrations, whereas the baseline dimerization affinity of BRAF is in the micromolar range, the observed activity of apo BRAF arises from a small subpopulation of dimers (on the order of 4 percent under the conditions of our experiments) and is therefore difficult to define accurately. As a result, we deemed it more suitable to compare our results to published activity measurements derived from 14-3-3-activated dimers which should represent fully dimerized BRAF. This analysis, as reported in Figure 2E, suggests that the BBD activity is approximately half of that of BB.

      (3) The 19F-NMR experiments make a good case for broadening of the helix aC signal in the BRAF dimer. From this, the study proposes that after inhibitor binds one subunit, the second unoccupied subunit retains dynamics. It would be useful to address this experimentally, if possible. For example, can the 19F-NMR signal be measured in the presence of inhibitor, to support the prediction that the unoccupied subunit is indeed dynamic and samples multiple conformations as in apo BRAF?

      We agree with the reviewer that it would be interesting to determine the dynamic response of BRAF to inhibitor binding. However, this is a challenging undertaking due to the biochemical heterogeneity that occurs at sub saturating inhibitor concentrations. For example, at any given inhibitor concentration, BRAF exists as a mixture of monomers, apo dimers, dimers with one inhibitor molecule, and dimers with two inhibitor molecules bound. This makes it challenging to relate the 19F NMR signal to a single biochemical state. Addressing this would require a substantial experimental effort that we feel is beyond the scope of this study.

    1. eLife assessment

      This valuable paper describes innovative force measurements of the bending modulus of gliding cyanobacteria, along with measurements of the critical buckling length of the cells, which in combination lead to insight into how these cells produce the force necessary to move. Quantitative analysis convincingly shows that the propulsive force and resistive friction coefficient are strongly coupled, which supports propulsion based on adhesion forces rather than slime extrusion.

    2. Reviewer #1 (Public Review):

      The paper combines experiments on freely gliding cyanobacteria, buckling experiments using two-dimensional V shaped corners, and micropipette force measurements with theoretical models to study gliding forces in these organisms. The aim is to quantify these forces and use the results to perhaps discriminate between competing mechanisms by which these cells move. A large data set of possible collision events are analyzed, bucking events evaluated, and critical buckling lengths estimated. A line elasticity model is used to analyze the onset of buckling and estimate the effective (viscous type) friction/drag that controls the dynamics of the rotation that ensues post-buckling. This value of the friction/drag is compared to a second estimate obtained by consideration of the active forces and speeds in freely gliding filaments. The authors find that these two independent estimates of friction/drag correlate with each other and are comparable in magnitude. The experiments are conducted carefully, the device fabrication is novel, the data set is interesting, and the analysis is solid. The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion. While consistent with the data, this conclusion is inferred.

      Summary:

      The paper addresses important questions on the mechanisms driving the gliding motility of filamentous cyanobacteria. The authors aim to understand these by estimating the elastic properties of the filaments, and by comparing the resistance to gliding under a) freely gliding conditions, and b) in post-buckled rotational states. Experiments are used to estimate the propulsion force density on freely gliding filaments (assuming over damped conditions). Experiments are combined with a theoretical model based on Euler beam theory to extract friction (viscous) coefficients for filaments that buckle and begin to rotate about the pinned end. The main results are estimates for the bending stiffness of the bacteria, the propulsive tangential force density, the buckling threshold in terms of the length, and estimates of the resistive friction (viscous drag) providing the dissipation in the system and balancing the active force. It is found that experiments on the two bacterial species yield nearly identical value of 𝑓 (albeit with rather large variations). The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion.

      Strengths of the paper:

      The strengths of the paper lie in the novel experimental setup and measurements that allow for the estimation of the propulsive force density, critical buckling length, and effective viscous drag forces for movement of the filament along its contour - the axial (parallel) drag coefficient, and the normal (perpendicular) drag coefficient (I assume this is the case, since the post-buckling analysis assumes the bent filament rotates at a constant frequency). These direct measurements are important for serious analysis and discrimination between motility mechanisms.

      Weaknesses:

      There are aspects of the analysis and discussion that may be improved. I suggest that the authors take the following comments into consideration while revising their manuscript.

      The conclusion that adhesion via focal adhesions is the cause for propulsion rather than slime protrusion, is consistent with the experimental results that the frictional drag correlates with propulsion force. At the same time, it is hard to rule out other factors that may result in this (friction) viscous drag - (active) force relationship while still being consistent with slime production. More detailed analysis aiming to discriminate between adhesion vs slime protrusion may be outside the scope of the study, but the authors may still want to elaborate on their inference. It would help if there was a detailed discussion on the differences in terms of the active force term for the focal adhesion-based motility vs the slime motility.

      Can the authors comment on possible mechanisms (perhaps from the literature) that indicate how isotropic friction may be generated in settings where focal adhesions drive motility. A key aspect here would probably be estimating the extent of this adhesion patch and comparing it to a characteristic contact area. Can lubrication theory be used to estimate characteristic areas of contact (knowing the radius of the filament, and assuming a height above substrate)? If the focal adhesions typically cover areas smaller than this lubrication area, it may suggest the possibility that bacteria essentially present a flat surface insofar as adhesion is concerned, leading to transversely isotropic response in terms of the drag. Of course, we will still require the effective propulsive force to act along the tangent.

      I am not sure why the authors mention that the power of the gliding apparatus is not rate limiting. The only way to verify this would be to put these in highly viscous fluids where the drag of the external fluid comes into the picture as well (if focal adhesions are on the substrate facing side, and the upper side is subject to ambient fluid drag). Also, the friction referred to here has the form of a viscous drag (no memory effect, and thus not viscoelastic or gel-like), and it is not clear if forces generated by adhesion involve other forms of drag such as chemical friction via temporary bonds forming and breaking. In quasi-static settings and under certain conditions such as separation of chemical and elastic time scales, bond friction may yield overall force proportional to local sliding velocities.

      For readers from a non-fluids background, some additional discussion of the drag forces, and the forms of friction would help. For a freely gliding filament if 𝑓 is the force density (per unit length), then steady gliding with a viscous frictional drag would suggest (as mentioned in the paper) 𝑓 ∼ 𝑣! 𝐿 𝜂∥. The critical buckling length is then dependent on 𝑓 and on 𝐵 the bending modulus. Here the effective drag is defined per length. I can see from this that if the active force is fixed, and the viscous component resulting from the frictional mechanism is fixed, the critical buckling length will not depend on the velocity (unless I am missing something in their argument), since the velocity is not a primitive variable, and is itself an emergent quantity.

    3. Reviewer #2 (Public Review):

      In the presented manuscript, the authors first use structured microfluidic devices with gliding filamentous cyanobacteria inside in combination with micropipette force measurements to measure the bending rigidity of the filaments. The distribution of bending rigidities is very broad.

      Next, they use triangular structures to trap the bacteria with the front against an obstacle. Depending on the length and rigidity, the filaments buckle under the propulsive force of the cells. The authors use theoretical expressions for the buckling threshold to infer propulsive force, given the measured length and (mean-) stiffnesses. They find nearly identical values for both species, 𝑓 ∼ (1.0 {plus minus} 0.6) nN∕µm, nearly independent of the velocity. These measurements have to be taken with additional care, as then inferred forces depend strongly on the bending rigidity, which already shows a broad distribution.

      Finally, they measure the shape of the filament dynamically to infer friction coefficients via Kirchhoff theory. In this section they report a strong correlation with velocity and report propulsive forces that vary over two orders of magnitude.

      From a theoretical perspective, not many new results are presented. The authors repeat the the well-known calculation for filaments buckling under propulsive load and arrive at the literature result of buckling when the dimensionless number (f L^3/B) is larger than 30.6 as previously derived by Sekimoto et al in 1995. In my humble opinion, the "buckling theory" section belongs to methods.<br /> Finally, the Authors use molecular dynamics type simulations similar to other models to reproduce the buckling dynamics from the experiments.

      Data and source code are available via trusted institutional or third-party repositories that adhere to policies that make data discoverable, accessible and usable.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper presents novel and innovative force measurements of the biophysics of gliding cyanobacteria filaments. These measurements allow for estimates of the resistive force between the cell and substrate and provide potential insight into the motility mechanism of these cells, which remains unknown.

      Strengths:

      The authors used well-designed microfabricated devices to measure the bending modulus of these cells and to determine the critical length at which the cells buckle. I especially appreciated the way the authors constructed an array of pillars and used it to do 3-point bending measurements and the arrangement the authors used to direct cells into a V-shaped corner in order to examine at what length the cells buckled at. By examining the gliding speed of the cells before buckling events, the authors were able to determine how strongly the buckling length depends on the gliding speed, which could be an indicator of how the force exerted by the cells depends on cell length; however, the authors did not comment on this directly.

      Weaknesses:

      There are no major weaknesses in the paper.

    5. Author response:

      Reviewer 1:

      The paper “Quantifying gliding forces of filamentous cyanobacteria by self-buckling” combines experiments on freely gliding cyanobacteria, buckling experiments using two-dimensional V-shaped corners, and micropipette force measurements with theoretical models to study gliding forces in these organisms. The aim is to quantify these forces and use the results to perhaps discriminate between competing mechanisms by which these cells move. A large data set of possible collision events are analyzed, bucking events evaluated, and critical buckling lengths estimated. A line elasticity model is used to analyze the onset of buckling and estimate the effective (viscous type) friction/drag that controls the dynamics of the rotation that ensues post-buckling. This value of the friction/drag is compared to a second estimate obtained by consideration of the active forces and speeds in freely gliding filaments. The authors find that these two independent estimates of friction/drag correlate with each other and are comparable in magnitude. The experiments are conducted carefully, the device fabrication is novel, the data set is interesting, and the analysis is solid. The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion. While consistent with the data, this conclusion is inferred.

      We thank the reviewer for the positive evaluation of our work.

      Summary:

      The paper addresses important questions on the mechanisms driving the gliding motility of filamentous cyanobacteria. The authors aim to understand these by estimating the elastic properties of the filaments, and by comparing the resistance to gliding under a) freely gliding conditions, and b) in post-buckled rotational states. Experiments are used to estimate the propulsion force density on freely gliding filaments (assuming over-damped conditions). Experiments are combined with a theoretical model based on Euler beam theory to extract friction (viscous) coefficients for filaments that buckle and begin to rotate about the pinned end. The main results are estimates for the bending stiffness of the bacteria, the propulsive tangential force density, the buckling threshold in terms of the length, and estimates of the resistive friction (viscous drag) providing the dissipation in the system and balancing the active force. It is found that experiments on the two bacterial species yield nearly identical values of f (albeit with rather large variations). The authors conclude that the experiments are consistent with the propulsion being generated by adhesion forces rather than slime extrusion.

      We appreciate this comprehensive summary of our work.

      Strengths of the paper:

      The strengths of the paper lie in the novel experimental setup and measurements that allow for the estimation of the propulsive force density, critical buckling length, and effective viscous drag forces for movement of the filament along its contour – the axial (parallel) drag coefficient, and the normal (perpendicular) drag coefficient (I assume this is the case, since the post-buckling analysis assumes the bent filament rotates at a constant frequency). These direct measurements are important for serious analysis and discrimination between motility mechanisms.

      We thank the reviewer for this positive assessment of our work.

      Weaknesses:

      There are aspects of the analysis and discussion that may be improved. I suggest that the authors take the following comments into consideration while revising their manuscript.

      The conclusion that adhesion via focal adhesions is the cause for propulsion rather than slime protrusion is consistent with the experimental results that the frictional drag correlates with propulsion force. At the same time, it is hard to rule out other factors that may result in this (friction) viscous drag - (active) force relationship while still being consistent with slime production. More detailed analysis aiming to discriminate between adhesion vs slime protrusion may be outside the scope of the study, but the authors may still want to elaborate on their inference. It would help if there was a detailed discussion on the differences in terms of the active force term for the focal adhesion-based motility vs the slime motility.

      We appreciate this critical assessment of our conclusions. Of course we are aware that many different mechanisms may lead to similar force/friction characteristics, and that a definitive conclusion on the mechanism would require the combination of various techniques, which is beyond the scope of this work. Therefore, we were very careful in formulating the discussion of our findings, refraining, in particular, from a singular conclusion on the mechanism but instead indicating “support” for one hypothesis over another, and emphasizing “that many other possibilities exist”.

      The most common concurrent hypotheses for bacterial gliding suggest that either slime extrusion at the junctional pore complex [A1], rhythmic contraction of fibrillar arrays at the cell wall [A2], focal adhesion sites connected to intracellular motor-microtubule complexes [A3], or modified type-IV pilus apparati [A4] provide the propulsion forces. For the slime extrusion hypothesis, which is still abundant today, one would rather expect an anticorrelation of force and friction: more slime extrusion would generate more force, but also enhance lubrication. The other hypotheses are more conformal to the trend we observed in our experiments, because both pili and focal adhesion require direct contact with a substrate. How contraction of fibrilar arrays would micromechanically couple to the environment is not clear to us, but direct contact might still facilitate force transduction. Please note that these hypotheses were all postulated without any mechanical measurements, solely based on ultra-structural electron microscopy and/or genetic or proteomic experiments. We see our work as complementary to that, providing a mechanical basis for evaluating these hypotheses.

      We agree with the referee that narrowing down this discussion to focal adhesion should have been avoided. We rewrote the concluding paragraph (page 8):

      “…it indicates that friction and propulsion forces, despite being quite vari able, correlate strongly. Thus, generating more force comes, inevitably, at the expense of added friction. For lubricated contacts, the friction coefficient is proportional to the thickness of the lubricating layer (Snoeijer et al., 2013 ), and we conjecture active force and drag both increase due to a more intimate contact with the substrate. This supports mechanisms like focal adhesion (Mignot et al., 2007 ) or a modified type-IV pilus (Khayatan et al., 2015 ), which generate forces through contact with extracellular surfaces, as the underlying mechanism of the gliding apparatus of filamentous cyanobacteria: more contacts generate more force, but also closer contact with the substrate, thereby increasing friction to the same extent. Force generation by slime extrusion (Hoiczyk and Baumeister, 1998 ), in contrast, would lead to the opposite behavior: More slime generates more propulsion, but also reduces friction. Besides fundamental fluid-mechanical considerations (Snoeijer et al., 2013 ), this is rationalized by two experimental observations: i. gliding velocity correlates positively with slime layer thickness (Dhahri et al., 2013 ) and ii. motility in slime-secretion deficient mutants is restored upon exogenous addition of polysaccharide slime. Still we emphasize that many other possibilities exist. One could, for instance, postulate a regulation of the generated forces to the experienced friction, to maintain some preferred or saturated velocity.”

      Can the authors comment on possible mechanisms (perhaps from the literature) that indicate how isotropic friction may be generated in settings where focal adhesions drive motility? A key aspect here would probably be estimating the extent of this adhesion patch and comparing it to a characteristic contact area. Can lubrication theory be used to estimate characteristic areas of contact (knowing the radius of the filament, and assuming a height above the substrate)? If the focal adhesions typically cover areas smaller than this lubrication area, it may suggest the possibility that bacteria essentially present a flat surface insofar as adhesion is concerned, leading to a transversely isotropic response in terms of the drag. Of course, we will still require the effective propulsive force to act along the tangent.

      We thank the referee for suggesting to estimate the dimensions of the contact region. Both pili and focal adhesion sites would be of sizes below one micron [A3, A4], much smaller than the typical contact region in the lubricated contact, which is on the order of the filament radius (few microns). So indeed, isotropic friction may be expected in this situation [A5] and is assumed frequently in theoretical work [A6–A8]. Anisotropy may then indeed be induced by active forces [A9], but we are not aware of measurements of the anisotropy of friction in bacterial gliding.

      For a more precise estimate using lubrication theory, rheology and extrusion rate of the secreted polysaccharides would have to be known, but we are not aware of detailed experimental characterizations.

      We extended the paragraph in the buckling theory on page 5 regarding the assumption of isotropic friction:

      “We use classical Kirchhoff theory for a uniform beam of length L and bending modulus B, subject to a force density ⃗b = −f ⃗t− η ⃗v, with an effective active force density f along the tangent ⃗t, and an effective friction proportional to the local velocity ⃗v, analog to existing literature (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Presumably, this friction is dominated by the lubrication drag from the contact with the substrate, filled by a thin layer of secreted polysaccharide slime which is much more viscous than the surrounding bulk fluid. Speculatively, the motility mechanism might also comprise adhering elements like pili (Khayatan et al., 2015 ) or foci (Mignot et al., 2007 ) that increase the overall friction (Pompe et al., 2015 ). Thus, the drag due to the surrounding bulk fluid can be neglected (Man and Kanso, 2019 ), and friction is assumed to be isotropic, a common assumption in motility models (Fei et al., 2020; Tchoufag et al., 2019; Wada et al., 2013 ). We assume…”

      We also extended the discussion regarding the outcome of isotropic friction (page 7):

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      I am not sure why the authors mention that the power of the gliding apparatus is not rate-limiting. The only way to verify this would be to put these in highly viscous fluids where the drag of the external fluid comes into the picture as well (if focal adhesions are on the substrate-facing side, and the upper side is subject to ambient fluid drag). Also, the friction referred to here has the form of a viscous drag (no memory effect, and thus not viscoelastic or gel-like), and it is not clear if forces generated by adhesion involve other forms of drag such as chemical friction via temporary bonds forming and breaking. In quasi-static settings and under certain conditions such as the separation of chemical and elastic time scales, bond friction may yield overall force proportional to local sliding velocities.

      We agree with the referee that the origin of the friction is not easily resolved. Lubrication yields an isotropic force density that is proportional to the velocity, and the same could be generated by bond friction. Importantly, both types of friction would be assumed to be predominantly isotropic. We explicitly referred to lubrication drag because it has been shown that mutations deficient of slime extrusion do not glide [A4].

      Assuming, in contrast, that in free gliding, friction with the environment is not rate limiting, but rather the internal friction of the gliding apparatus, i.e., the available power, we would expect a rather different behavior during early-buckling evolution. During early buckling, the tangential motion is stalled, and the dynamics is dominated by the growing buckling amplitude of filament regions near the front end, which move mainly transversely. For geometric reasons, in this stage the (transverse) buckling amplitude grows much faster than the rear part of the filament advances longitudinally. Thus that motion should not be impeded much by the internal friction of the gliding apparatus, but by external friction between the buckling parts of the filament and the ambient. The rate at which the buckling amplitude initially grows should be limited by the accumulated compressive stress in the filament and the transverse friction with the substrate. If the latter were much smaller than the (logitudinal) internal friction of the gliding apparatus, we would expect a snapping-like transition into the buckled state, which we did not observe.

      In our paper, we do not intend to evaluate the exact origin of the friction, quantifying the gliding force is the main objective. A linear force-velocity relation agrees with our observations. A detailed analysis of friction in cyanobacterial gliding would be an interesting direction for future work.

      To make these considerations more clear, we rephrased the corresponding paragraph on page 7 & 8:

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      For readers from a non-fluids background, some additional discussion of the drag forces, and the forms of friction would help. For a freely gliding filament if f is the force density (per unit length), then steady gliding with a viscous frictional drag would suggest (as mentioned in the paper) f ∼ v! L η||. The critical buckling length is then dependent on f and on B the bending modulus. Here the effective drag is defined per length. I can see from this that if the active force is fixed, and the viscous component resulting from the frictional mechanism is fixed, the critical buckling length will not depend on the velocity (unless I am missing something in their argument), since the velocity is not a primitive variable, and is itself an emergent quantity.

      We are not sure what “f ∼ v! L η||” means, possibly the spelling was corrupted in the forwarding of the comments.

      We assumed an overdamped motion in which the friction force density ff (per unit length of the filament) is proportional to the velocity v0, i.e. ff ∼ η v0, with a friction coefficient η. Overdamped means that the friction force density is equal and opposite to the propulsion force density, so the propulsion force density is f ∼ ff ∼ η v0. The total friction and propulsion forces can be obtained by multiplication with the filament length

      L, which is not required here. In this picture, v0 is an emergent quantity and f and η are assumed as given and constant. Thus, by observing v0, f can be inferred up to the friction coefficient η. Therefore, by using two descriptive variables, L and v0, with known B, the primitive variable η can be inferred by logistic regression, and f then follows from the overdamped equation of motion.

      To clarify this, we revised the corresponding section on page 5 of the paper:

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over- damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E,F show the buckling behavior…”

      Reviewer 2:

      In the presented manuscript, the authors first use structured microfluidic devices with gliding filamentous cyanobacteria inside in combination with micropipette force measurements to measure the bending rigidity of the filaments.

      Next, they use triangular structures to trap the bacteria with the front against an obstacle. Depending on the length and rigidity, the filaments buckle under the propulsive force of the cells. The authors use theoretical expressions for the buckling threshold to infer propulsive force, given the measured length and stiffnesses. They find nearly identical values for both species, f ∼ (1.0 ± 0.6) nN/µm, nearly independent of the velocity.

      Finally, they measure the shape of the filament dynamically to infer friction coefficients via Kirchhoff theory. This last part seems a bit inconsistent with the previous inference of propulsive force. Before, they assumed the same propulsive force for all bacteria and showed only a very weak correlation between buckling and propulsive velocity. In this section, they report a strong correlation with velocity, and report propulsive forces that vary over two orders of magnitude. I might be misunderstanding something, but I think this discrepancy should have been discussed or explained.

      We regret the misunderstanding of the reviewer regarding the velocity dependence, which indicates that the manuscript should be improved to convey these relations correctly.

      First, in the Buckling Measurements section, we did not assume the same propulsion force for all bacteria. The logistic regression yields an ensemble median for Lc (and thus an ensemble median for f ), along with the width ∆Lc of the distribution (and thus also the width of the distribution of f ). Our result f ∼ (1.0 ± 0.6) nN/µm indicates the median and the width of the distribution of the propulsion force densities across the ensemble of several hundred filaments used in the buckling measurements. The large variability of the forces found in the second part is consistently reflected by this very wide distribution of active forces detected in the logistic regression in the first part.

      We did small modifications to the buckling theory paragraph to clarify that in the first part, a distribution of forces rather than a constant value is inferred (page 6)

      “Inserting the population median and quartiles of the distributions of bending modulus and critical length, we can now quantify the distribution of the active force density for the filaments in the ensemble from the buckling measurements. We obtain nearly identical values for both species, f ∼ (1.0±0.6) nN/µm, where the uncertainty represents a wide distribution of f across the ensemble rather than a measurement error.”

      The same holds, of course, when inferring the distribution of the friction coefficients (page 5):

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over- damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E,F show the buckling behavior…”

      The (naturally) wide distribution of force (and friction) leads to a distribution of Lc as well. However, due to the small exponent of 1/3 in the buckling threshold Lc ∼ f 1/3, the distribution of Lc is not as wide as the distributions of the individually inferred f or η. This is visualized in panel G of Figure 3, plotting Lc as a function of v0 (v0 is equivalent to f , up to a proportionality coefficient η). The natural length distribution, in contrast, is very wide. Therefore, the buckling propensity of a filament is most strongly characterized by its length, while force variability, which alters Lc of the individual, plays a secondary role.

      In order to clarify this, we edited the last paragraph of the Buckling Measurements section on page 5 of the manuscript:

      “…Within the characteristic range of observed velocities (1 − 3 µm/s), the median Lc depends only mildly on v0, as compared to its rather broad distribution, indicated by the bands in Figure 3 G. Thus a possible correlation between f and v0 would only mildly alter Lc. The natural length distribution (cf. Appendix 1—figure 1 ), however, is very broad, and we conclude that growth rather than velocity or force distributions most strongly impacts the buckling propensity of cyanobacterial colonies. Also, we hardly observed short and fast filaments of K. animale, which might be caused by physiological limitations (Burkholder, 1934 ).”

      Second, in the Profile analysis section, we did not report a correlation between force and velocity. As can be seen in Figure 4—figure Supplement 1, neither the active force nor the friction coefficient, as determined from the analysis of individual filaments, show any significant correlation with the velocity. This is also written in the discussion (page 7):

      We see no significant correlation between L or v0 and f or η, but the observed values of f and η cover a wide range (Figure 4 B, C and Figure 4—figure Supplement 1 ).

      Note that this is indeed consistent with the logistic regression: Using v0 as a second regressor did not significantly reduce the width of the distribution of Lc as compared to the simple logistic regression, indicating that force and velocity are not strongly correlated.

      In order to clarify this in the manuscript, we modified that part (page 7):

      “…We see no significant correlation between L or v0 and f or η, but the observed values of f and η cover a wide range (Figure 4 B,C and Figure 4— figure Supplement 1 ). This is consistent with the logistic regression, where using v0 as a second regressor did not significantly reduce the width of the distribution of critical lengths or active forces. The two estimates of the friction coefficient, from logistic regression and individual profile fits, are measured in (predominantly) orthogonal directions: tangentially for the logistic regression where the free gliding velocity was used, and transversely for the evolution of the buckling profiles. Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic…”

      From a theoretical perspective, not many new results are presented. The authors repeat the well-known calculation for filaments buckling under propulsive load and arrive at the literature result of buckling when the dimensionless number (f L3/B) is larger than 30.6 as previously derived by Sekimoto et al in 1995 [1] (see [2] for a clamped boundary condition and simulations). Other theoretical predictions for pushed semi-flexible filaments [1–4] are not discussed or compared with the experiments. Finally, the Authors use molecular dynamics type simulations similar to [2–4] to reproduce the buckling dynamics from the experiments. Unfortunately, no systematic comparison is performed.

      [1]        Ken Sekimoto, Naoki Mori, Katsuhisa Tawada, and Yoko Y Toyoshima. Symmetry breaking instabilities of an in vitro biological system. Physical review letters, 75(1):172, 1995.

      [2]       Raghunath Chelakkot, Arvind Gopinath, Lakshminarayanan Mahadevan, and Michael F Hagan. Flagellar dynamics of a connected chain of active, polar, brownian particles. Journal of The Royal Society Interface, 11(92):20130884, 2014.

      [3]       Rolf E Isele-Holder, Jens Elgeti, and Gerhard Gompper. Self-propelled worm-like filaments: spontaneous spiral formation, structure, and dynamics. Soft matter, 11(36):7181–7190, 2015.

      [4]       Rolf E Isele-Holder, Julia J¨ager, Guglielmo Saggiorato, Jens Elgeti, and Gerhard Gompper. Dynamics of self-propelled filaments pushing a load. Soft Matter, 12(41):8495–8505, 2016.

      We thank the reviewer for pointing us to these publications, in particular the work by Sekimoto we were not aware of. We agree with the referee that the calculation is straight forward (basically known since Euler, up to modified boundary conditions). Our paper focuses on experimental work, the molecular dynamics simulations were included mainly as a consistency check and not intended to generate the beautiful post-buckling patterns observed in references [2-4]. However, such shapes do emerge in filamentous cyanobacteria, and with the data provided in our manuscript, simulations can be quantitatively matched to our experiments, which will be covered by future work.

      We included the references in the revision of our manuscript, and a statement that we do not claim priority on these classical theoretical results.

      Introduction, page 2:

      “…Self-Buckling is an important instability for self-propelling rod-like micro-organisms to change the orientation of their motion, enabling aggregation or the escape from traps (Fily et al., 2020; Man and Kanso, 2019; Isele-Holder et al., 2015; Isele-Holder et al., 2016 ). The notion of self-buckling goes back to work of Leonhard Euler in 1780, who described elastic columns subject to gravity (Elishakoff, 2000 ). Here, the principle is adapted to the self-propelling, flexible filaments (Fily et al., 2020; Man and Kanso, 2019; Sekimoto et al., 1995 ) that glide onto an obstacle. Filaments buckle if they exceed a certain critical length Lc ∼ (B/f)1/3, where B is the bending modulus and f the propulsion force density…”

      Buckling theory, page 5:

      “…The buckling of gliding filaments differs in two aspects: the propulsion forces are oriented tangentially instead of vertically, and the front end is supported instead of clamped. Therefore, with L < Lc all initial orientations are indifferently stable, while for L > Lc, buckling induces curvature and a resultant torque on the head, leading to rotation (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Buckling under concentrated tangential end-loads has also been investigated in literature (de Canio et al., 2017; Wolgemuth et al., 2005 ), but leads to substantially different shapes of buckled filaments. We use classical Kirchhoff theory for a uniform beam of length L and bending modulus B, subject to a force density ⃗b = −f ⃗t − η ⃗v, with an effective active force density f along the tangent ⃗t, and an effective friction proportional to the local velocity ⃗v, analog to existing literature (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 )…”

      Further on page 6:

      “To derive the critical self-buckling length, Equation 5 can be linearized for two scenarios that lead to the same Lc: early-time small amplitude buckling and late-time stationary rotation at small and constant curvature (Fily et al., 2020; Chelakkot et al., 2014 ; Sekimoto et al., 1995 ). […] Thus, in physical units, the critical length is given by Lc = (30.5722 B/f)1/3, which is reproduced in particle based simulations (Appendix Figure 2 ) analogous to those in Isele-Holder et al. (2015, 2016).”

      Discussion, page 7 & 8:

      “…This, in turn, has dramatic consequences on the exploration behavior and the emerging patterns (Isele-Holder et al., 2015, 2016; Abbaspour et al., 2021; Duman et al., 2018; Prathyusha et al., 2018; Jung et al., 2020 ): (L/Lc)3 is, up to a numerical prefactor, identical to the flexure number (Isele-Holder et al., 2015, 2016; Duman et al., 2018; Winkler et al., 2017 ), the ratio of the Peclet number and the persistence length of active polymer melts. Thus, the ample variety of non-equilibrium phases in such materials (Isele-Holder et al., 2015, 2016; Prathyusha et al., 2018; Abbaspour et al., 2021 ) may well have contributed to the evolutionary success of filamentous cyanobacteria.”

      Reviewer 3:

      Summary:

      This paper presents novel and innovative force measurements of the biophysics of gliding cyanobacteria filaments. These measurements allow for estimates of the resistive force between the cell and substrate and provide potential insight into the motility mechanism of these cells, which remains unknown.

      We thank the reviewer for the positive evaluation of our work. We have revised the manuscript according to their comments and detail our replies and modifications next to the individual points below.

      Strengths:

      The authors used well-designed microfabricated devices to measure the bending modulus of these cells and to determine the critical length at which the cells buckle. I especially appreciated the way the authors constructed an array of pillars and used it to do 3-point bending measurements and the arrangement the authors used to direct cells into a V-shaped corner in order to examine at what length the cells buckled at. By examining the gliding speed of the cells before buckling events, the authors were able to determine how strongly the buckling length depends on the gliding speed, which could be an indicator of how the force exerted by the cells depends on cell length; however, the authors did not comment on this directly.

      We thank the referee for the positive assessment of our work. Importantly, we do not see a significant correlation between buckling length and gliding speeds, and we also do not see a correlation with filament length, consistent with the assumption of a propulsion force density that is more or less homogeneously distributed along the filament. Note that each filament consists of many metabolically independent cells, which renders cyanobacterial gliding a collective effort of many cells, in contrast to gliding of, e.g., myxobacteria.

      In response also to the other referees’ comments, we modified the manuscript to reflect more on the absence of a strong correlation between velocity and force/critical length. We modified the Buckling measurements section on page 5 of the paper:

      “The substrate contact requires lubrication from polysaccharide slime to enable bacteria to glide (Khayatan et al., 2015 ). Thus we assume an over-damped motion with co-linear friction, for which the propulsion force f and the free gliding velocity v0 of a filament are related by f = η v0, with a friction coefficient η. In this scenario, f can be inferred both from the observed Lc ∼ (f/B)−1/3 and, up to the proportionality coefficient η, from the observed free gliding velocity. Thus, by combining the two relations, one may expect also a strong correlation between Lc and v0. In order to test this relation for consistency with our data, we include v0 as a second regressor, by setting x = (L−Lc(v0))/∆Lc in Equation 1, with Lc(v0) = (η v0/(30.5722 B))−1/3, to reflect our expectation from theory (see below). Now, η rather than f is the only unknown, and its ensemble distribution will be determined in the regression. Figure 3 E, F show the buckling behavior…”

      Further, we edited the last paragraph of the Buckling measurements section on page 5 of the manuscript:

      “Within the characteristic range of observed velocities (1 − 3 µm/s), the median Lc depends only mildly on v0, as compared to its rather broad distribution, indicated by the bands in Figure 3 G. Thus a possible correlation between f and v0 would only mildly alter Lc. The natural length distribution (cf. Appendix 1—figure 1 ), however, is very broad, and we conclude that growth rather than velocity or force distributions most strongly impacts the buckling propensity of cyanobacterial colonies. Also, we hardly observed short and fast filaments of K. animale, which might be caused by physiological limitations (Burkholder, 1934 ).”

      We also rephrased the corresponding discussion paragraph on page 7:

      “…Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces…”

      Weaknesses:

      There were two minor weaknesses in the paper.

      First, the authors investigate the buckling of these gliding cells using an Euler beam model. A similar mathematical analysis was used to estimate the bending modulus and gliding force for Myxobacteria (C.W. Wolgemuth, Biophys. J. 89: 945-950 (2005)). A similar mathematical model was also examined in G. De Canio, E. Lauga, and R.E Goldstein, J. Roy. Soc. Interface, 14: 20170491 (2017). The authors should have cited these previous works and pointed out any differences between what they did and what was done before.

      We thank the reviewer for pointing us to these references. The paper by Wolgemuth is theoretical work, describing A-motility in myxobacteria by a concentrated propulsion force at the rear end of the bacterium, possibly stemming from slime extrusion. This model was a little later refuted by [A3], who demonstrated that focal adhesion along the bacterial body and thus a distributed force powers A-motility, a mechanism that has by now been investigated in great detail (see [A10]). The paper by Canio et al. contains a thorough theoretical analysis of a filament that is clamped at one end and subject to a concentrated tangential load on the other. Since both models comprise a concentrated end-load rather than a distributed propulsion force density, they describe a substantially different motility mechanism, leading also to substantially different buckling profiles. Consequentially, these models cannot be applied to cyanobacterial gliding.

      We included both citations in the revision and pointed out the differences to our work in the introduction (page 2):

      “…A few species appear to employ a type-IV-pilus related mechanism (Khayatan et al., 2015; Wilde and Mullineaux, 2015 ), similar to the better- studied myxobacteria (Godwin et al., 1989; Mignot et al., 2007; Nan et al., 2014; Copenhagen et al., 2021; Godwin et al., 1989 ), which are short, rod-shaped single cells that exhibit two types of motility: S (social) motility based on pilus extension and retraction, and A (adventurous) motility based on focal adhesion (Chen and Nan, 2022 ) for which also slime extrusion at the trailing cell pole was earlier postulated as mechanism (Wolgemuth et al., 2005 ). Yet, most gliding filamentous cyanobacteria do not exhibit pili and their gliding mechanism appears to be distinct from myxobacteria (Khayatan et al., 2015 ).”

      And in Buckling theory, page 5:

      “….The buckling of gliding filaments differs in two aspects: the propulsion forces are oriented tangentially instead of vertically, and the front end is supported instead of clamped. Therefore, with L < Lc all initial orientations are indifferently stable, while for L > Lc, buckling induces curvature and a resultant torque on the head, leading to rotation (Fily et al., 2020; Chelakkot et al., 2014; Sekimoto et al., 1995 ). Buckling under concentrated tangential end-loads has also been investigated in literature (de Canio et al., 2017; Wolgemuth et al., 2005 ), but leads to substantially different shapes of buckled filaments.”

      The second weakness is that the authors claim that their results favor a focal adhesion-based mechanism for cyanobacterial gliding motility. This is based on their result that friction and adhesion forces correlate strongly. They then conjecture that this is due to more intimate contact with the surface, with more contacts producing more force and pulling the filaments closer to the substrate, which produces more friction. They then claim that a slime-extrusion mechanism would necessarily involve more force and lower friction. Is it necessarily true that this latter statement is correct? (I admit that it could be, but is it a requirement?)

      We thank the referee for raising this interesting question. Our claim regarding slime extrusion is based on three facts: i. mutations deficient of slime extrusion do not glide, but start gliding as soon as slime is provided externally [A4]. ii. A positive correlation between speed and slime layer thickness was observed in Nostoc [A11]. iii. The fluid mechanics of lubricated sliding contacts is very well understood and predicts a decreasing resistance with increasing layer thickness.

      We included these considerations in the revision of our manuscript (page 8):

      “…it indicates that friction and propulsion forces, despite being quite variable, correlate strongly. Thus, generating more force comes, inevitably, at the expense of added friction. For lubricated contacts, the friction coefficient is proportional to the thickness of the lubricating layer (Snoeijer et al., 2013 ), and we conjecture active force and drag both increase due to a more intimate contact with the substrate. This supports mechanisms like focal adhesion (Mignot et al., 2007 ) or a modified type-IV pilus (Khayatan et al., 2015 ), which generate forces through contact with extracellular surfaces, as the underlying mechanism of the gliding apparatus of filamentous cyanobacteria: more contacts generate more force, but also closer contact with the substrate, thereby increasing friction to the same extent. Force generation by slime extrusion (Hoiczyk and Baumeister, 1998 ), in contrast, would lead to the opposite behavior: More slime generates more propulsion, but also reduces friction. Besides fundamental fluid-mechanical considerations (Snoeijer et al., 2013 ), this is rationalized by two experimental observations: i. gliding velocity correlates positively with slime layer thickness (Dhahri et al., 2013 ) and ii. motility in slime-secretion deficient mutants is restored upon exogenous addition of polysaccharide slime. Still we emphasize that many other possibilities exist. One could, for instance, postulate a regulation of the generated forces to the experienced friction, to maintain some preferred or saturated velocity.”

      Related to this, the authors use a model with isotropic friction. They claim that this is justified because they are able to fit the cell shapes well with this assumption. How would assuming a non-isotropic drag coefficient affect the shapes? It may be that it does equally well, in which case, the quality of the fits would not be informative about whether or not the drag was isotropic or not.

      The referee raises another very interesting point. Given the typical variability and uncertainty in experimental measurements (cf. error Figure 4 A), a model with a sightly anisotropic friction could be fitted to the observed buckling profiles as well, without significant increase of the mismatch. Yet, strongly anisotropic friction would not be consistent with our observations.

      Importantly, however, we did not conclude on isotropic friction based on the fit quality, but based on a comparison between free gliding and early buckling (Figure 4 D). In early buckling, the dominant motion is in transverse direction, while longitudinal motion is insignificant, due to geometric reasons. Thus, independent of the underlying model, mostly the transverse friction coefficiont is inferred. In contrast, free gliding is a purely longitudinal motion, and thus only the friction coefficient for longitudinal motion can be inferred. These two friction coefficients are compared in Figure 4 D. Still, the scatter of that data would allow to fit a certain anisotropy within the error margins. What we can exclude based on out observation is the case of a strongly anisotropic friction. If there is no ab-initio reason for anisotropy, nor a measurement that indicates it, we prefer to stick with the simplest

      assumption. We carefully chose our wording in the Discussion as “mainly isotropic” rather

      than “isotropic” or “fully isotropic”.

      We added a small statement to the Discussion on page 7 & 8:

      “... Thus we plot f/v over η in Figure 4 D, finding nearly identical values over about two decades. Since f and η are not correlated with v0, this is due to a correlation between f and η. This relation is remarkable in two aspects: On the one hand, it indicates that friction is mainly isotropic. This suggests that friction is governed by an isotropic process like bond friction or lubrication from the slime layer in the contact with the substrate, the latter being consistent with the observation that mutations deficient of slime secretion do not glide but exogenous addition of slime restores motility (Khayatan et al., 2015 ). In contrast, hydrodynamic drag from the surrounding bulk fluid (Man and Kanso, 2019 ), or the internal friction of the gliding apparatus would be expected to generate strongly anisotropic friction. If the latter was dominant, a snapping-like transition into the buckling state would be expected, rather than the continuously growing amplitude that is observed in experiments. On the other hand, it indicates that friction and propulsion forces ...”

      Recommendations for the authors

      The discussion regarding how the findings of this paper imply that cyanobacteria filaments are propelled by adhesion forces rather than slime extrusion should be improved, as this conclusion seems questionable. There appears to be an inconsistency with a buckling force said to be only weakly dependent on the gliding velocity, while its ratio with the velocity correlates with a friction coefficient. Finally, data and source code should be made publicly available.

      In the revised version, we have modified the discussion of the force generating mechanism according to the reviewer suggestions. The perception of inconsistency in the velocity dependence of the buckling force was based on a misunderstanding, as we detailed in our reply to the referee. We revised the corresponding section to make it more clear. Data and source code have been uploaded to a public data repository.

      Reviewer #2 (recommendations for the authors)

      Despite eLife policy, the authors do not provide a Data Availability Statement. For the presented manuscript, data and source code should be provided “via trusted institutional or third-party repositories that adhere to policies that make data discoverable, accessible and usable.” https://elifesciences.org/inside-elife/51839f0a/for-authors-updates- to-elife-s-data-sharing-policies

      Most of the issues in this reviewer’s public review should be easy to correct, so I would strongly support the authors to provide an amended manuscript.

      We added the Data Availability Statement in the amended manuscript.

      References

      [A1] E. Hoiczyk and W. Baumeister. “The junctional pore complex, a prokaryotic secretion organelle, is the molecular motor underlying gliding motility in cyanobacteria”. In: Curr. Biol. 8.21 (1998), pp. 1161–1168. doi: 10.1016/s0960-9822(07)00487-3.

      [A2] N. Read, S. Connell, and D. G. Adams. “Nanoscale Visualization of a Fibrillar Array in the Cell Wall of Filamentous Cyanobacteria and Its Implications for Gliding Motility”. In: J. Bacteriol. 189.20 (2007), pp. 7361–7366. doi: 10.1128/jb.00706- 07.

      [A3] T. Mignot, J. W. Shaevitz, P. L. Hartzell, and D. R. Zusman. “Evidence That Focal Adhesion Complexes Power Bacterial Gliding Motility”. In: Science 315.5813 (2007), pp. 853–856. doi: 10.1126/science.1137223.

      [A4] Behzad Khayatan, John C. Meeks, and Douglas D. Risser. “Evidence that a modified type IV pilus-like system powers gliding motility and polysaccharide secretion in filamentous cyanobacteria”. In: Mol. Microbiol. 98.6 (2015), pp. 1021–1036. doi: 10.1111/mmi.13205.

      [A5] Tilo Pompe, Martin Kaufmann, Maria Kasimir, Stephanie Johne, Stefan Glorius, Lars Renner, Manfred Bobeth, Wolfgang Pompe, and Carsten Werner. “Friction- controlled traction force in cell adhesion”. In: Biophysical journal 101.8 (2011), pp. 1863–1870.

      [A6] Hirofumi Wada, Daisuke Nakane, and Hsuan-Yi Chen. “Bidirectional bacterial gliding motility powered by the collective transport of cell surface proteins”. In: Physical Review Letters 111.24 (2013), p. 248102.

      [A7] Jo¨el Tchoufag, Pushpita Ghosh, Connor B Pogue, Beiyan Nan, and Kranthi K Mandadapu. “Mechanisms for bacterial gliding motility on soft substrates”. In: Proceedings of the National Academy of Sciences 116.50 (2019), pp. 25087–25096.

      [A8] Chenyi Fei, Sheng Mao, Jing Yan, Ricard Alert, Howard A Stone, Bonnie L Bassler, Ned S Wingreen, and Andrej Kosmrlj. “Nonuniform growth and surface friction determine bacterial biofilm morphology on soft substrates”. In: Proceedings of the National Academy of Sciences 117.14 (2020), pp. 7622–7632.

      [A9] Arja Ray, Oscar Lee, Zaw Win, Rachel M Edwards, Patrick W Alford, Deok-Ho Kim, and Paolo P Provenzano. “Anisotropic forces from spatially constrained focal adhesions mediate contact guidance directed cell migration”. In: Nature communications 8.1 (2017), p. 14923.

      [A10] Jing Chen and Beiyan Nan. “Flagellar motor transformed: biophysical perspectives of the Myxococcus xanthus gliding mechanism”. In: Frontiers in Microbiology 13 (2022), p. 891694.

      [A11] Samia Dhahri, Michel Ramonda, and Christian Marliere. “In-situ determination of the mechanical properties of gliding or non-motile bacteria by atomic force microscopy under physiological conditions without immobilization”. In: PLoS One 8.4 (2013), e61663.

    1. Author response:

      We extend our gratitude to the two reviewers and the editors at eLife for their meticulous examination of our manuscript, as well as for their valuable feedback and positive assessment. We are particularly pleased to observe in both the reviews and the editorial evaluation the recognition of the importance of our findings. Through this provisional response, we wish to convey to the editors, reviewers, and the readership of eLife our intention to enhance the paper by incorporating a detailed description of the sections pertaining to MAD analysis, data interpretation with combined HS-AFM and PCA methods, and specific portions of the discussions. This will involve editing the manuscript accordingly and providing separate explanations in the "author response”. We acknowledge that such additions will strengthen the comprehensiveness of our work and render it more self-contained.

      Moreover, in alignment with the recommendations from the review team, we will provide a thorough discussion of published data and offer a clearer explanation of our utilized methods, thereby providing a more robust foundation for our conclusions.

    2. eLife assessment

      In this manuscript the authors present high-speed atomic force microscopy (HSAFM) to analyze real-time structural changes in actin filaments induced by cofilin binding. This important study enhances our understanding of actin dynamics which plays a crucial role in a broad spectrum of cellular activities. However, some technical questions remain, making the data interpretation incomplete.

    3. Reviewer #1 (Public Review):

      The authors provided a detailed analysis of the real-time structural changes in actin filaments resulting from cofilin binding, using High-Speed Atomic Force Microscopy (HSAFM). The cofilin family controls the lifespan of actin filaments in the cells by severing the filament and promoting depolymerization. Understanding the effects of cofilin on actin filament structure is critical. It is widely acknowledged that cofilin binding significantly shortens the pitch of the actin helix. The authors previously reported (1) that this shortening extends to the unbound region of the actin filament on the pointed end side of the cluster. In this study, the authors presented substantially improved AFM images and provide detailed accounts of the dynamics observed. It was found that a minimal cofilin-binding cluster, consisting of 2-4 molecules, could induce changes in the helical parameters over one or more actin crossover repeats. Adjacent to the cofilin-binding clusters, the actin crossovers were observed to shortened within seconds, and this shortening was limited to one side of the cluster. Additionally, the phosphate binding to the actin filament was observed to stabilize the helical twist, suggesting a mechanism in which cofilin preferentially binds to ADP-bound actin filaments. These findings significantly advance our understanding of actin filament dynamics which is essential for a wide of cellular processes.<br /> However, I propose that the sections about MAD and certain parts of the discussions need substantial revisions.

      MAD analysis<br /> The authors have presented findings that the mean axial distance (MAD) within actin filaments exhibits a significant dependency on the helical twist, a conclusion not previously derived despite extensive analyses through electron microscopy (EM) and molecular dynamics (MD) simulations. Notably, the MAD values span from 4.5 nm (8.5 pairs per half helical pitch, HHP) to 6.5 nm (4.5 pairs/HHP) as depicted in Figure 3C. The inner domain (ID) of actin remains very similar across C, G, and F forms(2, 3), maintaining similar ID-ID interactions in both cofilactin and bare actin filaments, keeping the identical axial distance between subunits in the both states. This suggests that the ID is unlikely to undergo significant structural changes, even with fluctuations in the filament's twist, keeping the ID-ID interactions and the axial distances. The broad range of MAD values reported poses a challenge for explanation. A careful reassessment of the MAD analysis is recommended to ensure accuracy.<br /> In determining axial distances, the authors extracted measurements from filament line profiles. It is advised to account for potential anomalies such as missing peaks or pseudo peaks, which could arise from noise interference. An example includes the observation of three peaks in HHP6 of Figure Supplement 5C, corresponding to 4.5 pairs. Peak intervals measured from the graph were 5, 11.8, 8.7, and 5.7 nm. The second region (11.8 nm) appears excessively long. If one peak is hidden in the second region, the MAD becomes 5.5 nm.

      Compiling histograms of axial distances (ADs) rather than focusing solely on MAD may provide deeper insights. If the AD is too long or too short, the authors should suspect the presence of missing peaks or pseudo-peaks due to noise. If 4.4 or 5.5 pairs/HHP regions tend to contain missing peaks and 7.5-8.5 pairs/HHP regions tend to contain pseudo peaks, this may explain the MAD dependency on the helical twist.

      Additionally, Figure 3E indicates a first decay constant of 0.14 seconds, substantially shorter than the frame rate (0.5 sec/frame). This suggests significant variations in line profiles between frames, attributable either to overly rapid dynamics or a low signal-to-noise ratio. Implementing running frame averages (of 2-3 frames) is recommended to distinguish between these scenarios. If the dynamics are indeed fast, the averaged frame's line profile may degrade, complicating peak identification. Conversely, if poor signal-to-noise ratio is the cause, averaging frames could facilitate peak detection. In the latter case, the authors can find the optimal number of frame averages and obtain better line profiles with fewer missing and pseudo-peaks.

      Discussions<br /> The authors suggest a strong link between the C-form of actin and the formation of a short pitch helix. However, Oda et al. (3) have demonstrated that the C-form is highly unstable in the absence of cofilin binding, casting doubt on the possibility of the C-form propagating without cofilin binding. Moreover, in one strand of the cofilactin, interactions between actin subunits are limited to those between the inner domains (ID-ID interactions), which are quite similar to the interactions observed in bare actin filaments. This similarity implies that ID-ID interactions alone are insufficient to determine the helical parameters, suggesting that the presence of cofilin is essential for the formation of the short pitch helix in the cofilactin filament. Thus, crossover repeats are not necessarily shortened even if the actin form is C-form.

      Narita (4) proposes that the facilitation of cofilin binding may occur through a shortening in the helix pitch, independent of a change to the C-form of actin. Furthermore, the dissociation of the D-loop from an adjacent actin subunit leads directly to the transition of actin to the G-form, which is considered the most stable configuration for the actin molecule (3).

      The mechanism by which the shortened pitch propagates remains a critical and unresolved issue. It appears that this propagation is not a result of the C-form's propagation but likely involves an unidentified mechanism. Identifying and understanding this mechanism represents an essential direction for future research.

      (1) K. X. Ngo et al., a, Cofilin-induced unidirectional cooperative conformational changes in actin filaments revealed by high-speed atomic force microscopy. eLife 4, (2015).<br /> (2) K. Tanaka et al., Structural basis for cofilin binding and actin filament disassembly. Nature communications 9, 1860 (2018).<br /> (3) T. Oda et al., Structural Polymorphism of Actin. Journal of molecular biology 431, 3217-3228 (2019).<br /> (4) A. Narita, ADF/cofilin regulation from a structural viewpoint. Journal of muscle research and cell motility 41, 141-151 (2020).

    4. Reviewer #2 (Public Review):

      Summary:

      This study by Ngo et al. uses mostly high-speed AFM to estimate conformational changes within actin filaments, as they get decorated by cofilin. The authors build on their earlier study (Ngo et al. eLife 2015) where they used the same technique to monitor the expansion of cofilin clusters on actin filaments, and the propagation of the associated conformational changes in the filament (reduction of the helical pitch). Here, they propose a higher-resolution description of the binding of cofilin to actin filaments.

      Strengths:

      The high speed AFM technique used here is quite original to address this question, compared to classical light and electron microscopy techniques. It can certainly bring valuable information as it provides a high spatial resolution while monitoring live events. Also, in this paper, a nice effort was made to make the 3D structures and conformational changes clear and understandable.

      Weaknesses:

      The paper also has a number of limitations, which I detail below.

      In addition to AFM, the authors also propose a Principal Component Analysis (PCA) of exisiting structural data on actin protomers. However, this part seems very similar to another published work by others (Oda et al. JMB 2019), which is not even cited.

      The asymmetrical growth of cofilin clusters has so far only been seen using AFM, by the same authors (Ngo et al. eLife 2015). Using fluorescent microscopy, others have reported a very symmetrical expansion of cofilin clusters (Wioland et al. Curr Biol 2017). This is not mentioned at all, here. It should be discussed, and explanations for this discrepancy could be proposed.

      Regarding the AFM technique, I have the following concerns.

      The filaments appear densely packed on the surface, and even clearly in register in some images (if not most images, e.g., Figs 3A, 4BC, 5A). Why is that? Isn't there a risk that this could affect the result? This suggests there is some interaction between the filaments.

      The properties of the lipid layer and its interaction with the actin filaments are not clear at all. A poor control of these interactions is a problem if one aims to measure conformational changes at high resolution. The strength of the interaction appears tuned by the ratio of lipids put on the surface to change its electrostatic charge. A strong attachement likely does more than suppress torsional motion (as claimed in Fig 8A). It may also hinder cofilin binding in several ways (lower availability of binding sites on the filament facing the surface, electrostatic interactions between cofilin and the surface, etc.)

      How do we know that the variations over time are not mostly experimental noise, i.e. variations between repeats of the same measurement? As shown in Fig 3, correlation is mostly lost from one image to the next, and rather stable after that.

      The identification of cofilactin regions relies on the additional height of the "peaks", due to the presence of cofilin. It thus seems that cofilin is detected every half helical pitch (HHP), but not in between, thereby setting the resolution for the localization of cluster borders to one HHP. It thus seems difficult to claim that there is a change in helicity without cofilin decoration over this distance. In Fig 7, the change in helicity could be due to cofilin decoration that is undetected because cofilins have not yet reached the next peak.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides valuable insights into how chromatin-bound PfMORC controls gene expression in the asexual blood stage of Plasmodium falciparum. By interacting with key nuclear proteins, PfMORC appears to affect expression of genes important for host invasion and subtelomeric var genes. Correlating transcriptomic data with in vivo chromatin insights, the study provides solid evidence for the central role of PfMORC in epigenetic transcriptional regulation through modulation of chromatin compaction.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The study provides valuable insights into the role of PfMORC in Plasmodium's epigenetic regulation, backed by a comprehensive methodological approach. The overarching goal was to understand the role of PfMORC in epigenetic regulation during asexual blood stage development, particularly its interactions with ApiAP2 TFs and its potential involvement in the regulation of genes vital for Plasmodium virulence. To achieve this, they conducted various analyses. These include a proteomic analysis to identify nuclear proteins interacting with PfMORC, a study to determine the genome-wide localization of PfMORC at multiple developmental stages, and a transcriptomic analysis in PfMORCHA-glmS knockdown parasites. Taken together, this study suggests that PfMORC is involved in chromatin assemblies that contribute to the epigenetic modulation of transcription during the asexual blood stage development.

      Strengths:

      The study employed a multi-faceted approach, combining proteomic, genomic, and transcriptomic analyses, providing a holistic view of PfMORC's role. The proteomic analysis successfully identified several nuclear proteins that may interact with PfMORC. The genome-wide localization offered valuable insights into PfMORC's function, especially its predominant recruitment to subtelomeric regions. The results align with previous findings on PfMORC's interaction with ApiAP2 TFs. Notably, the authors meticulously contextualized their findings with prior research, including pre-prints, adding credibility to their work.

      Weaknesses:

      While the study identifies potential interacting partners and loci of binding, direct functional outcomes of these interactions remain an inference. The authors heavily rely on past research for some of their claims. While it strengthens some assertions, it might indicate a lack of direct evidence in the current study for particular aspects. The declaration that PfMORC may serve as an attractive drug target is substantial. While the data suggests its involvement in essential processes, further studies are required to validate its feasibility as a drug target.

      Reviewer #2 (Public Review):

      Summary:

      This is a paper entitled "Plasmodium falciparum MORC protein modulates gene expression through interaction with heterochromatin" describes the role of PfMORC during the intra-erythrocytic cycle of Plasmodium falciparum. Garcia et al. investigated the PfMORC-interacting proteins and PfMORC genomic distribution in trophozoites and schizonts. They also examined the transcriptome of the parasites after partial knockdown of the transcript.

      Strengths:

      This study is a significant advance in the knowledge of the role of PfMORC in heterochromatin assembly. It provides an in-depth analysis of the PfMORC genomic localization and its correlation with other chromatin marks and ApiAP2 transcription factor binding.

      Weaknesses:

      However, most of the conclusions are based on the function of interacting proteins and the genomic localization of the protein. The authors did not investigate the direct effects of PfMORC depletion on heterochromatin marks. Furthermore, the results of the transcriptomic analysis are puzzling as 50% of the transcripts are downregulated, a phenotype not expected for a heterochromatin marker.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data, or analyses.

      • Figure 1A and Table 1: the authors should incorporate a volcano plot in their proteomic results presentation. This graphical representation can provide a more intuitive grasp of the most relevant proteins associated with PfMORC in terms of both their abundance and significance. It will aid in swiftly pinpointing proteins with the most notable differential associations. This will complement the comprehensive overview provided by the authors, referencing past research where PfMORC was detailed.

      We thank the reviewer for the suggestion. We agree with the reviewer that the volcano plot we now provide does indeed bring comprehensive information on associations between PfMORC and other cellular proteins. The volcano plot presented in the revised manuscript as Figure 1A, was generated using the normalized MS/MS counts from the anti-GFP and 3D7 (control) proteomics datasets (n=3). The potential PfMORC interacting proteins were determined using the fold changes and p-values between the two datasets, as provided in Table 1.

      Several protein interactors were strongly supported by statistical analysis (p-value), while others showed weaker p-value due to variability between replicates. Indeed, the total number of proteins identified in the three replicates, shown in the Venn diagram (Supplemental Figure 1D), exhibits a good overlap between the replicates but a lower number of identified proteins in the GFP-E1 sample. This variability was observed also in the statistical analysis. Indeed, by analyzing the GFP/3D7 ratios, some proteins have a significant difference in abundance (fold change greater than 1.5x) in one of the groups but do not meet the statistical threshold. For more clarity, we have included the -log p-value for the proteins listed in Table 1.

      Overall, these results demonstrate that many ApiAP2 proteins and several chromatin-associated factors interact with PfMORC.

      • Given the plethora of proteins detected in the PfMORC eluate, it raises the question of how many are genuine MORC interactors versus those that are merely nearby molecules acting adjacently. These might incidentally end up in the immunoprecipitate due to unintended interactions with DNA or chromatin. While the M&M section mentions that the beads were thoroughly washed, there is no specification about the washing buffer or its stringency (i.e., salinity level). At higher salinities, one could isolate core complexes of interactors associated with DNA or even RNA carryover.

      We apologize for this omission and have now added the buffer composition used to wash the beads. This section now reads "To perform the co-immunoprecipitation we followed the manufacturer protocol (ChromoTek, gta-20). Samples were lysed in modified RIPA buffer (50 mM Tris, pH 7.5, 150 mM NaCl, 0.5% sodium deoxycholate, 1% Nonidet P-40, 10 µg/ml aprotinin, 10 µg/ml leupeptin, 10 µg/ml, 1 mM phenylmethylsulfonyl fluoride, benzamidine) for 30 min on ice. The lysate was precleared with 50 µl of protein A/G-Agarose beads at 4°C for 1 h and clarified by centrifugation at 10,000 × g for 10 min. The precleared lysate was incubated overnight with an anti-GFP antibody using anti-GFP-Trap-A beads (ChromoTek, gta-20). The magnetic beads were then pelleted using a magnet (Invitrogen) and washed 3 times with wash buffer (10 mM Tris/Cl pH 7.5, 150 mM NaCl, 0.05 % Nonidet™ P40 Substitute, 0.5 mM EDTA)."

      We used the same salt concentration for immunoprecipitation as was used in the lysis buffer to minimize the binding of non-specific proteins. The wash buffer composition is updated in the revised manuscript. The immunoprecipitations were done in biological triplicates to ensure reproducibility and statistical support. A number of proteins are common across all three replicates. We also used wild-type parasites (non-GFP) as a negative control to eliminate non-specific hits, and we used a log2-fold change ≥1.5 relative to wild type parasites as our cutoff between the comparison groups.

      We believe that these conditions provide the stringency required to identify high confidence PfMORC interacting proteins, although this still leaves a possibility for additional lower affinity interactions. Future studies will certainly follow up candidate interaction partners to better define this complex. However, the complexity of the complex resembles that reported previously in Toxoplasma gondii (Farhat et al. 2020, Nat Microbiol) as well another report on the PfMORC complexes: https://elifesciences.org/reviewed-prepri nts/92499

      • The authors demonstrate that PfMORC creates distinct peaks in and around HP1-bound areas (Figure 2F), hinting at a specific role for PfMORC in heterochromatin compaction, boundary definition, and gene silencing. This pattern is clearly depicted in an example in Figure 2F. It would be beneficial to know if this enrichment profile is replicated elsewhere and, if so, it would be worthwhile to quantify it.

      This is an excellent point. Yes, this pattern is seen across the entire genome, where PfMORC is apposed to PfHP1-bound heterochromatic regions. As indicated in the manuscript, we have quantified this effect genome-wide; however, since we already display compiled data for Chromosome 2 (at both chromosome ends) pertaining to the position of PfMORC relative to PfHP1 we do not feel it is essential to provide such a figure for the entire genome as it does not alter the central message of our manuscript. Figure 2F is representative of the genome-wide distribution of PfMORC relative to PfHP1. The raw genome-wide data are available in Supplementary Information for further inspection of specific loci on other chromosomes.

      Recommendations for improving the writing and presentation.

      MAIN TEXT

      Panel e, referenced both in the main text and legend, is missing from Figure 4. This missing panel represents a significant finding of the study, highlighting according to the authors a low correlation between ChIP-seq gene targets and RNA-seq DEGs. This observation implies that PfMORC's global occupancy is more aligned with shaping chromatin architecture than directly regulating specific gene targets. In light of this, the authors should rephrase parts of their manuscript (including abstract and title) to avoid suggesting that PfMORC acts primarily (directly) as a gene regulator, emphasizing instead its role in influencing the topological structure of chromosomes.

      We have modified the title as suggested by the reviewer to more accurately reflect that PfMORC modulates chromatin architecture rather than acting as a direct regulator of specific genes. Our new title is: A Plasmodium falciparum MORC protein complex modulates epigenetic control of gene expression through interaction with heterochromatin

      We apologize for the omission of Figure 4e, which is now included in the revised manuscript. We found PfMORC occupancy on all chromosomes at subtelomeric regions, which are known to harbor genes related to immune evasion and antigenic variation (including most of the var genes). This study is also in agreement with Bryant et al. (PMID 32816370) which reported PfMORC occupancy along with PfISW1 at var gene promoters. PfMORC has also been identified in complexes with various ApiAP2 proteins in a proteome-wide study (Hillier et al. Cell Rep, PMID 31390575), as well as in immunoprecipitations of PfAP2-G2 (Singh et al., Mol Micro, PMID 33368818) and PfAP2-P (Subudhi et al., Nat Microbiol, PMID 37884813). The recent study by Subudhi et al. reports that PfAP2-P is involved in the regulation of var gene expression, antigenic variation, trophozoite development and parasite egress. It is therefore possible that PfMORC may have different effects on transcriptional regulation through interactions with different ApiAP2 transcription factors. Our comparison of PfMORC with known ApiAP2 protein occupancy reveals a high level of overlap, indicating that PfMORC may affect gene expression in various ways throughout the asexual cycle. Additionally, Hillier et al. show that PfMORC interaction is not limited to ApiAP2 but also implicates several other chromatin remodellers, which is consistent with our own results. We do not imply direct regulation of transcription via PfMORC in our manuscript. To the contrary, we suggest that it interacts with heterochromatin and thereby plays a role in the epigenetic control of asexual blood stage transcriptional regulation which is also clarified in the revised abstract.

      Another limitation of differential gene expression was use of the glmS ribozyme system, which resulted in only 50% depletion of the PfMORC transcript. There may still be enough PfMORC to rescue the gene expression we could not detect correctly. Therefore, it is challenging to interpret the function of PfMORC in only chromatin architecture but not in gene expression.

      If we believe that PfMORC in Plasmodium isn't mainly adjusting gene expression, the authors' suggestion that MORC is targeted by some AP2s becomes puzzling. How do we make sense of these different ideas? The authors need to clarify this to maintain consistency in their findings.

      Based on our data, we hypothesize that PfMORC acts as an accessory protein for ApiAP2 transcription factors. In a number of studies, including ours and the concurrent publication in eLife (https://elifesciences.org/reviewed-preprints/92499), PfMORC co-IPed with several ApiAP2 proteins, suggest it has multiple functions. In our previous study we showed that PfMORC expression is highest in mid and late asexual stages. A comparison of the PfMORC occupancy with 6 ApiAP2 (having different expression profile) suggest plasticity in PfMORC function. We have revised our discussion to make this hypothesis more transparent for the readers.

      The authors should cite Farhat et al. 2020 (Extended Data Fig. 1a), as it similarly identified 3 different ELM2-containing proteins in Toxoplasma MORC-associated complexes. This previous work provides context and supports the observations made with PfMORC in this study.

      Thank you for the suggestion and pointing out this omission. We have indeed cited the work of the Farhat group in the original manuscript and have now included this additional reference to corroborate the text and provide further support to our conclusions.

      Minor corrections to the text and figures.

      • Panel e is missing from Figure 4.

      As mentioned above Panel e is now included in Figure 4.

      • The captions are very minimally detailed. An effort must be made to better describe the panels as well as which statistical tests were used. As it stands, this is not really up to standard.

      We have elaborated the captions with more detailed descriptions, and we now provide additional information where further clarification was necessary.

      Reviewer #2 (Recommendations For The Authors):

      • The study lacks a direct correlation between the inferred function of PfMORC and the heterochromatin state of the genome after its depletion. It would be interesting to perform chip-seq on known heterochromatin markers such as H3K9me3, HP1 or H3K36me2/3 to measure the consequences of PfMORC depletion on global heterochromatin and its boundaries.

      While the proposed experiments are certainly interesting, they are beyond the scope of this study. The current manuscript is focused on PfMORC occupancy, its interacting partners, and its impact on differential gene regulation after PfMORC depletion in asexual parasites. Nonetheless, we did in fact compared the PfMORC occupancy with that of various heterochromatin markers (H2A.Z, H3K9ac, H3K4me3, H3K27ac, H3K18ac, H3K9me3, H3K36me2/3, H4K20me3, and H3K4me1) at 30hpi and 4hpi time points. These data are presented in Supplemental Figure 9. We did not find any significant colocalization, but documented the presence of PMORC in H3K36me2 depleted regions.

      • The PfMORC depletion was performed using a glms-based genetic system and the reviewer did not find any quantification of the depletion level at 24h or 36h. This is particularly important as the authors present RNA-seq data at these time points.

      We would like to clarify that RNA-seq was performed on 32hpi parasites after approximately 48 h treatment with 2.5 mM GlcN. At the trophozoite and schizont stage, PfMORC expression is high, which is why we selected these time points for RNA-seq (32hpi) and ChIP-seq (30hpi and 40hpi). PfMORC protein expression after GlcN treatment is analyzed in our previous paper (Singh et al., Sci Rep, PMID 33479315), where treatment with 2.5 mM GlcN leads to 50% reduction in PfMORC transcript at 32hpi. This is referenced in the Results section; we decided not to repeat the same experiment in the current manuscript.

      • The authors performed a thorough analysis of the correlations between ApiAP2 binding, histone modification and genomic localization of PfMORC (their chip-seq data). However, they found an inverse relationship between H3K36me2, a known histone repressive mark, and PfMORC genomic localization. This is particularly surprising when PfMORC itself is presented as a heterochromatin marker. The wording of this data is confusing in the results section (lines 257-258) and never discussed further. This important data should at least be discussed to make sense of this apparent contradiction.

      H3K36me2 indeed acts as a global repressive mark in P. falciparum. However, our hypothesis implies that PfMORC not only overlaps with H3K36me2 depleted region, but also interacts with other epigenetic regulators. Therefore, we propose that PfMORC is part of chromatin remodeling complexes involved in heterochromatin dynamics. Moreover, we did not see any overlap between several other heterochromatin markers, suggesting it has a unique binding preference not shared with other heterochromatin markers. Based on this study and parallel work submitted by Chahine et al. (https://elifesciences.org/reviewed-preprints/92499#abstract), it is evident that PfMORC is crucial for gene regulation and chromatin structure maintenance as shown in other organisms. Currently, we do not know what the apparent mutual exclusion between H3K36me2 and PfMORC implies mechanistically or how PfMORC interaction with heterochromatin aids in chromatin integrity. In Arabidopsis thaliana, MORC binding leads to chromatin compaction and reduces DNA accessibility to transcription factors, thereby repressing gene expression. In P. falciparum, overlap in the binding region of PfMORC with different transcription factors suggests several possibilities that require further investigation. Since there is only one gene encoding a PfMORC protein in P. falciparum, it is possible that PfMORC function is not limited to chromatin integrity, but it may also function to modulate gene expression at different stages. To fully explore the function of PfMORC will require investigating the functional role of the other interacting partners we and others have identified.

      We have modified the result section per the reviewer's suggestion, and we now also discuss this finding in more detail in the discussion section.

      • The ChIP-seq data are central to this manuscript. However, the presentation of this data in Figure 2A suggests that it is very noisy (particularly for Chr1). It would be of interest to present the called peaks together with the normalized data so that the reader can assess the quality of the ChIP-seq data.

      Our results clearly demonstrate the enrichment of PfMORC in sub-telomeric regions and internal heterochromatic islands. These results are consistent across all of our replicates taken at two independent time points of parasite asexual blood stage development and correlate well with the results of Le Roch: https://elifesciences.org/reviewed-preprints/92499. The raw data files have been provided and can be re-analyzed by any user.

      • The RNA-seq data showed that only a few genes are affected after 24 h of PfMORC depletion. Furthermore, there is an equal number of up- and down-regulated genes. It is not clear why depletion of a heterochromatin marker would induce down-regulation of genes. How these data relate to the partial depletion of PfMORC is not discussed.

      We would like to clarify that RNA-seq experiment was performed at 32hpi after GlcN following knockdown as previously described (Singh et al., Sci Rep, PMID 33479315). Briefly, synchronous, early trophozoites stage (24hpi) PfMORCglmS-HA parasites were treated with 2.5 mM GlcN until they reached the trophozoite stage (32 hpi) in the next cycle. These parasites were then collected for analysis by RNA-seq. We did not detect a substantial log-fold change at this point because only 50% of the transcripts were depleted in the glmS-based PfMORC knockdown system. However, we have seen a distinctive pattern of up (60) and down (103) regulated DEGs that are comprised of egress-related genes or surface antigens. We believe that PfMORC interacts with different ApiAP2 proteins, as shown in Figure 3A, and consequently exhibits multiple functions. This finding has now been corroborated in several other recent studies (See response to Reviewer 1 above).

    2. eLife assessment

      This study provides valuable insights into how chromatin-bound PfMORC controls gene expression in the asexual blood stage of Plasmodium falciparum. By interacting with key nuclear proteins, PfMORC is predicted to affect expression of genes important for host invasion and variable subtelomeric gene families. Correlating transcriptomic data with in vivo chromatin analysis, the study provides convincing evidence for the role of PfMORC in epigenetic transcriptional regulation.

    3. Reviewer #1 (Public Review):

      Summary:

      The study provides valuable insights into the role of PfMORC in Plasmodium's epigenetic regulation, backed by a comprehensive methodological approach. The overarching goal was to understand the role of PfMORC in epigenetic regulation during asexual blood stage development, particularly its interactions with ApiAP2 TFs and its potential involvement in the regulation of genes vital for Plasmodium virulence. To achieve this, they conducted various analyses. These include a proteomic analysis to identify nuclear proteins interacting with PfMORC, a study to determine the genome-wide localization of PfMORC at multiple developmental stages, and a transcriptomic analysis in PfMORCHA-glmS knockdown parasites. Taken together, this study suggests that PfMORC is involved in chromatin assemblies that contribute to the epigenetic modulation of transcription during the asexual blood stage development.

      Strengths:

      The study employed a multi-faceted approach, combining proteomic, genomic, and transcriptomic analyses, providing a holistic view of PfMORC's role. The proteomic analysis successfully identified several nuclear proteins that may interact with PfMORC. The genome-wide localization offered valuable insights into PfMORC's function, especially its predominant recruitment to subtelomeric regions. The results align with previous findings on PfMORC's interaction with ApiAP2 TFs. Notably, the authors meticulously contextualized their findings with prior research adding credibility to their work.

      Weaknesses:

      While the study identifies potential interacting partners and loci of binding, direct functional outcomes of these interactions remain an inference. The use of the glmS ribozyme system to achieve a 50% reduction in PfMORC transcript levels makes it difficult to understand the role of PfMORC solely in terms of chromatin architecture without considering its impact on gene expression. Although assessing the overall impact of acute MORC depletion was beyond the scope of the study, it would have been informative.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      I will summarize my comments and suggestions below.

      (1) Abstract:

      "Non-catalytic (pseudo)kinase signaling mechanisms have been described in metazoans, but information is scarce for plants." To the best of my understanding EFR is an active protein kinase in vitro and in vivo and cannot be considered a pseudokinase. Consider rephrasing.

      We rephrased to: “Non-catalytic signaling mechanisms of protein kinase domains have been described in metazoans, but information is scarce for plants.”

      (2) Page 4: It should be noted, that while membrane associated Rap-RiD systems have been used in planta to activate receptor kinase intracellular domains by promoting interaction with a co-receptor kinase domain, this system does not resemble the actual activation mechanism in the plasma membrane. This would be worth discussing when introducing the system. For example, the first substrates of the RK signaling complex may also be membrane associated and not freely diffuse in solution, which may be important for enzyme-substrate interaction.

      We inserted on page 4: “The RiD system was previously applied in planta, maintaining membrane-association by N-terminal myristoylation (Kim et al., 2021). For the in vitro experiments, the myristoylation sites were excluded to facilitate the production of recombinant protein.”

      (3) Page 4 and Fig 1: The catalytic Asp in BRI1 is D1027 and not D1009 (https://pubmed.ncbi.nlm.nih.gov/21289069/). Please check and prepare the correct mutant protein if needed.

      We clarified this in the text by stating that we mutated the HRD-aspartate to asparagine in all our catalytic-dead mutants: “Kinase-dead variants with the catalytic residue (HRD-aspartate) replaced by asparagine (EFRD849N and BRI1D1009N), had distinct effects […]”. D1027 in BRI1 is the DFG-Asp, which was not mutated in our study.

      (4) Page 4 and Fig 1: Is BIK1 a known component of the BR signaling pathway and a direct BRI1 substrate? Or in other words how specific is the trans-phosphorylation assay? In my opinion, a more suitable substrate for BRI1/BAK1 would be BSK1 or BSK3 (for example https://pubmed.ncbi.nlm.nih.gov/30615605/).

      Kinase-dead BIK1 is a reported substrate of BRI1. We clarified this in the results section by inserting: “BIK1 was chosen as it is reported substrate of both, EFR/BAK1 and BRI1/BAK1 complexes (Lin et al., 2013).”

      (5) Fig. 1B Why is BIK1 D202N partially phosphorylated in the absence of Rap? I would suggest to add control lanes showing BRI1, EFR, FLS2, BAK1 and BIK1 in isolation. Given that a nice in vitro activation system with purified components is available, why not compare the different enzyme kinetics rather than band intensities at only 1 enzyme : substrate ratio?

      BIK1 D202N is partially phosphorylated due to the presence of active BAK1 that is capable of transphosphorylating BIK1 D202N as it has been reported in a previous study: (DOI: 10.1038/s41586-018-0471-x).

      (6) Page 4 and Fig 1: Is the kinase dead variant of EFR indeed kinase dead? I could still see a decent autorad signal for this mutant when expressed in E. coli (Fig 1 A in Bender et al., 2021; https://pubmed.ncbi.nlm.nih.gov/34531323/)? If this mutant is not completely inactive, could this change the interpretation of the experiments performed with the mutant protein in vitro and in planta in the current manuscript? In my opinion, it could be possible that a partially active EFR mutant can be further activated by BAK1, and in turn can phosphorylate BIK1 D202N. The differences in autorad signal for BRI1D1009?N and EFRD849N is very small, and the entire mechanism hinges on this difference.

      We would like to emphasize that the mechanism hinges on the difference between non-dimerized and dimerized kinase domains in the in vitro kinase assay. BRI1 D1009N fails to enhance BIK1 D202N trans-phosphorylation compared to the non-dimerized sample, while EFR D849N is still capable of enhancing BIK1 transphosphorylation upon dimerization as indicated by quantification of autorads (Figure 1B/C). We have also addressed this point in a section on the limitations of our study.

      (7) Fig 1B. "Our findings therefore support the hypothesis that EFR increases BIK1 phosphorylation by allosterically activating the BAK1 kinase domain." To the best of my understanding presence of wild-type EFR in the EFR-BAK1 signaling complex leads to much better phosphorylation of BIK1D202N when compared to the EFRD849N mutant. How does that support the allosteric mechanism? By assuming that the D849N mutant is in an inactive conformation and fully catalytically inactive (see above)? Again, I think the data could also be interpreted in such a way that the small difference in autorad signal for BIK1 between BRI1 inactive (but see above) and ERF inactive are due to EFR not being completely kinase dead (see above), rather than EFR being an allosteric regulator. To clarify this point I would suggest to a) perform quantitative auto- and trans-(generic substrate) phosphorylation assays with wt and D849N EFR to derive enzyme kinetic parameters, to (2) include the EFRD849 mutant in the HDX analysis and (3) to generate transgenic lines for EFRD489N/F761H/Y836F // EFRD489N/F761H/SSAA and compare them to the existing lines in Fig. 3.

      Mutations of proteins, especially those that require conformational plasticity for their function can have pleiotropic effects as the mutation may affect the conformational plasticity and consequently catalytic and non-catalytic functions that depend on the conformational plasticity. In such cases, it is difficult to fully untangle catalytic and non-catalytic functions. Coming back to EFR D849N, the D849N mutation may also impact the non-catalytic function by altering the conformational plasticity, explaining the difference observed in EFR vs EFR D849N. As you rightly suggested, HDX would be a way to address this but would still not clarify whether catalytic activity contributes to activation. We instead attempted to produce analog sensitive EFR variants for in vivo characterization of EFR-targeted catalytic inhibition. Unfortunately, we failed in producing an analog-sensitive variant for which we could show ATP-analog binding. To address your concern, we inserted a section on limitations of the study.

      (8) Fig. 2B,C, supplement 3 C,D. Has it been assessed if the different EFR versions were expressed to similar protein levels and still localized to the PM?

      Localization of the mutant receptors has not been explicitly evaluated by confocal microscopy. However, the selected mutation EFRF761H is shown to accumulate in stable Arabidopsis lines (Figure 3 – Supplement 1C) and BAK1 could be coIPed by all EFR variants upon elf18-treatment (Figure 3 B), indicating plasma membrane localization.

      (9) How the active-like conformation of EFR is in turn activating BAK1 is poorly characterized, but appears to be the main step in the activation of the receptor complex. Extending the HDX analyses to resting and Rap-activated receptor complexes could be a first step to address this question. I tried to come up with an experimental plan to test if indeed the kinase activity of BAK1 and not of EFR is essential for signal propagation, but this is a complex issue. You would need to be able to mimic an activated form of EFR (which you can), to make sure its inactive (possibly, see above) and likewise to engineer a catalytically inactive form of BAK1 in an active-like state (difficult). As such a decisive experiment is difficult to implement, I would suggest to discuss different possible interpretations of the existing data and alternative scenarios in the discussion section of the manuscript.

      We addressed your concern whether BAK1 kinase activity is essential for signaling propagation by pairing EFRF761H and BAK1D416N (Figure 4 Supplement 2 C) which fails to induce signaling. In this case, EFRF761H is in its activated conformation but cannot activate downstream signaling. We also attempted to address your concern by an in vitro kinase assay by pairing EFR and BAK1D416N and using a range of concentrations of the substrate BIK1D202N. We observed that catalytic activity of BAK1 but not EFR was essential for BIK1 phosphorylation. However, this experiment does not address whether activated EFR can efficiently propagate signaling in the absence of BAK1 catalytic activity. In the limitations of the study section, we now discuss the catalytic importance of EFR for signaling activation.

      Author response image 1.

      BIK1 trans-phosphorylation depends on BAK1 catalytic activity. Increasing concentrations of BIK1 D202N were used as substrate for Rap-induced dimers of EFR-BAK1, EFR D849N-BAK1, and EFR-BAK1 D416N respectively. BIK1 trans-phosphorylation depended on the catalytic activity of BAK1. Proteins were purified from E. coli λPP cells. Three experiments yielded similar results of which a representative is shown here.

      Reviewer #2:

      All of my suggestions are minor.

      Figure 1B, I think it would be more useful to readers to explain the amino acid in the D-N change, rather than just call it D-to-N? Also, please label the bands on the stained gel; the shift on FKBP-BRI1 and FKBP-EFR are noticeable on the Coomassie stain.

      We implemented your suggestions.

      Figure 1-Supplement 1. There is still a signal in pS612 BAK1 (it states 'also failed to induce BAK1 S612 phosphorylation' in the text, which is not quite correct). Also, could mention the gel shift seen in BAK1, which appears absent in Y836F.

      We corrected the text which now states: “To test whether the requirement for Y836 phosphorylation is similar, we immunoprecipitated EFR-GFP and EFRY836F-GFP from mock- or elf18-treated seedlings and probed co-immunoprecipitated BAK1 for S612 phosphorylation. EFRY836F also obstructed the induction of BAK1 S612 phosphorylation (Figure 1 – Supplement 1), indicating that EFRY836F and EFRSSAA impair receptor complex activation.” The gel shift of BAK1 you pointed out was not observed in replications and thus we prefer not to comment on it.

      Figure 2 and 3 are full of a, b, c,d's, which I don't understand. Sorry

      We used uppercase letters to indicate subpanels and lowercase letters to indicate the results of the statistical testing. In the figure caption, we have clarified that the lowercase letters refer to statistical comparisons.

      Figure 2 A. If each point on the x-axis is one amino acid, I think it would again be useful to name the amino acids that the gold or purple or blue colored lines extend through.

      Each point stands for a peptide which are sorted by position of their starting amino acid from N-terminus to C-terminus. We now added plots of HDX for individual peptides that correspond to the highlighted region in subpanel A.

      Figure Supplement 1 is very small for what it is trying to show, even on the printed page. If this residue were to be phosphorylated, what would happen to the H-bond?

      We suppose that VIa-Tyr phosphorylation would break the H-bond and causes displacement of the aC-b4 loop. Recent studies, published after our submission, highlight the importance of this loop for substrate coordination and ATP binding. Thus, phosphorylation of VIa-Tyr and displacing this loop may render the kinase rather unproductive. We have expanded the discussion to include this point.

      Figure 2B: Tyr 836 is not present in any of the alignments in Figure 2A. This should be rectified, because the text talks about the similarity to Tyr 156 in PKA.

      We have adjusted the alignments such that they now contain the VIa-Tyr residues of EFR and PKA.

      Figure 4D. Is there any particular reason that these Blots are so hard to compare or FKBP and BAK1?

      We assume it is referred to Figure 4 – Supplement 2 D. FKBP-EFR and FRB-BAK1 both are approximately the size of RubisCo, the most abundant protein in plant protein samples and which overlay the FKBP- and FRB-tagged kinase. Thus, it is difficult to detect these proteins.

      Reviewer #3:

      (1) The paper reporting the allosteric activation mechanism of EGFR should be cited.

      Will be included.

      (2)The authors showed that "Rap addition increased BIK1 D202N phosphorylation when the BRI1 or EFR kinase domains were dimerized with BAK1, but no such effect was observed with FLS2". Please explain why FLS2 failed to enhance BIK1 transphosphorylation by Rap treatment?

      Even though BIK1 is a reported downstream signaling component of FLS2/BAK1, it might be not the most relevant downstream signaling component and rather related RLCKs, like PBL1, might be better substrates for dimerized FLS2/BAK1. We haven’t tested this, however. Alternatively, the purified FLS2 kinase domain might be labile and quickly unfolds even though it was kept on ice until the start of the assay, or the N-terminal FKBP-tag may disrupt function. As the reason for our observation is not clear, we have removed FLS2 in vitro dimerization experiments from the manuscript.

      (3) Based solely on the data presented in Figure 1, it can be concluded that EFR's kinase activity is not required to facilitate BIK1 transphosphorylation. Therefore, the title of Figure 1, "EFR Allosterically Activates BAK1," may be inappropriate.

      We have changed the figure title to: “EFR facilitates BIK1 trans-phosphorylation by BAK1 non-catalytically.”

      (4) In Figure 1- Supplement 1, I could not find any bands in anti-GFP and anti-BAK1 pS612 of input. Please redo it.

      Indeed, we could not detect protein in the input samples of this experiment. BAK1 S612 phosphorylation is an activation mark and not necessarily expected to be abundant enough for detection in input samples. EFR-GFP, however, is usually detected in input samples and is reported in Macho et al. 2014 from which manuscript these lines come. Why EFR-GFP is not detected in this set of experiments is unclear but, in our opinion, does not detract from the conclusions drawn since similar amounts of EFR-GFP are pulled-down across all samples.

      (5) For Figure 2A, please mark the structure represented by each color directly in the figure.

      We have made the suggested change.

      (6) Please modify "EFRF761/Y836F and EFRF761H/SSAA restore BIK1 trans-phosphorylation" to "EFRF761H/Y836F and EFRF761H/SSAA restore BIK1 trans-phosphorylation".

      Thank you for spotting this. We changed it.

      (7) The HDX-MS analysis demonstrated that the EFR (Y836F) mutation inhibits the formation of the active-like conformation. Conversely, the EFR (F761H) mutation serves as a potent intragenic suppressor, significantly stabilizing the active-like conformation. Confirming through HDX-MS conformational testing that the EFR (Y836F F761H) double mutation does not hinder the formation of the active-like EFR kinase conformation would greatly strengthen the conclusions of the article.

      Response: We agree that this is beneficial, and we attempted to do it but failed to produce enough protein for HDX-MS analysis. We stated this now in an extra section of the paper (“Limitations of the study”).

    2. eLife assessment

      This manuscript reports important in vitro biochemical and in planta experiments to study the receptor activation mechanism of plant membrane receptor kinase complexes with non-catalytic intracellular kinase domains. Several lines of evidence convincingly show that one such putative pseudokinase, the immune receptor EFR achieves an active conformation following phosphorylation by a co-receptor kinase, and then in turn activates the co-receptor kinase allosterically to enable it to phosphorylate down-stream signaling components. This manuscript will be of interest to scientists focusing on cell signalling and allosteric regulation.

    3. Reviewer #1 (Public Review):

      Summary

      The authors use an elegant but somewhat artificial heterodimerisation approach to activate the isolated cytoplasmic domains of different receptor kinases (RKs) including the receptor kinase BRI1 and EFR. The developmental RK BRI1 is known to be activated by the co-receptor BAK1. Active BRI1 is then able to phosphorylate downstream substrates. The immune receptor EFR is also an active protein kinase also activated by the co-receptor BAK1. EFR however appears to have little or no kinase activity but seems to use an allosteric mechanism to in turn enable BAK1 to phosphorylate the substrate kinase BIK1. EFR tyrosine phosphorylation by BAK1 appears to trigger a conformational change in EFR, activating the receptor. Likewise, kinase activating mutations can cause similar conformational transitions in EFR and also in BAK1 in vitro and in planta.

      Strengths:

      I particularly liked The HDX experiments coupled with mutational analysis (Fig. 2) and the design and testing of the kinase activating mutations (Fig. 3), as they provide novel mechanistic insights into the activation mechanisms of EFR and of BAK1. These findings are nicely extended by the large-scale identification of EFR-related RKs from different species with potentially similar activation mechanisms (Fig. 5).

      Weaknesses:

      In my opinion, there are currently two major issues with the present manuscript. (1) The authors have previously reported that the EFR kinase activity is dispensible for immune signaling (https://pubmed.ncbi.nlm.nih.gov/34531323/) but the wild-type EFR receptor still leads to a much better phosphorylation of the BIK1 substrate when compared to the kinase inactive D849N mutant protein (Fig. 1). (2) How the active-like conformation of EFR is in turn activating BAK1 is poorly characterized, but appears to be the main step in the activation of the receptor complex. Extending the HDX analyses to resting and Rap-activated receptor complexes could be a first step to address this question, but these HDX studies were not carried out due to technical limitations.

      Overall this is an interesting study that aims to advance our understanding of the activation mechanisms of different plant receptor kinases with important functions in plant immunity.

    4. Reviewer #2 (Public Review):

      Summary:

      Transmembrane signaling in plants is crucial for homeostasis. In this study, the authors set out to understand to what extent catalytic activity in the EFR tyrosine kinase is required in order to transmit a signal. This work was driven by mounting data that suggest many eukaryotic kinases do not rely on catalysis for signal transduction, relying instead on conformational switching to relay information. The crucial findings reported here involve the realisation that a kinase-inactive EFR can still activate (ie lead to downstream phosphorylation) of its partner protein BAK1. Using a convincing set of biochemical, mass spectrometric (HD-exchange) and in vivo assays, the team suggest a model in which EFR is likely phosphorylated in the canonical activation segment (where two Ser residues are present), which is sufficient to generate a conformation that can activate BAK1 through dimersation. A model is put forward involving C-helix positioning in BAK1, and the model extended to other 'non-RD' kinases in Arabidopsis kinases that likely do not require kinase activity for signaling.

      Strengths:

      The work uses logical and well-controlled approaches throughout, and is clear and convincing in most areas, linking data from IPs, kinase assays (including clear 32P-based biochemistry), HD-MX data (from non-phosphorylated EFR) structural biology, oxidative burst data and infectivity assays. Repetitions and statistical analysis all appear appropriate.<br /> Overall, the work builds a convincing story and the discussion does a clear job of explaining the potential impact of these findings (and perhaps an explanation of why so many Arabidopsis kinases are 'pseudokinases', including XPS1 and XIIa6, where this is shown explicitly).

      Weaknesses:

      No major weaknesses are noted from reviewing the data and the paper follows a logical course built on solid foundations; the use of Tables to explain various experimental data pertinent to the reported studies is appreciated.

      (1) The use of a, b,c, d in Figures 2C and 3C etc is confusing to this referee, and is now addressed in the latest version<br /> (2) The debate about kinase v pseudokinases is well over a decade old. For non-experts, the kinase alignments/issues raised are in PMID: 23863165 and might prove useful if cited.<br /> (3) Early on in the paper, the concept of kinases and pseudokinases related to R-spine (and extended R-spine) stability and regulation really needs to be more adequately introduced to explain what comes next; e.g. some of the key work in this area for RAF and Tyr kinases where mutual F-helix Phe amino acid changes are evaluated (conceptually similar to this study of the E-helix Tyr to Phe changes in EFR) should be cited (PMID: 17095602, 24567368 and 26925779).<br /> (4) In my version, some of the experimental text is also currently in the wrong order (and no page numbers, so hard for me to state exactly where in the manuscript); However, I am certain that Figure 2C is mentioned in the text when the data are actually shown in Figure 3C for the EFR-SSAA protein.<br /> (5) Tyr 156 in PKA is not shown in Supplement 1, 2A as suggested in the text; for readers, it will be important to show the alignment of the Tyr residue in other kinases; this has been updated in the second version. Although it is clearly challenging to generate phosphorylated EFR (seemingly through Codon-expansion here?), it appears unlikely that a phosphorylated EFR protein, even semi-pure, couldn't have been assayed to test the idea that the phosphorylation drives/supports downstream signaling. What about a DD or EE mutation, as commonly used (perhaps over-used) in MEK-type studies?

      Impact:

      The work is an important new step in the huge amount of follow-up work needed to examine how kinases and pseudokinases 'talk' to each other in (especially) the plant kingdom, where significant genetic expansions have occurred. The broader impact is that we might understand better how to manipulate signaling for the benefit of plants and mankind; as the authors suggest, their study is a natural progression both of their own work, and the kingdom-wide study of the Kannan group.

    5. Reviewer #3 (Public Review):

      The study presents strong evidence for allosteric activation of plant receptor kinases, which enhances our understanding of the non-catalytic mechanisms employed by this large family of receptors.

      Plant receptor kinases (RKs) play a critical role in transducing extracellular signals. The activation of RKs involves homo- or heterodimerization of the RKs, and it is believed that mutual phosphorylation of their intracellular kinase domains initiates downstream signaling. However, this model faces a challenge in cases where the kinase domain exhibits pseudokinase characteristics. In their recent study, Mühlenbeck et al. reveal the non-catalytic activation mechanisms of the EFR-BAK1 complex in plant receptor kinase signaling. Specifically, they aimed to determine that the EFR kinase domain activates BAK1 not through its kinase activity, but rather by utilizing a "conformational toggle" mechanism to enter an active-like state, enabling allosteric trans-activation of BAK1. The study sought to elucidate the structural elements and mutations of EFR that affect this conformational switch, as well as explore the implications for immune signaling in plants. To investigate the activation mechanisms of the EFR-BAK1 complex, the research team employed a combination of mutational analysis, structural studies, and hydrogen-deuterium exchange mass spectrometry (HDX-MS) analysis. For instance, through HDX-MS analysis, Mühlenbeck et al. discovered that the EFR (Y836F) mutation impairs the accessibility of the active-like conformation. On the other hand, they identified the EFR (F761H) mutation as a potent intragenic suppressor capable of stabilizing the active-like conformation, highlighting the pivotal role of allosteric regulation in BAK1 kinase activation. The data obtained from this methodology strengthens their major conclusion. Moreover, the researchers propose that the allosteric activation mechanism may extend beyond the EFR-BAK1 complex, as it may also be partially conserved in the Arabidopsis LRR-RK XIIa kinases. This suggests a broader role for non-catalytic mechanisms in plant RK signaling.

      The allosteric activation mechanism was demonstrated for receptor tyrosine kinases (RTKs) many years ago. A similar mechanism has been suggested for the activation of plant RKs, but experimental evidence for this conclusion is lacking. Data in this study represent a significant advancement in our understanding of non-catalytic mechanisms in plant RK signaling. By shedding light on the allosteric regulation of BAK1, the study provides a new paradigm for future research in this area.

    1. Author response:

      eLife assessment

      This study investigates associations between retrotransposon element expression and methylation with age and inflammation, using multiple public datasets. The study is valuable because a systematic analysis of retrotransposon element expression during human aging has been lacking. However, the data provided are incomplete due to the sole reliance on microarray expression data for the core analysis of the paper.

      Both reviewers found this study to be important. We have selected the microarray datasets of human blood adopted by a comprehensive study of ageing published in Nature Communications (DOI: doi: 10.1038/ncomms9570). We only included the datasets specifically collected for ageing studies. Therefore, the large RNA-seq cohorts for cancer, cardiovascular, and neurological diseases were not relevant to this study and cannot be included.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. The concept of the study is in principle interesting, as a systematic analysis of RTE expression during human aging is lacking.

      We thank the reviewer for the positive comment.

      Unfortunately, the reliance on expression microarray data, used to perform the core analysis of the paper places much of the study on shaky ground. The findings of the study would not be sufficiently supported until the authors validate them with more suitable methods.

      In our discussion section in the manuscript, we have clarified that “we are aware of the limitations imposed by using microarray in this study, particularly the low number of intergenic probes in the expression microarray data. Our study can be enriched with the advent of large RNA-seq cohorts for aging studies in the future.” However, the application of microarray for RTE expression analysis was introduced previously. In fact, in a manuscript published by Reichmann et al. (DOI: 10.1371/journal.pcbi.1002486) which was cited 76 times, the authors showed and experimentally verified that cryptic repetitive element probes present in Illumina and Affymetrix gene expression microarray platforms can accurately and sensitively monitor repetitive element expression data. Inspired by this methodological manuscript with reasonable acceptance by other researchers, we trusted that the RTE microarray probes could accurately quantify RTE expression at class and family levels.

      Strengths:

      This is a very important biological problem.

      Weaknesses:

      RNA microarray probes are obviously biased to genes, and thus quantifying transposon analysis based on them seems dubious. Based on how arrays are designed there should at least be partial (perhaps outdated evidence) that the probe sites overlap a protein-coding or non-coding RNA.

      We disagree with the reviewer that quantifying transposon analysis based on microarray data is dubious. As previously shown by Reichmann et al., the quantification is reliable as long as the probes do not overlap with annotated genes and they are in the correct orientation to detect sense repetitive element transcripts. Reichman et al. identified 1,400 repetitive element probes in version 1.0, version 1.1 and version 2.0 of the Illumina Mouse WG-6 Beadchips by comparing the genomic locations of the probes with the Repeatmasked regions of the mouse genome. We applied the same criteria for Illumina Human HT-12 V3 (29431 probes) and V4 (33963) to identify the RTE-specific probes.

      The authors state they only used intergenic probes, but based on supplementary files, almost half of RTE probes are not intergenic but intronic (n=106 out of 264).

      All our identified RTE probes overlap with intergenic regions. However, due to their repetitive natures, some probes overlap with intronic regions, too. We can replace "intergenic" with "noncoding" in our revision to show that they do not overlap with the exons of protein-coding genes. However, we do not rule out the possibility that some of our detected RTE probes might overlap noncoding RNAs. In fact, the border between coding and non-coding genomes has recently become very fuzzy with new annotations of the genome. RTE RNAs can be easily considered as non-coding RNAs if we challenge our junk DNA view.

      This is further complicated by the fact that not all this small subset of probes is available in all analyzed datasets. For example, 232 probes were used for the MESA dataset but only 80 for the GTP dataset. Thus, RTE expression is quantified with a set of probes which is extremely likely to be highly affected by non-RTE transcripts and that is also different across the studied datasets. Differences in the subsets of probes could very well explain the large differences between datasets in multiple of the analyses performed by the authors, such as in Figure 2a, or 3a. It is nonetheless possible that the quantification of RTE expression performed by the authors is truly interpretable as RTE expression, but this must be validated with more data from RNA-seq. Above all, microarray data should not be the main type of data used in the type of analysis performed by the authors.

      In this study, we did not compare MESA with GTP etc. We have analysed each dataset separately based on the available data for that dataset. Therefore, sacrificing one analysis because of the lack of information from the other does not make sense. We would do that if we were after comparing different datasets. Moreover, the datasets are not comparable because they were produced from different blood cell types.

      Reviewer #2 (Public Review):

      Summary:

      Yi-Ting Tsai and colleagues conducted a systematic analysis of the correlation between the expression of retrotransposable elements (RTEs) and aging, using publicly available transcriptional and methylome microarray datasets of blood cells from large human cohorts, as well as single-cell transcriptomics. Although DNA hypomethylation was associated with chronological age across all RTE biotypes, the authors did not find a correlation between the levels of RTE expression and chronological age. However, expression levels of LINEs and LTRs positively correlated with DNA demethylation, and inflammatory and senescence gene signatures, indicative of "biological age". Gene set variation analysis showed that the inflammatory response is enriched in the samples expressing high levels of LINEs and LTRs. In summary, the study demonstrates that RTE expression correlates with "biological" rather than "chronological" aging.

      Strengths:

      The question the authors address is both relevant and important to the fields of aging and transposon biology.

      We thank the reviewer for finding this study relevant and important.

      Weaknesses:

      The choice of methodology does not fully support the primary claims. Although microarrays can detect certain intergenic transposon sequences, the authors themselves acknowledge in the Discussion section that this method's resolution is limited. More critical considerations, however, should be addressed when interpreting the results. The coverage of transposon sequences by microarrays is not only very limited (232 unique probes) but also predetermined. This implies that any potential agerelated overexpression of RTEs located outside of the microarray-associated regions, or of polymorphic intact transposons, may go undetected. Therefore, the authors should be more careful while generalising their conclusions.

      This is a bioinformatics study, and we have already admitted and discussed the limitations in the discussion section of this manuscript. All technologies have their own limitations, and this should not stop us from shedding light on scientific facts because of inadequate information. In the manuscript, we have discussed that all large and proper ageing studies were performed using microarray technology. Peters et al. (DOI: doi: 10.1038/ncomms9570) adopted all these microarray data in their transcriptional landscape of ageing manuscript. Our study essentially applies the Reichmann et al. method to the peripheral blood-related data from the Peters et al. manuscript. Since hypomethylation due to ageing is a well-established and broad epigenetic reprogramming, it is unlikely that only a fraction of RTEs is affected by this phenomenon. Therefore, the subsampling of RTEs should not affect the result so much. Indeed, this is supported in our study by the inverse correlation between DNA methylation and RTE expression for LINE and SINE classes despite having limited numbers of probes for LINE and SINE expressions.

      Additionally, for some analyses, the authors pool signals from RTEs by class or family, despite the fact that these groups include subfamilies and members with very different properties and harmful potentials. For example, while sequences of older subfamilies might be passively expressed through readthrough transcription, intact members of younger groups could be autonomously reactivated and cause inflammation. The aggregation of signals by the largest group may obscure the potential reactivation of smaller subgroups. I recommend grouping by subfamily or, if not possible due to the low expression scores, by subgroup. For example, all HERV subfamilies are from the ERVL family.

      We agree with the reviewer that different subfamilies of RTEs play different roles through their activation. However, we will lose our statistical power if we study RTE subfamilies with a few probes. Global epigenetic alteration and derepression of RTEs by ageing have been observed to be genome-wide. While our systematic analysis across RTE classes and families cannot capture alterations in subfamilies due to statistical power, it is still relevant to the research question we are addressing.

      Next, Illumina arrays might not accurately represent the true abundance of TEs due to non-specific hybridization of genomic transposons. Standard RNA preparations always contain traces of abundant genomic SINEs unless DNA elimination is specifically thorough. The problem of such noise should be addressed.

      We have checked the RNA isolation step from MESA, GTP, and GARP manuscripts. The total RNA was isolated using the Qiagen mini kit following the manufacturer’s recommendations. The authors of these manuscripts did not mention whether they eliminated genomics DNA, but we assumed they were aware of the DNA contamination and eliminated it based on the manufacturer’s recommendations. We have looked up the literature about non-specific hybridization of RTEs but could not find any evidence to support this observation. We would appreciate the reviewers providing more evidence about such RTE contaminations.

      Lastly, scRNAseq was conducted using 10x Genomics technology. However, quantifying transposons in 10x sequencing datasets presents major challenges due to sparse signals.

      Applying the scTE pipeline (https://www.nature.com/articles/s41467-021-21808-x), we have found that the statical power of quantifying RTE classes (LINE, SINE, and LTR) or RTE families (L1, L2, All, ERVK, etc.) are as good as each individual gene. However, our proposed method cannot analyse RTE subfamilies, and we did not do that.

      Smart-seq single-cell technology is better suited to this particular purpose.

      We agree with the reviewer that Smart-seq provides higher yield than 10x, but there is no Smart-seq data available for ageing study.

      Anyway, it would be more convincing if the authors demonstrated TE expression across different clusters of immune cells using standard scRNAseq UMAP plots instead of boxplots.

      Since the number of RTE reads per cell is low, showing the expression of RTEs per cell in UMAP may not be the best statistical approach to show the difference between the aged and young groups. This is why we chose to analyse with pseudobulk and displayed differential expression using boxplot rather than UMAP for each immune cell type.

      I recommend validating the data by RNAseq, even on small cohorts. Given that the connection between RTE overexpression and inflammation has been previously established, the authors should consider better integrating their observations into the existing knowledge.

      Until recently, there were no publicly-available, non-cancerous, large cohort of RNA-seq data for ageing studies. We tried to gain access to the two RNA-seq datasets suggested by reviewer 2: Marquez et al. 2020 (phs001934.v1.p1, controlled access) and Morandini et al. 2023 (GSE193141, public access).

      Unfortunately, Marquez et al. 2020 data is not accessible because the authors only provide the data for projects related to cardiovascular diseases. However, we did analyse Morandini et al. 2023 data, and we can confirm that no association was observed between any class and family of RTEs with chronological ageing, which is the second strong piece of evidence supporting the statement in the manuscript. However, as expected, we found a positive correlation between RTE expression and IFNI signature score.

    2. eLife assessment

      This study investigates associations between retrotransposon element expression and methylation with age and inflammation, using multiple public datasets. The study is valuable because a systematic analysis of retrotransposon element expression during human aging ishas beenlacking. However, the data provided are incomplete due to the sole reliance on microarray expression data for the core analysis of the paper.

    3. Reviewer #1 (Public Review):

      Summary:

      Tsai and Seymen et al. investigate associations between RTE expression and methylation and age and inflammation, using multiple public datasets. The concept of the study is in principle interesting, as a systematic analysis of RTE expression during human aging is lacking. Unfortunately, the reliance on expression microarray data, used to perform the core analysis of the paper places much of the study on shaky ground. The findings of the study would not be sufficiently supported until the authors validate them with more suitable methods.

      Strengths:

      This is a very important biological problem.

      Weaknesses:

      RNA microarray probes are obviously biased to genes, and thus quantifying transposon analysis based on them seems dubious. Based on how arrays are designed there should at least be partial (perhaps outdated evidence) that the probe sites overlap a protein-coding or non-coding RNA. The authors state they only used intergenic probes, but based on supplementary files, almost half of RTE probes are not intergenic but intronic (n=106 out of 264). This is further complicated by the fact that not all this small subset of probes is available in all analyzed datasets. For example, 232 probes were used for the MESA dataset but only 80 for the GTP dataset. Thus, RTE expression is quantified with a set of probes which is extremely likely to be highly affected by non-RTE transcripts and that is also different across the studied datasets. Differences in the subsets of probes could very well explain the large differences between datasets in multiple of the analyses performed by the authors, such as in Figure 2a, or 3a. It is nonetheless possible that the quantification of RTE expression performed by the authors is truly interpretable as RTE expression, but this must be validated with more data from RNA-seq. Above all, microarray data should not be the main type of data used in the type of analysis performed by the authors.

    4. Reviewer #2 (Public Review):

      Summary:

      Yi-Ting Tsai and colleagues conducted a systematic analysis of the correlation between the expression of retrotransposable elements (RTEs) and aging, using publicly available transcriptional and methylome microarray datasets of blood cells from large human cohorts, as well as single-cell transcriptomics. Although DNA hypomethylation was associated with chronological age across all RTE biotypes, the authors did not find a correlation between the levels of RTE expression and chronological age. However, expression levels of LINEs and LTRs positively correlated with DNA demethylation, and inflammatory and senescence gene signatures, indicative of "biological age". Gene set variation analysis showed that the inflammatory response is enriched in the samples expressing high levels of LINEs and LTRs. In summary, the study demonstrates that RTE expression correlates with "biological" rather than "chronological" aging.

      Strengths:

      The question the authors address is both relevant and important to the fields of aging and transposon biology.

      Weaknesses:

      The choice of methodology does not fully support the primary claims. Although microarrays can detect certain intergenic transposon sequences, the authors themselves acknowledge in the Discussion section that this method's resolution is limited. More critical considerations, however, should be addressed when interpreting the results. The coverage of transposon sequences by microarrays is not only very limited (232 unique probes) but also predetermined. This implies that any potential age-related overexpression of RTEs located outside of the microarray-associated regions, or of polymorphic intact transposons, may go undetected. Therefore, the authors should be more careful while generalising their conclusions.

      Additionally, for some analyses, the authors pool signals from RTEs by class or family, despite the fact that these groups include subfamilies and members with very different properties and harmful potentials. For example, while sequences of older subfamilies might be passively expressed through readthrough transcription, intact members of younger groups could be autonomously reactivated and cause inflammation. The aggregation of signals by the largest group may obscure the potential reactivation of smaller subgroups. I recommend grouping by subfamily or, if not possible due to the low expression scores, by subgroup. For example, all HERV subfamilies are from the ERVL family.

      Next, Illumina arrays might not accurately represent the true abundance of TEs due to non-specific hybridization of genomic transposons. Standard RNA preparations always contain traces of abundant genomic SINEs unless DNA elimination is specifically thorough. The problem of such noise should be addressed.

      Lastly, scRNAseq was conducted using 10x Genomics technology. However, quantifying transposons in 10x sequencing datasets presents major challenges due to sparse signals. Smart-seq single-cell technology is better suited to this particular purpose. Anyway, it would be more convincing if the authors demonstrated TE expression across different clusters of immune cells using standard scRNAseq UMAP plots instead of boxplots.

      I recommend validating the data by RNAseq, even on small cohorts. Given that the connection between RTE overexpression and inflammation has been previously established, the authors should consider better integrating their observations into the existing knowledge.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study provides an important finding that the local abundance of metabolites impacts the biology of the tumor microenvironment by utilizing kidney tumors from patients and adjacent normal tissues. The evidence supporting the claims of the authors is convincing although certain caveats need to be taken into consideration as the authors acknowledged in the paper. The work will be of interest to the research community working on metabolism and on kidney cancer especially.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The present study addresses how the local abundance of metabolites impacts the biology of the tumor microenvironment. The authors enroll patients harboring kidney tumors and use freshly resected tumor material for metabolic studies. Specifically, the authors separate the adjacent normal kidney tissue from the tumor material and then harvest the interstitial fluid from the normal kidney (KIF) or the tumor (TIF) for quantitative metabolomics. The plasma samples from the patient are used for comparison. Additionally, the authors also compare metabolite levels in the plasma of patients with kidney versus lung cancer (or healthy donors) to address how specific tumor types might contribute to circulating levels of metabolites. Altogether, the authors find that the metabolite levels in the KIF and TIF, although vastly different than plasma, are largely overlapping. These findings indicate that tissue of origin appears to have a stronger role in determining the local metabolic environment of tumors than the genetics or biochemistry of the tumor itself.

      Strengths:

      The biggest strength of the current study is the use of human patient-derived samples. The cohort size (~50 patients) is relatively large, which adds to the rigor of the work. The work also relies on a small pool of metabolites that can be quantitatively measured using methods developed by the authors. Focusing on a smaller metabolic pool also likely increases the signal-to-noise ratio and enables the more rigorous determination of any underlying differences. The manuscript is well-written and highlights both the significance of the findings and also acknowledges many of the caveats. The recognition of the metabolic contributions of surrounding normal tissue as the primary driver of local nutrient abundance is a novel finding in the work, which can be leveraged in future studies.

      We thank the Reviewer for their careful evaluation of the study and for their supportive comments.

      Weaknesses:

      The work has certain caveats, some of which have been already recognized by the authors. These include the use of steady-state metabolites and the possibility of cross-contamination of some TIF into the adjacent KIF. This study is also unable to distinguish the mechanisms driving the metabolic changes in KIF/TIF relative to circulating levels in plasma.

      We agree with the Reviewer that these are important caveats to consider when interpreting the results of this study.

      The relative similarity of KIF and TIF is quite surprising. However, this interpretation is presently based on a sampling of only ~100 polar metabolites and ~200 lipid molecules. It is, perhaps, possible that future technological developments that enable more comprehensive quantitative metabolic profiling might distinguish between KIF and TIF composition.

      The Reviewer raises another important point that our interpretation of KIF vs TIF is limited to the ~300 metabolites we measured. We agree it would be worthwhile quantifying more metabolites where technically feasible to further characterize similarities and differences in nutrient availability between tumor and normal tissues.

      In vitro, tissue culture is recognized to suffer from ‘non-physiological’ nutrient dependencies, which are impacted by the composition of culture media. Thus, in vivo studies remain our current gold-standard in mechanistic studies of tumor metabolism. It is presently unclear whether the findings of this work will be recapitulated in any of the kidney cancer in vivo models and thus be functionally testable.

      We thank the Reviewer for calling attention to the limitations of cell culture media in studying tumor metabolism. While both in vitro and in vivo approaches have inherent limitations, formulating culture media based on metabolite concentrations measured here and in other studies provides a tool to study the influence of nutrient availability on kidney cell or kidney cancer cell phenotypes in vitro. We also agree with the Reviewer that determining whether the findings in our study are recapitulated in mouse models of kidney cancer, as this might enable investigation into the factors that modulate nutrient availability in this tissue context.

      Reviewer #2 (Public Review):

      The study employs quantitative metabolomic and lipidomic analyses to scrutinize tumor interstitial fluid (TIF), adjacent normal kidney interstitial fluid (KIF), and plasma samples from renal cell carcinoma (RCC) patients. The authors delve into the intricate world of renal cell carcinoma and its tumor microenvironment, shedding light on the factors that shape nutrient availability in both cancerous and adjacent normal tissues. The authors prove that non-cancer-driven tissue factors play a dominant role in shaping nutrient availability in RCC. This finding opens up new avenues for research, suggesting that the tumor microenvironment is profoundly influenced by factors beyond the presence of cancer cells. This study not only contributes valuable insights into RCC metabolism but also prompts a reevaluation of the factors governing nutrient availability in tumor microenvironments more broadly. Overall, it represents a significant step forward in our understanding of the intricate interplay between cancer and its surrounding milieu.

      We thank the Reviewer for their evaluation of our work and for their supportive comments.

      The study is overall well-constructed, including appropriate analysis. Likewise, the manuscript is written clearly and supported by high-quality figures. Since the authors exclusively employed samples from RCC patients and did not include kidney interstitial fluid and plasma samples from healthy individuals, we cannot accurately assess the true significance and applicability of the results until the role of cancer cells in reshaping KIF is understood. In essence, some metabolite levels in the tumor interstitial fluid did not show an increase or decrease compared to the adjacent normal kidney interstitial fluid. However, the levels of these metabolites in both TIF and KIF might be higher or lower than those in kidney interstitial fluid from healthy individuals, and the roles of these metabolites should not be overlooked. Similar concerns extend to plasma levels, emphasizing the importance of metabolites that synchronously change in RCC TIF, KIF, and plasma-whether elevated or reduced.

      We agree with the Reviewer that an important caveat in considering the study findings is that we do not have KIF values from healthy individuals. Since resection of normal kidney is not a common procedure, obtaining KIF samples from healthy patients was not possible to complement our analysis. We further agree that the metabolite levels we measured in KIF or plasma are plausibly impacted by the presence of RCC. We did compare the composition of polar metabolites in the plasma from RCC, lung cancer, and healthy patients, highlighting how cystine is affected by tumor presence and/or sample collection methodology. We also point out that factors such as diet will impact metabolites in both blood and tissues.

      Reviewer #3 (Public Review):

      In this study, the authors utilized mass spectrometry-based quantification of polar metabolites and lipids in normal and cancerous tissue interstitial fluid and plasma. This showed that nutrient availability in tumor interstitial fluid was similar to that of interstitial fluid in adjacent normal kidney tissue, but that nutrients found in both interstitial fluid compartments were different from those found in plasma. This suggests that the nutrients in kidney tissue differ from those found in blood and that nutrients found in kidney tumors are largely dictated by factors shared with normal kidney tissue. Those data could be useful as a resource to support further study and modeling of the local environment of RCC and normal kidney physiology.

      We thank the Reviewer for their time considering our paper and for their supportive comments.

      In Figures 1D and 1E, there were about 30% of polar metabolites and 25% of lipids significantly different between TIF and KIF, which could be key factors for RCC tumors. This reviewer considers that the authors should make comments on this.

      We agree with the Reviewer that the metabolites that significantly differ between TIF and KIF are of interest, particularly for those studying RCC tumor metabolism. We comment on some of the metabolites driving differences between TIF and KIF in our discussion of Figure 2, and in the revised manuscript we now include a new figure showing a heatmap that enables visualization of these metabolites (Figure 2-Supplement 1A-B).

      Recommendations for the authors:

      From the Reviewing Editor:

      Figure 2 needs to plot heatmaps for both upregulated and downregulated metabolites in TIF.

      We agree and now include heatmaps for significantly differing polar metabolites and lipids in TIF vs KIF as requested by Reviewer 3 (Figure 2-Supplement 1A-B). For completeness, we also include heatmaps for metabolites differing between healthy and RCC plasma (Figure 2-Supplement 2C) and for NSCLC and RCC plasma (Figure 2-Supplement 2D).

      There is a need to show whether the differences in these metabolites between plasma and tissue interstitial fluid are specific to RCC patients or if they are also present in normal individuals.

      Unfortunately, it has not been possible for us to collect KIF from healthy individuals. Since resection of normal kidney is not a common procedure, we have no way to obtain sufficient KIF samples from healthy patients for this measurement. We discuss this as a limitation of the study.

      Reviewer #1 (Recommendations For The Authors):

      a. The authors should provide additional details about the methodology to separate the KIF and TIF. Contaminating metabolites from surrounding tissue or the peritoneal fluids could impact interpretation and it would be helpful to understand how these challenges were addressed during tissue collection for this study. Additionally, was the collected tissue minced or otherwise dissociated? If so, could these procedures cause tissue lysis and contaminate the KIF/TIF with intracellular components?

      We thank the Reviewer for the suggestions to include more information about the sampling methodology. Care was taken to minimize cell lysis incurred by the processing methodology as tissues were not minced, smashed, nor dissociated, however there is still a possibility of some level of tissue lysis that is pre-existing or occurs during the isolation procedure. We note this caveat in the text (lines 218-220) and have updated the Methods with more details of the sampling and processing of the samples.

      b. Although the authors focus on metabolites that are elevated in TIF (relative to KIF and plasma), it would be equally relevant to consider the converse. Metabolites that are reduced in TIF, either due to underproduction or overconsumption, could render the tumors auxotrophic for some critical dependencies and identify some novel metabolic vulnerabilities. In this regard, Figure 2 could have a heatmap of the top metabolites that are elevated and depleted specifically in the TIF.

      We agree with the Reviewer it is useful to include heatmaps to better display the metabolites that significantly differ between TIF and KIF and now include these in Figure 2-Supplement 1A-B.

      c. The future utilization of this knowledge would depend on our ability to model these differences. Would interstitial tissue from a normal mouse kidney or tumor-bearing mouse kidney recapitulate the same differences relative to mouse plasma?

      We agree with the Reviewer that it would be worth determining whether the findings in our study are recapitulated in mouse models of kidney cancer, which would support future investigation into the factors that modulate nutrient availability. This is an interesting question, but we did not have access to endogenously arising models of RCC, which have been a limitation for the field, and comparison of normal mouse kidney metabolite data to human metabolite data is problematic for obvious reasons. Thus, we had no choice but to discuss this as a limitation of the study.

      Reviewer #2 (Recommendations For The Authors):

      In this study, Abbott et al. investigated the metabolic profile of renal cell carcinoma (RCC) by analyzing the tumor interstitial fluid (TIF), adjacent normal kidney interstitial fluid (KIF), and plasma samples from patients. The results indicate that nutrient composition in TIF closely resembles that of KIF, suggesting that tissue-specific factors, rather than tumor-driven alterations, have a more significant impact on nutrient levels. These findings are interesting. The study is overall well-constructed, including appropriate analysis, and the manuscript is written clearly and supported by high-quality figures. However, some issues are raised which if addressed, would strengthen the paper.

      We thank the Reviewer for their suggestions to improve the paper.

      The authors found a difference in the number of metabolites when comparing TIF or KIF lipid composition with plasma. The discoveries are intriguing; however, I am keen to understand whether the differences in these metabolites between plasma and tissue interstitial fluid are specific to RCC patients or if they are also present in normal individuals. I am particularly interested in identifying which metabolites could serve as potential diagnostic markers, intervention targets, or potentially reshape the tumor microenvironment. Because, even though some metabolite levels show no difference between TIF and KIF in RCC patients, I wonder if these metabolite levels in KIF increase or decrease compared to the interstitial fluid in healthy individuals. I am intrigued by the metabolites that simultaneously increase or decrease in both TIF and KIF compared to the kidney interstitial fluid in healthy individuals.

      We agree with the Reviewer that it would be interesting to measure kidney interstitial fluid from healthy patients to be able to compare metabolites changing due to the presence of RCC tumor. As we discuss in response to the public review, this was not possible as we could not obtain material from healthy individuals for analysis. Nevertheless we agree it warrants future study if material were available.

      The analysis conducted using plasma from healthy donors, as applauded by the author, is noteworthy. The author seems to have found that cystine levels do not differ between RCC patient plasma and tissue interstitial fluid. However, considering that in patient plasma, the cystine concentration is approximately two-fold higher than in plasma from healthy individuals, likely, cystine levels in patient tissue fluid have also increased nearly two-fold compared to levels in the interstitial fluid of normal kidney tissues. This finding aligns with the discovery of elevated GSH levels in cancer cells.

      We agree with the Reviewer that a higher cystine concentration in RCC patient plasma and interstitial fluid is interesting, and also considered this in relationship to past findings including reports of elevated GSH levels in RCC. However, we think this observation is driven at least in part by the fasting status of the patients pre-surgery. This does not rule out some part being related to the presence of the tumor, as this would be consistent with elevated GSH levels as noted by the Reviewer. Future studies will be needed to further delineate the factors that impact elevated cystine levels in both interstitial fluid and plasma.

      Some minor typos, such as "HIF1􀀀-driven" should be corrected.

      We thank the Reviewer for pointing out this typo and we have corrected it in the revised manuscript.

    2. Reviewer #3 (Public Review):

      In this study, the authors utilized mass spectrometry-based quantification of polar metabolites and lipids in normal and cancerous tissue interstitial fluid and plasma. This showed that nutrient availability in tumor interstitial fluid was similar to that of interstitial fluid in adjacent normal kidney tissue, but that nutrients found in both interstitial fluid compartments were different from those found in plasma. This suggests that the nutrients in kidney tissue differ from those found in blood and that nutrients found in kidney tumors are largely dictated by factors shared with normal kidney tissue. Those data could be useful as a resource to support further study and modeling of the local environment of RCC and normal kidney physiology.

    3. eLife assessment

      This study provides an important finding that the local abundance of metabolites impacts the biology of the tumor microenvironment by utilizing kidney tumors from patients and adjacent normal tissues. The evidence supporting the claims of the authors is convincing. The work will of interest to the research community working on metabolism and kidney cancer especially.

    4. Reviewer #1 (Public Review):

      (a) Summary: The present study addresses how the local abundance of metabolites impacts the biology of the tumor microenvironment. The authors enroll patients harboring kidney tumors and use freshly resected tumor material for metabolic studies. Specifically, the authors separate the adjacent normal kidney tissue from the tumor material and then harvest the interstitial fluid from the normal kidney (KIF) or the tumor (TIF) for quantitative metabolomics. The plasma samples from the patient are used for comparison. Additionally, the authors also compare metabolite levels in the plasma of patients with kidney versus lung cancer (or healthy donors) to address how specific tumor types might contribute to circulating levels of metabolites. Altogether, the authors find that the metabolite levels in the KIF and TIF, although vastly different than plasma, are largely overlapping. These findings indicate that tissue of origin appears to have a stronger role in determining the local metabolic environment of tumors than the genetics or biochemistry of the tumor itself.

      (b) Strengths: The biggest strength of the current study is the use of human patient-derived samples. The cohort size (~50 patients) is relatively large, which adds to the rigor of the work. The work also relies on a small pool of metabolites that can be quantitatively measured using methods developed by the authors. Focusing on a smaller metabolic pool also likely increases the signal-to-noise ratio and enables the more rigorous determination of any underlying differences. The manuscript is well-written and highlights both the significance of the findings and also acknowledges many of the caveats. The recognition of the metabolic contributions of surrounding normal tissue as the primary driver of local nutrient abundance is a novel finding in the work, which can be leveraged in future studies.

      (c) Weaknesses: The work has certain caveats, some of which have been already recognized by the authors. These include the use of steady-state metabolites and the possibility of cross-contamination of some TIF into the adjacent KIF. This study is also unable to distinguish the mechanisms driving the metabolic changes in KIF/TIF relative to circulating levels in plasma.

      The relative similarity of KIF and TIF is quite surprising. However, this interpretation is presently based on sampling of only ~100 polar metabolites and ~200 lipid molecules. It is, perhaps, possible that future technological developments that enable more comprehensive quantitative metabolic profiling might distinguish between KIF and TIF composition.

      In vitro tissue culture is recognized to suffer from 'non-physiological' nutrient dependencies, which are impacted by the composition of culture media. Thus, in vivo studies remain our current gold-standard in mechanistic studies of tumor metabolism. It is presently unclear whether the findings of this work will be recapitulated in any of the kidney cancer in vivo models and thus be functionally testable.

      The authors have acknowledged these caveats and where possible provided textual clarifications and updated figures in their revised manuscript. Future work will be required to model these changes in animal models.

    5. Reviewer #2 (Public Review):

      The study employs quantitative metabolomic and lipidomic analyses to scrutinize tumor interstitial fluid (TIF), adjacent normal kidney interstitial fluid (KIF), and plasma samples from renal cell carcinoma (RCC) patients. The authors delve into the intricate world of renal cell carcinoma and its tumor microenvironment, shedding light on the factors that shape nutrient availability in both cancerous and adjacent normal tissues. The authors prove that non-cancer-driven tissue factors play a dominant role in shaping nutrient availability in RCC. This finding opens up new avenues for research, suggesting that the tumor microenvironment is profoundly influenced by factors beyond the presence of cancer cells. This study not only contributes valuable insights into RCC metabolism but also prompts a reevaluation of the factors governing nutrient availability in tumor microenvironments more broadly. Overall, it represents a significant step forward in our understanding of the intricate interplay between cancer and its surrounding milieu.

      The study is overall well-constructed, including appropriate analysis. Likewise, the manuscript is written clearly and supported by high-quality figures. Since the authors exclusively employed samples from RCC patients and did not include kidney interstitial fluid and plasma samples from healthy individuals, we cannot accurately assess the true significance and applicability of the results until the role of cancer cells in reshaping KIF is understood. In essence, some metabolite levels in the tumor interstitial fluid did not show an increase or decrease compared to the adjacent normal kidney interstitial fluid. However, the levels of these metabolites in both TIF and KIF might be higher or lower than those in kidney interstitial fluid from healthy individuals, and the roles of these metabolites should not be overlooked. Similar concerns extend to plasma levels, emphasizing the importance of metabolites that synchronously change in RCC TIF, KIF, and plasma-whether elevated or reduced.

    1. eLife assessment

      This important study presents a novel pipeline for the large-scale genomic prediction of members of the non-ribosomal peptide group of pyoverdines based on a dataset from nearly 2000 Pseudomonas genomes. The advance presented in this study is largely based on solid evidence, although some main claims are only incompletely supported. This study on bacterial siderophores has broad theoretical and practical implications beyond a singular subfield.

    2. Reviewer #1 (Public Review):

      The manuscript introduces a bioinformatic pipeline designed to enhance the structure prediction of pyoverdines, revealing an extensive and previously overlooked diversity in siderophores and receptors. Utilizing a combination of feature sequence and phylogenetic approaches, the method aims to address the challenging task of predicting structures based on dispersed gene clusters, particularly relevant for pyoverdines.

      Predicting structures based on gene clusters is still challenging, especially pyoverdines as the gene clusters are often spread to different locations in the genome. An improved method would indeed be highly useful, and the diversity of pyoverdine gene clusters and receptors identified is impressive.

      However, so far the method basically aligns the structural genes and domains involved in pyoverdine biosynthesis and then predicts A domain specificity to predict the encoded compounds. Both methods are not particularly new as they are included in other tools such as PRISM (10.1093/nar/gkx320 ) or Sandpuma (https://doi.org/10.1093/bioinformatics/btx400) among others. The study claims superiority in A domain prediction compared to existing tools, yet the support is currently limited, relying on a comparison solely with AntiSMASH. A more extensive and systematic comparison with other tools is needed.

      Additionally, in contradiction to the authors' claims, the method's applicability seems constrained to well-known and widely distributed gene clusters. The absence of predictions for new amino acids raises concerns about its generalizability to NRPS beyond the studied cases.

      The manuscript lacks clarity on how the alignment of structural genes operates when dealing with multiple NRPS gene clusters on different genome contigs. How would the alignment of each BGC work?

      Another critical concern is that a main challenge in NRPS structure prediction is not the backbone prediction but rather the prediction of tailoring reactions, which is not addressed in the manuscript at all, and this limitation extensively restricts the applicability of the method.

      The manuscript presents a potentially highly useful bioinformatic pipeline for pyoverdine structure prediction, showcasing a commendable exploration of siderophore diversity. However, some of the claims made remain unsubstantiated. Overall, while the study holds promise, further validation and refinement are required to fulfill its potential impact on the field of bioinformatic structure prediction.

    3. Reviewer #2 (Public Review):

      Pyoverdines, siderophores produced by many Pseudomonads, are one of the most diverse groups of specialized metabolites and are frequently used as model systems. Thousands of Pseudomonas genomes are available, but large-scale analyses of pyoverdines are hampered by the biosynthetic gene clusters (BGCs) being spread across multiple genomic loci and existing tools' inability to accurately predict amino acid substrates of the biosynthetic adenylation (A) domains. The authors present a bioinformatics pipeline that identifies pyoverdine BGCs and predicts the A domain substrates with high accuracy. They tackled a second challenging problem by developing an algorithm to differentiate between outer membrane receptor selectivity for pyoverdines versus other siderophores and substrates. The authors applied their dataset to thousands of Pseudomonas strains, producing the first comprehensive overview of pyoverdines and their receptors and predicting many new structural variants.

      The A domain substrate prediction is impressive, including the correction of entries in the MIBiG database. Their high accuracy came from a relatively small training dataset of A domains from 13 pyoverdine BGCs. The authors acknowledge that this small dataset does not include all substrates, and correctly point out that new sequence/structure pairs can be added to the training set to refine the prediction algorithm. The authors could have been more comprehensive in finding their training set data. For instance, the authors claim that histidine "had not been previously documented in pyoverdines", but the sequenced strain P. entomophila L48, incorporates His (10.1007/s10534-009-9247-y). The workflow cannot differentiate between different variants of Asp and OHOrn, and it's not clear if this is a limitation of the workflow, the training data, or both. The prediction workflow holds up well in Burkholderiales A domains, however, they fail to mention in the main text that they achieved these numbers by adding more A domains to their training set.

      To validate their predictions, they elucidated structures of several new pyoverdines, and their predictions performed well. However, the authors did not include their MS/MS data, making it impossible to validate their structures. In general, the biggest limitation of the submitted manuscript is the near-empty methods section, which does not include any experimental details for the 20 strains or details of the annotation pipeline (such as "Phydist" and "Syndist"). The source code also does not contain the requisite information to replicate the results or re-use the pipeline, such as the antiSMASH version and required flags. That said, skimming through the source code and data (kindly provided upon request) suggests that the workflow itself is sound and a clear improvement over existing tools for pyoverdine BGC annotation.

      Predicting outer membrane receptor specificity is likewise a challenging problem and the authors have made a promising achievement by finding specific gene regions that differentiate the pyoverdine receptor FpvA from FpvB and other receptor families. Their predictions were not tested experimentally, but the finding that only predicted FpvA receptors were proximate to the biosynthesis genes lends credence to the predictive power of the workflow. The authors find predicted pyoverdine receptors across an impressive 468 genera, an exciting finding for expanding the role of pyoverdines as public goods beyond Pseudomonas. However, whether or not these receptors can recognize pyoverdines (and if so, which structures!) remains to be investigated.

      In all, the authors have assembled a rich dataset that will enable large-scale comparative genomic analyses. This dataset could be used by a variety of researchers, including those studying natural product evolution, public good eco/evo dynamics, and NRPS engineering.

    4. Reviewer #3 (Public Review):

      Summary:

      Secondary metabolites are produced by numerous microorganisms and have important ecological functions. A major problem is that neither the function of a secondary metabolite enzyme nor the resulting metabolite can be precisely predicted from gene sequence data.

      In the current paper, the authors addressed this highly relevant question.

      The authors developed a bioinformatic pipeline to reconstruct the complete secondary metabolism pathway of pyoverdines, a class of iron-scavenging siderophores produced by Pseudomonas spp. These secondary metabolites are biosynthesized by a series of non-ribosomal peptide synthetases and require a specific receptor (FpvA) for uptake. The authors combined knowledge-guided learning with phylogeny-based methods to predict with high accuracy encoding NRPSs, substrate specificity of A domains, pyoverdine derivatives, and receptors. After validation, the authors tested their pipeline with sequence data from 1664 phylogenetically distinct Pseudomonas strains and were able to determine 18,292 enzymatic A domains involved in pyoverdine synthesis, reliably predicted 97.8% of their substrates, identified 188 different pyoverdine molecule structures and 4547 FpvA receptor variants belonging to 94 distinct groups. All the results and predictions were clearly superior to predictions that are based on antiSMASH. Novel pyoverdine structures were elucidated experimentally by UHPLC-HR-MS/MS.

      To assess the extendibility of the pipeline, the authors chose Burkholderiales as a test case which led to the results that the pipeline consistently maintains high prediction accuracy within Burkholderiales of 83% which was higher than for antiSMASH (67%).

      Together, the authors concluded that supervised learning based on a few known compounds produced by species from the same genus probably outperforms generalized prediction algorithms trained on many products from a diverse set of microbes for NRPS substrate predictions. As a result, they also show that both pyoverdine and receptor diversity have been vastly underestimated.

      Strengths:

      The authors developed a very useful bioinformatic pipeline with high accuracy for secondary metabolites, at least for pyoverdines. The pipelines have several advantages compared to existing pipelines like the extensively used antiSMASH program, e.g. it can be applied to draft genomes, shows reduced erroneous gene predictions, etc. The accuracy was impressively demonstrated by the discovery of novel pyoverdines whose structures were experimentally substantiated by UHPLC-HR-MS/MS.

      The manuscript is very well written, and the data and the description of the generation of pipelines are easy to follow.

      Weaknesses:

      The only major comment I have is the uncertainty of whether the pipeline can be applied to more complex non-ribosomal peptides. In the current study, the authors only applied their pipeline to a very narrow field, i.e., pyoverdines of Pseudomonas and Burkholderia strains.

    1. Author response:

      eLife assessment

      This study provides valuable evidence indicating that Syngap1 regulates the synaptic drive and membrane excitability of parvalbumin- and somatostatin-positive interneurons in the auditory cortex. Since haplo-insufficiency of Syngap1 has been linked to intellectual disabilities without a well-defined underlying cause, the central question of this study is timely. However, the support for the authors' conclusions is incomplete in general and some parts of the experimental evidence are inadequate. Specifically, the manuscript requires further work to properly evaluate the impact on synaptic currents, intrinsic excitability parameters, and morphological features.

      We are happy that the editors found that our study provides valuable evidence and that the central question is timely. We thank the reviewers for their detailed comments and suggestions. Below, we provide a point-by-point answer (in blue) to the specific comments and indicate the changes to the manuscript and the additional experiments we plan to perform to answer these comments.

      Public Reviews:

      Reviewer #1 (Public Review):

      The study is designed to assess the role of Syngap1 in regulating the physiology of the MGE-derived PV+ and SST+ interneurons. Syngap1 is associated with some mental health disorders, and PV+ and SST+ cells are the focus of many previous and likely future reports from studies of interneuron biology, highlighting the translational and basic neuroscience relevance of the authors' work.

      Strengths of the study are using well-established electrophysiology methods and the highly controlled conditions of ex vivo brain slice experiments combined with a novel intersectional mouse line, to assess the role of Syngap1 in regulating PV+ and SST+ cell properties. The findings revealed that in the mature auditory cortex, Syngap1 haploinsufficiency decreases both the intrinsic excitability and the excitatory synaptic drive onto PV+ neurons from Layer 4. In contrast, SST+ interneurons were mostly unaffected by Syngap1 haploinsufficiency. Pharmacologically manipulating the activity of voltage-gated potassium channels of the Kv1 family suggested that these channels contributed to the decreased PV+ neuron excitability by Syngap insufficiency. These results therefore suggest that normal Syngap1 expression levels are necessary to produce normal PV+ cell intrinsic properties and excitatory synaptic drive, albeit, perhaps surprisingly, inhibitory synaptic transmission was not affected by Syngap1 haploinsufficiency.

      Since the electrophysiology experiments were performed in the adult auditory cortex, while Syngap1 expression was potentially affected since embryonic stages in the MGE, future studies should address two important points that were not tackled in the present study. First, what is the developmental time window in which Syngap1 insufficiency disrupted PV+ neuron properties? Albeit the embryonic Syngap1 deletion most likely affected PV+ neuron maturation, the properties of Syngap-insufficient PV+ neurons do not resemble those of immature PV+ neurons. Second, whereas the observation that Syngap1 haploinsufficiency affected PV+ neurons in auditory cortex layer 4 suggests auditory processing alterations, MGE-derived PV+ neurons populate every cortical area. Therefore, without information on whether Syngap1 expression levels are cortical area-specific, the data in this study would predict that by regulating PV+ neuron electrophysiology, Syngap1 normally controls circuit function in a wide range of cortical areas, and therefore a range of sensory, motor and cognitive functions. These are relatively minor weaknesses regarding interpretation of the data in the present study that the authors could discuss.

      We agree with the reviewer on the proposed open questions, which we will certainly discuss in the revised manuscript we are preparing. We do have experimental evidence suggesting that Syngap1 mRNA is expressed by PV+ and SST+ neurons in different cortical areas, during early postnatal development and in adulthood; therefore, we agree that it will be important, in future experiments, to tackle the question of when the observed phenotypes arise.

      Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors investigated how partial loss of SynGap1 affects inhibitory neurons derived from the MGE in the auditory cortex, focusing on their synaptic inputs and excitability. While haplo-insufficiently of SynGap1 is known to lead to intellectual disabilities, the underlying mechanisms remain unclear.

      Strengths:

      The questions are novel

      Weaknesses:

      Despite the interesting and novel questions, there are significant concerns regarding the experimental design and data quality, as well as potential misinterpretations of key findings. Consequently, the current manuscript fails to contribute substantially to our understanding of SynGap1 loss mechanisms and may even provoke unnecessary controversies.

      Major issues:

      (1) One major concern is the inconsistency and confusion in the intermediate conclusions drawn from the results. For instance, while the sEPSC data indicates decreased amplitude in PV+ and SOM+ cells in cHet animals, the frequency of events remains unchanged. In contrast, the mEPSC data shows no change in amplitudes in PV+ cells, but a significant decrease in event frequency. The authors conclude that the former observation implies decreased excitability. However, traditionally, such observations on mEPSC parameters are considered indicative of presynaptic mechanisms rather than changes of network activity.‎ The subsequent synapse counting experiments align more closely with the traditional conclusions. This issue can be resolved by rephrasing the text. However, it would remain unexplained why the sEPSC frequency shows no significant difference. If the majority of sEPSC events were indeed mediated by spiking (which is blocked by TTX), the average amplitudes and frequency of mEPSCs should be substantially lower than those of sEPSCs. Yet, they fall within a very similar range, suggesting that most sEPSCs may actually be independent of action potentials. But if that was indeed the case, the changes of purported sEPSC and mEPSC results should have been similar.

      We understand the reviewer’s perspective; indeed, we asked ourselves the very same question regarding why the sEPSC and mEPSC frequency fall within a similar range when we analysed neuron means (bar graphs). We have already recorded sEPSCs followed by mEPSCs from several PV neurons (control and cHet) and are in the process of analyzing the data. We will add this data to the revised version of the manuscript. We will also rephrase the manuscript to present multiple potential interpretations of the data.

      We hope that we have correctly interpreted the reviewer's concern. However, if the question is why sEPSC amplitude but not frequency is affected in cHet vs ctrl then the reviewer’s comment is perhaps based on the assumption that the amplitude and frequency of miniature events should be lower for all events compared to those observed for spontaneous events. However, it's essential to note that changes in the mean amplitude of sEPSCs are primarily driven by alterations in large sEPSCs (>9-10pA, as shown in cumulative probability in Fig. 1b right), with smaller ones being relatively unaffected. Consequently, a reduction in sEPSC amplitude may not necessarily result in a significant decrease in frequency since their values likely remain above the detection threshold of 3 pA. This could explain the lack of a significant decrease in average inter-interval event of sEPSCs (as depicted in Fig. 1b left).

      If the question is whether we should see the same parameters affected by the genetic manipulation in both sEPSC and mEPSC, then another critical consideration is the involvement of the releasable pool in mEPSCs versus sEPSCs. Current knowledge suggests that activity-dependent and -independent release may not necessarily engage the same pool of vesicles or target the same postsynaptic sites. This concept has been extensively explored (reviewed in Kavalali, 2015). Consequently, while we may have traditionally interpreted activity-dependent and -independent data assuming they utilize the same pool, this is no longer accurate. The current discussion in the field revolves around understanding the mechanisms underlying such phenomena. Therefore, comparisons between sEPSCs and mEPSCs may not yield conclusive data but rather speculative interpretations. For a rigorous analysis, particularly in this context involving thousands of events, it is essential to assess these data sets (mEPSCs vs sEPSCs) separately and provide cumulative probability curves. This approach allows for a more comprehensive understanding of the underlying distributions and helps to elucidate any potential differences between the two types of events. We will rephrase the text, and as mentioned above, add additional data, to better reflect these considerations.

      (2) Another significant concern is the quality of synapse counting experiments. The authors attempted to colocalize pre- and postsynaptic markers Vglut1 and PSD95 with PV labelling. However, several issues arise. Firstly, the PV labelling seems confined to soma regions, with no visible dendrites. Given that the perisomatic region only receives a minor fraction of excitatory synapses, this labeling might not accurately represent the input coverage of PV cells. Secondly, the resolution of the images is insufficient to support clear colocalization of the synaptic markers. Thirdly, the staining patterns are peculiar, with PSD95 puncta appearing within regions clearly identified as somas by Vglut1, hinting at possible intracellular signals. Furthermore, PSD95 seems to delineate potential apical dendrites of pyramidal cells passing through the region, yet Vglut1+ partners are absent in these segments, which are expected to be the marker of these synapses here. Additionally, the cumulative density of Vglut2 and Vglut1 puncta exceeds expectations, and it's surprising that subcortical fibers labeled by Vglut2 are comparable in number to intracortical Vglut1+ axon terminals. Ideally, N(Vglut1)+N(Vglut2) should be equal or less than N(PSD95), but this is not the case here. Consequently, these results cannot be considered reliable due to these issues.

      We apologize, as it appears that the images we provided have caused confusion. The selected images represent a single focal plane of a confocal stack, which was visually centered on the PV cell somata. We chose just one confocal plane because we thought it showed more clearly the apposition of presynaptic and postsynaptic immunolabeling around the somata. In the revised version of the manuscript, we will provide higher magnification images, which will clearly show how we identified and selected the region of interest for the quantification of colocalized synaptic markers. In our confocal stacks, we can also identify PV immunolabeled dendrites and colocalized vGlut1/PSD95 or vGlut2/PSD95 puncta on them; but these do not appear in the selected images because, as explained, only one focal plane, centered on the PV cell somata, was shown.

      We acknowledge the reviewer's point that in PV+ cells the majority of excitatory inputs are formed onto dendrites; however, we focused on the somatic excitatory inputs to PV cells, because despite their lower number, they produce much stronger depolarization in PV neurons than dendritic excitatory inputs (Hu et al., 2010; Norenberg et al., 2010). Further, quantification of perisomatic putative excitatory synapses is more reliable since by using PV immunostaining, we can visualize the soma and larger primary dendrites, but smaller, higher order dendrites are not be always detectable. Of note, PV positive somata receive more excitatory synapses than SST positive and pyramidal neuron somata as found by electron microscopy studies in the visual cortex (Hwang et al., 2021; Elabbady et al., 2024).

      Regarding the comment on the density of vGlut1 and vGlut2 puncta, the reason that the numbers appear high and similar between the two markers is because we present normalized data (cHet normalized to their control values for each set of immunolabelling) to clearly represent the differences between genotypes. This information is present in the legends but we apologize for not clearly explaining it the methods section. We will provide a more detailed explanation of our methods in the revised manuscript.

      Briefly, immunostained sections were imaged using a Leica SP8-STED confocal microscope, with a 63x (NA 1.4) at 1024 X 1024, z-step =0.3 μm, stack size of ~15 μm. Images were acquired from the auditory cortex from at least 3 coronal sections per animal. All the confocal parameters were maintained constant throughout the acquisition of an experiment. All images shown in the figures are from a single confocal plane. To quantify the number of vGlut1/PSD95 or vGlut2/PSD95 putative synapses, images were exported as TIFF files and analyzed using Fiji (Image J) software. We first manually outlined the profile of each PV cell soma (identified by PV immunolabeling). At least 4 innervated somata were selected in each confocal stack. We then used a series of custom-made macros in Fiji as previously described (Chehrazi et al, 2023). After subtracting background (rolling value = 10) and Gaussian blur (σ value = 2) filters, the stacks were binarized and vGlut1/PSD95 or vGlut2/PSD95 puncta were independently identified around the perimeter of a targeted soma in the focal plane with the highest soma circumference. Puncta were quantified after filtering particles for size (included between 0-2μm2) and circularity (included between 0-1). Data quantification was done by investigators blind to the genotype, and presented as normalized data over control values for each experiment.

      (3) One observation from the minimal stimulation experiment was concluded by an unsupported statement. Namely, the change in the onset delay cannot be attributed to a deficit in the recruitment of PV+ cells, but it may suggest a change in the excitability of TC axons.

      We agree with the reviewer, please see answer to point below.

      (‎4) The conclusions drawn from the stimulation experiments are also disconnected from the actual data. To make conclusions about TC release, the authors should have tested release probability using established methods, such as paired-pulse changes. Instead, the only observation here is a change in the AMPA components, which remained unexplained.

      We agree with the reviewer and we will perform additional paired-pulse ratio experiments at different intervals. We will rephrase the discussion and our interpretation and potential hypothesis according to the data obtained from this new experiment.

      (5) The sampling rate of CC recordings is insufficient ‎to resolve the temporal properties of the APs. Therefore, the phase-plots cannot be interpreted (e.g. axonal and somatic AP components are not clearly separated), raising questions about how AP threshold and peak were measured. The low sampling rate also masks the real derivative of the AP signals, making them apparently faster.

      We acknowledge that a higher sampling rate could offer a more detailed analysis of the action potential waveform. However, in the context of action potential analysis, it is acceptable to use sampling rates ranging from 10 kHz to 20 kHz (Golomb et al., 2007; Stevens et al., 2021; Zhang et al., 2023), which are considered adequate in the context of the present study. Indeed, our study aims to evaluate "relative" differences in the electrophysiological phenotype when comparing groups following a specific genetic manipulation. A sampling rate of 10 kHz is commonly employed in similar studies, including those conducted by our collaborator and co-author S. Kourrich (e.g., Kourrich and Thomas 2009, Kourrich et al., 2013), as well as others (Russo et al., 2013; Ünal et al., 2020; Chamberland et al., 2023).

      Despite being acquired at a lower sampling rate than potentially preferred by the reviewer, our data clearly demonstrate significant differences between the experimental groups, especially for parameters that are negligibly or not affected by the sampling rate used here (e.g., #spikes/input, RMP, Rin, Cm, Tm, AP amplitude, AP latency, AP rheobase).

      Regarding the phase-plots, we agree that a higher sampling rate would have resulted in smoother curves and more accurate absolute values. However, the differences were sufficiently pronounced to discern the relative variations in action potential waveforms between the experimental groups.

      A related issue is that the Methods section lacks essential details about the recording conditions, such as bridge balance and capacitance neutralization.

      We indeed performed bridge balance and neutralized the capacitance before starting every recording. We will add the information in the methods.

      (6) Interpretation issue: One of the most fundamental measures of cellular excitability, the rheobase, was differentially affected by cHet in BCshort and BCbroad. Yet, the authors concluded that the cHet-induced changes in the two subpopulations are common.

      We are uncertain if we have correctly interpreted the reviewer's comment. While we observed distinct impacts on the rheobase (Fig. 7d and 7i), there seems to be a common effect on the AP threshold (Fig. 7c and 7h), as interpreted and indicated in the final sentence of the results section for Figure 7 (page 12). If our response does not address the reviewer's comment adequately, we would greatly appreciate it if the reviewer could rephrase their feedback.

      (7) Design issue:

      The Kv1 blockade experiments are disconnected from the main manuscript. There is no experiment that shows the causal relationship between changes in DTX and cHet cells. It is only an interesting observation on AP halfwidth and threshold. However, how they affect rheobase, EPSCs, and other topics of the manuscript are not addressed in DTX experiments.

      Furthermore, Kv1 currents were never measured in this work, nor was the channel density tested. Thus, the DTX effects are not necessarily related to changes in PV cells, which can potentially generate controversies.

      While we acknowledge the reviewer's point that Kv1 currents and density weren't specifically tested, an important insight provided by Fig. 5 is the prolonged action potential latency. This delay is significantly influenced by slowly inactivating subthreshold potassium currents, namely the D-type K+ current. It's worth noting that D-type current is primarily mediated by members of the Kv1 family. The literature supports a role for Kv1.1-containing channels in modulating responses to near-threshold stimuli in PV cells (Wang et al., 1994; Goldberg et al., 2008; Zurita et al., 2018). However, we recognize that besides the Kv1 family, other families may also contribute to the observed changes.

      To address this concern, we will revise our interpretation. We will opt for the more accurate term "D-type K+ current" and only speculate about the involved channel family in the discussion. It is not our intention to open unnecessary controversy, but present the data we obtained. We believe this approach and rephrasing the discussion as proposed will prevent unnecessary controversy and instead foster fruitful discussions.

      (8) Writing issues:

      Abstract:

      The auditory system is not mentioned in the abstract.

      One statement in the abstract is unclear‎. What is meant by "targeting Kv1 family of voltage-gated potassium channels was sufficient..."? "Targeting" could refer to altered subcellular targeting of the channels, simple overexpression/deletion in the target cell population, or targeted mutation of the channel, etc. Only the final part of the Results revealed that none of the above, but these channels were blocked selectively.

      We agree with the reviewer and we will rephrase the abstract accordingly.

      Introduction:

      There is a contradiction in the introduction. The second paragraph describes in detail the distinct contribution of PV and SST n‎eurons to auditory processing. But at the end, the authors state that "relatively few reports on PV+ and SST+ cell-intrinsic and synaptic properties in adult auditory cortex". Please be more specific about the unknown properties.

      We agree with the reviewer and we will rephrase more specifically.

      (9) The introduction emphasizes the heterogeneity of PV neurons, which certainly influences the interpretation of the results of the current manuscript. However, the initial experiments did not consider this and handled all PV cell data as a pooled population.

      In the initial experiments, we handled all PV cell data together because we wanted to be rigorous and not make assumptions/biases on the different PV cells, which in later experiments we were to distinguish based on the intrinsic properties alone. We will make this point clear in the revised manuscript.

      (10) The interpretation of the results strongly depends on unpublished work, which potentially provide the physiological and behavioral contexts about the role of GABAergic neurons in SynGap-haploinsufficiency. The authors cite their own unpublished work, without explaining the specific findings and relation to this manuscript.

      We agree with the reviewer and apologize for the lack of clarity. Our unpublished work is in revision right now. We will provide more information and update references in the revised version of this manuscript.

      (11) The introduction of Scholl analysis ‎experiments mentions SOM staining, however, there is no such data about this cell type in the manuscript.

      We apologize for the error, we will change SOM with SST (SOM and SST are two commonly used acronyms for Somatostatin expressing interneurons).

      Reviewer #3 (Public Review):

      This paper compares the synaptic and membrane properties of two main subtypes of interneurons (PV+, SST+) in the auditory cortex of control mice vs mutants with Syngap1 haploinsufficiency. The authors find differences at both levels, although predominantly in PV+ cells. These results suggest that altered PV-interneuron functions in the auditory cortex may contribute to the network dysfunction observed in Syngap1 haploinsufficiency-related intellectual disability. The subject of the work is interesting, and most of the approach is direct and quantitative, which are major strengths. There are also some weaknesses that reduce its impact for a broader field.

      (1) The choice of mice with conditional (rather than global) haploinsufficiency makes the link between the findings and Syngap1 relatively easy to interpret, which is a strength. However, it also remains unclear whether an entire network with the same mutation at a global level (affecting also excitatory neurons) would react similarly.

      The reviewer raises an interesting and pertinent open question which we will address in the discussion of the revised paper.

      (2) There are some (apparent?) inconsistencies between the text and the figures. Although the authors appear to have used a sophisticated statistical analysis, some datasets in the illustrations do not seem to match the statistical results. For example, neither Fig 1g nor Fig 3f (eNMDA) reach significance despite large differences.

      We respectfully disagree, we do not think the text and figures are inconsistent. In the cited example, large apparent difference in mean values does not show significance due to the large variability in the data; further, we did not exclude any data points, because we wanted to be rigorous. In particular, for Fig.1g, statistical analysis shows a significant increase in the inter-mEPSC interval (*p=0.027, LMM) when all events are considered (cumulative probability plots), while there is no significant difference in the inter-mEPSCs interval for inter-cell mean comparison (inset, p=0.354, LMM). Inter-cell mean comparison does not show difference with Mann-Whitney test either (p=0.101, the data are not normally distributed, hence the choice of the Mann-Whitney test). For Fig. 3f (eNMDA), the higher mean value for the cHet versus the control is driven by two data points which are particularly high, while the other data points overlap with the control values. The Mann-Whitney test show also no statistical difference (p=0.174).

      In the manuscript, discussion of the data is based on the results of the LMM analysis, which takes in account both the number of cells and the numbers of mice from which these cells are recorded. We chose this statistical approach because it does not rely on the assumption that cells recorded from same mouse are independent variables. In the supplemental tables, we provided the results of the statistical analysis done with both LMM and the most commonly used Mann Whitney (for not normally distributed) or t-test (for normally distributed), for each data set.

      Also, the legend to Fig 9 indicates the presence of "a significant decrease in AP half-width from cHet in absence or presence of a-DTX", but the bar graph does not seem to show that.

      We apologize for our lack of clarity. In legend 9, we reported the statistical comparisons between 1) cHET mice in absence of a-DTX and control mice and 2) cHET mice in presence of a-DTX and control mice. We will rephrase result description and the legend of the figure to avoid confusion.

      (3) The authors mention that the lack of differences in synaptic current kinetics is evidence against a change in subunit composition. However, in some Figures, for example, 3a, the kinetics of the recorded currents appear dramatically different. It would be important to know and compare the values of the series resistance between control and mutant animals.

      We agree with the reviewer that there appears to be a qualitative difference in eNMDA decay between conditions, although quantified eNMDA decay itself is similar between groups. We have used a cutoff of 15 % for the series resistance (Rs), which is significantly more stringent as compared to the cutoff typically used in electrophysiology, which are for the vast majority between 20 and 30%. To answer this concern, we re-examined the Rs, we compared Rs between groups and found no difference for Rs in eAMPA (13.2±0.5 in WT n=16 cells, 7 mice vs 13.7±0.3 in cHet n=14 cells, 7 mice, p=0.432 LMM) and eNMDA (12.7±0.7 in WT n=6 cells, 3 mice vs 13.8±0.7 in cHet n=6 cells, 5 mice, p=0.231, LMM). Thus, the apparent qualitative difference in eNMDA decay stems from inter-cell variability rather than inter-group differences. Notably, this discrepancy between the trace (Fig. 3a) and the data (Fig. 3f, right) is largely due to inter-cell variability, particularly in eNMDA, where a higher but non-significant decay rate is driven by a couple of very high values (Fig. 3f, right). In the revised manuscript, we will show traces that better represent our findings.

      (4) A significant unexplained variability is present in several datasets. For example, the AP threshold for PV+ includes points between -50-40 mV, but also values at around -20/-15 mV, which seems too depolarized to generate healthy APs (Fig 5c, Fig7c).

      We acknowledge the variability in AP threshold data, with some APs appearing too depolarized to generate healthy spikes. However, we meticulously examined each AP that spiked at these depolarized thresholds and found that other intrinsic properties (such as Rin, Vrest, AP overshoot, etc.) all indicate that these cells are healthy. Therefore, to maintain objectivity and provide unbiased data to the community, we opted to include them in our analysis. It's worth noting that similar variability has been observed in other studies (Bengtsson Gonzales et al., 2020; Bertero et al., 2020).

      Further, we conducted a significance test on AP threshold excluding these potentially unhealthy cells and found that the significant differences persist. After removing two outliers from the cHet group with values of -16.5 and 20.6 mV, we obtain: -42.6±1.01 mV in control, n=33, 15 mice vs -36.2±1.1 mV in cHet, n=38 cells, 17 mice, ***p<0.001, LMM. Thus, whether these cells are included or excluded, our interpretations and conclusions remain unchanged.

      We would like to clarify that these data have not been corrected with the junction potential. We will add this info in the revised version.

      (5) I am unclear as to how the authors quantified colocalization between VGluts and PSD95 at the low magnification shown in Supplementary Figure 2.

      We apologize for our lack of clarity. Although the analysis was done at high resolution, the figures were focused on showing multiple PV somata receiving excitatory inputs. We will add higher magnification figures and more detailed information in the methods of the revised version. Please also see our response to reviewer #2.

      (6) The authors claim that "cHet SST+ cells showed no significant changes in active and passive membrane properties", but this claim would seem to be directly refused by the data of Fig 8f. In the absence of changes in either active or passive membrane properties shouldn't the current/#AP plot remain unchanged?

      While we acknowledge the theoretical expectation that changes in intrinsic parameters should correlate with alterations in neuronal firing, the absence of differences in the parameters analyzed in this study should not overshadow the clear and significant decrease in firing rate observed in cHet SST+ cells. This decrease serves as a compelling indication of reduced intrinsic neuronal excitability. It's certainly possible that other intrinsic factors, not assessed in this study, may have contributed to this effect. However, exploring these mechanisms is beyond the scope of our current investigation. We will rephrase the discussion and add this limitation of our study in the revised version.

      (7) The plots used for the determination of AP threshold (Figs 5c, 7c, and 7h) suggest that the frequency of acquisition of current-clamp signals may not have been sufficient, this value is not included in the Methods section.

      This study utilized a sampling rate of 10 kHz, which is a standard rate for action potential analysis in the present context. We will describe more extensively the technical details in the method section of the revised manuscript we are preparing. While we acknowledge that a higher sampling rate could have enhanced the clarity of the phase plot, our recording conditions, as detailed in our response to Rev#2/comment#5, were suitable for the objectives of this study.

      Reference list

      Bengtsson Gonzales C, Hunt S, Munoz-Manchado AB, McBain CJ, Hjerling-Leffler J (2020) Intrinsic electrophysiological properties predict variability in morphology and connectivity among striatal Parvalbumin-expressing Pthlh-cells. Scientific Reports, 10, 15680. https://doi.org/10.1038/s41598-020-72588-1

      Bertero A, Zurita H, Normandin M, Apicella AJ (2020) Auditory long-range parvalbumin cortico-striatal neurons. Frontiers in Neural Circuits, 14, 45. http://doi.org/ 10.3389/fncir.2020.00045

      Chamberland S, Nebet ER, Valero M, Hanani M, Egger R, Larsen SB, Eyring KW, Buzsáki G, Tsien RW (2023) Brief synaptic inhibition persistently interrupts firing of fast-spiking interneurons. Neuron, 111, 1264–1281. http://doi.org/10.1016/j.neuron.2023.01.017

      Chehrazi P, Lee KKY, Lavertu-Jolin M, Abbasnejad Z, Carreño-Muñoz MI, Chattopadhyaya B, Di Cristo G (2023). The p75 Neurotrophin Receptor in Preadolescent Prefrontal Parvalbumin Interneurons Promotes Cognitive Flexibility in Adult Mice. Biol Psychiatry, 94, 310-321. doi: 10.1016/j.biopsych.2023.04.019.

      Elabbady L, Seshamani S, Mu S, Mahalingam G, Schneider-Mizell C, Bodor AL, Bae JA, Brittain D, Buchanan J, Bumbarger DJ, Castro MA, Dorkenwald S, Halageri A, Jia Z, Jordan C, Kapner D, Kemnitz N, Kinn S, Lee K, Li K…Collman F (2024) Perisomatic features enable efficient and dataset wide cell-type classifications across large-scale electron microscopy volumes. bioRxiv, https://doi.org/10.1101/2022.07.20.499976

      Goldberg EM, Clark BD, Zagha E, Nahmani M, Erisir A, Rudy B (2008) K+ Channels at the axon initial segment dampen near-threshold excitability of neocortical fast-spiking GABAergic interneurons. Neuron, 58, 387–400. https://doi.org/10.1016/j.neuron.2008.03.003

      Golomb D, Donner K, Shacham L, Shlosberg D, Amitai Y, Hansel D. (2007). Mechanisms of firing patterns in fast-spiking cortical interneurons. PLoS Computational Biology, 38, e156. http://doi.org/10.1371/journal.pcbi.0030156

      Hu H, Martina M, Jonas P (2010). Dendritic mechanisms underlying rapid synaptic activation of fast-spiking hippocampal interneurons. Science, 327, 52–58. http://doi.org/10.1126/science.1177876

      Hwang YS, Maclachlan C, Blanc J, Dubois A, Petersen CH, Knott G, Lee SH (2021). 3D ultrastructure of synaptic inputs to distinct gabaergic neurons in the mouse primary visual cortex. Cerebral Cortex, 31, 2610–2624. http://doi.org/10.1093/cercor/bhaa378

      Kavalali E (2015) The mechanisms and functions of spontaneous neurotransmitter release. Nature Reviews Neuroscience, 16, 5–16. https://doi.org/10.1038/nrn3875

      Kourrich S, Thomas MJ (2009) Similar neurons, opposite adaptations: psychostimulant experience differentially alters firing properties in accumbens core versus shell. Journal of Neuroscience, 29, 12275-12283. http://doi.org:10.1523/JNEUROSCI.3028-09.2009

      Kourrich S, Hayashi T, Chuang JY, Tsai SY, Su TP, Bonci A (2013) Dynamic interaction between sigma-1 receptor and Kv1.2 shapes neuronal and behavioral responses to cocaine. Cell, 152, 236–247. http://doi.org/10.1016/j.cell.2012.12.004

      Norenberg A, Hu H, Vida I, Bartos M, Jonas P (2010) Distinct nonuniform cable properties optimize rapid and efficient activation of fast-spiking GABAergic interneurons. Proceedings of the National Academy of Sciences, 107, 894–9. http://doi.org/10.1073/pnas.0910716107

      Stevens SR, Longley CM, Ogawa Y, Teliska LH, Arumanayagam AS, Nair S, Oses-Prieto JA, Burlingame AL, Cykowski MD, Xue M, Rasband MN (2021) Ankyrin-R regulates fast-spiking interneuron excitability through perineuronal nets and Kv3.1b K+ channels. Elife, 10, e66491. http://doi.org/10.7554/eLife.66491

      Russo G, Nieus TR, Maggi S, Taverna S (2013) Dynamics of action potential firing in electrically connected striatal fast-spiking interneurons. Frontiers in Cellular Neuroscience, 7, 209. https://doi.org/10.3389/fncel.2013.00209

      Ünal CT, Ünal B, Bolton MM (2020) Low-threshold spiking interneurons perform feedback inhibition in the lateral amygdala. Brain Structure and Function, 225, 909–923. http://doi.org/10.1007/s00429-020-02051-4

      Wang H, Kunkel DD, Schwartzkroin PA, Tempel BL (1994) Localization of Kv1.1 and Kv1.2, two K channel proteins, to synaptic terminals, somata, and dendrites in the mouse brain. The Journal of Neuroscience, 14, 4588-4599. https://doi.org/10.1523/JNEUROSCI.14-08-04588.1994

      Zhang YZ, Sapantzi S, Lin A, Doelfel SR, Connors BW, Theyel BB (2023) Activity-dependent ectopic action potentials in regular-spiking neurons of the neocortex. Frontiers in Cellular Neuroscience, 17. https://doi.org/10.3389/fncel.2023.1267687

      Zurita H, Feyen PLC, Apicella AJ (2018) Layer 5 callosal parvalbumin-expressing neurons: a distinct functional group of GABAergic neurons. Frontiers in Cellular Neuroscience, 12, 53. https://doi.org/10.3389/fncel.2018.00053

    2. eLife assessment

      This study provides valuable evidence indicating that SynGap1 regulates the synaptic drive and membrane excitability of parvalbumin- and somatostatin-positive interneurons in the auditory cortex. Since haplo-insufficiency of SynGap1 has been linked to intellectual disabilities without a well-defined underlying cause, the central question of this study is timely. However, the support for the authors' conclusions is incomplete in general and some parts of the experimental evidence are inadequate. Specifically, the manuscript requires further work to properly evaluate the impact on synaptic currents, intrinsic excitability parameters, and morphological features.

    3. Reviewer #1 (Public Review):

      The study is designed to assess the role of Syngap1 in regulating the physiology of the MGE-derived PV+ and SST+ interneurons. Syngap1 is associated with some mental health disorders, and PV+ and SST+ cells are the focus of many previous and likely future reports from studies of interneuron biology, highlighting the translational and basic neuroscience relevance of the authors' work.

      Strengths of the study are using well-established electrophysiology methods and the highly controlled conditions of ex vivo brain slice experiments combined with a novel intersectional mouse line, to assess the role of Syngap1 in regulating PV+ and SST+ cell properties. The findings revealed that in the mature auditory cortex, Syngap1 haploinsufficiency decreases both the intrinsic excitability and the excitatory synaptic drive onto PV+ neurons from Layer 4. In contrast, SST+ interneurons were mostly unaffected by Syngap1 haploinsufficiency. Pharmacologically manipulating the activity of voltage-gated potassium channels of the Kv1 family suggested that these channels contributed to the decreased PV+ neuron excitability by Syngap insufficiency. These results therefore suggest that normal Syngap1 expression levels are necessary to produce normal PV+ cell intrinsic properties and excitatory synaptic drive, albeit, perhaps surprisingly, inhibitory synaptic transmission was not affected by Syngap1 haploinsufficiency.

      Since the electrophysiology experiments were performed in the adult auditory cortex, while Syngap1 expression was potentially affected since embryonic stages in the MGE, future studies should address two important points that were not tackled in the present study. First, what is the developmental time window in which Syngap1 insufficiency disrupted PV+ neuron properties? Albeit the embryonic Syngap1 deletion most likely affected PV+ neuron maturation, the properties of Syngap-insufficient PV+ neurons do not resemble those of immature PV+ neurons. Second, whereas the observation that Syngap1 haploinsufficiency affected PV+ neurons in auditory cortex layer 4 suggests auditory processing alterations, MGE-derived PV+ neurons populate every cortical area. Therefore, without information on whether Syngap1 expression levels are cortical area-specific, the data in this study would predict that by regulating PV+ neuron electrophysiology, Syngap1 normally controls circuit function in a wide range of cortical areas, and therefore a range of sensory, motor and cognitive functions. These are relatively minor weaknesses regarding interpretation of the data in the present study that the authors could discuss.

    4. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors investigated how partial loss of SynGap1 affects inhibitory neurons derived from the MGE in the auditory cortex, focusing on their synaptic inputs and excitability. While haplo-insufficiently of SynGap1 is known to lead to intellectual disabilities, the underlying mechanisms remain unclear.

      Strengths:

      The questions are novel

      Weaknesses:

      Despite the interesting and novel questions, there are significant concerns regarding the experimental design and data quality, as well as potential misinterpretations of key findings. Consequently, the current manuscript fails to contribute substantially to our understanding of SynGap1 loss mechanisms and may even provoke unnecessary controversies.

      Major issues:

      (1) One major concern is the inconsistency and confusion in the intermediate conclusions drawn from the results. For instance, while the sEPSC data indicates decreased amplitude in PV+ and SOM+ cells in cHet animals, the frequency of events remains unchanged. In contrast, the mEPSC data shows no change in amplitudes in PV+ cells, but a significant decrease in event frequency. The authors conclude that the former observation implies decreased excitability. However, traditionally, such observations on mEPSC parameters are considered indicative of presynaptic mechanisms rather than changes of network activity.‎ The subsequent synapse counting experiments align more closely with the traditional conclusions. This issue can be resolved by rephrasing the text. However, it would remain unexplained why the sEPSC frequency shows no significant difference. If the majority of sEPSC events were indeed mediated by spiking (which is blocked by TTX), the average amplitudes and frequency of mEPSCs should be substantially lower than those of sEPSCs. Yet, they fall within a very similar range, suggesting that most sEPSCs may actually be independent of action potentials. But if that was indeed the case, the changes of purported sEPSC and mEPSC results should have been similar.

      (2) Another significant concern is the quality of synapse counting experiments. The authors attempted to colocalize pre- and postsynaptic markers Vglut1 and PSD95 with PV labelling. However, several issues arise. Firstly, the PV labelling seems confined to soma regions, with no visible dendrites. Given that the perisomatic region only receives a minor fraction of excitatory synapses, this labeling might not accurately represent the input coverage of PV cells.<br /> Secondly, the resolution of the images is insufficient to support clear colocalization of the synaptic markers. Thirdly, the staining patterns are peculiar, with PSD95 puncta appearing within regions clearly identified as somas by Vglut1, hinting at possible intracellular signals. Furthermore, PSD95 seems to delineate potential apical dendrites of pyramidal cells passing through the region, yet Vglut1+ partners are absent in these segments, which are expected to be the marker of these synapses here.<br /> Additionally, the cumulative density of Vglut2 and Vglut1 puncta exceeds expectations, and it's surprising that subcortical fibers labeled by Vglut2 are comparable in number to intracortical Vglut1+ axon terminals. Ideally, N(Vglut1)+N(Vglut2) should be equal or less than N(PSD95), but this is not the case here. Consequently, these results cannot be considered reliable due to these issues.

      (3) One observation from the minimal stimulation experiment was concluded by an unsupported statement. Namely, the change in the onset delay cannot be attributed to a deficit in the recruitment of PV+ cells, but it may suggest a change in the excitability of TC axons.

      (‎4) The conclusions drawn from the stimulation experiments are also disconnected from the actual data. To make conclusions about TC release, the authors should have tested release probability using established methods, such as paired-pulse changes. Instead, the only observation here is a change in the AMPA components, which remained unexplained.

      (5) The sampling rate of CC recordings is insufficient ‎to resolve the temporal properties of the APs. Therefore, the phase-plots cannot be interpreted (e.g. axonal and somatic AP components are not clearly separated), raising questions about how AP threshold and peak were measured. The low sampling rate also masks the real derivative of the AP signals, making them apparently faster.<br /> A related issue is that the Methods section lacks essential details about the recording conditions, such as bridge balance and capacitance neutralization.

      (6) Interpretation issue: One of the most fundamental measures of cellular excitability, the rheobase, was differentially affected by cHet in BCshort and BCbroad. Yet, the authors concluded that the cHet-induced changes in the two subpopulations are common.

      (7) Design issue:<br /> The Kv1 blockade experiments are disconnected from the main manuscript. There is no experiment that shows the causal relationship between changes in DTX and cHet cells. It is only an interesting observation on AP halfwidth and threshold. However, how they affect rheobase, EPSCs, and other topics of the manuscript are not addressed in DTX experiments.<br /> Furthermore, Kv1 currents were never measured in this work, nor was the channel density tested. Thus, the DTX effects are not necessarily related to changes in PV cells, which can potentially generate controversies.

      (8) Writing issues:<br /> Abstract:<br /> The auditory system is not mentioned in the abstract.<br /> One statement in the abstract is unclear‎. What is meant by "targeting Kv1 family of voltage-gated potassium channels was sufficient..."? "Targeting" could refer to altered subcellular targeting of the channels, simple overexpression/deletion in the target cell population, or targeted mutation of the channel, etc. Only the final part of the Results revealed that none of the above, but these channels were blocked selectively.<br /> Introduction:<br /> There is a contradiction in the introduction. The second paragraph describes in detail the distinct contribution of PV and SST n‎eurons to auditory processing. But at the end, the authors state that "relatively few reports on PV+ and SST+ cell-intrinsic and synaptic properties in adult auditory cortex". Please be more specific about the unknown properties.

      (9) The introduction emphasizes the heterogeneity of PV neurons, which certainly influences the interpretation of the results of the current manuscript. However, the initial experiments did not consider this and handled all PV cell data as a pooled population.

      (10) The interpretation of the results strongly depends on unpublished work, which potentially provide the physiological and behavioral contexts about the role of GABAergic neurons in SynGap-haploinsufficiency. The authors cite their own unpublished work, without explaining the specific findings and relation to this manuscript.

      (11) The introduction of Scholl analysis ‎experiments mentions SOM staining, however, there is no such data about this cell type in the manuscript.

    5. Reviewer #3 (Public Review):

      This paper compares the synaptic and membrane properties of two main subtypes of interneurons (PV+, SST+) in the auditory cortex of control mice vs mutants with Syngap1 haploinsufficiency. The authors find differences at both levels, although predominantly in PV+ cells. These results suggest that altered PV-interneuron functions in the auditory cortex may contribute to the network dysfunction observed in Syngap1 haploinsufficiency-related intellectual disability. The subject of the work is interesting, and most of the approach is direct and quantitative, which are major strengths. There are also some weaknesses that reduce its impact for a broader field.

      (1) The choice of mice with conditional (rather than global) haploinsufficiency makes the link between the findings and Syngap1 relatively easy to interpret, which is a strength. However, it also remains unclear whether an entire network with the same mutation at a global level (affecting also excitatory neurons) would react similarly.

      (2) There are some (apparent?) inconsistencies between the text and the figures. Although the authors appear to have used a sophisticated statistical analysis, some datasets in the illustrations do not seem to match the statistical results. For example, neither Fig 1g nor Fig 3f (eNMDA) reach significance despite large differences. Also, the legend to Fig 9 indicates the presence of "a significant decrease in AP half-width from cHet in absence or presence of a-DTX", but the bar graph does not seem to show that.

      (3) The authors mention that the lack of differences in synaptic current kinetics is evidence against a change in subunit composition. However, in some Figures, for example, 3a, the kinetics of the recorded currents appear dramatically different. It would be important to know and compare the values of the series resistance between control and mutant animals.

      (4) A significant unexplained variability is present in several datasets. For example, the AP threshold for PV+ includes points between -50-40 mV, but also values at around -20/-15 mV, which seems too depolarized to generate healthy APs (Fig 5c, Fig7c).

      (5) I am unclear as to how the authors quantified colocalization between VGluts and PSD95 at the low magnification shown in Supplementary Figure 2.

      (6) The authors claim that "cHet SST+ cells showed no significant changes in active and passive membrane properties", but this claim would seem to be directly refused by the data of Fig 8f. In the absence of changes in either active or passive membrane properties shouldn't the current/#AP plot remain unchanged?

      (7) The plots used for the determination of AP threshold (Figs 5c, 7c, and 7h) suggest that the frequency of acquisition of current-clamp signals may not have been sufficient, this value is not included in the Methods section.

    1. eLife assessment

      This study presents an important new technology for transdifferentiation of fibroblasts into muscle cells. The data and methods used for analysis were compelling. This study will have broad interest to cellular reprogramming biologists in particular as well as the general public.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This solid study investigates the transdifferentiation of chicken embryonic fibroblasts into muscle and fat cells in 3D to create whole-cut meat mimics. The study is important and provides a method to control muscle, fat, and collagen content within the 3D meat mimics and thus provides a new avenue for customized cultured meat production. Limitations of this study include the use of transgene for transdifferentiation and thus the creation of GMO food.

      We are grateful for the substantial effort that editors and reviewers put into assessing our manuscript and providing insightful feedback. We have tried to address, as much as possible, all comments and criticisms. We believe that we have now a significantly improved manuscript. Below, there is a point-by-point response.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors presented here a novel 3D fibroblast culture and transdifferentiation approach for potential meat production with GelMA hydrogel.

      Strengths:

      (1) Reduced serum concentration for 3D chicken fibroblast culture and transdifferentiation is optimized.

      (2) Efficient myogenic transdifferentiation and lipogenesis as well as controlled fat deposition are achieved in the 3D GelMA.

      Weaknesses:

      (1) While the authors stated the rationale of using fibroblasts instead of myogenic/adipogenic stem cells for meat production, the authors did not comment on the drawbacks/disadvantages of genetic engineering (e.g., forced expression of MyoD) in meat production.

      Thanks for the reviewer for raise this important issue. We have now described this drawback in the discussion part.

      As a proof-of-concept study, we sought to explore the potential of utilizing the transdifferentiation integrated transgene tools for overexpressing a transdifferentiation factor to achieve the maximum muscle production. However, it is important to acknowledge that genetically modified meat products derived from the genetic engineering of cultured cells will not be suitable for consumer acceptance and market viability. We are currently testing other non-genomic integrating delivery means such as modRNAs and chemical cocktails to induce myogenic transdifferentiation in fibroblasts. We believe the new non-genomic integration means would be compatible for the meat production and consumer acceptance.

      Please see lines 439-445.

      “As a proof-of-concept, we utilized the transgene method to achieve maximum myogenic induction and the final products still retain the foreign transgene fragment in the cells’ genome. It is therefore posing a risk of genetic modified food which is not suitable for mass production. In the next step, other non-transgenic means such as non-integrating vectors, chemical reprogramming, modified RNAs, and recombinant transgene removal techniques will be explored to develop transgene-free end products.”

      (2) While the authors cited one paper to state the properties and applications of GelMA hydrogel in tissue engineering and food processing, concerns/examples of the food safety with GelMA hydrogel are not discussed thoroughly.

      Thank you for pointing out this issue. We discussed the drawbacks of Gelma hydrogel applications in the meat production in the main text.

      GelMA-based hydrogels have shown great potential due to their biocompatibility and mechanical tenability. It is widely used in 3D cell culture and tissue engineering for regenerative medicine, but less common in food processing and agricultural applications. Due to its special photo-crosslinking properties, biocompatibility and degradability, it allows this material to be shaped into complex tissue structures by 3D printing or modelling. Many researchers have also used Gelma hydrogel as a scaffold for culture meat production (Jeong et al., 2022; Li et al., 2021; Park et al., 2023). Later research will carefully consider Gelma hydrogen as well as other types of scaffold biomaterials for cost-effective and food-safety compliant culture meat production (Bomkamp et al., 2022).

      Bomkamp, C., Skaalure, S. C., Fernando, G. F., Ben‐Arye, T., Swartz, E. W., & Specht, E. A. J. A. S. (2022). Scaffolding biomaterials for 3D cultivated meat: prospects and challenges. Advanced Science (Weinh), 9(3), 2102908.

      Jeong, D., Seo, J. W., Lee, H. G., Jung, W. K., Park, Y. H., & Bae, H. (2022). Efficient Myogenic/Adipogenic Transdifferentiation of Bovine Fibroblasts in a 3D Bioprinting System for Steak-Type Cultured Meat Production. Advanced Science (Weinh), 9(31), e2202877.

      Li, Y., Liu, W., Li, S., Zhang, M., Yang, F., & Wang, S. J. J. o. F. F. (2021). Porcine skeletal muscle tissue fabrication for cultured meat production using three-dimensional bioprinting technology. Journal of Future Foods, 1(1), 88-97.

      Park, S., Hong, Y., Park, S., Kim, W., Gwon, Y., Jang, K.-J., & Kim, J. J. J. o. B. E. (2023). Designing Highly Aligned Cultured Meat with Nanopatterns-Assisted Bio-Printed Fat Scaffolds. Journal of Biosystems Engineering, 48(4), 503-511.

      We discussed the drawbacks of GelMA hydrogel. Please see lines 445-457.

      “Another food safety concern in this study is the use of GelMA hydrogel for culture meat production. Due to its excellent biocompatibility and mechanical flexibility, GelMA-based hydrogel has demonstrated significant potential in scalable 3D cell culture for creating artificial tissue ranging in sizes from millimeters to centimeters. It is widely used in 3D cell culture and tissue engineering for regenerative medicine, but less common in food processing and agricultural applications. Due to its special photo-crosslinking properties, biocompatibility and degradability, it allows this material to be shaped into complex tissue structures by 3D printing or modelling. Many researchers have also used GelMA hydrogel as a scaffold for culture meat production (Jeong et al., 2022; Li et al., 2021; Park et al., 2023). Later research will carefully consider hydrogel as well as other types of scaffold biomaterials for cost-effective and food-safety compliant culture meat production (Bomkamp et al., 2022). ”

      (3) In Fig. 4C, there seems no significant difference in the Vimentin expression between Fibroblast_MyoD and Myofibroblast. The conclusion of "greatly reduced in the myogenic transdifferentiated cells" is overstated.

      Thanks for pointing out this mistake.

      We revised the wording accordingly. The vimentin expression was reduced in fibroblast_MyoD compare to the original fibroblast.

      Please see lines 231-233.

      “The fibroblast intermediate filament Vimentin (Tarbit et al., 2019) was abundantly expressed in the fibroblasts but reduced in the myogenic transdifferentiated cells (Figure 4C)”

      (4) The presented cell culture platform is only applied to chicken fibroblasts and should be tested in other species such as pigs and fish.

      Thank you for the suggestion.

      In this pilot cultured meat study, we utilized chicken embryonic fibroblasts. These specific cells were chosen for their near-immortal nature and robustness in culture, as well as the inducible myogenic capacity. In our previous experiments (Ren et al, Cell Reports, 2022, 40:111206), we have tested the myogenic transdifferentiation potential of fibroblasts from mice, pigs, and chickens, and observed varying efficiencies of myogenesis. It is important to note that fibroblast cells derived from different species, or even different tissues within the same species, would exhibit significant variations in their capacities for myogenic and adipogenic transdifferentiation.

      In this proof-of-concept study we used only one source of fibroblasts for testing culture meat production and confirmed the myogenic/adipogenic transdifferentiation could be manipulated as feasible means to precisely control muscle, fat and collagen content. We would expect that different origins of fibroblasts to display different transdifferentiation efficiencies and thus produce various muscle/fat ratios in meat mimics. That is beyond the scope of current study.

      Furthermore, we are also testing myogenic/adipogenic transdifferentiation of fibroblasts from pigs through non-genomic integration approaches. We believe only the non-transgene tools are viable solutions for culture meat production in the future. We added the species information in the discussion part.

      See lines 515-517.

      “This approach can be readily extrapolated to other species such as pigs and presents promising avenues for the large-scale production of customized and versatile meat products that may cater to varying consumer preferences.”

      Reviewer #2 (Public Review):

      The manuscript by Ma et al. tries to develop a protocol for cell-based meat production using chicken fibroblasts as three-dimensional (3D) muscle tissues with fat accumulation. The authors used genetically modified fibroblasts which can be forced to differentiate into muscle cells and formulated 3D tissues with these cells and a biphasic material (hydrogel). The degrees of muscle differentiation and lipid deposition in culture were determined by immunohistochemical, biochemical, and molecular biological evaluations. Notably, the protocol successfully achieved the process of myogenic and lipogenic stimulation in the 3D tissues.

      Overall, the study is reasonably designed and performed including adequate analysis. The manuscript is clearly written with well-supported figures. While it presents valuable results in the field of cultivated meat science and skeletal muscle biology, some critical concerns were identified. First, it is unclear whether some technical approaches were really the best choice for cell-based meat production. Next, more careful evaluations and justifications would be required to properly explain biological events in the results. These points include additional evaluations and considerations with regard to myocyte alignment and lipid accumulation in the differentiated 3D tissues. The present data are very suggestive in general, but further clarifications and arguments would properly support the findings and conclusions.

      Thanks for the reviewer’s comments. We have performed additional experiments and analysis to address the critical questions. We also revised the text extensively to clarify or discuss some of the concerns, such as the cell alignment and cellular distribution of intramuscular fat issues. We expect the revised data and text could adequately support the conclusions of the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) In Figure 1, the authors used 1% chicken serum. Have the authors tested other lower concentrations? It will be interesting to see the lowest chicken serum concentrations in fibroblast culture and transdifferentiation;

      Thank you for your suggestion.

      Yes, we actually have tested the lower concentrations of serum, such as 1% FBS, and 0.5% chicken serum. However, the cells are not in a healthy state under these low levels of serum, as shown by the abnormal cell morphology and nearly no cell growth. Please see the revised Supplementary Figure S1D, in which we added the 1%FBS and 0.5% chicken serum data. Hence, the 1% chicken serum is optimal in our hands. We will also test other types of specialized serum-free medium in future experiments.

      (2) In Figure 2, the authors should quantify the fold expansion of fibroblasts cultured in 3D gel after 1, 3, 5, and 9 days since this data is important for future meat manufacturing. In addition, long-term expansion (e.g., 1 month) in 3D gel should also be shown;

      Thanks for the question. We have quantified the cell growth in 3D by measuring the PHK26 stained cells. Since the cells were implanted into the gel, they propagated exponentially from 1 day to 9 days. The cell proliferation data provide good reference for the future meat manufacturing (Figure 2D). We have tried the long-term expansion in 3D but failed to measure the cell proliferation. Because the 3D gel always collapsed during 12-15 days in cell culture for some unknown reasons, either the cells are grown too crowded to compromise the gel structure or the gel matrix itself is not strong enough for standing long-term. We believe the cells will grow well in long-term if we provide enough 3D attachment surface, since they grow indefinitely in 2D. We will testing different 3D matrix in the future.

      Please see the revised Figure 2D for the quantification of cells.

      (3) In Figure 3, please also show MyoD staining as it'll be interesting to see the expression of exogenous and endogenous MyoD expression after dox treatment. In Figure G, the hydrogel meat seems very small, please show/discuss the maximum size of hydrogel meat that may be achieved using this approach;

      Thanks for asking this information. We performed the immunostaining by using the anti-MyoD and anti-Flag to show the expression of all MyoD (exogenous and endogenous) and only exogenous MyoD after dox treatment. The MyoD and 3xFlag were fused in-frame in the transgene plasmid and thus the anti-Flag staining indicate the exogenous MyoD expression and anti-MyoD staining indicate the expression of exogenous and endogenous MyoD together.

      As shown in Figure S4, we found that almost 100% of cells were positive for MyoD staining and 60% of which expressed Flag, these data were consistent with our previous results (Ren et al., 2022, Cell Reports).

      Author response image 1.

      As for the size of the culture meat based on hydrogel, we discussed the possibilities in scalable production of hydrogel based whole-cut meat mimics. Please see lines 446-449. “Due to its excellent biocompatibility and mechanical flexibility, GelMA-based hydrogel has demonstrated significant potential in scalable 3D cell culture for creating artificial tissue ranging in sizes from millimeters to centimeters.”

      (4) In Figure 5 and Supplementary Figure 6, please quantify the Oil-red O+ fat cells in the 2D and 3D lipogenic induction. Also in Fig. 6B, quantify the oil-red+MHC+ cells;

      Thank you for this advice. We have quantified the oil-red O stained images in the result “Stimulate the fat deposition in chicken fibroblasts in 3D” using analysis software imageJ and the quantification of Oil-red O area was added to the corresponding graphs (Figure 5C, Figure S6C and S6F).

      However, due to the unique structure of the 3D matrix, many MHC+ and Oil Red O+ double-positive cells overlap with each other across different Z-stack layers in 3D. This overlap makes it challenging to accurately position and quantify the double-positive cells as the different layers interfere with each other.

      (5) In Figure 7, please show immunostaining images of collagen and other major ECMs;

      Thank you for this question. We have tried to stain collagen networks the by the Picrosirius Red staining but failed. Instead, we employed the laminin immunostainings to confirm that the ECM contents in the 3D matrix is increasing steadily during cell culturation.

      Please see Figure 7C. Lines 346-348.

      “the laminin protein content was accumulated and increased steadily during 3D culturation (Figure 7C) “

      (6) In Figure 8, please show hierarchical clustering analysis of whole transcriptomes of 3D_fibroblasts, 3D_MyoD, 3D+FI, and 3D_MyoD+FI. A Venn Diagram showing the overlap and distinct gene expression among these groups is also appreciated.

      Thank you for the suggestion.

      We added the hierarchical clustering analysis of whole transcriptomes of 3D_fibroblasts, 3D_MyoD, 3D+FI, and 3D_MyoD+FI using Euclidean distance with ward.D cluster method. Please see Figure 8B. The result showed that these groups formed two large clusters, in which the 3D+FI clustered separately and the 3D_fibroblasts, 3D_MyoD and 3D_MyoD+FI were more similar. Please see Figure 8B.

      As the reviewer suggested, we also compared the transcriptomes of 3D_MyoD, 3D+FI, and 3D_MyoD+FI to the original 3D_fibroblasts to identify differentially expression genes (DEG) and then analyzed the overlap and distinct DEGs respectively. As shown in Figure 8D, the Venn Diagram showed that majority of DEG from 3D_MyoD+FI (3D_MyoD+FI versus 3D_fibroblasts) are overlapped with 3D_MyoD and 3D+FI, indicating that 3D_MyoD+FI are compatible with myogenic and adipogenic function.

      Please see the revised Figure 8.

      Reviewer #2 (Recommendations For The Authors):

      In this study, the authors demonstrated a new approach for cultivated meat production using chicken fibroblasts. Specifically, the cells were cultured as 3D and induced muscle differentiation and lipid deposition. The manuscript contains a good set of data, which would be valuable to researchers in the fields of both cell-based meat and skeletal muscle biology. From the aspect of cultivated meat science, the rationale behind the idea is understandable, but it remains unclear whether the proposed approach was really the best choice to achieve their final goal. On the other hand, when we read this manuscript as a paper in skeletal muscle biology, the overall approach was not innovative enough and several uncertain issues remain. The authors should add more sufficient justifications, arguments, and discussions.

      (1) When considering their goal to produce edible meat products, the current approach has some concerns. First, there are issues with the approach used for the induction of myogenesis by MyoD transgene. This makes the end products GMO foods, which are not easily acceptable to a wide range of consumers. Next, the hydrogel was used for 3D tissue formation, but it is unclear whether this matrix type is edible, safe, and bio-comparable for cell-based meat production. The authors already discussed these points by excusing that the current work remains proof-of-concept. However, more careful considerations and justifications would be required.

      Thank you for the suggestion.

      We acknowledge that the current transgene myogenic induction method is not suitable for mass production of culture meat because of the GMO food concerns. We utilized the MyoD transgene as the means of myogenic transdifferentiation at the first place, because of the ease of genetic manipulation and maximum efficiency. We are current testing non-genomic integration tools such as chemical cocktails and modified RNAs for myogenic transdifferentiation.

      When it comes to the applications of hydrogel in the food industry, certain types of hybrid hydrogels, such as those made from pectin or sodium polyacrylate, are not only edible but also safe for consumption. While GelMA hydrogel is typically utilized in tissue engineering and subsequent implantation in patients for therapeutic regenerative medicine purposes, it has not been commonly employed in food processing. In this study, we cultivated cells within GelMA hydrogel due to its durability and ease of use in cell culture. Moving forward, we plan to investigate alternative types of matrices to develop cultured meat suitable for food applications.

      We have now described the GMO and hydrogel drawbacks in the discussion part. Please see lines 439-457.

      “As a proof-of-concept, we utilized the transgene method to achieve maximum myogenic induction and the final products still retain the foreign transgene fragment in the cells’ genome. It is therefore posing a risk of genetic modified food which is not suitable for mass production. In the next step, other non-transgenic means such as non-integrating vectors, chemical reprogramming, modified RNAs, and recombinant transgene removal techniques will be explored to develop transgene-free end products. Another food safety concern in this study is the use of GelMA hydrogel for culture meat production. Due to its excellent biocompatibility and mechanical flexibility, GelMA-based hydrogel has demonstrated significant potential in scalable 3D cell culture for creating artificial tissue ranging in sizes from millimeters to centimeters. It is widely used in 3D cell culture and tissue engineering for regenerative medicine, but less common in food processing and agricultural applications. Due to its special photo-crosslinking properties, biocompatibility and degradability, it allows this material to be shaped into complex tissue structures by 3D printing or modelling. Many researchers have also used GelMA hydrogel as a scaffold for culture meat production (Jeong et al., 2022; Li et al., 2021; Park et al., 2023). Later research will carefully consider hydrogel as well as other types of scaffold biomaterials for cost-effective and food-safety compliant culture meat production (Bomkamp et al., 2022). ”

      (2) From the view of skeletal muscle biology, the approaches (MyoD overexpression, hydrogel-based 3D tissue formation, and lipogenic induction) have already been tested.

      Thank you for the insightful comments from the perspective of skeletal muscle cell biology. We totally agree that the current approaches including MyoD overexpression, 3D cell culture and lipogenic induction, were routine experiments in muscle cell biology. However, we want to highlight that utilization of these classical and robust muscle cell approaches, combine with the unique advantages of fibroblast cells (easily accessible, immortalized, cost-effective, ...) would provide a novel and practical avenue for culture meat production. We stated these issues in the revised manuscript in the discussion part.

      Please see lines 511-515.

      “In conclusion, we have effectively utilized immortalized chicken fibroblasts in conjunction with classical myogenic/adipogenic transdifferentiation approaches within 3D hydrogel to establish a cultured meat model. This model allows for the precise regulation of the synthesis of key components found in conventional meat, including muscle, fat, and ECM.”

      (3) The common emphasis in this manuscript is to use the advantages of 3D culture for tissue differentiation. As the authors described, skeletal muscle is a highly aligned tissue. In this study, some results successfully demonstrated advantages in terms of myocyte alignment, maturation, and lipid deposition. However, the current results cannot address whether the entire 3D tissues maintained these advantageous characteristics or not. Because the method for 3D formation does not have any additional modifications to make the cells aligned, like micropatterning, scaffolding, or bioprinting.

      Thank you for the suggestion.

      We agree with the reviewer that the skeletal muscle tissues are composed of well organized, directional bundles of fibers, and the cell alignment would greatly affect the meat tenderness and sensory properties. Therefore, it is a desired attribute if the cells in the culture meat matrix could be aligned together. But this alignment would require sophisticated biomaterial engineering mainly involved in the scaffold manipulation which is beyond the scope of this study. The hydrogel used in this study formed different sizes of pores at random directions and we would expect the embedded cells to be totally non-directional. But we still found localized cell alignments in some parts of the gel matrix which confirming the cell-cell interactions, please see figure 3D. We describe this feature in the results part. In the future, we will be testing the application of physical or electrical stimulations to the matrix to see if we can align the cells better to make all the muscle cells in the whole matrix to align together.

      Please see lines 186-190.

      “The separate XY axis views of the orthogonal projections at different depths (Figure 3D) and a multi-angle video (Supplementary Video 2) also showed the several myotubes were aligned together. Nevertheless, many myotubes were oriented in different directions, preventing the entire matrix from aligning in one direction.”

      (4) In the skeletal muscle, fat accumulation mainly occurs in adipocytes between myocytes. This means that "intra-" muscular fat deposition is identified. However, lipid deposition within myocytes also occurred in this preparation (Supplementary Figure 7C). This situation is not "intra-" muscular accumulation, which sounds different from what is going on in normal skeletal muscle tissues. Please explain what happened and what biological situations accounted for this. Also, the authors should clarify better how lipogenesis was induced in the 3D tissues, such as cell types (transdifferentiated myocytes, remained/un-transdifferentiated fibroblasts, or both).

      Thank you for the very insightful question. We have revised the corresponding text to further explain the intramuscular fat distribution in different cell types in culture meat.

      We totally agree with the reviewer that intramuscular fat accumulation may occur mainly in the intramuscular adipocytes. However, under some pathological and physiological conditions in human and animals, the lipid droplets were also abundantly observed inside myofibers (intramyocellular lipids within myofiber cytoplasm). For instance, high intramyocellular lipid content was found in insulin resistance patients and paradoxically in endurance trained athletes, (doi.org/10.1016/j.tem.2012.05.009), as well as in some farm animals under intensive selective breeding (doi:10.2174/1876142910901010059). In the current study, with the Oil Red O staining of lipid droplets, we identified lipid deposition in both the transdifferentiated myocytes and the remained un-transdifferentiated fibroblasts in the culture meat. This lipid distribution pattern is comparable to the intramuscular fat storage pattern observed in some human and animals, in which fat accumulation occurs in both myofibers (intramyocellular lipids) and intramuscular adipocyte cells (extramyocellular lipids) which reside within the muscle tissue bundle but between myofibers. We reason that current adipogenic induction treatment caused lipogenesis in both the MyoD-transdifferentiated cells and un-transdifferentiated fibroblasts. It is difficult to compare the absolute amount of lipids between these two types of cells via the Oil Red O staining. Also, it is almost impossible to separate these two types of cells from the 3D meat mimics. Thus, we can only confirm the lipid deposition occurs in both transdifferentiated myocytes and un-transdifferentiated fibroblasts, but without knowing which one is dominant and the major contributor to the intramuscular fat content in the culture meat.

      Please see lines 486-492.

      “In this study, the deposition of fat in the myotubes/myofibers facilitated the storage of significant lipid quantities in transdifferentiated muscle cells, known as intramyocellular lipids. Additionally, we observed Oil Red O staining in the remaining un-transdifferentiated fibroblasts, resembling cells of intramuscular adipocytes (extramyocellular lipids) found within muscle tissue. Hence, current adipogenic induction treatment caused lipogenesis in both the MyoD-transdifferentiated cells and un-transdifferentiated fibroblasts.”

    3. Reviewer #1 (Public Review):

      Summary:

      The authors presented here a novel 3D fibroblast culture and transdifferentiation approach for potential meat production with GelMA hydrogel.

      Strengths:

      (1) Reduced serum concentration for 3D chicken fibroblast culture and transdifferentiation is optimized.<br /> (2) Efficient myogenic transdifferentiation and lipogenesis as well as controlled fat deposition are achieved in the 3D GelMA.

    4. Reviewer #2 (Public Review):

      The manuscript by Ma et al. tries to develop a protocol for cell-based meat production using chicken fibroblasts as three-dimensional (3D) muscle tissues with fat accumulation. The authors used genetically modified fibroblasts, which can be forced to differentiate into muscle cells, and formulated 3D tissues with these cells and a biphasic material (hydrogel). The degrees of muscle differentiation and lipid deposition in culture were determined by immunohistochemical, biochemical, and molecular biological evaluations. Notably, the protocol successfully achieved the process of myogenic and lipogenic stimulation in the 3D tissues.

      As addressed after the initial review process, the manuscript is clearly written with well-supportive figures. The study design is reasonable with adequate analysis. In the revised manuscript, the authors further discussed the ideas in terms of the approach using genetic modification for cell-based meat production. However, more careful considerations may still be helpful when actually using the technology for cultivated meat production.

    1. eLife assessment

      This study reports a novel substrate and a mediator of oncogenesis downstream of mTORC1, a fundamental advance in our understanding of the mechanistic basis of mTORC1-regulated cap-dependent translation and protein synthesis. Using an array of biochemical, proteomic and functional assays, the authors provide compelling evidence for a novel mTORC1/S6K1-IBTK-eIF4A1 signaling axis that promotes cancer pathogenic translation. This work is of broad interest and significance, given the importance of aberrant protein synthesis in cancer.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      In this study, the authors examined the role of IBTK, a substrate-binding adaptor of the CRL3 ubiquitin ligase complex, in modulating the activity of the eiF4F translation initiation complex. They find that IBTK mediates the non-degradative ubiquitination of eiF4A1, promotes cap-dependent translational initiation, nascent protein synthesis, oncogene expression, and tumor cell growth. Correspondingly, phosphorylation of IBTK by mTORC1/ S6K1 increases eIF4A1 ubiquitination and sustains oncogenic translation.

      Strengths:

      This study utilizes multiple biochemical, proteomic, functional, and cell biology assays to substantiate their results. Importantly, the work nominates IBTK as a unique substrate of mTORC1, and further validates eiF4A1 (a crucial subunit of the ei44F complex) as a promising therapeutic target in cancer. Since IBTK interacts broadly with multiple members of the translational initial complex - it will be interesting to examine its role in eiF2alpha-mediated ER stress as well as eiF3-mediated translation. Additionally, since IBTK exerts pro-survival effects in multiple cell types, it will be of relevance to characterize the role of IBTK in mediating increased mTORC1 mediated translation in other tumor types, thus potentially impacting their treatment with eiF4F inhibitors.

      Limitations/Weaknesses:

      The findings are mostly well supported by data, but some areas need clarification and could potentially be enhanced with further experiments:

      (1) Since eiF4A1 appears to function downstream of IBTK1, can the effects of IBTK1 KO/KD in reducing puromycin incorporation (in Fig 3A), cap-dependent luciferase reporter activity (Fig 3G), reduced oncogene expression (Fig 4A) or 2D growth/ invasion assays (Fig 4) be overcome or bypassed by overexpressing eiF4A1? These could potentially be tested in future studies.

      We appreciate the reviewer for bringing up this crucial point. As per the reviewer's suggestion, we conducted experiments where we overexpressed Myc-eIF4A1 in IBTK-KO SiHa cells. Our findings indicate that increasing levels of eIF4A1 through ectopic overexpression is unable to reverse the decrease in puromycin incorporation (Fig. S3C) and protein expression of eIF4A1 targets caused by IBTK ablation (Fig. S4E). These results clearly demonstrate that IBTK ablation-induced eIF4A1 dysfunctions cannot be rescued by simply elevating eIF4A1 protein levels. Given the above results are negative, the impacts of eIF4A1 overexpression on the 2D growth/invasion capacities of IBTK-KO cells were not further examined. We sincerely appreciate the reviewer's understanding regarding this matter.

      (2) The decrease in nascent protein synthesis in puromycin incorporation assays in Figure 3A suggest that the effects of IBTK KO are comparable to and additive with silvesterol. It would be of interest to examine whether silvesterol decreases nascent protein synthesis or increases stress granules in the IBTK KO cells stably expressing IBTK as well.

      We appreciate the reviewer for bringing up this crucial point. We have showed that silvestrol treatment still decreased nascent protein synthesis in IBTK-KO cells overexpressing FLAG-IBTK as well (Fig. S3B).

      (3) The data presented in Figure 5 regarding the role of mTORC1 in IBTK- mediated eiF4A1 ubiquitination needs further clarification on several points:

      • It is not clear if the experiments in Figure 5F with Phos-tag gels are using the FLAG-IBTK deletion mutant or the peptide containing the mTOR sites as it is mentioned on line 517, page 19 "To do so, we generated an IBTK deletion mutant (900-1150 aa) spanning the potential mTORC1-regulated phosphorylation sites" This needs further clarification.

      We appreciate the reviewer for bringing up this crucial point. The IBTK deletion mutant used in Fig. 5F is FLAG-IBTK900-1150aa. We have annotated it with smaller font size in the panel (red box) in Author response image 1.

      Author response image 1.

      • It may be of benefit to repeat the Phos tag experiments with full-length FLAG- IBTK and/or endogenous IBTK with molecular weight markers indicating the size of migrated bands.

      We appreciate the reviewer for bringing up this crucial point. We attempted to perform Phos-tag assays to detect the overexpressed full-length FLAG-IBTK or endogenous IBTK. However, we encountered difficulties in successfully transferring the full-length FLAG-IBTK or endogenous IBTK onto the nitrocellulose membrane during Phos-tag WB analysis. This is likely due to the limitations of this technique. Based on our experience, phos-tag gel is less efficient in detecting protein motility shifts with large molecular weights. As the molecular weight of IBTK protein is approximately 160 kDa, it falls within this category. Considering these technical constraints, we did not include Phos-tag assay results for full-length IBTK in our study. We sincerely appreciate the reviewer's understanding regarding this matter.

      The binding of Phos-tag to phosphorylated proteins induces a mobility shift during gel electrophoresis or protein separation techniques. This shift allows for the visualization and quantification of phosphorylated proteins separately from non-phosphorylated proteins. It's important to note that these mobility shifts indicate phosphorylation status, rather than actual molecular weights. pre- stained protein markers are typically used as a reference to assess the efficiency of protein transfer onto the membrane [Ref: 1]. Considering the aforementioned reasons, we did not add molecular weights to the WB images.

      Reference [1]. FUJIFILM Wako Pure Chemical Corporation, https://www.wako- chemicals.de/media/pdf/c7/5e/20/FUJIFILM-Wako_Phos-tag-R.pdf

      • Additionally, torin or Lambda phosphatase treatment may be used to confirm the specificity of the band in separate experiments.

      We appreciate the reviewer for bringing up this crucial point. Torin1 is a synthetic mTOR inhibitor by preventing the binding of ATP to mTOR, leading to the inactivation of both mTORC1 and mTORC2, whereas rapamycin primarily targets mTORC1 activity and may inhibit mTORC2 in certain cell types after a prolonged treatment. We have identified that the predominant mediator of IBTK phosphorylation is the mTORC1/S6K1 complex. Therefore, in this context, we think that rapamycin is sufficient to inactivate the mTORC1/S6K1 pathway. As shown in Fig. 5F, the phosphorylated IBTK900-1150aa was markedly decreased while the non-phosphorylated form was simultaneously increased in rapamycin- treated cells. As per the reviewer's suggestion, we treated FLAG-IBTK900-1150aa overexpressed cells with lambda phosphatase. As shown in Fig. 5G, lambda phosphatase treatment completely abolished the mobility shifts of phosphorylated FLAG-IBTK900-1150aa. Additionally, the lowest band displayed an abundant accumulation of the non-phosphorylated form of FLAG-IBTK900-1150aa. These findings confirm that the mobility shifts observed in WB analysis correspond to the phosphorylated forms of FLAG-IBTK900-1150aa.

      • Phos-tag gels with the IBTK CRISPR KO line would also help confirm that the non-phosphorylated band is indeed IBTK.

      We appreciate the reviewer for bringing up this crucial point. As we state above, we performed Phos-tag assays to detect the mobility shifts of phosphorylated FLAG-IBTK900-1150aa. Anti-FLAG antibody, but not the anti-IBTK antibody was used for WB detection. This antibody does not exhibit cross-reactivity with endogenous IBTK.

      • It is unclear why the lower, phosphorylated bands seem to be increasing (rather than decreasing) with AA starvation/ Rapa in Fig 5H.

      We appreciate the reviewer for bringing up this crucial point. We think the panel the reviewer mentioned is Fig. 5F. According to the principle of Phos-tag assays, proteins with higher phosphorylation levels have slower migration rates on SDS-PAGE, while proteins with lower phosphorylation levels have faster migration rates.

      As shown in Author response image 2, the green box indicates the most phosphorylated forms of FLAG-IBTK900-1150aa, the red box indicates the moderately phosphorylated forms of FLAG-IBTK900-1150aa, and the yellow box indicates the non-phosphorylated forms of FLAG-IBTK900-1150aa. AA starvation or Rapamycin treatment reduced the hyperphosphorylated forms of FLAG-IBTK900-1150aa (green box), while simultaneously increasing the hypophosphorylated (red box) and non- phosphorylated (yellow box) forms of FLAG-IBTK900-1150aa. Thus, we conclude that AA starvation or Rapamycin treatment leads to a marked decrease in the phosphorylation levels of FLAG-IBTK900-1150aa.

      Author response image 2.

      Reviewer #2 (Public Review):

      Summary:

      This study by Sun et al. identifies a novel role for IBTK in promoting cancer protein translation, through regulation of the translational helicase eIF4A1. Using a multifaceted approach, the authors demonstrate that IBTK interacts with and ubiquitinates eIF4A1 in a non-degradative manner, enhancing its activation downstream of mTORC1/S6K1 signaling. This represents a significant advance in elucidating the complex layers of dysregulated translational control in cancer.

      Strengths:

      A major strength of this work is the convincing biochemical evidence for a direct regulatory relationship between IBTK and eIF4A1. The authors utilize affinity purification and proximity labeling methods to comprehensively map the IBTK interactome, identifying eIF4A1 as a top hit. Importantly, they validate this interaction and the specificity for eIF4A1 over other eIF4 isoforms by co- immunoprecipitation in multiple cell lines. Building on this, they demonstrate that IBTK catalyzes non-degradative ubiquitination of eIF4A1 both in cells and in vitro through the E3 ligase activity of the CRL3-IBTK complex. Mapping IBTK phosphorylation sites and showing mTORC1/S6K1-dependent regulation provides mechanistic insight. The reduction in global translation and eIF4A1- dependent oncoproteins upon IBTK loss, along with clinical data linking IBTK to poor prognosis, support the functional importance.

      Weaknesses:

      While these data compellingly establish IBTK as a binding partner and modifier of eIF4A1, a remaining weakness is the lack of direct measurements showing IBTK regulates eIF4A1 helicase activity and translation of target mRNAs. While the effects of IBTK knockout/overexpression on bulk protein synthesis are shown, the expression of multiple eIF4A1 target oncogenes remains unchanged.

      Summary:

      Overall, this study significantly advances our understanding of how aberrant mTORC1/S6K1 signaling promotes cancer pathogenic translation via IBTK and eIF4A1. The proteomic, biochemical, and phosphorylation mapping approaches established here provide a blueprint for interrogating IBTK function. These data should galvanize future efforts to target the mTORC1/S6K1-IBTK-eIF4A1 axis as an avenue for cancer therapy, particularly in combination with eIF4A inhibitors.

      Reviewer #1 (Recommendations For The Authors):

      (1) Certain references should be provided for clarity. For e.g.,: Page 15, line 418 " The C-terminal glycine glycine (GG) amino acid residues are essential for Ub conjugation to targeted proteins".

      We appreciate the reviewer for bringing up this crucial point. We have taken two fundamental review papers (PMID: 22524316, 9759494) on the ubiquitin system as references in this sentence.

      (2) Please describe the properties of the ΔBTB mutant on page 15 when first describing it. What motifs does it lack and has it been described before in functional studies?

      We appreciate the reviewer for bringing up this crucial point. We added a sentence to describe the properties of the ΔBTB mutant. This mutant lacks the BTB1 and BTB2 domains (deletion of aa 554–871), which have been previously demonstrated to be essential for binding to CUL3. The original reference has been added to the revised manuscript.

      (3) In Figure 2G how do the authors explain the fact that co-expression of the Ub K-ALLR mutant, which is unable to form polyubiquitin chains, formed only a moderate reduction in IBTK-mediated eIF4A1 ubiquitination?

      We appreciate the reviewer for bringing up this crucial point. The Ub K-ALLR mutant can indeed conjugate to substrate proteins, but it cannot form chains due to its absence of lysine residues, resulting in mono-ubiquitination. Multi- mono-ubiquitination refers to the attachment of single ubiquitin molecules to multiple lysine residues on a substrate protein. It's worth noting that a poly- ubiquitinated protein and a multi-mono-ubiquitinated protein appear strikingly similar in Western blot. Our findings demonstrated that the co-expression of the Ub K-ALL-R mutant resulted in only a modest reduction in IBTK-mediated eIF4A1 ubiquitination (Fig. 2G), and that eIF4A1 was ubiquitinated at twelve lysine residues when co-expressed with IBTK (Fig. S2F). As such, we conclude that the CRL3IBTK complex primarily catalyzes multi-mono-ubiquitination on eIF4A1. .

      (4) In Figure 5, The identity of the seven sites in the IBTK 7ST A mutants should be specified.

      We appreciate the reviewer for bringing up this crucial point. We have specified the seven mutation sites in the IBTK-7ST A mutant (Fig. 6A).

      (5) In Figure 5, the rationale for generating antibodies only to S990/992/993, as opposed to the other mTORC1/S6K motifs should be specified.

      We appreciate the reviewer for bringing up this crucial point. Upon demonstrating that IBTK can be phosphorylated—with evidence from positive Phos-tag and in vitro phosphorylation assays—we sought to directly detect changes in the phosphorylation levels using an antibody specific to IBTK phosphorylation. However, the expense of generating seven phosphorylation- specific antibodies for each site is significant. Recognizing that S990/992/993 are three adjacent sites, we deemed it appropriate to generate a single antibody to recognize the phospho-S990/992/993 epitope. Moreover, out of the seven phosphorylation sites, S992 perfectly matches the consensus motif for S6K1 phosphorylation (RXRXXS). Utilizing this antibody allowed us to observe a substantial decrease in the phosphorylation levels of these three adjacent Ser residues in IBTK following either AA deprivation or Rapamycin treatment (Fig. 5L). We have specified these points in the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      The following suggestions would strengthen the study:

      (1) Directly examine the effects of IBTK modulation (knockdown/knockout/ overexpression) on eIF4A1 helicase activity.

      We appreciate the reviewer for bringing up this crucial point. We agree with the reviewer's suggestion that evaluating IBTK's influence on eIF4A1 helicase activity directly would enhance the strength of our conclusion. However, the current eIF4A1 helicase assays, as described in previous publications [Ref: 1, 2], can only be conducted using in vitro purified recombinant proteins. For instance, it is feasible to assess the varying levels of helicase activity exhibited by recombinant wild-type or mutant EIF4A1 proteins [Ref: 2]. Importantly, there is currently no reported methodology for evaluating the helicase activity of EIF4A1 in vivo, as mentioned by the reviewer in gene knockdown, knockout, or overexpression cellular contexts. Therefore, we have not performed these assays and we sincerely appreciate the reviewer's understanding in this regard. We sincerely appreciate the reviewer's understanding regarding this matter.

      Reference:

      [1] Chu J, Galicia-Vázquez G, Cencic R, Mills JR, Katigbak A, Porco JA, Pelletier J. CRISPR-mediated drug-target validation reveals selective pharmacological inhibition of the RNA helicase, eIF4A. Cell reports. 2016 Jun 14;15(11):2340-7.

      [2] Chu J, Galicia-Vázquez G, Cencic R, Mills JR, Katigbak A, Porco JA, Pelletier J. CRISPR-mediated drug-target validation reveals selective pharmacological inhibition of the RNA helicase, eIF4A. Cell reports. 2016 Jun 14;15(11):2340-7.

      (2) Justify why the expression of some but not all eIF4A1 target oncogenes is affected in IBTK-depleted/overexpressing cells. This is important if IBTK should be considered as a therapeutic target. The authors should consider which of the eIF4A1 targets are most impacted by IBTK KO. This would provide a more focused therapeutic approach in the future.

      We appreciate the reviewer for bringing up this crucial point. As the reviewer has pointed out, we assessed the protein levels of ten reported eIF4A1 target genes across three cancer cell lines (Fig.4, Fig. S4A, C). We observed that IBTK depletion led to a substantial reduction in the protein levels of most eIF4A1- regulated oncogenes upon IBTK depletion, although there were some exceptions. For instance, IBTK KO in H1299 cells exerted minimal influence on the protein levels of ROCK1 (Fig. S4A). Several possible explanations might account for this observation: firstly, given that our list of eIF4A1 target genes collected from previous studies conducted using distinct cell lines, it is not unexpected for different lines to exhibit subtle differences in regulation of eIF4A1 target genes. Secondly, as a CRL3 adaptor, IBTK potentially performs other biological functions via ubiquitination of specific substrates; dysregulation of these could buffer the impact of IBTK KO on the protein expression of some eIF4A1 target genes. We added these comments to the Discussion section of the revised manuscript.

      (3) Expand mTOR manipulation experiments (inhibition, Raptor knockout, activation) and evaluate impacts on IBTK phosphorylation, eIF4A1 ubiquitination, and translation.

      The mTORC1 signaling pathway is constitutively active under normal culture conditions. In order to inhibit mTORC1 activation, we employed several approaches including AA starvation, Rapamycin treatment, or Raptor knockout. Our results have demonstrated that both AA starvation and rapamycin treatment led to a reduction in eIF4A1 ubiquitination (Fig. 5M). Moreover, we have included new findings in the revised manuscript, which highlight that Raptor knockout specifically decreases eIF4A1 ubiquitination (Fig. 5N). It is worth mentioning that the impacts of mTOR inhibition or activation on protein translation have been extensively investigated and documented in numerous studies. Therefore, in our study, we did not feel it necessary to examine these treatments further.

      (4) Although not absolutely necessary, it would be nice to see if some of these findings are true in other cancer cell types.

      We appreciate the reviewer for bringing up this crucial point. We concur with the reviewer's suggestion that including data from other cancer cell types would enhance the strength of our conclusion. While the majority of our data is derived from two cervical cancer cell lines, we have corroborated certain key findings— such as the impact of IBTK on eIF4A1 and its target gene expression—in H1299 cells (human lung cancer) (Fig. 2C, Fig. S4A, B) and in CT26 cells (murine colon adenocarcinoma) (Fig. S4C, D). Additionally, we demonstrated that IBTK promotes IFN-γ-induced PD-L1 expression and tumor immune escape in both the H1299 and CT26 cells (Fig. S6A-K).

    3. Reviewer #1 (Public Review):

      In this study, the authors examined the role of IBTK, a substrate-binding adaptor of the CRL3 ubiquitin ligase complex, in modulating the activity of the eiF4F translation initiation complex. They find that IBTK mediates the non-degradative ubiquitination of eiF4A1, promotes cap-dependent translational initiation, nascent protein synthesis, oncogene expression, and tumor cell growth. Correspondingly, phosphorylation of  IBTK by mTORC1/ S6K1 increases eIF4A1 ubiquitination and sustains oncogenic translation.

      Strengths:

      This study utilizes multiple biochemical, proteomic, functional and cell biology assays to substantiate their results.  Importantly, the work nominates IBTK as a unique substrate of mTORC1, and further validates eiF4A1 ( a crucial subunit of the ei44F complex) as a promising therapeutic target in cancer. Since IBTK interacts broadly with multiple members of the translational initial complex- it will be interesting to examine its role in eiF2alpha-mediated ER stress as well as eiF3-mediated translation. Additionally, since IBTK exerts pro-survival effects in multiple cell types, it will be of relevance to characterize the role of IBTK in mediating increased mTORC1 mediated translation in other tumor types, thus potentially impacting their treatment with eiF4F inhibitors.

      Limitations/Weaknesses:

      The findings are mostly well supported by data, but some areas need clarification and could potentially be enhanced with further experiments:

      (1) Since eiF4A1 appears to function downstream of IBTK1, can the effects of IBTK1 KO/KD in reducing puromycin incorporation ( in Fig 3A),  cap-dependent luciferase reporter activity (Fig 3G), reduced oncogene expression ( Fig 4A) or 2D growth/ invasion assays (Fig 4) be overcome or bypassed by overexpressing eiF4A1? These could potentially be tested in future studies. <br /> (2) The decrease in nascent protein synthesis in puromycin incorporation assays in Figure 3A suggests that the effects of IBTK KO are comparable to and additive with silvesterol. It would be of interest to examine whether silvesterol decreases nascent protein synthesis or increases stress granules in the IBTK KO cells stably expressing IBTK as well. <br /> (3) The data presented in Figure 5 regarding the role of mTORC1 in IBTK-mediated eiF4A1 ubiquitination needs further clarification on several points:<br /> - It is not clear if the experiments in Figure 5F with Phos-tag gels are using the FLAG-IBTK deletion mutant or the peptide containing the mTOR sites as it is mentioned on line 517, page 19 "To do so, we generated an IBTK deletion mutant (900-1150 aa) spanning the potential mTORC1-regulated phosphorylation sites" This needs further clarification.<br /> -It may be of benefit to repeat the Phos tag experiments with full length FLAG-IBTK and/or endogenous IBTK with molecular weight markers indicating size of migrated bands.<br /> -Additionally, torin or Lambda phosphatase treatment may be used to confirm the specificity of the band in separate experiments.<br /> -Phos-tag gels with the IBTK CRISPR KO line would also help confirm that the non-phosphorylated band is indeed IBTK. <br /> -It is unclear why the lower, phosphorylated bands seem to be increasing ( rather than decreasing) with AA starvation/ Rapa in Fig 5H.

    4. Reviewer #2 (Public Review):

      Summary:

      This study by Sun et al. identifies a novel role for IBTK in promoting cancer protein translation, through regulation of the translational helicase eIF4A1. Using a multifaceted approach, the authors demonstrate that IBTK interacts with and ubiquitinates eIF4A1 in a non-degradative manner, enhancing its activation downstream of mTORC1/S6K1 signaling. This represents a significant advance in elucidating the complex layers of dysregulated translational control in cancer.

      Strengths:

      A major strength of this work is the convincing biochemical evidence for a direct regulatory relationship between IBTK and eIF4A1. The authors utilize affinity purification and proximity labeling methods to comprehensively map the IBTK interactome, identifying eIF4A1 as a top hit. Importantly, they validate this interaction and the specificity for eIF4A1 over other eIF4 isoforms by co-immunoprecipitation in multiple cell lines. Building on this, they demonstrate that IBTK catalyzes non-degradative ubiquitination of eIF4A1 both in cells and in vitro through the E3 ligase activity of the CRL3-IBTK complex. Mapping IBTK phosphorylation sites and showing mTORC1/S6K1-dependent regulation provides mechanistic insight. The reduction in global translation and eIF4A1-dependent oncoproteins upon IBTK loss, along with clinical data linking IBTK to poor prognosis, support the functional importance. Finally, the impact of IBTK on eIF4A1 target gene expression in colon and lung cancer cell lines, strengthens these findings.

      Weaknesses:

      While the effects of IBTK knockout/over-expression on bulk protein synthesis are shown, the expression of several eIF4A1 target oncogenes remains unchanged.

      Summary:

      Overall, this study significantly advances our understanding of how aberrant mTORC1/S6K1 signaling promotes cancer pathogenic translation via IBTK and eIF4A1. The proteomic, biochemical and phosphorylation mapping approaches established here provide a blueprint for interrogating IBTK function. These data should galvanize future efforts to target the mTORC1/S6K1-IBTK-eIF4A1 axis as an avenue for cancer therapy, particularly in combination with eIF4A inhibitors.

    1. eLife assessment

      This important work presents a consolidated overview of the NeuroML2 open community standard. It provides convincing evidence for its central role within a broader comprehensive software ecosystem for the development of neuronal models that are open, shareable, reproducible, and interoperable. A major strength of the presented work is the persistence of the development over more than two decades to establish, maintain, and adapt this standard to meet the evolving needs of the field. This work is of broad interest to the sub-cellular, cellular, computational, and systems neuroscience communities undertaking studies involving theory, modeling, and simulation.

    1. Author response:

      Reviewer #1

      The first is that data on the general health of mice with single and double knockouts is not shown, nor is there any data on effects in any other tissues. This gives the impression that the only phenotype is in the male reproductive system, which would be misleading if there were phenotypes in other tissues that are not reported.

      We thank the reviewer for helpful and constructive suggestions that we plan to implement in the revision. We agree with this point and we will add a statement that the effect on the urogenital system was not the only observed phenotype, although it was the most striking histological feature that we found. We did notice some other physiological differences that we are examining in detail and determining their mechanisms, for future publications.

      Furthermore, data for the genitourinary system in single knockouts are very sparse; data are described for fertility in Figure 1H, ploidy, and cell number in Figures 2B and C, plasma testosterone and luteinizing hormone levels in Figures 5C and 5D, and morphology of testis and prostate tissue for single Cdk8 knockout in Supplementary Figure 1C (although in this case the images do not appear very comparable between control and CDK8 KO, thus perhaps wider fields should be shown), but, for example, there is no analysis of different meiotic stages or of gene expression in single knockouts. It is worth mentioning that single knockouts seem to show a corresponding upregulation of the level of the paralogue kinase, indicating that any lack of phenotypes might be due to feedback compensation, which would be an interesting finding if confirmed; this has not been mentioned.

      We agree that a description of the single KO could be beneficial, but we expect no big differences with the WT or Cre-Ert. We found neither histological differences nor changes in cell counts or ratios of cell types. Our ethical committee also has concerns about sacrificing mice without major phenotypic changes, without a well formulated hypothesis about the observed effects. We plan to add histological pictures to the next version of the article.

      We thank the reviewer for raising an important point about the paralog upregulation. Indeed, our data on primary cells (supplementary 1B) suggests the upregulation of CDK19 in CDK8KO and vice versa. We will point this out in disc We plan to examine the data for the testis as soon as more tissues are available.

      The second major weakness is that the correlation between double knockout and reduced expression of genes involved in steroid hormone biosynthesis is portrayed as a causal mechanism for the phenotypes observed. While this is a possibility, there are no experiments performed to provide evidence that this is the case. Furthermore, there is no evidence showing that CDK8 and/or CDK19 are directly responsible for the transcription of the genes concerned.

      We agree with the reviewer that the effects on CDK8/CDK19/CCNC could lead to the observed transcriptional changes in multiple indirect steps. There are, however, major technical challenges in examining the binding of transcription factors in the tissue, especially in Leydig cells which are a relatively minor population. We will clarify it in the revision, and strengthen this point in the discussion.

      Finally, the authors propose that the phenotypes are independent of the kinase activity of CDK8 or CDK19 because treatment of mice for a month with an inhibitor does not recapitulate the effects of the knockout, and nor does expression of two steroidogenic genes change in cultured Leydig cells upon treatment with an inhibitor. However, there are no controls for effective target inhibition shown.

      We thank the reviewer for raising this concern, which we will address in the revision. This study used the same CDK8/19 inhibitor (SNX631-6) as in the recently published study on prostate cancer (doi: 10.1172/JCI176709). That study describes the inhibitor, its target engagement in cell-free and cell-based assays, its anticancer potency, and its transcriptomic effects in vivo, the same dosage strength as in the present study, which phenocopy the effects of CDK8/19 knockdown. Additional data will be included in the revision.

      Reviewer #2

      The claim of reproductive defects in the induced double knockout of CDK8/19 resulted from the loss of CCNC via a kinase-independent mechanism is interesting but was not supported by the data presented. While the construction and analysis of the systemic induced knockout model of Cdk8 in Cdk19KO mice is not trivial, the analysis and data are weakened by the systemic effect of Cdk8 loss, making it difficult to separate the systemic effect from the local testis effect.

      We agree with the reviewer that the effects on the testis could be due to the systemic loss of CDK8 rather than specifically in the testis, and we will clarify it in the revision. We will also clarify that although our results are suggestive that the effects of CDK8/19 knockout are kinase-independent, and that the loss of Cyclin C is a likely explanation for the kinase independence but we do not claim that it is the mechanism.

      The analysis of male sterile phenotype is also inadequate with poor image quality, especially testis HE sections. The male reproductive tract picture is also small and difficult to evaluate.

      Unfortunately, during the submission process through Biorxiv the quality of the image worsened. We uploaded the high resolution pictures for the journal but probably they were not presented for the reviewer. We will re-send the high resolution images.

      The mice crossing scheme is unusual as you have three mice to cross to produce genotypes, while we could understand that it is possible to produce pups of desired genotypes with different mating schemes, such a vague crossing scheme is not desirable and of poor genetics practice.

      We thank the reviewer for this suggestion. Indeed, our scheme is not a representation of the actual breeding scheme but just a brief explanation of lineages used for the acquisition of the triple transgenic mice. We will include the full crossing scheme into the revision.

      Also using TAM-treated wild type as control is ok, but a better control will be TAM-treated ERT2-cre; CDK8f/f or TAM-treated ERT2 Cre CDK19/19 KO, so as to minimize the impact from the well-recognized effect of TAM.

      We used TAM-treated ERT2-cre for most of the experiments, and did not observe any major histological or physiological differences with the WT+TAM. We will make sure to present them in the revision.

      While the authors proposed that the inducible loss of CDK8 in the CDK19 knockout background is responsible for spermatogenic defects, it was not clear in which cells CDK8/19 genes are interested and which cell types might have a major role in spermatogenesis. The authors also put forward the evidence that reduction/loss of Testosterone might be the main cause of spermatogenic defects, which is consistent with the expression change in genes involved in steroigenesis pathway in Leydig cells of inducible double knockout. However it is not clear how the loss of Testosterone contributed to the loss of CcnC protein.

      We agree with the reviewer that the spermatogenic defects could be caused by the effects on gene expression in tissues other than Leydig cells. Nevertheless, this is our primary hypothesis since these changes resemble the effects of chemical castration in rats (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3408499/), and in SCARKO mice (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3968405/).

      Our hypothesis is actually the reversed scenario proposed by the reviewer. We think that the loss of steroidogenic gene expression is caused by the loss of CDK8/19 and Cyclin C in Leydig cells. This, in turn, leads to a drop of testosterone levels. We will expand this explanation for clarity.

      The authors should clarify or present the data on where CDK8 and CDK19 as well as CcnC are expressed so as to help the readers understand which tissues both CDK might be functioning in and cause the loss of CcnC. It should be easier to test the hypothesis of CDK8/19 stabilizing CcnC protein using double knock-out primary cells, instead of the whole testis.

      The stabilizing effect of Cdk8/19 on CcnC has been previously discovered and reported in cell culture (doi: 10.1093/nar/gkad538.), and here we have confirmed it at the level of whole tissue. Due to a limited sensitivity of single cell sequencing (only ~5,000 transcripts are sequenced from total of average 500,000 transcripts per cell, so the low expressed transcripts are not sequenced in all cells) it is challenging to firmly establish CDK8/19 positive and -negative tissues from single cell data because both transcripts are minor. This image will be included in the next version. We plan to resolve this matter using two approaches. First, we will try immunohistochemistry. If this method is not sufficiently sensitive we will analyze published single cell sequencing data from mouse databases and re-analyze our data. So far the former approach was challenging for us due to the absence of anti-mouse antibodies which are specific for CDK8 and CDK19 and work on tissue sections. We and others could not produce a tissue-specific staining, with the currently available commercially available antibodies. The only published specific antibody is currently not available.

      Since CDK8KO and CDK19KO have significantly reduced fertility compared to the wild type, it might be important to measure the sperm quantity and motility among CDK8 KO, CDK19KO, and induced DKO to evaluate spermatogenesis based on their sperm production.

      We agree that this is an interesting question. We did not do spermograms for single KOs but we don’t think that a decreased sperm count would explain CDK8KO infertility as the vasectomized males are able to produce copulative plugs in females whereas CDK8KO males do not, suggesting the absence of mating behavior as a reason for low fertility in the latter genotype.

      Some data for the inducible knockout efficiency of Cdk8 were presented in Supplemental Figure 1, but there is no legend for the supplemental figures, it was not clear which band represented the deletion band, and which tissues were examined. Tail or testis?

      We apologize for the accidental loss of supplementary figure legends, which will be presented in the next version. The efficiency of CDK8 KO in different tissues was previously examined by us in https://www.ncbi.nlm.nih.gov/gene/264064. The western blot in the MS represents deletion data for the testis.

      It seems that two months after the injection of Tam, all the Cdk8 were completely deleted, indicating extremely efficient deletion of Tam induction by two months post administration. Were the complete deletion of Cdk8 happening even earlier?

      The complete deletion of CDK8 occurs within a week or even as early as 2-3 days in culture, and at least after at two weeks in vivo. We chose the two mo. period to prevent the effect of tamoxifen on gene expression. We examined other time points (Figure 6) and registered the beginning of effects at 2 weeks and maximum effect by one mo.

      The authors found that Sertoli cells re-entered the cell cycle in the inducible double knockout but stopped short of careful characterization other than increased expression of cell cycle genes.

      We agree with the reviewer, and we will add Ki67 (or equivalent) staining along with Sertoli cell markers.

      Dko should be appropriately named iDKO (induced dKO).

      We will make the corresponding change.

      We performed necropsy ? not the right wording here. Colchicine-lke apoptotic bodies ? what does this mean? Not clear.

      We will amend the next version to address these minor points, and we thank the reviewer for careful reading of the manuscript.

      Images throughout the manuscript suffer from poor resolution and are often blurry and hard to evaluate.

      As mentioned above, we had a problem with image quality during the submission through Biorxiv and we will provide high resolution images in the next version.

      To pinpoint the meiotic stage defect of iDKO, it is better to use the meiotic chromosome spread approach.

      Unfortunately, meiotic spreads would not be feasible or informative, due to a low number of surviving cells in iDKO and the fact that there were evidently no cells in stages after SYCP3+.

    2. eLife assessment

      This useful study reports an unexpected phenotype of atrophy of the male reproductive system and infertility upon combined knockout in adult mice of the genes encoding the two kinases CDK8 and CDK19. While the morphological evidence and single-cell transcriptomic data are solid, the proposed mechanism remains unconvincing as there is little evidence for causality, and some controls are missing. This work will be of interest to reproductive biologists, developmental biologists, and andrologists.

    3. Reviewer #1 (Public Review):

      Summary:

      In this paper, Bruter and colleagues report the effects of inducible deletion of the genes encoding the two paralogous kinases of the Mediator complex in adult mice. The physiological roles of these two kinases, CDK8 and CDK19, are currently rather poorly understood; although conserved in all eukaryotes, and among the most highly conserved kinases in vertebrates, individual knockouts of genes encoding CDK8 homologues in different species have revealed generally rather mild and specific effects, in contrast to Mediator itself. Here, the authors provide evidence that neither CDK8 nor CDK19 are required for adult homeostasis but they are functionally redundant for maintenance of reproductive tissue morphology and fertility in males.

      Strengths:

      The morphological data on the atrophy of the male reproductive system and the arrest of spermatocyte meiosis are solid and are reinforced by single-cell transcriptomics data, which is a challenging technique to implement in vivo. The main findings are important and will be of interest to scientists in the fields of transcription and developmental biology.

      Weaknesses:

      There are several major weaknesses.

      The first is that data on the general health of mice with single and double knockouts is not shown, nor is there any data on effects in any other tissues. This gives the impression that the only phenotype is in the male reproductive system, which would be misleading if there were phenotypes in other tissues that are not reported. Furthermore, data for the genitourinary system in single knockouts are very sparse; data are described for fertility in Figure 1H, ploidy, and cell number in Figures 2B and C, plasma testosterone and luteinizing hormone levels in Figures 5C and 5D, and morphology of testis and prostate tissue for single Cdk8 knockout in Supplementary Figure 1C (although in this case the images do not appear very comparable between control and CDK8 KO, thus perhaps wider fields should be shown), but, for example, there is no analysis of different meiotic stages or of gene expression in single knockouts. It is worth mentioning that single knockouts seem to show a corresponding upregulation of the level of the paralogue kinase, indicating that any lack of phenotypes might be due to feedback compensation, which would be an interesting finding if confirmed; this has not been mentioned.

      The second major weakness is that the correlation between double knockout and reduced expression of genes involved in steroid hormone biosynthesis is portrayed as a causal mechanism for the phenotypes observed. While this is a possibility, there are no experiments performed to provide evidence that this is the case. Furthermore, there is no evidence showing that CDK8 and/or CDK19 are directly responsible for the transcription of the genes concerned.

      Finally, the authors propose that the phenotypes are independent of the kinase activity of CDK8 or CDK19 because treatment of mice for a month with an inhibitor does not recapitulate the effects of the knockout, and nor does expression of two steroidogenic genes change in cultured Leydig cells upon treatment with an inhibitor. However, there are no controls for effective target inhibition shown.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors tried to test the hypothesis that Cdk8 and Cdk19 stabilize the cytoplasmic CcNC protein, the partner protein of the Mediator complex including CDK8/19 and Mediator protein via a kinase-independent function by generating induced double knockout of Cdk8/19. However, the evidence presented suffers from a lack of focus and rigor and does not support their claims.

      Strengths:

      This is the first comprehensive report on the effect of a double knockout of CDK8 and CDK19 in mice on male fertility, hormones, and single-cell testicular cellular expression. The inducible knockout mice led to male sterility with severe spermatogenic defects, and the authors attempted to use this animal model to test the kinase-independent function of CDK8/19, previously reported for humans. Single-cell RNA-seq of knockout testis presented a high resolution of molecular defects of all the major cell types in the testes of the inducible double knockout mice. The authors also have several interesting findings such as reentry into cell cycles by Sertoli cells, and loss of Testosterone in induced dko that could be investigated further.

      Weaknesses:

      The claim of reproductive defects in the induced double knockout of CDK8/19 resulted from the loss of CCNC via a kinase-independent mechanism is interesting but was not supported by the data presented. While the construction and analysis of the systemic induced knockout model of Cdk8 in Cdk19KO mice is not trivial, the analysis and data are weakened by the systemic effect of Cdk8 loss, making it difficult to separate the systemic effect from the local testis effect.

      The analysis of male sterile phenotype is also inadequate with poor image quality, especially testis HE sections. The male reproductive tract picture is also small and difficult to evaluate. The mice crossing scheme is unusual as you have three mice to cross to produce genotypes, while we could understand that it is possible to produce pups of desired genotypes with different mating schemes, such a vague crossing scheme is not desirable and of poor genetics practice. Also using TAM-treated wild type as control is ok, but a better control will be TAM-treated ERT2-cre; CDK8f/f or TAM-treated ERT2 Cre CDK19/19 KO, so as to minimize the impact from the well-recognized effect of TAM.

      While the authors proposed that the inducible loss of CDK8 in the CDK19 knockout background is responsible for spermatogenic defects, it was not clear in which cells CDK8/19 genes are interested and which cell types might have a major role in spermatogenesis. The authors also put forward the evidence that reduction/loss of Testosterone might be the main cause of spermatogenic defects, which is consistent with the expression change in genes involved in steroigenesis pathway in Leydig cells of inducible double knockout. However it is not clear how the loss of Testosterone contributed to the loss of CcnC protein.

      The authors should clarify or present the data on where CDK8 and CDK19 as well as CcnC are expressed so as to help the readers understand which tissues both CDK might be functioning in and cause the loss of CcnC. It should be easier to test the hypothesis of CDK8/19 stabilizing CcnC protein using double knock-out primary cells, instead of the whole testis.

      Since CDK8KO and CDK19KO both have significantly reduced fertility in comparison with wildtype, it might be important to measure the sperm quantity and motility among CDK8 KO, CDK19KO, and induced DKO to evaluate spermatogenesis based on their sperm production.

      Some data for the inducible knockout efficiency of Cdk8 were presented in Supplemental Figure 1, but there is no legend for the supplemental figures, it was not clear which band represented the deletion band, and which tissues were examined. Tail or testis? It seems that two months after the injection of Tam, all the Cdk8 were completely deleted, indicating extremely efficient deletion of Tam induction by two months post administration. Were the complete deletion of Cdk8 happening even earlier? An examination of time points of induced loss would be useful and instructional as to when is the best time to examine phenotypes.

      The authors found that Sertoli cells re-entered the cell cycle in the inducible double knockout but stopped short of careful characterization other than increased expression of cell cycle genes.

      Overall this work suffered from a lack of focus and rigor in the analysis and lack of sufficient evidence to support their main conclusions.

      Minor:

      Dko should be appropriately named iDKO (induced dKO).

      "suppress spermatogenesis and male fertility" in the title does not fit the evidence presented.

      "DKO males, had an understized and dedifferentiated reproductive system?" what is the evidence for "undifferentiated"?

      We performed necropsy ? not the right wording here.

      Colchicine-lke apoptotic bodies ? what does this mean? Not clear.

      Images throughout the manuscript suffer from poor resolution and are often blurry and hard to evaluate.

      To pinpoint the meiotic stage defect of iDKO, it is better to use the meiotic chromosome spread approach.

    1. Reviewer #1 (Public Review):

      In this manuscript, the authors described a computational method catELMo for embedding TCR CDR3 sequences into numeric vectors using a deep-learning-based approach, ELMo. The authors applied catELMo to two applications: supervised TCR-epitope binding affinity prediction and unsupervised epitope-specific TCR clustering. In both applications, the authors showed that catELMo generated significantly better binding prediction and clustering performance than other established TCR embedding methods.

      The authors have addressed all of my concerns except for one as following:

      (5) GIANA's result is like

      – ## TIME:2020-12-14 14:45:14|cmd: GIANA4.py|COVID_test/rawData/hc10s10.txt|IsometricDistance_Thr=7.0|thr_v=3.7|thr_s=3.3|exact=True|Vgene=True|ST=3

      – ## Column Info: CDR3 aa sequence, cluster id, other information in the input file<br /> CAISDGTAASSTDTQYF 1 TRBV10-3*01 6.00384245917387e-05 0.930103216755186 COVID19:BS-EQ-0002-T1-replacement_TCRB.tsv<br /> CAISDGTAASSTDTQYF 1 TRBV10-3*01 4.34559031223066e-05 0.918135389545364 COVID19:BS-EQ-0002-T2-replacement_TCRB.tsv<br /> CANATLLQVLSTDTQYF 2 TRBV21-1*01 3.00192122958694e-05 0.878695260046097 COVID19:BS-EQ-0002-T1-replacement_TCRB.tsv<br /> CANATLLQVLSTDTQYF 2 TRBV21-1*01 1.44853010407689e-05 0.768125375525736 COVID19:BS-EQ-0002-T2-replacement_TCRB.ts<br /> ...

      as in its example file at: https://raw.githubusercontent.com/s175573/GIANA/master/data/hc10s10--RotationEncodingBL62.txt

      The results directly give the clustering results in the second column, and there is no direct distance metric for hierarchical clustering. Therefore, it is still not clear how the authors conducted the hierarchical clustering on GIANA's results. Did the hierarchical clustering apply to each of the original clusters on the CDR3 distances within the same original cluster?

    2. Author response:

      The following is the authors’ response to the original reviews.

      Thank you very much for the careful and positive reviews of our manuscript. We have addressed each comment in the attached revised manuscript. We describe the modifications below. To avoid confusion, we've changed supplementary figure and table captions to start with "Supplement Figure" and "Supplementary Table," instead of "Figure" and "Table."

      We have modified/added:

      ● Supplementary Table S1: AUC scores for the top 10 frequent epitope types (pathogens) in the testing set of epitope split.

      ● Supplementary Table S5: AUCs of TCR-epitope binding affinity prediction models with BLOSUM62 to embed epitope sequences.

      ● Supplementary Table S6: AUCs of TCR-epitope binding affinity prediction models trained on catELMo TCR embeddings and random-initialized epitope embeddings.

      ● Supplementary Table S7: AUCs of TCR-epitope binding affinity prediction models trained on catELMo and BLOSUM62 embeddings.

      ● Supplementary Figure 4: TCR clustering performance for the top 34 abundant epitopes representing 70.55% of TCRs in our collected databases.

      ● Section Discussion.

      ● Section 4.1 Data: TCR-epitope pairs for binding affinity prediction.

      ● Section 4.4.2 Epitope-specific TCR clustering.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this manuscript, the authors described a computational method catELMo for embedding TCR CDR3 sequences into numeric vectors using a deep-learning-based approach, ELMo. The authors applied catELMo to two applications: supervised TCR-epitope binding affinity prediction and unsupervised epitope-specific TCR clustering. In both applications, the authors showed that catELMo generated significantly better binding prediction and clustering performance than other established TCR embedding methods. However, there are a few major concerns that need to be addressed.

      (1) There are other TCR CDR3 embedding methods in addition to TCRBert. The authors may consider incorporating a few more methods in the evaluation, such as TESSA (PMCID: PMC7799492), DeepTCR (PMCID: PMC7952906) and the embedding method in ATM-TCR (reference 10 in the manuscript). TESSA is also the embedding method in pMTnet, which is another TCR-epitope binding prediction method and is the reference 12 mentioned in this manuscript.

      TESSA is designed for characterizing TCR repertoires, so we initially excluded it from the comparison. Our focus was on models developed specifically for amino acid embedding rather than TCR repertoire characterization. However, to address the reviewer's inquiry, we conducted further evaluations. Since both TESSA and DeepTCR used autoencoder-based models to embed TCR sequences, we selected one used in TESSA for evaluation in our downstream prediction task, conducting ten trials in total. It achieved an average AUC of 75.69 in TCR split and 73.3 in epitope split. Notably, catELMo significantly outperformed such performance with an AUC of 96.04 in TCR split and 94.10 in epitope split.

      Regarding the embedding method in ATM-TCR, it simply uses BLOSUM as an embedding matrix which we have already compared in Section 2.1. Furthermore, we have provided the comparison results between our prediction model trained on catELMo embeddings with the state-of-the-art prediction models such as netTCR and ATM-TCR in Table 6 of the Discussion section.

      (2) The TCR training data for catELMo is obtained from ImmunoSEQ platform, including SARS-CoV2, EBV, CMV, and other disease samples. Meanwhile, antigens related to these diseases and their associated TCRs are extensively annotated in databases VDJdb, IEDB and McPAS-TCR. The authors then utilized the curated TCR-epitope pairs from these databases to conduct the evaluations for eptitope binding prediction and TCR clustering. Therefore, the training data for TCR embedding may already be implicitly tuned for better representations of the TCRs used in the evaluations. This seems to be true based on Table 4, as BERT-Base-TCR outperformed TCRBert. Could catELMo be trained on PIRD as TCRBert to demonstrate catELMo's embedding for TCRs targeting unseen diseases/epitopes?

      We would like to note that catELMo was trained exclusively on TCR sequences in an unsupervised manner, which means it has never been exposed to antigen information. We also ensured that the TCRs used in catELMo's training did not overlap with our downstream prediction data. Please refer to the section 4.1 Data where we explicitly stated, “We note that it includes no identical TCR sequences with the TCRs used for training the embedding models.”. Moreover, the performance gap (~1%) between BERT-Base-TCR and TCRBert, as observed in Table 4, is relatively small, especially when compared to the performance difference (>16%) between catELMo and TCRBert.

      To further address this concern, we conducted experiments using the same number of TCRs, 4,173,895 in total, sourced exclusively from healthy ImmunoSeq repertoires. This alternative catELMo model demonstrated a similar prediction performance (based on 10 trials) to the one reported in our paper, with an average AUC of 96.35% in TCR split and an average AUC of 94.03% in epitope split.

      We opted not to train catELMo on the PIRD dataset for several reasons. First, approximately 7.8% of the sequences in PIRD also appear in our downstream prediction data, which could be a potential source of bias. Furthermore, PIRD encompasses sequences related to diseases such as Tuberculosis, HIV, CMV, among others, which the reviewer is concerned about.

      (3) In the application of TCR-epitope binding prediction, the authors mentioned that the model for embedding epitope sequences was catElMo, but how about for other methods, such as TCRBert? Do the other methods also use catELMo-embedded epitope sequences as part of the binding prediction model, or use their own model to embed the epitope sequences? Since the manuscript focuses on TCR embedding, it would be nice for other methods to be evaluated on the same epitope embedding (maybe adjusted to the same embedded vector length).

      Furthermore, the authors found that catELMo requires less training data to achieve better performance. So one would think the other methods could not learn a reasonable epitope embedding with limited epitope data, and catELMo's better performance in binding prediction is mainly due to better epitope representation.

      Review 1 and 3 have raised similar concerns regarding the epitope embedding approach employed in our binding affinity prediction models. We address both comments together on page 6 where we discuss the epitope embedding strategies in detail.

      (4) In the epitope binding prediction evaluation, the authors generated the test data using TCR-epitope pairs from VDJdb, IEDB, McPAS, which may be dominated by epitopes from CMV. Could the authors show accuracy categorized by epitope types, i.e. the accuracy for TCR-CMV pair and accuracy for TCR-SARs-CoV2 separately?

      The categorized AUC scores have been added in Supplementary Table 7. We observed significant performance boosts from catELMo compared with other embedding models.

      (5) In the unsupervised TCR clustering evaluation, since GIANA and TCRdist direct outputs the clustering result, so they should not be affected by hierarchical clusters. Why did the curves of GIANA and TCRdist change in Figure 4 when relaxing the hierarchical clustering threshold?

      For fair comparisons, we performed GIANA and TCRdist with hierarchical clustering instead of the nearest neighbor search. We have clarified it in the revised manuscript as follows.

      “Both methods are developed on the BLOSUM62 matrix and apply nearest neighbor search to cluster TCR sequences. GIANA used the CDR3 of TCRβ chain and V gene, while TCRdist predominantly experimented with CDR1, CDR2, and CDR3 from both TCRα and TCRβ chains. For fair comparisons, we perform GIANA and TCRdist only on CDR3 β chains and with hierarchical clustering instead of the nearest neighbor search.”

      (6 & 7) In the unsupervised TCR clustering evaluation, the authors examined the TCR related to the top eight epitopes. However, there are much more epitopes curated in VDJdb, IEDB and McPAS-TCR. In real application, the potential epitopes is also more complex than just eight epitopes. Could the authors evaluate the clustering result using all the TCR data from the databases? In addition to NMI, it is important to know how specific each TCR cluster is. Could the authors add the fraction of pure clusters in the results? Pure cluster means all the TCRs in the cluster are binding to the same epitope, and is a metric used in the method GIANA.

      We would like to note that there is a significant disparity in TCR binding frequencies across different epitopes in current databases. For instance, the most abundant epitope (KLGGALQAK) has approximately 13k TCRs binding to it, while 836 out of 982 epitopes are associated with fewer than 100 TCRs in our dataset. Furthermore, there are 9347 TCRs having the ability to bind multiple epitopes. In order to robustly evaluate the clustering performance, we originally selected the top eight frequent epitopes from McPAS and removed TCRs binding multiple epitopes to create a more balanced dataset.

      We acknowledge that the real-world scenario is more complex than just eight epitopes. Therefore, we conducted clustering experiments using the top most abundant epitopes whose combined cognate TCRs make up at least 70% of TCRs across three databases (34 epitopes). This is illustrated in Supplementary Figure 5. Furthermore, we extended our analysis by clustering all TCRs after filtering out those that bind to multiple epitopes, resulting in 782 unique epitopes. We found that catELMo achieved the 3rd and 2nd best performance in NMI and Purity, respectively (see Table below). These are aligned with our previous observations of the eight epitopes.

      Author response table 1.

      Reviewer #2 (Public Review):

      In the manuscript, the authors highlighted the importance of T-cell receptor (TCR) analysis and the lack of amino acid embedding methods specific to this domain. The authors proposed a novel bi-directional context-aware amino acid embedding method, catELMo, adapted from ELMo (Embeddings from Language Models), specifically designed for TCR analysis. The model is trained on TCR sequences from seven projects in the ImmunoSEQ database, instead of the generic protein sequences. They assessed the effectiveness of the proposed method in both TCR-epitope binding affinity prediction, a supervised task, and the unsupervised TCR clustering task. The results demonstrate significant performance improvements compared to existing embedding models. The authors also aimed to provide and discuss their observations on embedding model design for TCR analysis: 1) Models specifically trained on TCR sequences have better performance than models trained on general protein sequences for the TCR-related tasks; and 2) The proposed ELMo-based method outperforms TCR embedding models with BERT-based architecture. The authors also provided a comprehensive introduction and investigation of existing amino acid embedding methods. Overall, the paper is well-written and well-organized.

      The work has originality and has potential prospects for immune response analysis and immunotherapy exploration. TCR-epitope pair binding plays a significant role in T cell regulation. Accurate prediction and analysis of TCR sequences are crucial for comprehending the biological foundations of binding mechanisms and advancing immunotherapy approaches. The proposed embedding method presents an efficient context-aware mathematical representation for TCR sequences, enabling the capture and analysis of their structural and functional characteristics. This method serves as a valuable tool for various downstream analyses and is essential for a wide range of applications. Thank you.

      Reviewer #3 (Public Review):

      Here, the authors trained catElMo, a new context-aware embedding model for TCRβ CDR3 amino acid sequences for TCR-epitope specificity and clustering tasks. This method benchmarked existing work in protein and TCR language models and investigated the role that model architecture plays in the prediction performance. The major strength of this paper is comprehensively evaluating common model architectures used, which is useful for practitioners in the field. However, some key details were missing to assess whether the benchmarking study is a fair comparison between different architectures. Major comments are as follows:

      • It is not clear why epitope sequences were also embedded using catELMo for the binding prediction task. Because catELMO is trained on TCRβ CDR3 sequences, it's not clear what benefit would come from this embedding. Were the other embedding models under comparison also applied to both the TCR and epitope sequences? It may be a fairer comparison if a single method is used to encode epitope sequence for all models under comparison, so that the performance reflects the quality of the TCR embedding only.

      In our study, we indeed used the same embedding model for both TCRs and epitopes in each prediction model, ensuring a consistent approach throughout.

      Recognizing the importance of evaluating the impact of epitope embeddings, we conducted experiments in which we used BLOSUM62 matrix to embed epitope sequences for all models. The results (Supplementary Table 5) are well aligned with the performance reported in our paper. This suggests that epitope embedding may not play as critical a role as TCR embedding in the prediction tasks. To further validate this point, we conducted two additional experiments.

      Firstly, we used catELMo to embed TCRs while employing randomly initialized embedding matrices with trainable parameters for epitope sequences. It yielded similar prediction performance as when catELMo was used for both TCR and epitope embedding (Supplementary Table 6). Secondly, we utilized BLOSUM62 to embed TCRs but employed catELMo for epitope sequence embedding, resulting in performance comparable to using BLOSUM62 for both TCRs and epitopes (Supplementary Table 4). These experiment results confirmed the limited impact of epitope embedding on downstream performance.

      We conjecture that these results may be attributed to the significant disparity in data scale between TCRs (~290k) and epitopes (less than 1k). Moreover, TCRs tend to exhibit high similarity, whereas epitopes display greater distinctiveness from one another. These features of TCRs require robust embeddings to facilitate effective separation and improve downstream performance, while epitope embedding primarily serves as a categorical encoding.

      We have included a detailed discussion of these findings in the revised manuscript to provide a comprehensive understanding of the role of epitope embeddings in TCR binding prediction.

      • The tSNE visualization in Figure 3 is helpful. It makes sense that the last hidden layer features separate well by binding labels for the better performing models. However, it would be useful to know if positive and negative TCRs for each epitope group also separate well in the original TCR embedding space. In other words, how much separation between these groups is due to the neural network vs just the embedding?

      It is important to note that we used the same downstream prediction model, a simple three-linear-layer network, for all the discussed embedding methods. We believe that the separation observed in the t-SNE visualization effectively reflects the ability of our embedding model. Also, we would like to mention that it can be hard to see a clear distinction between positive and negative TCRs in the original embedding space because embedding models were not trained on positive/negative labels. Please refer to the t-SNE of the original TCR embeddings below.

      Author response image 1.

      • To generate negative samples, the author randomly paired TCRs from healthy subjects to different epitopes. This could produce issues with false negatives if the epitopes used are common. Is there an estimate for how frequently there might be false negatives for those commonly occurring epitopes that most populations might also have been exposed to? Could there be a potential batch effect for the negative sampled TCR that confounds with the performance evaluation?

      Thank you for bringing this valid and interesting point up. Generating negative samples is non-trivial since only a limited number of non-binding TCR-pairs are publicly available and experimentally validating non-binding pairs is costly [1]. Standard practices for generating negative pairs are (1) paring epitopes with healthy TCRs [2, 3], and (2) randomly shuffling existing TCR-epitope pairs [4,5]. We used both approaches (the former included in the main results, and the latter in the discussion). In both scenarios, catELMo embeddings consistently demonstrated superior performance.

      We acknowledge the possibility of false negatives due to the finite-sized TCR database from which we randomly selected TCRs, however, we believe that the likelihood of such occurrences is low. Given the vast diversity of human TCR clonotypes, which can exceed 10^15[6], the chance of randomly selecting a TCR that specifically recognizes a target epitope is relatively small.

      In order to investigate the batch effect, we generated new negative pairs using different seeds and observed consistent prediction performance across these variations. However, we agree that there could still be a potential batch effect for the negative samples due to potential data bias.

      We have discussed the limitation of generative negative samples in the revised manuscript.

      • Most of the models being compared were trained on general proteins rather than TCR sequences. This makes their comparison to catELMO questionable since it's not clear if the improvement is due to the training data or architecture. The authors partially addressed this with BERT-based models in section 2.4. This concern would be more fully addressed if the authors also trained the Doc2vec model (Yang et al, Figure 2) on TCR sequences as baseline models instead of using the original models trained on general protein sequences. This would make clear the strength of context-aware embeddings if the performance is worse than catElmo and BERT.

      We agree it is important to distinguish between the effects of training data and architecture on model performance.

      In Section 2.4, as the reviewer mentioned, we compared catELMo with BERT-based models trained on the same TCR repertoire data, demonstrating that architecture plays a significant role in improving performance. Furthermore, in Section 2.5, we compared catELMo-shallow with SeqVec, which share the same architecture but were trained on different data, highlighting the importance of data on the model performance.

      To further address the reviewer's concern, we trained a Doc2Vec model on the TCR sequences that have been used for catELMo training. We observed significantly lower prediction performance compared to catELMo, with an average AUC of 50.24% in TCR split and an average AUC of 51.02% in epitope split, making the strength of context-aware embeddings clear.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) It is known that TRB CDR3, the CDR1, CDR2 on TRBV gene and the TCR alpha chain also contribute to epitope recognition, but were not modeled in catELMo. It would be nice for the authors to add this as a current limitation for catELMo in the Discussion section.

      We have discussed the limitation in the revised manuscript.

      “Our study focuses on modeling the TCRβ chain CDR3 region, which is known as the primary determinant of epitope binding. Other regions, such as CDR1 and CDR2 on the TRB V gene, along with the TCRα chain, may also contribute to specificity in antigen recognition. However, a limited number of available samples for those additional features can be a challenge for training embedding models. Future work may explore strategies to incorporate these regions while mitigating the challenges of working with limited samples.”

      (2) I tried to follow the instructions to train a binding affinity prediction model for TCR-epitope pairs, however, the cachetools=5.3.0 seems could not be found when running "pip install -r requirements.txt" in the conda environment bap. Is this cachetools version supported after Python 3.7 so the Python 3.6.13 suggested on the GitHub repo might not work?

      This has been fixed. We have updated the README.md on our github page.

      Reviewer #2 (Recommendations For The Authors):

      The article is well-constructed and well-written, and the analysis is comprehensive.

      The comments for minor issues that I have are as follows:

      (1) In the Methods section, it will be clearer if the authors interpret more on how the standard deviation is calculated in all tables. How to define the '10 trials'? Are they based on different random training and test set splits?

      ‘10 trials' refers to the process of splitting the dataset into training, validation, and testing sets using different seeds for each trial. Different trials have different training, validation, and testing sets. For each trial, we trained a prediction model on its training set and measured performance on its testing set. The standard deviation was calculated from the 10 measurements, estimating model performance variation across different random splits of the data.

      (2) The format of AUCs and the improvement of AUCs need to be consistent, i.e., with the percent sign.

      We have updated the format of AUCs.

      Reviewer #3 (Recommendations For The Authors):

      In addition to the recommendations in the public review, we had the following more minor questions and recommendations:

      • Could you provide some more background on the data, such as overlaps between the databases, and how the training and validation split was performed between the three databases? Also summary statistics on the length of TCR and epitope sequence data would be helpful.

      We have provided more details about data in our revision.

      • Could you comment on the runtime to train and embed using the catELMo and BERT models?

      Our training data is TCR sequences with relatively short lengths (averaging less than 20 amino acid residues). Such characteristic significantly reduces the computational resources required compared to training large-scale language models on extensive text corpora. Leveraging standard machines equipped with two GeForce RTX 2080 GPUs, we were able to complete the training tasks within a matter of days. After training, embedding one sequence can be accomplished in a matter of seconds.

      • Typos and wording:

      • Table 1 first row of "source": "immunoSEQ" instead of "immuneSEQ"

      This has been corrected.

      • L23 of abstract "negates the need of complex deep neural network architecture" is a little confusing because ELMo itself is a deep neural network architecture. Perhaps be more specific and add that the need is for downstream tasks.

      We have made it more specific in our abstract.

      “...negates the need for complex deep neural network architecture in downstream tasks.”

      References

      (1) Montemurro, Alessandro, et al. "NetTCR-2.0 enables accurate prediction of TCR-peptide binding by using paired TCRα and β sequence data." Communications biology 4.1 (2021): 1060.

      (2) Jurtz, Vanessa Isabell, et al. "NetTCR: sequence-based prediction of TCR binding to peptide-MHC complexes using convolutional neural networks." BioRxiv (2018): 433706.

      (3) Gielis, Sofie, et al. "Detection of enriched T cell epitope specificity in full T cell receptor sequence repertoires." Frontiers in immunology 10 (2019): 2820.

      (4) Cai, Michael, et al. "ATM-TCR: TCR-epitope binding affinity prediction using a multi-head self-attention model." Frontiers in Immunology 13 (2022): 893247.

      (5) Weber, Anna, et al. "TITAN: T-cell receptor specificity prediction with bimodal attention networks." Bioinformatics 37 (2021): i237-i244.

      (6) Lythe, Grant, et al. "How many TCR clonotypes does a body maintain?." Journal of theoretical biology 389 (2016): 214-224.

    1. Author response:

      eLife assessment

      This is an important study describing a neuromuscular junction co-culture system using human cells that the authors use to study the synaptic consequences of ALS mutations. The data supporting the system are solid and show the value of using myotubes and motor neurons from the same donor. The study will be of interest to researchers who model neuromuscular junction disorders, however, the authors could more comprehensively compare and contrast their system with previous literature describing other similar models. There are also technical weaknesses that limit the interpretation of specific findings.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors propose an improved neuro-muscle co-culture system to study ALS-related functional differences in human pluripotent stem cell lines.

      Strengths:

      A simple co-culture system with functional readout.

      We appreciate the recognition that this is a simplified co-culture system with a straight-forward functional evaluation.

      Weaknesses:

      There are concerns about the lack of novelty, rigor, and clarity in the approach. The strength of the study is undermined by its reliance on transcription factors used more than a decade ago, low myocyte activity, and inadequate validation methods, such as the lack of single-cell transcriptome analysis and detailed neuromuscular synapse characterization. The evidence presented requires substantial validation through rigorous experimental approaches and resolution of the identified concerns for the study's findings to be considered significant and reliable.

      The muscle differentiation protocol used in our work is an adaptation of the Albini S, et al. Cell Rep. 2013. This protocol was selected due to its efficiency to differentiate skeletal muscles from pluripotent stem cells (PSCs). Modifications from the original publications were made in the plasmids (MYOD and BAF60C) used, such as the inclusion of selection genes, puromycin and blasticidin, to improve efficiency. Moreover, a criticism of the previously used overexpression system, especially overexpression of MYOD, is that it introduces artificial expression of this gene throughout muscle differentiation, when it is only supposed to be expressed early in myogenesis. Thus, the constructs used in our work are dox inducible, which enables us to control the expression of MYOD and restrict it to the first 48 hours. This protocol resulted in a highly efficient skeletal muscle differentiation, as noted in our manuscript. “The PSC-derived skeletal muscles were characterized by the presence of Desmin (DES) and Myosin Heavy Chain (MHC), and as early as day 8 of differentiation nearly 100% of the cells co-expressed these markers.” We agree with the reviewer that the myocyte activity identified in our work is lower compared to Albini et al. (2013), mostly explained by the modification we made to the method, from a 3D to a 2D culture. In Albini et al. (2013) the electrophysiological properties were assayed in skeletal myospheres (3D), which are known to improve contractility measurements. Conversely, in 2D cultures when the contractility intensifies the cells detach from the plate. Thus, a tight regulation of cell concentration for optimal maturation and formation of contractile skeletal muscle culture without premature detachment of the cells is required. We believe that single-cell or single-nuclei transcriptome analysis from the co-culture setting of two well-defined cell types might yield little value for method characterization, however, as part of a follow up study we are performing morphological NMJ characterization and applying single-nuclei transcriptome analysis in the fALS disease context to identify specific molecular mechanisms that result in synaptic dysfunction.

      Reviewer #2 (Public Review):

      The manuscript by Chen et al from the group of Helen Miranda aims to describe an improved neuromuscular junction (NMJ) model to study synaptic dysfunction in several cases of familial ALS. Overall, the system described in the paper appears as a valid platform to study disease phenotypes with exciting results showing specific effects of GDNF on non-SOD1 ALS patient lines. The strength of the paper lies in the use of myotubes, and motor neurons derived from the same donor. However, the current study: (1) lacks a clear comparison of the current system with numerous previously described systems; (2) is limited by the number of repeat experiments in the study and (3) has no description of the synaptic phenotype observed in the study. These major points are discussed in more detail below.

      We appreciate the recognition that “the system described in the paper appears as a valid platform to study disease phenotypes with exciting results showing specific effects of GDNF on non-SOD1 ALS patient lines” and the careful evaluation of our work. We plan to address the points raised by this reviewer in the revision.

      Major points:

      (1) In the introduction the authors state (p. 4): "Finally, recent human NMJ models have been established from PSCs by differentiating these cells into both skeletal muscles and motor neurons in 2D and 3D formats. These previous systems present a remarkable advancement to the studies of human NMJs, however, they require long NMJ formation and maturation time (40 to 60 days), which, restricts their sensitivity and scalability [42]"

      In fact, a number of studies have described various in-vitro NMJ systems, with the same timeframes for NMJ formation. For example, in studies by Osaki et al, 2018, Sci Adv; Bellmann et al, 2019, Biomat; Demestre et al, 2015, Stem Cell Res; Badu-Mensah et al, 2022, Biomat (this is just an exemplar selection of the papers); NMJ formation was observed as early as 14 d in culture, in line with or at least slightly longer than reported by Chen et al. With the exception of the study by Osaki et al, all co-culture systems cited above are 2D-based. The authors need to expand on this further or provide a quantitative assessment of why their system is better compared to previously published models.

      Indeed, there are previous publications that have described neuromuscular junctions (NMJs) in cocultures of iPSC-derived skeletal muscles and motor neurons. Some of the publications mentioned above did show NMJ formation within ~20ish days, albeit with several caveats such as culture heterogeneity, i.e. 50% motor neuron differentiation efficiency. We agree with the reviewer that this needs to be expanded and clarified, and we will address this concern in the revision.

      (2) Further, when comparing their results with other work it is hard to claim how the current system is (p. 5) "more reproducible, and offers a 6-fold increase in scalability compared to previous models [40-43]".

      The authors need to expand on this further.

      This is an important aspect of this work, and we believe that our protocol offers a higher reproducibility due to, at least partially, the homogeneity of the starting cultures of iPSC-derived skeletal muscles and iPSC-derived motor neurons, and that the direct 2D co-culture approach is more suitable for miniaturization compared to 3D cultures or microfluidic chamber devices. Thus, we will expand on this idea in the revision.

      (3) Although mentioned, there were no examples of the modularity of the system, which of course would strengthen the paper and help to uncover ALS mechanisms of synaptic formation, for example by combining WT myotubes and fALS motor neurons (see point 4 below). The authors should show how they would adapt to 96 well plate format to showcase the scalability of the system. Based on their data on the efficacy of synaptic formation (60 per 0.7 cm2 area), is further miniaturization allowed?

      We appreciate the points raised by the reviewer. The “mix-and-match” approach to co-culture wild-type and affected iPSC-derived skeletal muscles with iPSC-derived motor neurons is a main focus of our lab and an advantage to protocols like ours, where cells are differentiated independently and later co-cultured together; however, a comprehensive characterization of various mix-match combinations is beyond the scope of this Tools and Resources article. Since the initial submission of this manuscript, we have extensively optimized the scalability of the co-cultures from the initial 0.7 cm2 to 0.32 cm2 (96-well plates). Further miniaturization is also being optimized to 0.136 cm2 (384-well plates). This point will be clarified in the revision.

      (4) A lot of a-bungarotoxin staining corresponds to AChR clusters that do not seem to be associated with muscle and do not form normal rings of clustering (pretzel-like) associated with the NMJ in vivo. This is seen clearly in Figure 3B and Figure 5B. Figures 3B and 5B only show low-magnification images which makes it difficult to assess the specificity of localization of the pre-/post-synaptic markers. The authors should clearly show the morphologies of the NMJs formed in WT and fALS lines at high magnification. In addition, the authors should show co-localization images for a-bungarotoxin and myosin-heavy chains to confirm the localization of the bungarotoxin signal on the myotubes.

      In addition to that, the authors report that the number of functional synapses formed on a plate varies from 30 (fASL) to 60 (Ctrl) per 10,000 neurons spread over the 0.7 cm2 area (0.6%). How do the authors explain an extensive loss of a-bungarotoxin signal in Figure 5B the majority of which likely corresponds to AChR clusters that are formed outside of neuronal connections? Such clustering can be usually observed in immature co-cultures and in vivo prior to the innervation of myotubes. One explanation could be that myotubes derived from fALS PSC are less capable of synaptic formation. Noteworthy, a study of PSCderived myotubes and motor neurons from PSC lines with various SOD1 mutations has already been published, but not cited by Chen et al (Badu-Mensah et al). Given the importance of those confounding factors, the authors should test cell-intrinsic (motor neuron-related) vs non-cell-intrinsic mechanisms by co-culturing healthy myotubes with fALS-derived motor neurons followed by NMJ quantification.

      The iPSC-derived skeletal muscle cultures were plated as a monolayer and even though the abungarotoxin staining does not show the pretzel-like shape NMJs, similar to other in vitro NMJ protocols (Badu-Mensah et al, Biomat 2023; Pereira et al., Nat Commun 2021; Uzel et al., Sci Adv 2016), abungarotoxin does show association with the muscles. For quantification purposes we omitted the MHC staining to decrease background, however we will include it in the revision in response to the reviewer’s concern.

      We agree with the reviewer that the suggested approaches would yield insight into disease mechanism but are beyond the scope of this method development study. In fact, we are very excited about our follow up study pursuing a more in-depth analysis of cell-autonomous vs non-cell autonomous pathogenesis to understand the NMJ dysfunction in fALS. We apologize that the “Badu-Mensah et al” work was not included, this was our oversight and will be added in the revision.

      (5) The authors present the advantage of optogenetic stimulation, but they only show the proof-ofprinciple and never really apply it to their studies. Specifically, with regard to Figure 6, are motor units derived from fALS PSCs incapable of being ontogenetically activated to the same extent as control motor units? Does the dysfunction stem from fALS motor neurons or fALS myotubes?

      We agree that these are important questions to be addressed and are actively pursuing these experiments as part of the natural follow up investigation from the present Tools and Resources article.

      (6) Figures 6 B and C appear to be identical except for the addition of the GDNF effect on the fALS lines. This should all be put in one figure. The authors should also show whether GDNF-induced functional recovery is associated with recovery in the number of motor units or with merely synaptic function by quantifying the NMJ number in the presence of GDNF.

      We will combine Figures 6B and 6C in the revision. Our follow up study also includes the interrogation of the mechanism through which GDNF rescues fALS NMJ dysfunction.

      (7) Figure 5 and Figure 6. The authors only use one line per fALS mutation and their corresponding isogenic controls. They state that the n=6 for these experiments represents the technical replication of the experiment. These experiments should be performed at least n=3 times starting from neuronal differentiation, and not by seeding replicate wells representing a true replication of each experiment. This would significantly strengthen their argument that their method is robust and the results are easily reproducible.

      We will clarify that the technical replicates originated from independent differentiations in the revision.

      (8) In the discussion the authors may want to mention that the lack of function of GDNF on the SOD1 lines may relate to the fact that SOD1 mutations do not lead to TDP43 pathology. Although speculative this suggests that in cases with TDP43 mutations (their data) or sporadic disease GDNF may be effective.

      We appreciate this suggestion and will highlight this as possible inclusion criteria for GDNF treatment in the discussion of our revised version of the manuscript.

      (9) Although beyond the scope of this paper, it would of course be interesting to see if sporadic forms of ALS had this same phenotype.

      We agree with the reviewer and we hope to include iPSC derived NMJs from sporadic ALS patients in a future study.

    2. eLife assessment

      This is an important study describing a neuromuscular junction co-culture system using human cells that the authors use to study the synaptic consequences of ALS mutations. The data supporting the system are solid and show the value of using myotubes and motor neurons from the same donor. The study will be of interest to researchers who model neuromuscular junction disorders, however, the authors could more comprehensively compare and contrast their system with previous literature describing other similar models. There are also technical weaknesses that limit the interpretation of specific findings.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors propose an improved neuro-muscle co-culture system to study ALS-related functional differences in human pluripotent stem cell lines.

      Strengths:

      A simple co-culture system with functional readout.

      Weaknesses:

      There are concerns about the lack of novelty, rigor, and clarity in the approach. The strength of the study is undermined by its reliance on transcription factors used more than a decade ago, low myocyte activity, and inadequate validation methods, such as the lack of single-cell transcriptome analysis and detailed neuromuscular synapse characterization. The evidence presented requires substantial validation through rigorous experimental approaches and resolution of the identified concerns for the study's findings to be considered significant and reliable.

    4. Reviewer #2 (Public Review):

      The manuscript by Chen et al from the group of Helen Miranda aims to describe an improved neuromuscular junction (NMJ) model to study synaptic dysfunction in several cases of familial ALS. Overall, the system described in the paper appears as a valid platform to study disease phenotypes with exciting results showing specific effects of GDNF on non-SOD1 ALS patient lines. The strength of the paper lies in the use of myotubes, and motor neurons derived from the same donor. However, the current study: (1) lacks a clear comparison of the current system with numerous previously described systems; (2) is limited by the number of repeat experiments in the study and (3) has no description of the synaptic phenotype observed in the study. These major points are discussed in more detail below.

      Major points:

      (1) In the introduction the authors state (p. 4): "Finally, recent human NMJ models have been established from PSCs by differentiating these cells into both skeletal muscles and motor neurons in 2D and 3D formats. These previous systems present a remarkable advancement to the studies of human NMJs, however, they require long NMJ formation and maturation time (40 to 60 days), which, restricts their sensitivity and scalability [42]"

      In fact, a number of studies have described various in-vitro NMJ systems, with the same timeframes for NMJ formation. For example, in studies by Osaki et al, 2018, Sci Adv; Bellmann et al, 2019, Biomat; Demestre et al, 2015, Stem Cell Res; Badu-Mensah et al, 2022, Biomat (this is just an exemplar selection of the papers); NMJ formation was observed as early as 14 d in culture, in line with or at least slightly longer than reported by Chen et al. With the exception of the study by Osaki et al, all co-culture systems cited above are 2D-based. The authors need to expand on this further or provide a quantitative assessment of why their system is better compared to previously published models.

      (2) Further, when comparing their results with other work it is hard to claim how the current system is (p. 5) "more reproducible, and offers a 6-fold increase in scalability compared to previous models [40-43]". The authors need to expand on this further.

      (3) Although mentioned, there were no examples of the modularity of the system, which of course would strengthen the paper and help to uncover ALS mechanisms of synaptic formation, for example by combining WT myotubes and fALS motor neurons (see point 4 below). The authors should show how they would adapt to 96 well plate format to showcase the scalability of the system. Based on their data on the efficacy of synaptic formation (60 per 0.7 cm2 area), is further miniaturization allowed?

      (4) A lot of a-bungarotoxin staining corresponds to AChR clusters that do not seem to be associated with muscle and do not form normal rings of clustering (pretzel-like) associated with the NMJ in vivo. This is seen clearly in Figure 3B and Figure 5B. Figures 3B and 5B only show low-magnification images which makes it difficult to assess the specificity of localization of the pre-/post-synaptic markers. The authors should clearly show the morphologies of the NMJs formed in WT and fALS lines at high magnification. In addition, the authors should show co-localization images for a-bungarotoxin and myosin-heavy chains to confirm the localization of the bungarotoxin signal on the myotubes.

      In addition to that, the authors report that the number of functional synapses formed on a plate varies from 30 (fASL) to 60 (Ctrl) per 10,000 neurons spread over the 0.7 cm2 area (0.6%). How do the authors explain an extensive loss of a-bungarotoxin signal in Figure 5B the majority of which likely corresponds to AChR clusters that are formed outside of neuronal connections? Such clustering can be usually observed in immature co-cultures and in vivo prior to the innervation of myotubes. One explanation could be that myotubes derived from fALS PSC are less capable of synaptic formation. Noteworthy, a study of PSC-derived myotubes and motor neurons from PSC lines with various SOD1 mutations has already been published, but not cited by Chen et al (Badu-Mensah et al). Given the importance of those confounding factors, the authors should test cell-intrinsic (motor neuron-related) vs non-cell-intrinsic mechanisms by co-culturing healthy myotubes with fALS-derived motor neurons followed by NMJ quantification.

      (5) The authors present the advantage of optogenetic stimulation, but they only show the proof-of-principle and never really apply it to their studies. Specifically, with regard to Figure 6, are motor units derived from fALS PSCs incapable of being ontogenetically activated to the same extent as control motor units? Does the dysfunction stem from fALS motor neurons or fALS myotubes?

      (6) Figures 6 B and C appear to be identical except for the addition of the GDNF effect on the fALS lines. This should all be put in one figure. The authors should also show whether GDNF-induced functional recovery is associated with recovery in the number of motor units or with merely synaptic function by quantifying the NMJ number in the presence of GDNF.

      (7) Figure 5 and Figure 6. The authors only use one line per fALS mutation and their corresponding isogenic controls. They state that the n=6 for these experiments represents the technical replication of the experiment. These experiments should be performed at least n=3 times starting from neuronal differentiation, and not by seeding replicate wells representing a true replication of each experiment. This would significantly strengthen their argument that their method is robust and the results are easily reproducible.

      (8) In the discussion the authors may want to mention that the lack of function of GDNF on the SOD1 lines may relate to the fact that SOD1 mutations do not lead to TDP43 pathology. Although speculative this suggests that in cases with TDP43 mutations (their data) or sporadic disease GDNF may be effective.

      (9) Although beyond the scope of this paper, it would of course be interesting to see if sporadic forms of ALS had this same phenotype.

    1. eLife assessment

      This manuscript reports important findings regarding the development, anatomical placement and synaptic connectivity of a subtype of V1 spinal inhibitory interneurons. Using a combination of techniques, the authors show convincingly how V1 interneurons in the spinal cord, specifically those expressing the transcription factor Foxp2, differ in their birthdates, synaptic connectivity to motor neurons and their postnatal location. The study is an important addition to the literature on spinal cord interneurons and opens avenues for their functional assessment.

    2. Reviewer #1 (Public Review):

      To understand the spinal locomotor circuits, we need to reveal how various types of spinal interneurons work in the circuits. So far, the general roles of the cardinal groups of spinal interneurons (dI6, V0, V1, V2a, V2b, and V3) involved in locomotion have been roughly established but not fully understood. Each group is believed to contain some clades with more detailed functional differences. However, each character and function of these clades has not yet been elucidated.

      In this study, Worthy et al. investigated clades of V1 neurons that are one of the main groups of inhibitory neurons in the spinal cord. Previous reports proposed four clades (Renshaw cells, FoxP2, sp8, and pou6f2) in V1 neurons defined by the expression of transcription factors. For V1 neurons in each of the four clades, the authors investigated the birth time and showed the postnatal location in the spinal cord according to the birth time. They found FoxP2-V1 located near LMC motor neurons that project to the limb. Using genetically labeled Foxp2-V1 mice, they showed that most of the synapses of V1 neurons on the cell bodies of motor neurons were from Foxp2-V1 and Renshaw cells. Furthermore, a higher proportion of Foxp2-V1 synapses is observed on LMC motor neurons than on axial motor neurons. They proposed that Foxp2-V1, which represents 60% of V1, can be further classified according to the expression of transcription factors Otp and Foxp4.

      These results will be helpful for future analyses of the development and function of V1 neurons. In particular, the discovery of strong synaptic connections between Foxp2-V1 and LMC motor neurons will be beneficial in analyzing the role of V1 neurons in motor circuits that generate movement of the limbs.

      The conclusions of this paper are well supported by the data obtained using widely used methods. However, for some analyses, the specificity of labeling V1 clades should be clearly described.

      (1) In Figure 1, the MafB antibody (Sigma) was used to identify Renshaw cells at P5. However, according to the supplementary Figure 3D, the specificity of the MafB antibody (Sigma) is relatively low. The image of MafB-GFP, V1-INs, and MafB-IR at P5 should be added to the supplementary figure. The specificity of MaFB-IR-Sigma in V1 neurons at P5 should be shown. This image also might support the description of the genetically labeled MafB-V1 distribution at P5 (page 8, lines 28-32).

      (2) The proportion of genetically labeled FoxP2-V1 in all V1 is more than 60%, although immunolabeled FoxP2-V1 is approximately 30% at P5. Genetically labeled Otp-V1 included other non-FoxP2 V1 clades (Fig. 8L-M). I wonder whether genetically labeled FoxP2-V1 might include the other three clades. The authors should show whether genetically labeled FoxP2-V1 expresses other clade markers, such as pou6f2, sp8, and calbindin, at P5.

    3. Reviewer #2 (Public Review):

      Summary:

      This work brings important information regarding the composition of interneurons in the mammalian spinal cord, with a developmental perspective. Indeed, for the past decades, tools inspired from developmental biology have opened up promising avenues for challenging the functional heterogeneity in the spinal cord. They rely on the fact that neurons sharing similar mature properties also share a largely similar history of expression of specific transcription factor (TF) genes during embryogenic and postnatal development. For instance, neurons originating from p1 progenitors and expressing the TF Engrailed-1, form the V1 neuronal class. While such "cardinal" neuronal classes defined by one single RF indeed share numerous features - e.g., for the case of V1 neurons, a ventral positioning, an inhibitory nature and ipsilatetal projections - there is accumulating evidence for a finer-grained diversity and specialization in each class which is still largely obscure. The present work studies the heterogeneity of V1 interneurons and describes multiple classes based on their birthdate, final positioning, and expression of additional TF. It brings in particular a solid characterization of the Foxp2-expressing V1 interneurons for which authors also delve into the connectivity, and hence, possible functional implication. The work will be of interest to developmental biologists and those interested in the organization of the locomotor spinal network.

      Strengths:

      This study has deeply analyzed the diversity of V1 neurons by intersecting multiple criteria: TF expression, birthdate, location in the spinal cord, diversity along the rostro-caudal axis, and for some subsets, connectivity. This illustrates and exemplifies the absolute need to not consider cardinal classes, defined by one single TF, as homogeneous. Rather, it highlights the limits of single-TF classification, and exemplifies the existence of further diversity within cardinal class.

      Experiments are generally well performed with a satisfactory number of animals and adequate statistical tests.

      Authors have also paid strong attention to potential differences in cell-type classification when considering neurons currently expressing of a given TF (e.g., using antibodies), from those defined as having once expressed that TF (e.g., defined by a lineage-tracing strategy). This ambiguity is a frequent source of discrepancy of findings across studies.

      Furthermore, there is a risk in developmental studies to overlook the fact that the spinal cord is functionally specialized rostro-caudally, and to generalize features that may only be applicable to a specific segment and hence to a specific motor pool. While motoneurons share the same dorso-ventral origin and appear homogenous on a ChAT staining, specific clusters are dedicated to specific muscle groups, e.g., axial, hypaxial or limb muscles. Here, the authors make the important distinction between different lumbar levels and detail the location and connectivity of their neurons of interest with respect to specific clusters of MN.

      Finally, the authors are fully transparent on inter-animal variability in their representation and quantification. This is crucial to avoid the overgeneralization of findings but to rather provide a nuanced understanding of the complexities of spinal circuits.

      Weaknesses:

      The current version of the paper is VERY hard to read. It is often extremely difficult to "see the forest for the trees" and the reader is often drowned in methodological details that provide only minor additions to the scientific message. Non-specialists in developmental biology, but still interested in the spinal cord organization, especially students, might find this article challenging to digest and there is a high risk that they will be inclined to abandon reading it. The diversity of developmental stages studied (with possible mistakes between text and figures) adds a substantial complexity in the reading. It is also not clear at all why authors choose to focus on the Foxp2 V1 from page 9. Naively, the Pou6f2 might have been equally interesting. Finally, numerous discrepancies in the referencing of figures must also be fixed. I strongly recommend an in-depth streamlining and proofreading, and possibly moving some material to supplement (e.g. page 8, and elsewhere).

      Second, and although the different V1 populations have been investigated in detail regarding their development and positioning, their functional ambition is not directly investigated through gain or loss of function experiments. For the Foxp2-V1, the developmental and anatomical mapping is complemented by a connectivity mapping (Fig 6s, 8), but the latter is fairly superficial compared to the former. Synapses (Fig 6) are counted on a relatively small number of motoneurons per animal, that may, or may not, be representative of the population. Likewise, putative synaptic inputs are only counted on neuronal somata. Motoneurons that lack of axono-somatic contacts may still be contacted distally. Hence, while this data is still suggestive of differences between V1 pools, it is only little predictive of function.

      Third, I suggest taking with caution the rabies labelling (Figure 8). It is known that this type of Rabies vectors, when delivered from the periphery, might also label sensory afferents and their post-synaptic targets in the cord through anterograde transport and transneuronal spread (e.g., Pimpinella et al., 2022). Yet I am not sure authors have made all controls to exclude that labelled neurons, presumed here to be premotoneurons, could rather be anterogradely labelled from sensory afferents.

      Fourth, the ambition to differentiate neuronal birthdate at a half-day resolution (e.g., E10 vs E10.5) is interesting but must be considered with caution. As the author explains in their methods, animals are caged at 7pm, and the plug is checked the next morning at 7 am. There is hence a potential error of 12h.

    4. Reviewer #3 (Public Review):

      Building on their previous work that defined four major subgroups, or clades, of V1 interneurons largely by their transcriptional signatures, they do meticulous yet comprehensive analysis of the birth timing of V1 interneurons by clade, and even intra-clade, subtypes. This analysis establishes new relationships between the molecular identity, settling position, and birth time with extraordinary precision.

      These relationships are then explored from the lens of synaptic connectivity. Focusing on the FoxP2 clade, they show tight spatial correspondence between V1 and motor neuron position, and through detailed synaptic analysis, find the FoxP2 V1 clade, as compared to Renshaw cells and other V1s, are the major contributors to V1-to-limb motor neuron connectivity. Finally, by analyzing sensory-to-V1 connectivity too, they show that the FoxP2 clade exhibits Ia-reciprocal interneuron-like convergence of proprioceptive and Renshaw cell synapses.

      Taking the development and connectivity analysis together, their work substantially advances our understanding of spinal interneurons and yields fundamental basic information about how cell type heterogeneity corresponds across developmental, molecular and anatomical features.

      An additional strength of this study is that they generate new genetic tools for labeling interneuron subpopulations, and provide insider knowledge into antibody, genetic and viral labeling that often get tucked under the rug, providing a very useful resource for further studies.

      My only criticism is that some of the main messages of the paper are buried in technical details. Better separation of the main conclusions of the paper, which should be kept in the main figures and text, and technical details/experimental nuances, which are essential but should be moved to the supplement, is critical. This will also correct the other issue with the text at present, which is that it is too long.

    1. Reviewer #2 (Public Review):

      Summary:

      The authors have developed a novel bimanual task that allows them to study how the sensorimotor control system deals with redundancy within our body. Specifically, the two hands control two robot handles that control the position and orientation of a virtual stick, where the end of the stick is moved into a target. This task has infinite solutions to any movement, where the two hands influence both tip-movement direction and stick-tilt angle. When moving to different targets in the baseline phase, participants change the tilt angle of the stick in a specific pattern that produces close to the minimum movement of the two hands to produce the task. In a series of experiments, the authors then apply perturbations to the stick angle and stick movement direction to examine how either tip-movement (task-relevant) or stick-angle (task-irrelevant) perturbations affect adaptation. Both types of perturbations affect adaptation, but this adaptation follows the baseline pattern of tip-movement and stick angle relation such that even task-irrelevant perturbations drive adaptation in a manner that results in task-relevant errors. Overall, the authors suggest that these baseline relations affect how we adapt to changes in our tasks. This work provides an important demonstration that underlying solutions/relations can affect the manner in which we adapt. I think one major contribution of this work will also be the task itself, which provides a very fruitful and important framework for studying more complex motor control tasks.

      Strengths:

      Overall, I find this a very interesting and well-written paper. Beyond providing a new motor task that could be influential in the field, I think it also contributes to studying a very important question - how we can solve redundancy in the sensorimotor control system, as there are many possible mechanisms or methods that could be used - each of which produces different solutions and might affect the manner in which we adapt.

      Weaknesses:

      I would like to see further discussion of what the particular chosen solution implies in terms of optimality.

      The underlying baseline strategy used by the participants appears to match the path of minimum movement of the two hands. This suggests that participants are simultaneously optimizing accuracy and minimizing some metabolic cost or effort to solve the redundancy problem. However, once the perturbations are applied, participants still use this strategy for driving adaptation. I assume that this means that the solution that participants end up with after adaptation actually produces larger movements of the two hands than required. That is - they no longer fall onto the minimum hand movement strategy - which was used to solve the problem. Can the authors demonstrate that this is either the case or not clearly? These two possibilities produce very different implications in terms of the results.

      If my interpretation is correct, such a result (using a previously found solution that no longer is optimal) reminds me of the work of Selinger et al., 2015 (Current Biology), where participants continue to walk at a non-optimal speed after perturbations unless they get trained on multiple conditions to learn the new landscape of solutions. Perhaps the authors could discuss their work within this kind of interpretation. Do the authors predict that this relation would change with extensive practice either within the current conditions or with further exploration of the new task landscape? For example, if more than one target was used in the adaptation phase of the experiment?

      On the other hand, if the adaptation follows the solution of minimum hand movement and therefore potentially effort, this provides a completely different interpretation.

      Overall, I would find the results even more compelling if the same perturbations applied to movements to all of the targets and produced similar adaptation profiles. The question is to what degree the results derive from only providing a small subset of the environment to explore.

    2. Reviewer #3 (Public Review):

      Summary:

      This study explored how the motor system adapts to new environments by modifying redundant body movements. Using a novel bimanual stick manipulation task, participants manipulated a virtual stick to reach targets, focusing on how tip-movement direction perturbations affected both tip movement and stick-tilt adaptation. The findings indicated a consistent strategy among participants who flexibly adjusted the tilt angle of the stick in response to errors. The adaptation patterns are influenced by physical space relationships, guiding the motor system's choice of movement patterns. Overall, this study highlights the adaptability of the motor system through changes in redundant body movement patterns.

      Strengths:

      This paper introduces a novel bimanual stick manipulation task to investigate how the motor system adapts to novel environments by altering the movement patterns of our redundant body.

      Weaknesses:

      The generalizability of the findings is quite limited. It would have been interesting to see if the same relationships were held for different stick lengths (i.e., the hands positioned at different start locations along the virtual stick) or when reaching targets to the left and right of a start position, not just at varying angles along one side. Alternatively, this study would have benefited from a more thorough investigation of the existing literature on redundant systems instead of primarily focusing on the lack of redundancy in endpoint-reaching tasks. Although the novel task expands the use of endpoint robots in motor control studies, the utility of this task for exploring motor control and learning may be limited.

    3. eLife assessment

      This study presents a valuable finding on how the sensorimotor control system deals with redundancy within our body, based on a novel bimanual task. The evidence supporting the authors' claims is convincing, as demonstrated over four different experiment. The work will be of interest to researchers from the motor control community and related fields, and further investigation into the interpretation of the findings could increase the generalisation of the study to a broader audience.

    4. Reviewer #1 (Public Review):

      Summary/Strengths:

      This manuscript describes a stimulating contribution to the field of human motor control. The complexity of control and learning is studied with a new task offering a myriad of possible coordination patterns. Findings are original and exemplify how baseline relationships determine learning.

      Weaknesses:

      A new task is presented: it is a thoughtful one, but because it is a new one, the manuscript section is filled with relatively new terms and acronyms that are not necessarily easy to rapidly understand.

      First, some more thoughts may be devoted to the take-home message. In the title, I am not sure manipulating a stick with both hands is a key piece of information. Also, the authors appear to insist on the term 'implicit', and I wonder if it is a big deal in this manuscript and if all the necessary evidence appears in this study that control and adaptation are exclusively implicit. As there is no clear comparison between gradual and abrupt sessions, the authors may consider removing at least from the title and abstract the words 'implicit' and 'implicitly'. Most importantly, the authors may consider modifying the last sentence of the abstract to clearly provide the most substantial theoretical advance from this study.

      It seems that a substantial finding is the 'constraint' imposed by baseline control laws on sensorimotor adaptation. This seems to echo and extend previous work of Wu, Smith et al. (Nat Neurosci, 2014): their findings, which were not necessarily always replicated, suggested that the more participants were variable in baseline, the better they adapted to a systematic perturbation. The authors may study whether residual errors are smaller or adaptation is faster for individuals with larger motor variability in baseline. Unfortunately, the authors do not present the classic time course of sensorimotor adaptation in any experiment. The adaptation is not described as typically done: the authors should thus show the changes in tip movement direction and stick-tilt angle across trials, and highlight any significant difference between baseline, early adaptation, and late adaptation, for instance. I also wonder why the authors did not include a few no-perturbation trials after the exposure phase to study after-effects in the study design: it looks like a missed opportunity here. Overall, I think that showing the time course of adaptation is necessary for the present study to provide a more comprehensive understanding of that new task, and to re-explore the role of motor variability during baseline for sensorimotor adaptation.

      The distance between hands was fixed at 15 cm with the Kinarm instead of a mechanical constraint. I wonder how much this distance varied and more importantly whether from that analysis or a force analysis, the authors could determine whether one hand led the other one in the adaptation.

      I understand the distinction between task- and end-effector irrelevant perturbation, and at the same time results show that the nervous system reacts to both types of perturbation, indicating that they both seem relevant or important. In line 32, the errors mentioned at the end of the sentence suggest that adaptation is in fact maladaptive. I think the authors may extend the Discussion on why adaptation was found in the experiments with end-effector irrelevant and especially how an internal (forward) model or a pair of internal (forward) models may be used to predict both the visual and the somatosensory consequences of the motor commands.

    1. eLife assessment

      This important work advances our understanding of how memories interact to facilitate or interfere with one another, also informing our understanding of how humans build knowledge. The study provides convincing evidence that semantic relatedness proactively benefits memory using clean experimental design, rigorous statistics, large N samples, and well-characterized stimuli. The study also demonstrates the boundaries of these proactive benefits, showing that when studied items have weaker semantic relationships, proactive interference may be observed. This research will be of interest to memory researchers as well as cognitive psychologists, neuroscientists, and educators more broadly.

    2. Reviewer #1 (Public Review):

      Summary:

      Bennion and colleagues present a careful examination of how an earlier set of memories can either interfere with or facilitate memories formed later. This impressive work is a companion piece to an earlier paper by Antony and colleagues (2022) in which a similar experimental design was used to examine how a later set of memories can either interfere with or facilitate memories formed earlier. This study makes contact with an experimental literature spanning 100 years, which is concerned with the nature of forgetting, and the ways in which memories for particular experiences can interact with other memories. These ideas are fundamental to modern theories of human memory, for example, paired-associate studies like this one are central to the theoretical idea that interference between memories is a much bigger contributor to forgetting than any sort of passive decay.

      Strengths:

      At the heart of the current investigation is a proposal made by Osgood in the 1940s regarding how paired associates are learned and remembered. In these experiments, one learns a pair of items, A-B (cue-target), and then later learns another pair that is related in some way, either A'-B (changing the cue, delta-cue), or A-B' (changing the target, delta-target), or A'-B' (changing both, delta-both), where the prime indicates that item has been modified, and may be semantically related to the original item. The authors refer to the critical to-be-remembered pairs as base pairs. Osgood proposed that when the changed item is very different from the original item there will be interference, and when the changed item is similar to the original item there will be facilitation. Osgood proposed a graphical depiction of his theory in which performance was summarized as a surface, with one axis indicating changes to the cue item of a pair and the other indicating changes to the target item, and the surface itself necessary to visualize the consequences of changing both.

      In the decades since Osgood's proposal, there have been many studies examining slivers of the proposal, e.g., just changing targets in one experiment, just changing cues in another experiment. Because any pair of experiments uses different methods, this has made it difficult to draw clear conclusions about the effects of particular manipulations.

      The current paper is a potential landmark, in that the authors manipulate multiple fundamental experimental characteristics using the same general experimental design. Importantly, they manipulate the semantic relatedness of the changed item to the original item, the delay between the study experience and the test, and which aspect of the pair is changed. Furthermore, they include both a positive control condition (where the exact same pair is studied twice), and a negative control condition (where a pair is only studied once, in the same phase as the critical base pairs). This allows them to determine when the prior learning exhibits an interfering effect relative to the negative control condition and also allows them to determine how close any facilitative effects come to matching the positive control.

      The results are interpreted in terms of a set of existing theories, most prominently the memory-for-change framework, which proposes a mechanism (recursive reminding) potentially responsible for the facilitative effects examined here. One of the central results is the finding that a stronger semantic relationship between a base pair and an earlier pair has a facilitative effect on both the rate of learning of the base pair and the durability of the memory for the base pair. This is consistent with the memory-for-change framework, which proposes that this semantic relationship prompts retrieval of the earlier pair, and the two pairs are integrated into a common memory structure that contains information about which pair was studied in which phase of the experiment. When semantic relatedness is lower, they more often show interference effects, with the idea being that competition between the stored memories makes it more difficult to remember the base pair.

      This work represents a major methodological and empirical advance for our understanding of paired-associates learning, and it sets a laudably high bar for future work seeking to extend this knowledge further. By manipulating so many factors within one set of experiments, it fills a gap in the prior literature regarding the cognitive validity of an 80-year-old proposal by Osgood. The reader can see where the observed results match Osgood's theory and where they are inconclusive. This gives us insight, for example, into the necessity of including a long delay in one's experiment, to observe potential facilitative effects. This point is theoretically interesting, but it is also a boon for future methodological development, in that it establishes the experimental conditions necessary for examining one or another of these facilitation or interference effects more closely.

      Weaknesses:

      One minor weakness of the work is that the overarching theoretical framing does not necessarily specify the expected result for each and every one of the many effects examined. For example, with a narrower set of semantic associations being considered (all of which are relatively high associations) and a long delay, varying the semantic relatedness of the target item did not reliably affect the memorability of that pair. However, the same analysis showed a significant effect when the wider set of semantic associations was used. The positive result is consistent with the memory-for-change framework, but the null result isn't clearly informative to the theory. I call this a minor weakness because I think the value of this work will grow with time, as memory researchers and theorists use it as a benchmark for new theory development. For example, the data from these experiments will undoubtedly be used to develop and constrain a new generation of computational models of paired-associates learning.

    3. Reviewer #2 (Public Review):

      Summary:

      The study focuses on how relatedness with existing memories affects the formation and retention of new memories. Of core interest were the conditions that determine when prior memories facilitate new learning or interfere with it. Across a set of experiments that varied the degree of relatedness across memories as well as retention interval, the study compellingly shows that relatedness typically leads to proactive facilitation of new learning, with interference only observed under specific conditions and immediate test and being thus an exception rather than a rule.

      Strengths:

      The study uses a well-established word-pair learning paradigm to study interference and facilitation of overlapping memories. However it goes more in-depth than a typical interference study in the systematic variation of several factors: (1) which elements of an association are overlapping and which are altered (change target, change cue, change both, change neither); (2) how much the changed element differs from the original (word relatedness, with two ranges of relatedness considered); (3) retention period (immediate test, 2-day delay). Furthermore, each experiment has a large N sample size, so both significant effects as well as null effects are robust and informative.

      The results show the benefits of relatedness, but also replicate interference effects in the "change target" condition when the new target is not related to the old target and when the test is immediate. This provides a reconciliation of some existing seemingly contradictory results on the effect of overlap on memory. Here, the whole range of conditions is mapped to convincingly show how the direction of the effect can flip across the surface of relatedness values.

      Additional strength comes from supporting analyses, such as analyses of learning data, demonstrating that relatedness leads to both better final memory and also faster initial learning.<br /> More broadly, the study informs our understanding of memory integration, demonstrating how the interdependence of memory for related information increases with relatedness. Together with a prior study or retroactive interference and facilitation, the results provide new insights into the role of reminding in memory formation.

      In summary, this is a highly rigorous body of work that sets a great model for future studies and improves our understanding of memory organization.

      Weaknesses:

      The evidence for the proactive facilitation driven by relatedness is very convincing. However, in the finer scale results, the continuous relationship between the degree of relatedness and the degree of proactive facilitation/interference is less clear. This could be improved with some additional analyses and/or context and discussion. In the narrower range, the measure used was AS, with values ranging from 0.03-0.98, where even 0.03 still denotes clearly related words (pious - holy). Within this range from "related" to "related a lot", no relationship to the degree of facilitation was found. The wider range results are reported using a different scale, GloVe, with values from -0.14 to 0.95, where the lower end includes unrelated words (sap - laugh). It is possible that any results of facilitation/interference observed in the wider range may be better understood as a somewhat binary effect of relatedness (yes or no) rather than the degree of relatedness, given the results from the narrower condition. These two options could be more explicitly discussed. The report would benefit from providing clearer information about these measures and their range and how they relate to each other (e.g., not a linear transformation). It would be also helpful to know how the values reported on the AS scale would end up if expressed in the GloVe scale (and potentially vice-versa) and how that affects the results. Currently, it is difficult to assess whether the relationship between relatedness and memory is qualitative or quantitative. This is less of a problem with interdependence analyses where the results converge across a narrow and wider range.

      A smaller weakness is generalizability beyond the word set used here. Using a carefully crafted stimulus set and repeating the same word pairings across participants and conditions was important for memorability calculations and some of the other analyses. However, highlighting the inherently noisy item-by-item results, especially in the Osgood-style surface figures, makes it challenging to imagine how the results would generalize to new stimuli, even within the same relatedness ranges as the current stimulus sets.

    4. Reviewer #3 (Public Review):

      Summary:

      Bennion et al. investigate how semantic relatedness proactively benefits the learning of new word pairs. The authors draw predictions from Osgood (1949), which posits that the degree of proactive interference (PI) and proactive facilitation (PF) of previously learned items on to-be-learned items depends on the semantic relationships between the old and new information. In the current study, participants learn a set of word pairs ("supplemental pairs"), followed by a second set of pairs ("base pairs"), in which the cue, target, or both words are changed, or the pair is identical. Pairs were drawn from either a narrower or wider stimulus set and were tested after either a 5-minute or 48-hour delay. The results show that semantic relatedness overwhelmingly produces PF and greater memory interdependence between base and supplemental pairs, except in the case of unrelated pairs in a wider stimulus set after a short delay, which produced PI. In their final analyses, the authors compare their current results to previous work from their group studying the analogous retroactive effects of semantic relatedness on memory. These comparisons show generally similar, if slightly weaker, patterns of results. The authors interpret their results in the framework of recursive reminders (Hintzman, 2011), which posits that the semantic relationships between new and old word pairs promote reminders of the old information during the learning of the new to-be-learned information. These reminders help to integrate the old and new information and result in additional retrieval practice opportunities that in turn improve later recall.

      Strengths:

      Overall, I thought that the analyses were thorough and well-thought-out and the results were incredibly well-situated in the literature. In particular, I found that the large sample size, inclusion of a wide range of semantic relatedness across the two stimulus sets, variable delays, and the ability to directly compare the current results to their prior results on the retroactive effects of semantic relatedness were particular strengths of the authors' approach and make this an impressive contribution to the existing literature. I thought that their interpretations and conclusions were mostly reasonable and included appropriate caveats (where applicable).

      Weaknesses:

      Although I found that the paper was very strong overall, I have three main questions and concerns about the analyses.

      My first concern lies in the use of the narrow versus wider stimulus sets. I understand why the initial narrow stimulus set was defined using associative similarity (especially in the context of their previous paper on the retroactive effects of semantic similarity), and I also understand their rationale for including an additional wider stimulus set. What I am less clear on, however, is the theoretical justification for separating the datasets. The authors include a section combining them and show in a control analysis that there were no directional effects in the narrow stimulus set. The authors seem to imply in the Discussion that they believe there are global effects of the lower average relatedness on differing patterns of PI vs PF across stimulus sets (lines 549-553), but I wonder if an alternative explanation for some of their conflicting results could be that PI only occurs with pairs of low semantic relatedness between the supplemental and base pair and that because the narrower stimulus set does not include the truly semantically unrelated pairs, there was no evidence of PI.

      My next concern comes from the additive change in both measures (change in Cue + change in Target). This measure is simply a measure of overall change, in which a pair where the cue changes a great deal but the target doesn't change is treated equivalently to a pair where the target changes a lot, but the cue does not change at all, which in turn are treated equivalently to a pair where the cue and target both change moderate amounts. Given that the authors speculate that there are different processes occurring with the changes in cue and target and the lack of relationship between cue+target relatedness and memorability, it might be important to tease apart the relative impact of the changes to the different aspects of the pair.

      Finally, it is unclear to me whether there was any online spell-checking that occurred during the free recall in the learning phase. If there wasn't, I could imagine a case where words might have accidentally received additional retrieval opportunities during learning - take for example, a case where a participant misspelled "razor" as "razer." In this example, they likely still successfully learned the word pair but if there was no spell-checking that occurred during the learning phase, this would not be considered correct, and the participant would have had an additional learning opportunity for that pair.

    1. Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Metabolic heterogeneity of colorectal cancer as a prognostic factor: insights gained from fluorescence lifetime imaging" by Komarova et al., the authors used fluorescence lifetime imaging and quantitative analysis to assess the metabolic heterogeneity of colorectal cancer. Generally, this work is logically well-designed, including in vitro and in vivo animal models and ex vivo patient samples. However, since the key parameter presented in this study, the BI index, is already published in a previous paper by this group (Shirshin et al., 2022), and the quantification method of metabolic heterogeneity has already been well (and even better) described in previous studies (such as the one by Heaster et al., 2019), the novelty of this study is doubted. Moreover, I am afraid that the way of data analysis and presentation in this study is not well done, which will be mentioned in detail in the following sections.

      Strengths:

      (1) Solid experiments are performed and well-organized, including in vitro and in vivo animal models and ex vivo patient samples.

      (2) Attempt and efforts to build the association between the metabolic heterogeneity and prognosis for colorectal cancer.

      Weaknesses:

      (1) The human sample number (from 21 patients) is very limited. I wonder how the limited patient number could lead to reliable diagnosis and prognosis;.

      (2) The BI index or similar optical metrics have been well established by this and other groups; therefore, the novelty of this study is doubted.

    2. eLife assessment

      This study presents a valuable finding on the heterogeneity of tumour metabolism using fluorescence lifetime imaging, measured across 4 cell lines, 4 tumour types of in vivo mouse models, and 21 patient samples. The indication is that the level of heterogeneity of cellular metabolism increases with model complexity and demonstrates high heterogeneity at a clinical level. The evidence supporting the claims of the authors is solid, although the inclusion of a larger number of patient samples would have strengthened the study. The work will be of interest to medical biologists developing methods for quantifying metabolic heterogeneity.

    3. Reviewer #1 (Public Review):

      Summary:

      In this study, Komarova et al. investigate the clinical prognostic ability of cell-level metabolic heterogeneity quantified via the fluorescence lifetime characteristics of NAD(P)H. Fluorescence lifetime imaging microscopy (FLIM) has been studied as a minimally invasive approach to measure cellular metabolism in live cell cultures, organoids, and animal models. Its clinical translation is spearheaded through macroscopic implementation approaches that are capable of large sampling areas and enable access to otherwise constrained spaces but lack cellular resolution for a one-to-one transition with traditional microscopy approaches, making the interpretation of the results a complicated task. The merit of this study primarily lies in its design by analyzing with the same instrumentation and approach colorectal samples in different research scenarios, namely in vitro cells, in vivo animal xenografts, and tumor tissue from human patients. These conform to a valuable dataset to explore the translational interpretation hurdles with samples of increasing levels of complexity. For human samples, the study specifically investigates the prediction ability of NAD(P)H fluorescence metrics for the binary classification of tumors of low and advanced stage, with and without metastasis, and low and high grade. They find that NAD(P)H fluorescence properties have a strong potential to distinguish between high- and low-grade tumors and a moderate ability to distinguish advanced-stage tumors from low-stage tumors. This study provides valuable results contributing to the deployment of minimally invasive optical imaging techniques to quantify tumor properties and potentially migrate into tools for human tumor characterization and clinical diagnosis.

      Strengths:

      The investigation of colorectal samples under multiple imaging scenarios with the same instrument and approach conforms to a valuable dataset that can facilitate the interpretation of results across the spectrum of sample complexity.

      The manuscript provides a strong discussion reviewing studies that investigated cellular metabolism with FLIM and the metabolic heterogeneity of colorectal cancer in general.

      The authors do a thorough acknowledgement of the experimental limitations of investigating human samples ex vivo, and the analytical limitation of manual segmentation, for which they provide a path forward for higher throughput analysis.

      Weaknesses:

      To substantiate the changes in fluorescence properties at the examined wavelength range (associated with NAD(P)H fluorescence) in relationship to metabolism, the study would strongly benefit from additional quantification of metabolic-associated metrics using currently established standard methods. This is especially interesting when discussing heterogeneity, which is presumably high within and between patients with colorectal cancer, and could help explain the particularities of each sample leading to a more in-depth analysis of the acquired valuable dataset. Additionally, NAD(P)H fluorescence does not provide a complete picture of the cell/tissue metabolic characteristics. Including, or discussing the implications of including fluorescence from flavins would comprise a more compelling dataset. These additional data would also enable the quantification of redox metrics, as briefly mentioned, which could positively contribute to the prognosis potential of metabolic heterogeneity.

      In the current form of the manuscript, there is a diluted interpretation and discussion of the results obtained from the random forest and SHAP analysis regarding the ability of the FLIM parameters to predict clinicopathological outcomes. This is, not only the main point the authors are trying to convey given the title and the stated goals, but also a novel result given the scarce availability of these type of data, which could have a remarkable impact on colorectal cancer in situ diagnosis and therapy monitoring. These data merit a more in-depth analysis of the different factors involved. In this context, the authors should clarify how is the "trend of association" quantified (lines 194 and 199).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Strengths:

      This work (almost didactically) demonstrates how to develop, calibrate, validate and analyze a comprehensive, spatially resolved, dynamical, multicellular model. Testable model predictions of (also non-monotonic) emergent behaviors are derived and discussed. The computational model is based on a widely-used simulation platform and shared openly such that it can be further analyzed and refined by the community.

      Weaknesses:

      While the parameter estimation approach is sophisticated, this work does not address issues of structural and practical non-identifiability (Wieland et al., 2021, DOI:10.1016/j.coisb.2021.03.005) of parameter values, given just tissue-scale summary statistics, and does not address how model predictions might change if alternative parameter combinations were used. Here, the calibrated model represents one point estimate (column "Value" in Suppl. Table 1) but there is specific uncertainty of each individual parameter value and such uncertainties need to be propagated (which is computationally expensive) to the model predictions for treatment scenarios.

      We thank the reviewer for the excellent suggestions and observations. The CaliPro parameterization technique applied puts an emphasis on finding a robust parameter space instead of a global optimum. To address structural non-identifiability, we utilized partial rank correlation coefficient with each iteration of the calibration process to ensure that the sensitivity of each parameter was relevant to model outputs. We also found that there were ranges of parameter values that would achieve passing criteria but when testing the ranges in replicate resulted in inconsistent outcomes. This led us to further narrow the parameters into a single parameter set that still had stochastic variability but did not have such large variability between replicate runs that it would be unreliable. Additional discussion on this point has been added to lines 623-628. We acknowledge that there are likely other parameter sets or model rules that would produce similar outcomes but the main purpose of the model was to utilize it to better understand the system and make new predictions, which our calibration scheme allowed us to accomplish.

      Regarding practical non-identifiability, we acknowledge that there are some behaviors that are not captured in the model because those behaviors were not specifically captured in the calibration data. To ensure that the behaviors necessary to answer the aims of our paper were included, we used multiple different datasets and calibrated with multiple different output metrics. We believe we have identified the appropriate parameters to recapitulate the dominating mechanisms underlying muscle regeneration. We have added additional discussion on practical non-identifiability to lines 621-623.

      Suggested treatments (e.g. lines 484-486) are modeled as parameter changes of the endogenous cytokines (corresponding to genetic mutations!) whereas the administration of modified cytokines with changed parameter values would require a duplication of model components and interactions in the model such that cells interact with the superposition of endogenous and administered cytokine fields. Specifically, as the authors also aim at 'injections of exogenously delivered cytokines' (lines 578, 579) and propose altering decay rates or diffusion coefficients (Fig. 7), there needs to be a duplication of variables in the model to account for the coexistence of cytokine subtypes. One set of equations would have unaltered (endogenous) and another one have altered (exogenous or drugged) parameter values. Cells would interact with both of them.

      Our perturbations did not include delivery of exogenously delivered cytokines and instead were focused on microenvironmental changes in cytokine diffusion and decay rates or specific cytokine concentration levels. For example, the purpose of the VEGF delivery perturbation was to test how an increase in VEGF concentrations would alter regeneration outcome metrics with the assumption that the delivered VEGF would act in the same manner as the endogenous VEGF. We have clarified the purpose of the simulations on line 410. We agree that exploring if model predictions would be altered if endogenous and exogenous were represented separately; however, we did not explore this type of scenario.

      This work shows interesting emergent behavior from nonlinear cytokine interactions but the analysis does not provide insights into the underlying causes, e.g. which of the feedback loops dominates early versus late during a time course.

      Indeed, analyzing the model to fully understand the time-varying interactions between the multiple feedback loops is a challenge in and of itself, and we appreciate the opportunity to elaborate on our approach to addressing this challenge. First: the crosstalk/feedback between cytokines and the temporal nature was analyzed in the heatmap (Fig. 6) and lines 474-482. Second: the sensitivity of cytokine parameters to specific outputs was included in Table 9 and full-time course sensitivity is included in Supplemental Figure 2. Further correlation analysis was also included to demonstrate how cytokine concentrations influenced specific output metrics at various timepoints (Supplemental Fig. 3). We agree that further elaboration of these findings is required; therefore, we added lines 504-509 to discuss the specific mechanisms at play with the combined cytokine interactions. We also added more discussion (lines 637-638) regarding future work that could develop more analysis methods to further investigate the complex behaviors in the model.

      Reviewer #2 (Public Review):

      Strengths:

      The manuscript identified relevant model parameters from a long list of biological studies. This collation of a large amount of literature into one framework has the potential to be very useful to other authors. The mathematical methods used for parameterization and validation are transparent.

      Weaknesses:>

      I have a few concerns which I believe need to be addressed fully.

      My main concerns are the following:

      (1) The model is compared to experimental data in multiple results figures. However, the actual experiments used in these figures are not described. To me as a reviewer, that makes it impossible to judge whether appropriate data was chosen, or whether the model is a suitable descriptor of the chosen experiments. Enough detail needs to be provided so that these judgements can be made.

      Thank you for raising this point. We created a new table (Supplemental table 6) that describes the techniques used for each experimental measurement.

      (2) Do I understand it correctly that all simulations are done using the same initial simulation geometry? Would it be possible to test the sensitivity of the paper results to this geometry? Perhaps another histological image could be chosen as the initial condition, or alternative initial conditions could be generated in silico? If changing initial conditions is an unreasonably large request, could the authors discuss this issue in the manuscript?

      We appreciate your insightful question regarding the initial simulation geometry in our model. The initial configuration of the fibers/ECM/microvascular structures was kept consistent but the location of the necrosis was randomly placed for each simulation. Future work will include an in-depth analysis of altered histology configuration on model predictions which has been added to lines 618-621. We did a preliminary example analysis by inputting a different initial simulation geometry, which predicted similar regeneration outcomes. We have added Supplemental Figure 5 that provides the results of that example analysis.

      (3) Cytokine knockdowns are simulated by 'adjusting the diffusion and decay parameters' (line 372). Is that the correct simulation of a knockdown? How are these knockdowns achieved experimentally? Wouldn't the correct implementation of a knockdown be that the production or secretion of the cytokine is reduced? I am not sure whether it's possible to design an experimental perturbation which affects both parameters.

      We appreciate that this important question has been posed. Yes, in order to simulate the knockout conditions, the cytokine secretion was reduced/eliminated. The diffusion and decay parameters were also adjusted to ensure that the concentration within the system was reduced. Lines 391-394 were added to clarify this assumption.

      (4) The premise of the model is to identify optimal treatment strategies for muscle injury (as per the first sentence of the abstract). I am a bit surprised that the implemented experimental perturbations don't seem to address this aim. In Figure 7 of the manuscript, cytokine alterations are explored which affect muscle recovery after injury. This is great, but I don't believe the chosen alterations can be done in experimental or clinical settings. Are there drugs that affect cytokine diffusion? If not, wouldn't it be better to select perturbations that are clinically or experimentally feasible for this analysis? A strength of the model is its versatility, so it seems counterintuitive to me to not use that versatility in a way that has practical relevance. - I may well misunderstand this though, maybe the investigated parameters are indeed possible drug targets.

      Thank you for your thoughtful feedback. The first sentence (lines 32-34) of the abstract was revised to focus on beneficial microenvironmental conditions to best reflect the purpose of the model. The clinical relevance of the cytokine modifications is included in the discussion (lines 547-558) with additional information added to lines 524-526. For example, two methods to alter diffusion experimentally are: antibodies that bind directly to the cytokine to prevent it from binding to its receptor on the cell surface and plasmins that induce the release of bound cytokines.

      (5) A similar comment applies to Figure 5 and 6: Should I think of these results as experimentally testable predictions? Are any of the results surprising or new, for example in the sense that one would not have expected other cytokines to be affected as described in Figure 6?

      We appreciate the opportunity to clarify the basis for these perturbations. The perturbations included in Figure 5 were designed to mimic the conditions of a published experiment that delivered VEGF in vivo (Arsic et al. 2004, DOI:10.1016/J.YMTHE.2004.08.007). The perturbation input conditions and experimental results are included in Table 8 and Supplemental Table 6 has been added to include experimental data and method description of the perturbation. The results of this analysis provide both validation and new predictions, because some the outputs were measured in the experiments while others were not measured. The additional output metrics and timepoints that were not collected in the experiment allow for a deeper understanding of the dynamics and mechanisms leading to the changes in muscle recovery (lines 437-454). These model outputs can provide the basis for future experiments; for example, they highlight which time points would be more important to measure and even provide predicted effect sizes that could be the basis for a power analysis (lines 639-640).

      Regarding Figure 6, the published experimental outcomes of cytokine KOs are included in Table 8. The model allowed comparison of different cytokine concentrations at various timepoints when other cytokines were removed from the system due to the KO condition. The experimental results did not provide data on the impact on other cytokine concentrations but by using the model we were able to predict temporally based feedback between cytokines (lines 474-482). These cytokine values could be collected experimentally but would be time consuming and expensive. The results of these perturbations revealed the complex nature of the relationship between cytokines and how removal of one cytokine from the system has a cascading temporal impact. Lines 533-534 have been added to incorporate this into the discussion.

      (6) In figure 4, there were differences between the experiments and the model in two of the rows. Are these differences discussed anywhere in the manuscript?

      We appreciate your keen observation and the opportunity to address these differences. The model did not match experimental results for CSA output in the TNF KO and antiinflammatory nanoparticle perturbation or TGF levels with the macrophage depletion. While it did align with the other experimental metrics from those studies, it is likely that there are other mechanisms at play in the experimental conditions that were not captured by simulating the downstream effects of the experimental perturbations. We have added discussion of the differences to lines 445-454.

      (7) The variation between experimental results is much higher than the variation of results in the model. For example, in Figure 3 the error bars around experimental results are an order of magnitude larger than the simulated confidence interval. Do the authors have any insights into why the model is less variable than the experimental data? Does this have to do with the chosen initial condition, i.e. do you think that the experimental variability is due to variation in the geometries of the measured samples?

      Thank you for your insightful observations and questions. The lower model variability is attributed to the larger sample size of model simulations compared to experimental subjects. By running 100 simulations it narrows in the confidence interval (average 2.4 and max 3.3) compared to the experiments that typically had a sample size of less than 15. If the number of simulations had been reduced to 15 the stochasticity within the model results in a larger confidence interval (average 7.1 and max 10). There are also several possible confounding variables in the experimental protocols (i.e. variations in injury, different animal subjects for each timepoint, etc.) that are kept constant in the model simulation. We have added discussion of this point to the manuscript (lines 517519). Future work with the model will examine how variations in conditions, such as initial muscle geometry, injury, etc, alter regeneration outcomes and overall variability. This discussion has been incorporated into lines 640-643.

      (8) Is figure 2B described anywhere in the text? I could not find its description.

      Thank you for pointing that out. We have added a reference for Fig. 2B on line 190.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The model code seems to be available from https://simtk.org/projects/muscle_regen but that website requests member status ("This is a private project. You must be a member to view its contents.") and applying for membership could violate eLife's blind review process. So, this reviewer liked to but couldn't run the model her/himself. To eLife: Can the authors upload their model to a neutral server that reviewers and editors can access anonymously?

      The code has been made publicly available on the following sites:

      SimTK: https://simtk.org/docman/?group_id=2635

      Zendo: https://zenodo.org/records/10403014

      GitHub: https://github.com/mh2uk/ABM-of-Muscle-Regeneration-with-MicrovascularRemodeling

      Line 121 has been updated with the new link and the additional resources were added to lines 654-657.

      (2) The muscle regeneration field typically studies 2D cross-sections and the present model can be well compared to these other 2D models but cells as stochastic and localized sources of diffusible cytokines may yield different cytokine fields in 3D vs. 2D. I would expect more broadened and smoothened cytokine fields (from sources in neighboring cross-sections) than what the 2D model predicts based on sources just within the focus cross-section. Such relations of 2D to 3D should be discussed.

      We thank the reviewer for the excellent suggestions and observations. It has been reported in other Compucell3D models (Sego et al. 2017, DOI:10.1088/17585090/aa6ed4) that the convergence of diffusion solutions between 2D and 3D model configurations had similar outcomes, with the 3D simulations presenting excessive computational cost without contributing any noticeable additional accuracy. Similarly, other cell-based ABMs that incorporate diffusion mechanisms (Marino et al. 2018, DOI:10.3390/computation6040058) have found that 2D and 3D versions of the model both predict the same mechanisms and that the 2D resolution was sufficient for determining outcomes. Lines 615-618 were added to elaborate on this topic.

      (3) Since the model (and title) focuses on "nonlinear" cytokine interactions, what would change if cytokine decay would not be linear (as modeled here) but saturated (with nonlinear Michaelis-Menten kinetics as ligand binding and endocytosis mechanisms would call for)?

      Thank you for raising an intriguing point. The model includes a combination of cytokine decay as well as ligand binding and endocytosis mechanisms that can be saturated. For a cytokine-dependent model behavior to occur the cytokines necessary to induce that action had to reach a minimum threshold. Once that threshold was reached, that amount of the cytokine would be removed at that location to simulate ligand-receptor binding and endocytosis. These ligand binding and endocytosis mechanisms behave in a saturated way, removing a set amount when above a certain threshold or a defined ratio when under the threshold. Lines 313-315 was revised to clarify this point. There were certain concentrations of cytokines where we saw a plateau in outputs likely as a result of reaching a saturation threshold (Supplemental Fig. 3). In future work, more robust mathematical simulation of binding kinetics of cytokines (e.g., using ODEs) could be included.

      (4) Limitations of the model should be discussed together with an outlook for model refinement. For example, fiber alignment and ECM ultrastructure may require anisotropic diffusion. Many of the rate equations could be considered with saturation parameters etc. There are so many model assumptions. Please discuss which would be the most urgent model refinements and, to achieve these, which would be the most informative next experiments to perform.

      We appreciate your thoughtful consideration of the model's limitations and the need for a comprehensive discussion on model refinements and potential future experiments. The future direction section was expanded to discuss additional possible model refinements (lines 635-643) and additional possible experiments for model validation (lines 630-634).

      (5) It is not clear how the single spatial arrangement that is used affects the model predictions. E.g. now the damaged area surrounds the lymphatic vessel but what if the opposite corner was damaged and the lymphatic vessel is deep inside the healthy area?

      Thank you for highlighting the importance of considering different spatial arrangements in the model and its potential impact on predictions. We previously tested model perturbations that included specifying the injury surrounding the lymphatic vessel versus on the side opposite the vessel. Since this paper focuses more on cytokine dynamics, we plan to include this perturbation, along with other injury alterations, in a follow-on paper. We added more context about this in the future efforts section lines 640-643.

      (6) It seems that not only parameter values but also the initial values of most of the model components are unknown. The parameter estimation strategy does not seem to include the initial (spatial) distributions of collagen and cytokines and other model components. Please discuss how other (reasonable) initial values or spatial arrangements will affect model predictions.

      We appreciate your thoughtful consideration of unknown initial values/spatial arrangements and their potential influence on predictions. Initial cytokine levels prior to injury had a low relative concentration compared to levels post injury and were assumed to be negligible. Initial spatial distribution of cytokines was not defined as initial spatial inputs (except in knockout simulations) but are secreted from cells (with baseline resident cell counts defined from the literature). The distribution of cytokines is an emergent behavior that results from the cell behaviors within the model. The collagen distribution is altered in response to clearance of necrosis by the immune cells (decreased collagen with necrosis removal) and subsequent secretion of collagen by fibroblasts. The secretion of collagen from fibroblast was included in the parameter estimation sweep (Supplemental Table 1).

      We are working on further exploring the model sensitivity to altered spatial arrangements and have added this to the future directions section (lines 618-621), as well as provided Supplemental Figure 5 to demonstrate that model outcomes are similar with altered initial spatial arrangements.

      (7) Many details of the CC3D implementation are missing: overall lattice size, interaction neighborhood order, and "temperature" of the Metropolis algorithm. Are the typical adhesion energy terms used in the CPM Hamiltonian and if so, then how are these parameter values estimated?

      Thank you for bringing attention to the missing details regarding the CC3D implementation in our manuscript. We have included supplemental information providing greater detail for CPM implementation (Lines 808-854). We also added two additional supplemental tables for describing the requested CC3D implementation details (Supplemental Table 4) and adhesion energy terms (Supplemental Table 5).

      (8) Extending the model analysis of combinations of altered cytokine properties, which temporal schedules of administration would be of interest, and how could the timing of multiple interventions improve outcomes? Such a discussion or even analysis would further underscore the usefulness of the model.

      In response to your valuable suggestion, lines 558-562 were added to discuss the potential of using the model as a tool to perturb different cytokine combinations at varying timepoints throughout regeneration. In addition, this is also included in future work in lines 636-637.

      (9) The CPM is only weakly motivated, just one sentence on lines 142-145 which mentions diffusion in a misleading way as the CPM just provides cells with a shape and mechanical interactions. The diffusion part is a feature of the hybrid CompuCell3D framework, not the CPM.

      Thank you for bringing up this distinction. We removed the statement regarding diffusion and updated lines 143-146 to focus on CPM representation of cellular behavior and interactions. We also added a reference to supplemental text that includes additional details on CPM.

      (10) On lines 258-261 it does not become clear how the described springs can direct fibroblasts towards areas of low-density collagen ECM. Are the lambdas dependent on collagen density?

      Thank you for highlighting this area for clarification. The fibroblasts form links with low collagen density ECM and then are pulled towards those areas based on a constant lambda value. The links between the fibroblast and the ECM will only be made if the collagen is below a certain threshold. We added additional clarification to lines 260-264.

      (11) On line 281, what does the last part in "Fibers...were regenerating but not fully apoptotic cells" mean? Maybe rephrase this.

      The last of part of that line indicates that there were some fibers surrounding the main injury site that were damaged but still had healthy portions, indicating that they were impacted by the injury and are regenerating but did not become fully apoptotic like the fiber cells at the main site of injury. We rephrased this line to indicate that the nearby fibers were damaged but not fully apoptotic.

      (12) Lines 290-293 describe interactions of cells and fields with localized structures (capillaries and lymphatic vessel). Please explain in more detail how "capillary agents...transport neutrophiles and monocytes" in the CPM model formalism. Are new cells added following rules? How is spatial crowding of the lattice around capillaries affecting these rules? Moreover, how can "lymphatic vessel...drain the nearby cytokines and cells"? How is this implemented in the CPM and how is "nearby" calculated? We appreciate your detailed inquiry into the interactions of cells and fields with localized structures. The neutrophils and monocytes are added to the simulation at the lattice sites above capillaries (within the cell layer Fig. 2B) and undergo chemotaxis up their respective gradients. The recruitment of the neutrophils and monocytes are randomly distributed among the healthy capillaries that do not have an immune cell at the capillary location (a modeling artifact that is a byproduct of only having one cell per lattice site). This approach helped to prevent an abundance of crowding at certain capillaries. Because immune cells in the simulation are sufficiently small, chemotactic gradients are sufficiently large, and the simulation space is sufficiently large, we do not see aggregation of recruited immune cells in the CPM.

      The lymphatic vessel uptakes cytokines at lattice locations corresponding to the lymphatic vessel and will remove cells located in lattice sites neighboring the lymphatic vessel. In addition, we have included a rule in our ABM to encourage cells to migrate towards the lymphatic vessel utilizing CompuCell3D External Potential Plugin. The influence of this rule is inversely proportional to the distance of the cells to the lymphatic vessel.

      We have updated lines 294-298 and 305-309 to include the above explanation.

      (13) Tables 1-4 define migration speeds as agent rules but in the typical CPM, migration speed emerges from random displacements biased by chemotaxis and other effects (like the slope of the cytokine field). How was the speed implemented as a rule while it is typically observable in the model?

      We appreciate your inquiry regarding the implementation of migration speeds. To determine the lambda parameters (Table 7) for each cell type, we tested each in a simplified control simulation with a concentration gradient for the cell to move towards. We tuned the lambda parameters within this simulation until the model outputted cell velocity aligned with the literature reported cell velocity for each cell type (Tables 1-4). We have incorporated clarification on this to lines 177-180.

      (14) Line 312 shows the first equation with number (5), either add eqn. (1-4) or renumber.

      We have revised the equation number.

      (15) Typos: Line 456, "expect M1 cell" should read "except M1 cell".

      Line 452, "thresholds above that diminish fibroblast response (Supplemental Fig 3)." remains unclear, please rephrase.

      Line 473, "at 28." should read "at 28 days.".

      Line 474, is "additive" correct? Was the sum of the individual effects calculated and did that match?

      Line 534, "complexity our model" should read "complexity in our model".

      We have corrected the typos and clarified line 452 (updated line 594) to indicate that the TNF-α concentration threshold results in diminished fibroblast response. We updated terminology line 474 (updated line 512) to indicate that there was a synergistic effect with the combined perturbation.

      (16) Table 7 defines cell target volumes with the same value as their diameter. This enforces a strange cell shape. Should there be brackets to square the value of the cell diameter, e.g. Value=(12µm)^2 ?

      The target volume parameter values were selected to reflect the relative differences in average cell diameter as reported in the literature; however, there are no parameters that directly enforce a diameter for the cells in the CPM formalism separate from the volume. We have observed that these relative cell sizes allow the ABM to effectively reproduce cell behaviors described in the literature. Single cells that are too large in the ABM would be unable to migrate far enough per time step to carry out cell behaviors, and cells that are too small in the CPM would be unstable in the simulation environment and not persist in the simulation when they should. We removed the units for the cell shape values in Table 7 since the target volume is a relative parameter and does not directly represent µm.

      (17) Table 7 gives estimated diffusion constants but they appear to be too high. Please compare them to measured values in the literature, especially for MCP-1, TNF-alpha and IL-10, or relate these to their molecular mass and compare to other molecules like FGF8 (Yu et al. 2009, DOI:10.1038/nature08391).

      We utilized a previously published estimation method (Filion et al. 2004, DOI:10.1152/ajpheart.00205.2004) for estimating cytokine diffusivity within the ECM. This method incorporates the molecular masses and accounts for the combined effects of the collagen fibers and glycosaminoglycans. The paper acknowledged that the estimated value is faster than experimentally determined values, but that this was a result of the less-dense matrix composition which is more reflective of the tissue environment we are simulating in contrast to other reported measurements which were done in different environments. Using this estimation method also allowed us to more consistently define diffusion constants versus using values from the literature (which were often not recorded) that had varied experimental conditions and techniques (such as being in zebrafish embryo Yu et al. 2009, DOI:10.1038/nature08391 as opposed to muscle tissue). This also allowed for recalculation of the diffusivity throughout the simulation as the collagen density changed within the model. Lines 318-326 were updated to help clarify the estimation method.

      (18) Many DOIs in the bibliography (Refs. 7,17,20,31,40,47...153) are wrong and do not resolve because the appended directory names are not allowed in the DOI, just with a journal's URL after resolution.

      Thank you for bringing this to our attention. The incorrect DOIs have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      Minor comments:

      (9) On line 174, the authors say "We used the CC3D feature Flip2DimRatio to control the number of times the Cellular-Potts algorithm runs per mcs." What does this mean? Isn't one monte carlo timestep one iteration of the Cellular Potts model? How does this relate to physical timescales?

      We appreciate your attention to detail and thoughtful question regarding the statement about the use of the CC3D feature Flip2DimRatio. Lines 175-177 were revised to simplify the meaning of Flip2DimRatio. That parameter alters the number of times the Cellular-Potts algorithm is run, which is the limiting factor for cell movement. The physical timescale is kept to a 15-minute timestep but a high Flip2DimRatio allows more flexibility and stability to allow the cells to move faster in one timestep.

      (10) Has the costum matlab script to process histology images into initial conditions been made available?

      The Matlab script along with CC3D code for histology initialization with documentation has been made available with the source code on the following sites:

      SimTK: https://simtk.org/docman/?group_id=2635

      Zendo: https://zenodo.org/records/10403014

      GitHub: https://github.com/mh2uk/ABM-of-Muscle-Regeneration-with-MicrovascularRemodeling

      (11) Equation 5 is provided without a reference or derivation. Where does it come from and what does it mean?

      Thank you for highlighting the diffusion equation and seeking clarification on its origin and significance. Lines 318-326 were revised to clarify where the equation comes from. This is a previously published estimation method that we applied to calculate the diffusivity of the cytokines considering both collagen and glycosaminoglycans.

      (12) Line 326: "For CSA, experimental fold-change from pre-injury was compared with fold-change in model-simulated CSA". Does this step rely on the assumption that the fold change will not depend on the CSA? If so, is this something that is experimentally known, or otherwise, can it be confirmed by simulations?

      We appreciate the opportunity to clarify our rationale. The fold change was used as a method to normalize the model and experiment so that they could be compared on the same scale. Yes, this step relies on the assumption that fold change does not depend on pre-injury CSA. Experimentally it is difficult to determine the impact of initial fiber morphology on altered regeneration time course. This fold-change allows us to compare percent recovery which is a common metric utilized to assess muscle regeneration outcomes experimentally. Line 340-343 was revised to clarify.

      (13) Line 355: "The final passing criteria were set to be within 1 SD for CSA recovery and 2.5 SD for SSC and fibroblast count" Does this refer to the experimental or the simulated SD?

      The model had to fit within those experimental SD. Lines 371-372 was edited to specify that we are referring the experimental SD.

      (14) "Following 8 iterations of narrowing the parameter space with CaliPro, we reached a set that had fewer passing runs than the previous iteration". Wouldn't one expect fewer passing runs with any narrowing of the parameter space? Why was this chosen as the stopping criterion for further narrowing?

      We appreciate your observation regarding the statement about narrowing the parameter space with CaliPro. We started with a wide parameter space, expecting that certain parameters would give outputs that fall outside of the comparable data. So, when the parameter space was narrowed to enrich parts that give passing output, initially the number of passing simulations increased.

      Once we have narrowed the set of possible parameters into an ideal parameter space, further narrowing will cut out viable parameters resulting in fewer passing runs. Therefore, we stopped narrowing once any fewer simulations passed the criteria that they had previously passed with the wider parameter set. Lines 375-379 have been updated to clarify this point.

      (15) Line 516: 'Our model could test and optimize combinations of cytokines, guiding future experiments and treatments." It is my understanding that this is communicated as a main strength of the model. Would it be possible to demonstrate that the sentence is true by using the model to make actual predictions for experiments or treatments?

      This is demonstrated by the combined cytokine alterations in Figure 7 and discussed in lines 509-513. We have also added in a suggested experiment to test the model prediction in lines 691-695.

      (16) Line 456, typo: I think 'expect' should be 'except'.

      Thank you for pointing that out. The typo has been corrected.

    2. eLife assessment

      This is so-far the most comprehensive, spatially resolved in 2D, dynamical, multicellular model of murine muscle regeneration after injury. The work is an attempt to combine many contributors to muscle regeneration into one coherent calibrated framework. The presented analysis is solid and the model has the potential to be a very valuable tool in the areas of tissue morphogenesis, regenerative therapies, quantitative modeling and simulation.

    3. Reviewer #1 (Public Review):

      Summary:

      This work extends previous agent-based models of murine muscle regeneration by the authors (especially Westman et al., 2021) and by others (especially Khuu et al, 2023) by incorporating additional agent rules (altogether now based on over 100 published studies), threshold parameters and interactions with fields of cytokines and growth factors as well as capillaries (dynamically changing through damage and angiogenesis) and lymphatic vessels. The estimation of 52 unknown parameters against three time courses of tissue-scale observables (muscle cross-sectional area recovery, satellite stem cell count and fibroblast cell count) employs the CaliPro algorithm (Joslyn et al., 2021) and sensitivity analysis. The model is validated against additional time courses of tissue-scale observables and qualitative perturbation data, which match almost all conditions. This model is here used to predict (also non-monotonic) responses of (combinations of) cytokine perturbations but it moreover represents a valuable resource for further analysis of emergent behavior across multiple spatial scales in a physiologically relevant system.

      Strengths:

      This work (almost didactically) demonstrates how to develop, calibrate, validate and analyze a comprehensive, spatially resolved, dynamical, multicellular model. Testable model predictions of (also non-monotonic) emergent behaviors are derived and discussed. The computational model is based on a widely-used simulation platform and shared openly such that it can be further analyzed and refined by the community. The single-used parameter set is a good starting point for future work that can, as outlined in the discussion section of the paper, analyze model results from the full distribution of matching parameter values and for a spectrum of realistic tissue configurations.

    4. Reviewer #2 (Public Review):

      Summary:

      In the paper, the authors use a cellular Potts model to investigate muscle regeneration. The model is an attempt to combine many contributors to muscle regeneration into one coherent framework. I believe the resulting model has the potential to be very useful in investigating the complex interplay of multiple actors contributing to muscle regeneration.

      Strengths:

      The manuscript identified relevant model parameters from a long list of biological studies. This collation of a large amount of literature into one framework has the potential to be very useful to other authors. The mathematical methods used for parameterization and validation are transparent.

      Comments on revised version:

      The authors have satisfactorily addressed my previous comments.

    1. eLife assessment

      This useful study identifies a population of CD81-positive fibroblasts showing senescence signatures that can activate neutrophils through the C3/C3aR1 axis, hence contributing to the inflammatory response in periodontitis. Solid evidence, combining in vitro and in vivo analyses and mouse and human data, supports these findings. The work could be of interest to researchers working in the senescence and oral medicine fields

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Liangliang Fu and colleagues propose that a population of CD81-positive fibroblasts exhibiting senescent features activate neutrophils via the C3/C3aR1 axis and contribute to maintaining the inflammatory response in periodontitis. The authors provide evidence that inhibition of cellular senescence by metformin treatment in murine models ameliorated periodontitis progression. This study provides some valuable insights into the impact of periodontitis-induced gingival damage and the significance of stromal senescence.

      Strengths:

      (1) The work combines a variety of models of periodontitis, including analyses of human samples, primary gingival fibroblast cell culture isolation and cultures, and mouse models of ligature/induced periodontitis. Then, the results are solid in terms of used models.

      (2) Comprehensive exhibition of methodologies incorporating histology procedures, micro-CT imaging, bulkRNAseq and scRNAseq transcriptomic profiles (the latter analyses of published datasets), and a number of computational analyses. The paper is robust at the technical level.

      (3) This paper is timely and interesting and it opens potential therapeutic avenues for the treatment of periodontitis. Although the interplay of senescence with periodontitis and the use of metformin has been previously reported (e.g. Kuang et al. Biogerontology 2020), I think the proposed mechanism of neutrophils activation by CD81-positive senescent fibroblasts and the inflammatory response is original. The paper is therefore at the forefront of the field, as senescence and its interplay with the immune system is a hot topic and reflects the current directions ("trending topics") of the field.

      Weaknesses:

      (1) The assessment of Cellular Senescence is limited and would benefit from additional biomarkers and not just p16 and p21, in particular in vivo.

      (2) This paper does not include original scRNAseq datasets in periodontitis, but analyses of already published datasets.

      (3) The authors claim that cellular senescence of CD81+ fibroblasts could be attributed to disturbances of lipid metabolism, resulting in differentiation arrest and higher expression of SASP factors in CD81+ fibroblast cells. Although the authors found that a series of pathways related to metabolism (metabolism of linoleic acid, linolenic acid, arachidonic acid, or steroid biosynthesis) are upregulated in CD81+ fibroblasts by transcriptomic analyses the hypothesis remains speculative and requires further validations.

      (4) Metformin has been reported to downregulate the SASP and lower senescent cell burden (e.g. for review see Kulkarni, Gubbi, and Barzilai. Cell Metab 2020). Although Metformin's senotherapeutic activities can be mediated by anti-inflammatory effects preventing NFkB translocation to the nucleus (Moiseeva et al. Aging Cell 2013) and has been shown to prevent oxidative stress-induce senescence in human periodontal ligament cells (Kuang et al. Biogerontology 2020) it can also drive multiple and pleiotropic effects unrelated to senescence.

      (5) Mechanistically, the proposed activation neutrophils by senescent C81+ fibroblasts via the C3/C3aR1 axis would be further supported by using a senolytic approach (e.g. Bcl2 inhibitor) allowing testing of whether eradication of senescent stromal cells results in reduced levels of CD81 and C3 positivity, and prevention of neutrophils infiltration.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors report the discovery of a population of gingival fibroblasts displaying the expression of cellular senescence markers P21 and P16 in human periodontitis samples and a murine ligature-induced periodontitis (LIP) model. They support this finding in the murine model through bulk RNA-sequencing and show that differentially expressed genes are significantly enriched in the SenMayo cellular senescence in an aging dataset. They then show that Ligature-Induced Periodontitis (LIP) mice treated with the senomorphic drug metformin display overall diminished bone damage, reduced histomorphic alterations, and a reduction in P21 and P16 immunostaining signal. To explore the cell types expressing cellular senescence markers in periodontitis, the authors make use of a combination of bioinformatic analyses on publicly available scRNA-seq data, immunostainings on patient samples and their LIP model; as well as in vitro culture of healthy human gingival fibroblasts treated with LPS. They found that fibroblasts are a cell population expressing P16 in periodontitis which are also enriched for SenMayo genes, suggesting they have a senescent phenotype. They then point to a subgroup of fibroblasts expressing CD81+ with the highest enrichment for a SASP geneset in periodontitis. They also show that treatment of LIP mice and human LPS-treated gingival fibroblasts with metformin leads to a reduction of P21 and P16-positive cells, as well as the senescence-associated beta-galactosidase (SA-beta-gal) marker. Finally, they show evidence suggesting that CD81+ senescent fibroblasts are the source of C3 complement protein, which they stipulate signals through the C3AR1 receptor present in neutrophils in periodontitis. The authors observed that both CD81+ fibroblast and C3AR1+ neutrophil populations are expanded in periodontitis, that both populations appear to be in close contact, and that treatment with metformin reduced both C3 and the neutrophil marker MPO in their mouse LIP model.

      Strengths:

      The study implements several different techniques and tools on human samples, mouse models, fibroblast cultures, and publicly available data to support their conclusions. In summary, the evidence suggests that in the context of periodontitis, there is an expansion of cells expressing senescence markers P21, and P16, as well as members of the SASP, and that this includes CD81+ fibroblasts.

      Weaknesses:

      The manuscript appears to use as synonyms the terms "senescent cells" and "aging cells", as well as "senescence" and "aging", or "accelerated senescence" and "accelerated aging". This choice of words makes it difficult to understand the objectives of the study and the interpretations the authors are deriving from their results. The current understanding of the role of cellular senescence is that it is only one of the multiple biological aspects that characterize physiological aging. Although deeply intertwined, aging and cellular senescence are widely considered distinct phenomena, but the difference between these concepts seems blurry to me within the manuscript.

      After reading the manuscript, my understanding is that the authors are comparing the process of periodontitis to a form of accelerated aging, in which senescent cells are potential drivers or contributors. I believe this to be an interesting point of view. As the authors mention, periodontitis is more common in the elderly, and senescence is strongly implicated in aging. However, I am not entirely sure if the authors were trying to address such a question, and more importantly, the experiments conducted here cannot address the relationships between cellular senescence in periodontitis and aging as (1) they do not conduct an expanded analysis of molecular and cellular features of aging in the oral epithelium beyond cellular senescence, (2) they do not test this hypothesis in vitro and in vivo using models of accelerated or delayed aging (or publicly available datasets of such models), and (3) interpretations regarding the aging process are hindered by the fact that all human healthy patients were young adults, while all human periodontitis patients were middle-aged, while the mouse model did not include different age groups.

      The authors also refer to metformin as an "anti-aging" drug. Therefore, to me, it is not clear if the authors intended to use metformin as a senotherapeutic agent to show a correlation between senescence markers and the severity of periodontitis, or if they conceived their experiments and interpreted their results as "delaying the aging process". The latter would be more difficult to determine as cellular senescence is only one of the several aspects of the aging process in tissues. As none of the other molecular and cellular hallmarks that characterize the process of aging (epigenetic alterations, telomere shortening, immunosenescence, mitochondrial dysfunction, stem cell depletion, genomic instability, loss of proteostasis, nutrient sensing disruption, etc.) were studied, I believe this might be just a matter of semantics and rephrasing.

      On the other hand, and assuming the authors were only seeking to explore the role of cellular senescence in periodontitis (irrespective of the aging process), I have the following concerns:

      Major concerns:

      (1) A majority of the bioinformatic analyses regarding cellular senescence were conducted using only the SenMayo geneset reported by Dominik Saul et al. That geneset was developed by literature searching for genes associated with cellular senescence that had been studied in the context of human aging (in bone marrow). Thus, my understanding is that it is not an "aging" gene set as the authors describe it (and possibly interpret it) throughout the manuscript but a gene set of cellular senescence-associated genes that are overrepresented in aging tissues.

      The SenMayo geneset specifically excludes important genes like P21, P16, and RELA as they were used for validating that dataset against other datasets. Additionally, most of the genes that comprise SenMayo are cytokines and growth factors. This includes part of the SASP (and the authors also show enrichment for some SASP factors using the Coppé dataset in Figure 5) but excludes many of the core important processes that are known to define cellular senescence, including cell cycle inhibition, lack of cell proliferation, accumulation of DNA damage, activation of the lysosomal compartment and disruption of the nuclear envelope, among others. As the SenMayo geneset was developed for studying senescence in the context of aging, I believe it is important to conduct a more extensive analysis with other published gene sets of cellular senescence. Examples include the cellular senescence and SASP REACTOME pathways, the KEGG cellular senescence pathway, the cellular senescence GO term, the Fridman dataset, SeneQuest, CSGene, CellAge, etc. Most importantly, it will be important to show the enrichment of pathways related to hallmark pathways underlying cellular senescence such as cell cycle inhibition, the DNA damage response and repair pathways, NF-kB signaling, MTOR, and autophagy signaling, etc. Showing the enrichment level of these pathways in the CD81+ fibroblasts in periodontitis would be of utmost importance for backing up the conclusions of this study.

      (2) The most important aspect of the definition of cellular senescence is the absence of cell proliferation. Beyond the expression of the p21, p16, and SASP markers, any evidence showing that CD81+ fibroblasts are not proliferating in vivo in humans and mice, and in vitro in the case of LPS experiments, would be of great importance for defining these cells as senescent. Otherwise, conclusions should be toned down to refer to the expression of senescence markers or cells having a "senescent-like" phenotype.

      (3) The use of a "relative optic density" metric instead of positive cell counts as a measure for quantifying IHC stainings might pose challenges in reproducing these results, especially for the P21 and P16 stainings which are proteins that despite being possibly also being found in the cytoplasm, should be clearly present in the nucleus of positive cells. The quantification of the levels of these markers is of great importance for the conclusions of this study but I am concerned they would be too difficult to reproduce. In my opinion, cell counts and % of positive cells should be used, with a clear description of the total number of cells counted in the methodology. Otherwise, a strong justification for using OD in the methodology is required in addition to considering the following comments:

      a. There is no description in the methodology describing how this relative OD is measured and calculated. It is not clear if the data points shown in the graphs are biological replicates or OD means measured in different histological sections from the same sample.

      b. The images of P16 and P21 stainings in Figures 2E and 2F do not appear to resemble the differences in OD between conditions shown in the graphs of Figures 2Gd and 2Ge.

      c. The stainings shown for p16 in Figure 2E seem considerably different from those shown in Figure 1D. Additionally, the relative OD of P16 at 14D is around 0.08 in Figure 1D, while the mean for the control appears to be around 0.015 at 14D in Figure 2Gd. The use of OD as a measure is again worrying as this could be impacting interpretations: the difference between the ODs of LIP+MET (around 0.08) and LIP+ddH2O (around 0.015) is reported as significant but the difference in OD between LIP14D in Figure 1D (around 0.07) and LIP+ddH2o in Figure 2Gd (around 0.015) should not be significant as they are supposed to similar control conditions.

      d. Irrespective of the measure used, the authors should state exact means and standard deviations, as well as exact P values, the statistical test used, and the number of biological replicates per group in parenthesis in the main text and figure legend.

      (4) The conclusions derived from experiments with metformin in mice and cell cultures are not fully supported by the evidence.

      First, metformin has multiple molecular targets, as well as multiple organ and tissue targets. The experiments presented in mice do not consider/evaluate the systemic effects of metformin nor local effects in other gingival cell types and this should be discussed.

      As mentioned before, these experiments cannot be interpreted as testing metformin in the context of "anti-aging", as this study addresses cellular senescence in periodontitis. However, the results are still relevant as there is considerable evidence showing that metformin has senomorphic activity. In this regard, the authors could make use of a compound that has been more extensively characterized as a senolytic such as ABT-737, ABT-263 (Navitoclax), or the combination of Dasatinib + Quercetin, to show the effect of eliminating senescent cells in their LPS induction fibroblast model.

      They could also show the effect of metformin on the activation of other hallmark senescence pathways such as (the NF-kB pathway or the DNA damage response) and in the expression of SASP factors they identified as overexpressed in CD81+ fibroblasts through their analysis against the SenMayo dataset (e.g., IL6, CXCL1, CXCL12). This could be done in their samples from metformin-treated mouse experiments and in their LPS induction fibroblast model.

      (5) For the data produced on the authors' human samples, the difference in the age of patient groups is a significant confounding factor. This is because all their periodontitis patient samples came from middle-aged individuals (mean age above 50 years), while all healthy samples were obtained from young adults (mean age 25 years). The authors should justify this selection of age groups and justify why differences in the age of each experimental group could impact the validity of their results regarding the accumulation of senescent cells. Showing the level of P21 and P16 positive cell accumulation in samples from healthy patients from a similar age group (middle-aged) is of great importance to support the validity of their results in humans.

    4. Reviewer #3 (Public Review):

      Summary:

      This work investigates the role of cellular senescence in the progression of Periodontitis using a combination of in vivo and in vitro mouse modelling experiments, human periodontitis samples, and transcriptomic analyses.

      The authors propose that gum fibroblasts from either patient periodontitis samples or a mouse model of periodontitis can enter a state of cellular senescence (Figure 1). Treatment of their periodontitis mouse model with the compound Metformin attenuated this senescent phenotype and mildly reduced symptom severity. Therefore providing a potential mechanistic link between the senescent state and disease progression (Figure 2).

      Leveraging analysis of published single-cell RNA-sequencing datasets of human healthy and periodontitis gum samples, the authors identify CD81+ gum fibroblasts as the cell type with the greatest enrichment of senescence-associated gene expression (Figures 3 and 4) as well as possessing metabolic alterations (Figure 5). Finally, the authors propose that these senescent gum fibroblasts are able to recruit neutrophils through C3 signalling, generating a sustained inflammatory environment that promotes disease progression (Figure 6).

      The conclusions of this research are mostly well supported by that data. However, the characterisation of the senescent state and its causal involvement in disease progression could be further improved.

      Strengths:

      (1) The authors' use of both human and mouse samples provides important translational relevance to their research by finding analogous populations of putatively senescent fibroblasts in both systems.

      (2) The use of single-cell RNA-sequencing datasets derived from patient control and periodontitis samples provides a powerful system for interrogating specific cell types. Such an analysis allowed for the characterisation of fibroblast heterogeneity revealing the unique CD81-expressing subset as having the greatest senescent characteristics. Importantly, this result was validated by immunofluorescence in both mouse and human periodontitis systems.

      Weaknesses:

      (1) The assessment of cellular senescence induction during periodontitis is rather superficial, relying on p16 and p21 Immunohistochemical staining and geneset enrichment analysis (Figure 1). This could be bolstered by their in vitro human fibroblast culture system utilising LPS stimulation. Specifically, their assessment could be more robust by including further markers of senescence such as (i) expression of DNA-damage markers, (ii) evidence of proliferative arrest, and (iii) assessment of an induced secretory phenotype. While a SASP signature was defined in Figure 5A, this was derived from a published single-cell RNA-sequencing dataset. Finding an analogous SASP signature in their human fibroblast cultures/bulk RNA-sequencing comparison of mouse normal-versus-periodontitis tissue would provide more compelling evidence for senescence induction.

      (2) While Metformin treatment has an existing basis in the literature as a therapeutic strategy for treating periodontitis, the authors of the current study provide novelty by proposing that Metformin acts by reducing the senescent cell burden during periodontitis. While Metformin treatment is able to significantly reduce the severity of bone damage in ligation-induced periodontitis, the effect is quite mild and the evidence presented does not compellingly show an effect on the putatively senescent p16+ and p21+ cell populations in the gum (Figures 2E and F). Moreover, while the authors show that Metformin treatment is able to attenuate senescence by reducing the expression of senescence-associated Beta-galactosidase (Supplementary Figure 2E), this raises several questions. Namely, (i) Does Metformin prevent the acquisition of a senescent state or (ii) is it acting as a senolytic by actively killing the senescent fibroblasts? It would be important to address these questions to better assess whether Metformin treatment is efficacious only prophylactically, or whether it can have an effect during disease progression. Furthermore, experimental testing if other, widely utilised, senolytics strategies (i.e Navitoclax, Dasatinib+Quercetin, Fisetin etc...) or testing if a p16-/- genetic background is able to attenuate senescence and produce similar protective response would provide more compelling evidence to support their conclusion that cellular senescence is having a causal role in disease progression.

      (3) The authors' metabolic profiling of their senescent gum fibroblasts, through interrogation of the transcriptomic datasets, reveals an upregulated synthesis of arachidonic acid. Through this they propose that it can be converted into prostaglandins and leukotrienes, by COXs expressed by the fibroblasts, fuelling tissue inflammation. However, this mechanism promoting inflammation is speculative and lacks experimental demonstration. To support this mechanism it would be important to show (i) increased prostaglandin/leukotrienes expression in periodontitis (relative to healthy control) and (ii) the ability to reduce this by attenuating the senescent phenotype (either by Metformin or other senolytics strategies).

    1. eLife assessment

      This valuable study presents a machine learning model to recommend effective antimicrobial drugs from patients' samples analysed with mass spectrometry. While the proposed approach of training a single model across different bacterial species and drugs seems promising, the comparison with baselines and related work is incomplete. With the evaluation part strengthened, this paper would be of interest to computational biologists, microbiologist, and clinicians.

    2. Reviewer #1 (Public Review):

      Summary:

      De Waele et al. reported a dual-branch neural network model for predicting antibiotic resistance profiles using matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry data. Neural networks were trained on the recently available DRIAMS database of MALDI-TOF mass spectrometry data and their associated antibiotic susceptibility profiles. The authors used a dual branch neural network approach to simultaneously represent information about mass spectra and antibiotics for a wide range of species and antibiotic combinations. The authors showed consistent performance of their strategy to predict antibiotic susceptibility for different spectrums and antibiotic representations (i.e., embedders). Remarkably, the authors showed how small datasets collected at one location can improve the performance of a model trained with limited data collected at a second location. Despite these promising results, there are several analyses that the authors could incorporate to offer additional support to some of their claims (see weaknesses). In particular, this work would benefit from a more comprehensive comparison of the author's single recommender model vs an ensemble of specialist models, and the inclusion of 1-2 examples that showcase how their model could be translated into the clinic.

      Strengths:

      • A single AMR recommender system could potentially facilitate the adoption of MALDI-TOF-based antibiotic susceptibility profiling into clinical practices by reducing the number of models to be considered, and the efforts that may be required to periodically update them.

      • Authors tested multiple combinations of embedders for the mass spectra and antibiotics while using different metrics to evaluate the performance of the resulting models. Models trained using different spectrum embedder-antibiotic embedder combinations had remarkably good performance for all tested metrics. The average ROC AUC scores for global and spectrum-level evaluations were above 0.9. Average ROC AUC scores for antibiotic-level evaluations were greater than 0.75.

      • Authors showed that data collected in one location can be leveraged to improve the performance of models generated using a smaller number of samples collected at a different location. This result may encourage researchers to optimize data integration to reduce the burden of data generation for institutions interested in testing this method.

      Weaknesses:

      • Although ROC AUC is a widely used metric. Other metrics such as precision, recall, sensitivity, and specificity are not reported in this work. The last two metrics would help readers understand the model's potential implications in the context of clinical research.

      • The authors did not hypothesize or describe in any way what an acceptable performance of their recommender system should be in order to be adopted by clinicians.

      • Related to the previous comment, this work would strongly benefit from the inclusion of 1-2 real-life applications of their method that could showcase the benefits of their strategy for designing antibiotic treatment in a clinical setting.

      • The authors do not offer information about the model features associated with resistance. This information may offer insights about mechanisms of antimicrobial resistance and how conserved they are across species.

      • Comparison of AUC values across models lacks information regarding statistical significance. Without this information it is hard for a reader to figure out which differences are marginal and which ones are meaningful (for example, it is unclear if a difference in average AUC of 0.02 is significant). This applied to Figure 2, Figure 3, and Table 2 (and the associated supplementary figures).

      • One key claim of this work was that their single recommender system outperformed specialist (single species-antibiotic) models. However, in its current status, it is not possible to determine that in fact that is the case (see comment above). Moreover, comparisons to species-level models (that combine all data and antibiotic susceptibility profiles for a given species) would help to illustrate the putative advantages of the dual branch neural network model over species-based models. This analysis will also inform the species (and perhaps datasets) for which specialist models would be useful to consider.

      • Taking into account that the clustering of spectra embeddings seemed to be species-driven (Figure 4), one may hypothesize that there is limited transfer of information between species, and therefore the neural network model may be working as an ensemble of species models. Thus, this work would deeply benefit from a comparison between the authors' general model and an ensemble model in which the species is first identified and then the relevant species recommender is applied. If authors had identified cases to illustrate how data from one species positively influence the results for another species, they should include some of those examples.

    3. Reviewer #2 (Public Review):

      The authors frame the MS-spectrum-based prediction of antimicrobial resistance prediction as a drug recommendation task. Weis et al introduced the dataset this model is tested on and benchmark models which take as input a single species and are trained to predict resistance to a single drug. Instead here, a pair of drug and spectrum are fed to 2 neural network models to predict a resistance probability. In this manner, knowledge from different drugs and species can be shared through the model parameters. Three questions are asked: 1. what is the best way to encode the drugs? 2. does the dual NN outperform the single-spectrum drug?

      Overall the paper is well-written and structured. It presents a novel framework for a relevant problem. The work would benefit from more work on evaluation.

    1. eLife assessment

      The authors examined whether the frequency of alternative splicing across entire genomes correlates with measures of complexities across prokaryotes and eukaryotes. Although the question is very interesting and important for our general understanding of the evolution of life forms, the work is inadequate: the methods, data, and analyses do not support the primary claims. The measure of alternative splicing frequency used by the authors is problematic; the method is inappropriate; the observed correlations may also be explained by known population genetics principles; and parts of the manuscript are difficult to understand.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors collected genomic information from public sources covering 423 eukaryote genomes and around 650 prokaryote genomes. Based on pre-computed CDS annotation, they estimated the frequency of alternative splicing (AS) as a single average measure for each genome and computed correlations with this measure and other genomic properties such as genome size, percentage of coding DNA, gene and intergenic span, etc. They conclude that AS frequency increases with genome complexity in a somewhat directional trend from "lower" organisms to "higher" organisms.

      Strengths:

      The study covers a wide range of taxonomic groups, both in prokaryotes and eukaryotes.

      Weaknesses:

      The study is weak both methodologically and conceptually. Current high throughput sequencing technologies, coupled with highly heterogeneous annotation methods, can observe cases of AS with great sensitivity, and one should be extremely cautious of the biases and rates of false positives associated with these methods. These issues are not addressed in the manuscript. Here, AS measures seem to be derived directly from CDS annotations downloaded from public databases, and do not account for differing annotation methods or RNA sequencing depth and tissue sample diversity.

      There is no mention of the possibility that AS could be largely caused by random splicing errors, a possibility that could very well fit with the manuscript's data. Instead, the authors adopt early on the view that AS is regulated and functional, generally citing outdated literature.

      There is no question that some AS events are functional, as evidenced by strongly supported studies. However, whether all AS events are functional is questionable, and the relative fractions of functional and non-functional AS are unknown. With this in mind, the authors should be more cautious in interpreting their data. The "complexity" of organisms also correlates well (negatively) with effective population size. The power of selection to eliminate (slightly) deleterious mutations or errors decreases with effective population size. The correlation observed by the authors could thus easily be explained by a non-adaptive interpretation based on simple population genetics principles.

      The manuscript contains evidence that the authors might benefit from adopting a more modern view of how evolution proceeds. Sentences such as "... suggests that only sophisticated organisms optimize alternative splicing by increasing..." (L113), or "especially in highly evolved groups such as mammals" (L130), or the repeated use of "higher" and "lower" organisms need revising.

      Because of the lack of controls mentioned above, and because of the absence of discussion regarding an alternative non-adaptive interpretation, the analyses presented in the manuscript are of very limited use to other researchers in the field. In conclusion, the study does not present solid conclusions.

    3. Reviewer #2 (Public Review):

      Summary:

      In this contribution, the authors investigate the degree of alternative splicing across the evolutionary tree and identify a trend of increasing alternative splicing as you move from the base of the tree (here, only prokaryotes are considered) towards the tips of the tree. In particular, the authors investigate how the degree of alternative splicing (roughly speaking, the number of different proteins made from a single ORF (open reading frame) via alternative splicing) relates to three genomic variables: the genome size, the gene content (meaning the fraction of the genome composed of ORFs), and finally, the coding percentage of ORFs, meaning the ratio between exons and total DNA in the ORF. When correlating the degree of alternative splicing with these three variables, they find that the different taxonomic groups have a different correlation coefficient, and identify a "progressive pattern" among metazoan groups, namely that the correlation coefficient mostly increases when moving from flowering plants to arthropods, fish, birds, and finally mammals. They conclude that therefore the amount of splicing that is performed by an organismal group could be used as a measure of its complexity.

      Weaknesses:

      While I find the analysis of alternative splicing interesting, I also find that it is a very imperfect measure of organismal complexity and that the manuscript as a whole is filled with unsupported statements. First, I think it is clear to anyone studying evolution over the tree of life that it is the complexity of gene regulation that is at the origin of much of organismal structural and behavioral complexity. Arguably, creating different isoforms out of a single ORF is just one example of complex gene regulation. However, the complexity of gene regulation is barely mentioned by the authors. Further, it is clear that none of their correlation coefficients actually show a simple trend (see Table 3). According to these coefficients, birds are more complex than mammals for 3 out of 4 measures. It is also not clear why the correlation coefficient between alternative splicing ratio and genome length, gene content, and coding percentage should display such a trend, rather than the absolute value. There are only vague mechanistic arguments.

      Much more troubling, however, is the statement that the data supports "lineage-specific trends" (lines 299-300). Either this is just an ambiguous formulation, or the authors claim that you can see trends *within* lineages. The latter is clearly not the case. In fact, within each lineage, there is a tremendous amount of variation, to such an extent that many of the coefficients given in Table 3 are close to meaningless. Note that no error bars or p-values are presented for the values shown in Table 3. Figure 2 shows the actual correlation, and the coefficient for flowering plants there is given as 0.151, with a p-value of 0.193. Table 3 seems to quote r=0.174 instead. It should be clear that a correlation within a lineage or species is not a sign of a trend.

      There are several wrong or unsupported statements in the manuscript. Early on, the authors state that the alternative splicing ratio (a number greater or equal to one that can be roughly understood as the number of different isoforms per ORF) "quantifies the number of different isoforms that can be transcribed using the same amount of information" (lines 51-52). But in many cases, this is incorrect, because the same sequence can represent different amounts of information depending on the context. So, if a changed context gives rise to a different alternative splice, it is because the genetic sequence has a different meaning in the changed context: the information has changed. In line 149, the authors state that "the energetic cost of having large genomes is high". No citation is given, and while such a statement seems logical, it does not have very solid support. If there was indeed a strong selective force to reduce genome size, we would not see the stunning diversity of genome sizes even within lineages. This statement is repeated (without support) several times in the manuscript, apparently in support of the idea that mammals had "no choice" to increase complexity via alternative splicing because they can't increase it by having longer genomes. I don't think this reasoning can be supported. Even more problematic is the statement that "the amount of protein-coding DNA seems to be limited to a size of about 10MB" (line 219). There is no evidence whatsoever for this statement. The reference that is cited (Choi et al 2020) suggests that there is a maximum of 150GB in total genome size due to physiological constraints. In lines 257-258, the authors write that "plants are less restricted in terms of storing DNA sequences compared to animals" (without providing evidence or a citation). I believe this statement is made due to the observation that plants tend to have large intergenic regions. But without examining the functionality of these interagency regions (they might host long non-coding RNA stretches that are used to regulate the expression of other genes, for example) it is quite adventurous to use such a simple measure as being evidence that plants "are less restricted in terms of storing DNA sequences", whatever that even means. I do not think the authors mean that plants have better access to -80 freezers. The authors conclude that "plant's primary mechanism of genome evolution is by expanding their genome". This statement itself is empty: we know that plants are prone to whole genome duplication, but this duplication is not, as far as we understand, contributing to complexity. It is not a "primary mechanism of genome evolution". In lines 293-294, the authors claim that "alternative splicing is maximized in mammalian genomes". There is no evidence that this ratio cannot be increased. So, to conclude (on lines 302-303) that alternative splicing ratios are "a potential candidate to quantify organismal complexity" seems, based on this evidence, both far-fetched and weak at the same time.

      I am also not very comfortable with the data analysis. The authors, for example, say that they have eliminated from their analysis a number of "outlier species". They mention one: Emmer wheat because it has a genome size of 900 Mb (line 367). Since 900MB does not appear to be extreme, perhaps the authors meant to write 900 Gb. When I consulted the paper that sequenced Triticum dicoccoides, they noted that 14 chromosomes are about 10GB. Even a tetraploid species would then not be near 900Gb. But more importantly, such a study needs to state precisely which species were left out, and what the criteria are for leaving out data, lest they be accused of selecting data to fit their hypothesis.

      I understand that Methods are often put at the end of a manuscript, but the measures discussed here are so fundamental to the analysis that a brief description of what the different measures are (in particular, the "alternative splicing ratio") should be in the main text, even when the mathematical definition can remain in the Methods.

      Finally, a few words on presentation. I understand that the following comments might read differently after the authors change their presentation. This manuscript was at the border of being comprehensible. In many cases, I could discern the meaning of words and sentences in contexts but sometimes even that failed (as an example above, about "species-specific trends", illustrates). The authors introduced jargon that does not have any meaning in the English language, and they do this over and over again.

      Note that I completely agree with all the comments by the other reviewer, who alerted me to problems I did not catch, including the possible correlation with effective population size: a possible non-adaptive explanation for the results.

    4. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The authors collected genomic information from public sources covering 423 eukaryote genomes and around 650 prokaryote genomes. Based on pre-computed CDS annotation, they estimated the frequency of alternative splicing (AS) as a single average measure for each genome and computed correlations with this measure and other genomic properties such as genome size, percentage of coding DNA, gene and intergenic span, etc. They conclude that AS frequency increases with genome complexity in a somewhat directional trend from "lower" organisms to "higher" organisms.

      Strengths:

      The study covers a wide range of taxonomic groups, both in prokaryotes and eukaryotes.

      Weaknesses:

      The study is weak both methodologically and conceptually. Current high throughput sequencing technologies, coupled with highly heterogeneous annotation methods, can observe cases of AS with great sensitivity, and one should be extremely cautious of the biases and rates of false positives associated with these methods. These issues are not addressed in the manuscript. Here, AS measures seem to be derived directly from CDS annotations downloaded from public databases, and do not account for differing annotation methods or RNA sequencing depth and tissue sample diversity.

      We are aware of the bias that may exist in annotation files. Since the source of noise can be highly variable, we have assumed that most of the data has a similar bias. However, we agree with the reviewer that we could perform some analysis to test for these biases and their association to different methodologies. Thus, we will measure the uncertainty present in the data. From one side, we will be more explicit about the data limitations and the biases it can generate in the results. On the other side, while analyzing the false positives in the data is out of our scope, we will perform a statistical test to detect possible biases regarding different methods of sequencing and annotation, and types of organisms (model or non-model organisms). If positive, we will proceed, as far as possible, to normalize the data or to estimate a confidence interval.

      Here, AS measures seem to be derived directly from CDS annotations downloaded from public databases, and do not account for differing annotation methods or RNA sequencing depth and tissue sample diversity.

      Beyond taking into account the differential bias that may exist in the data, we do not consider that our AS measure is problematic. The NCBI database is one of the most reliable databases that we have to date and is continuously updated from all scientific community. So, the use of this data and the corresponding procedures for deriving the AS measure are perfectly acceptable for a comparative analysis on such a huge global scale. Furthermore, the proposal of a new genome-level measure of AS that allows to compare species spanning the whole tree of life is part of the novelty of the study. We understand that small-scale studies require a high specificity about the molecular processes involved in the study. However, this is not the case, where we are dealing with a large-scale problem. On the other side, as we have previously mention, we agree with the reviewer to analyze the degree of uncertainty in the data to better interpret the results.

      There is no mention of the possibility that AS could be largely caused by random splicing errors, a possibility that could very well fit with the manuscript's data. Instead, the authors adopt early on the view that AS is regulated and functional, generally citing outdated literature.

      There is no question that some AS events are functional, as evidenced by strongly supported studies. However, whether all AS events are functional is questionable, and the relative fractions of functional and non-functional AS are unknown. With this in mind, the authors should be more cautious in interpreting their data.

      Many studies suggest that most of the AS events observed are the result of splicing errors and are therefore neither functional nor conserved. However, we still have limited knowledge about the functionality of AS. Just because we don’t have a complete understanding of its functionality, doesn’t mean there isn’t a fundamental cause behind these events. AS is a highly dynamic process that can be associated with processes of a stochastic nature that are fundamental for phenotypic diversity and innovation. This is one of the reasons why we do not get into a discussion about the functionality of AS and consider it as a potential measure of biological innovation. Nevertheless, we agree with the reviewer’s comments, so we will add a discussion about this issue with updated literature and look at any possible misinterpretation of the results.

      The "complexity" of organisms also correlates well (negatively) with effective population size. The power of selection to eliminate (slightly) deleterious mutations or errors decreases with effective population size. The correlation observed by the authors could thus easily be explained by a non-adaptive interpretation based on simple population genetics principles.

      We appreciate the observation of the reviewer. We know well the M. Lynch’s theory on the role of the effective population size and its eventual correlation with genomic parameters, but we want to emphasize that our objective is not to find an adaptive or non-adaptive explanation of the evolution of AS, but rather to reveal it. Nevertheless, as the reviewer suggests, we will look at the correlation between the AS and the effective population size and discuss about a possible non-adaptive interpretation.

      The manuscript contains evidence that the authors might benefit from adopting a more modern view of how evolution proceeds. Sentences such as "... suggests that only sophisticated organisms optimize alternative splicing by increasing..." (L113), or "especially in highly evolved groups such as mammals" (L130), or the repeated use of "higher" and "lower" organisms need revising.

      As the reviewer suggests, we will proceed with the corresponding linguistic corrections.

      Because of the lack of controls mentioned above, and because of the absence of discussion regarding an alternative non-adaptive interpretation, the analyses presented in the manuscript are of very limited use to other researchers in the field. In conclusion, the study does not present solid conclusions.

      Reviewer #2 (Public Review):

      Summary:

      In this contribution, the authors investigate the degree of alternative splicing across the evolutionary tree and identify a trend of increasing alternative splicing as you move from the base of the tree (here, only prokaryotes are considered) towards the tips of the tree. In particular, the authors investigate how the degree of alternative splicing (roughly speaking, the number of different proteins made from a single ORF (open reading frame) via alternative splicing) relates to three genomic variables: the genome size, the gene content (meaning the fraction of the genome composed of ORFs), and finally, the coding percentage of ORFs, meaning the ratio between exons and total DNA in the ORF. When correlating the degree of alternative splicing with these three variables, they find that the different taxonomic groups have a different correlation coefficient, and identify a "progressive pattern" among metazoan groups, namely that the correlation coefficient mostly increases when moving from flowering plants to arthropods, fish, birds, and finally mammals. They conclude that therefore the amount of splicing that is performed by an organismal group could be used as a measure of its complexity.

      Weaknesses:

      While I find the analysis of alternative splicing interesting, I also find that it is a very imperfect measure of organismal complexity and that the manuscript as a whole is filled with unsupported statements. First, I think it is clear to anyone studying evolution over the tree of life that it is the complexity of gene regulation that is at the origin of much of organismal structural and behavioral complexity. Arguably, creating different isoforms out of a single ORF is just one example of complex gene regulation. However, the complexity of gene regulation is barely mentioned by the authors.

      We disagree with the reviewer with that our measure of AS is imperfect. Just as we responded to the first reviewer, we will quantify the uncertainty in the data and correct for differential biases caused by annotation and sequencing methods. Thus, beyond correcting relevant biases in the data, we consider that our measure is adequate for a comparative analysis at a global scale. A novelty of our study is the proposal of a genome-level measure of AS that takes into account data from the entire scientific community. 

      We want also to emphasize that we assume from the beginning that AS may reflect some kind of biological complexity, it is not a conclusion from the results. An argument in favor of such an assumption is that AS is associated with stochastic processes that are fundamental for phenotypic diversity and innovation. Of course, we agree with the reviewer that it is not the only mechanism behind biological complexity, so we will emphasize it in the manuscript. On the other side, we will be more explicit about the assumptions and objectives, and will correct any unsupported statement.

      Further, it is clear that none of their correlation coefficients actually show a simple trend (see Table 3). According to these coefficients, birds are more complex than mammals for 3 out of 4 measures.

      An evolutionary trend is broadly defined as the gradual change in some characteristic of organisms as they evolve or adapt to a specific environment. Under our context, we define an evolutionary trend as the gradual change in genome composition and its association with AS across the main taxonomic groups. If we look at Figure 4 and Table 3 we can conclude that there is a progressive trend. We will be more precise about how we define an evolutionary trend and correct any possible misinterpretation of the results. On the other side, we do not assume that mammals should be more complex than birds. First, we will emphasize that our results show that birds have the highest values of such a trend. Second, after reading the reviewer’s comments, we have decided that we will perform an additional analysis to correct for differences in the taxonomic group sizes, which will allow us to have more confidence in the results.

      It is also not clear why the correlation coefficient between alternative splicing ratio and genome length, gene content, and coding percentage should display such a trend, rather than the absolute value. There are only vague mechanistic arguments.

      The study analyzes the relationship of AS with genomic composition for the large taxonomic groups. We assume that significant differences in these relationships are indicators of the presence of different mechanisms of genome evolution. However, we agree with the reviewer that a correlation does not imply a causal relation, so we will be more cautious when interpreting the results.

      To quantify the relationships we use correlation coefficients, the slopes of such correlations, and the relation of variability. Although the absolute values of AS are also illustrated in Table 4, we consider that they are less informative than if we include how it relates to the genomic composition. For example, we observe that plants have a different genome composition and relation with AS if compared to animals, which suggest that they follow different mechanisms of genome evolution. On the other hand, we observe a trend in animals, where high values of AS are associated to a large percentage of introns and a percentage of intergenic DNA of about the 50% of genomes.

      Much more troubling, however, is the statement that the data supports "lineage-specific trends" (lines 299-300). Either this is just an ambiguous formulation, or the authors claim that you can see trends *within* lineages.

      We agree with the reviewer that this statement is not correct, so we will proceed to correct it.

      The latter is clearly not the case. In fact, within each lineage, there is a tremendous amount of variation, to such an extent that many of the coefficients given in Table 3 are close to meaningless. Note that no error bars or p-values are presented for the values shown in Table 3. Figure 2 shows the actual correlation, and the coefficient for flowering plants there is given as 0.151, with a p-value of 0.193. Table 3 seems to quote r=0.174 instead. It should be clear that a correlation within a lineage or species is not a sign of a trend.

      The reviewer is not understanding correctly the results in Table 3. It is precisely the variation of the genome variables what we are measuring. Given the standardization of these values by the mean values, we have proceeded to compare the variability between groups, which is the result shown in Table 3. In this case there are no error bars or p-values associated. On the other hand, we agree that a correlation is not a sign of a trend. But the relations of variability, together with the results obtained in Figure 3, are indicators of a trend. As we mentioned before, we will proceed to analyze whether the variation in the group sizes is causing a bias in the results.

      There are several wrong or unsupported statements in the manuscript. Early on, the authors state that the alternative splicing ratio (a number greater or equal to one that can be roughly understood as the number of different isoforms per ORF) "quantifies the number of different isoforms that can be transcribed using the same amount of information" (lines 51-52). But in many cases, this is incorrect, because the same sequence can represent different amounts of information depending on the context. So, if a changed context gives rise to a different alternative splice, it is because the genetic sequence has a different meaning in the changed context: the information has changed.

      We agree that there are not well supported statements, so we will proceed to revise them.

      In line 149, the authors state that "the energetic cost of having large genomes is high". No citation is given, and while such a statement seems logical, it does not have very solid support.

      We will also revise the bibliography and support our statements with updated references.

      If there was indeed a strong selective force to reduce genome size, we would not see the stunning diversity of genome sizes even within lineages. This statement is repeated (without support) several times in the manuscript, apparently in support of the idea that mammals had "no choice" to increase complexity via alternative splicing because they can't increase it by having longer genomes. I don't think this reasoning can be supported.

      We agree with the reviewer in this issue, so we will carefully revise the statements that indirectly (or directly) assume the action of selective forces on the genome composition.

      Even more problematic is the statement that "the amount of protein-coding DNA seems to be limited to a size of about 10MB" (line 219). There is no evidence whatsoever for this statement.

      In Figure 1A we observe a one-to-one relationship between the genome size and the amount of coding. However, in multicellular organisms, although the genome size increases we observe that the amount of coding does not increase by more than 10Mb, which suggest the presence of some genomic limitation. Of course, this is not an absolute or general statement, but rather a suggestion. We are only describing our results.

      The reference that is cited (Choi et al 2020) suggests that there is a maximum of 150GB in total genome size due to physiological constraints. In lines 257-258, the authors write that "plants are less restricted in terms of storing DNA sequences compared to animals" (without providing evidence or a citation).

      We will revise the bibliography and add updated references.

      I believe this statement is made due to the observation that plants tend to have large intergenic regions. But without examining the functionality of these interagency regions (they might host long non-coding RNA stretches that are used to regulate the expression of other genes, for example) it is quite adventurous to use such a simple measure as being evidence that plants "are less restricted in terms of storing DNA sequences", whatever that even means. I do not think the authors mean that plants have better access to -80 freezers. The authors conclude that "plant's primary mechanism of genome evolution is by expanding their genome". This statement itself is empty: we know that plants are prone to whole genome duplication, but this duplication is not, as far as we understand, contributing to complexity. It is not a "primary mechanism of genome evolution".

      We will revise these statements.

      In lines 293-294, the authors claim that "alternative splicing is maximized in mammalian genomes". There is no evidence that this ratio cannot be increased. So, to conclude (on lines 302-303) that alternative splicing ratios are "a potential candidate to quantify organismal complexity" seems, based on this evidence, both far-fetched and weak at the same time.

      Our results show the highest values of AS in mammals, but we understand that the results are limited to the availability and accuracy of data, which we will emphasize in the manuscript. As we previously mention, we will also proceed to analyze the uncertainty in data and carry out the appropriate corrections.

      I am also not very comfortable with the data analysis. The authors, for example, say that they have eliminated from their analysis a number of "outlier species". They mention one: Emmer wheat because it has a genome size of 900 Mb (line 367). Since 900MB does not appear to be extreme, perhaps the authors meant to write 900 Gb. When I consulted the paper that sequenced Triticum dicoccoides, they noted that 14 chromosomes are about 10GB. Even a tetraploid species would then not be near 900Gb. But more importantly, such a study needs to state precisely which species were left out, and what the criteria are for leaving out data, lest they be accused of selecting data to fit their hypothesis.

      The reviewer is right, we wanted to say 900Mb, which is approximately 7.2Gb. We had a mistake of nomenclature. This value is extreme compared to the typical values, so it generates large deviations when applying measures of central tendency and dispersion. We want to obtain mean values that are representative of the most species composing the taxonomic groups, so we find appropriate to exclude all outlier values in the study. Nevertheless, we will specify the criteria that we have used to select the data in a rigorous way.

      I understand that Methods are often put at the end of a manuscript, but the measures discussed here are so fundamental to the analysis that a brief description of what the different measures are (in particular, the "alternative splicing ratio") should be in the main text, even when the mathematical definition can remain in the Methods.

      We agree with the reviewer, so we will add a brief description of the genomic variables at the beginning of the Results section.

      Finally, a few words on presentation. I understand that the following comments might read differently after the authors change their presentation. This manuscript was at the border of being comprehensible. In many cases, I could discern the meaning of words and sentences in contexts but sometimes even that failed (as an example above, about "species-specific trends", illustrates). The authors introduced jargon that does not have any meaning in the English language, and they do this over and over again.

      Note that I completely agree with all the comments by the other reviewer, who alerted me to problems I did not catch, including the possible correlation with effective population size: a possible non-adaptive explanation for the results.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Specific comments to improve the quality of the work:

      (1) The choice of subunits to tag are really not ideal. In the available structures of the human proteasome, The C-terminus of Rpn3/PSMD3 points directly toward the ATPase pore and is likely to disrupt the structure and/or dynamics of the proteasome during proteolysis (see comments regarding controls for functionality below). Similarly, the C-terminal tail of Rpt1/PSMC2 has a key role in the opening of the 20S core particle gate for substrate translocation and processing (see 2018 Nature Communications, 9:1360 and 2018 Cell Reports 24:1301-1315), and Alpha3/PSMA4 can be substituted by a second copy of Alpha4/PSMA7 in some conditions (although tagging Alpha3/PSMA4 would admittedly provide a picture of the canonical proteasome interactome while actively excluding the interactome of the non-canonical proteasomes that form via replacement of Alpha3/PSMA4). Comparison of these cell lines with lines harboring tags on subunits that are commonly used for tagging in the field because of a lack of impacts, such as the N-terminus of Rpn1/PSMD2, the C-terminus of Rpn11/PSMD14, and the C-terminus of Beta4/PSMB2 would help instill confidence that the interactome reported largely arises from mature, functional proteasomes rather than subcomplexes, defective proteasomes, or other species that may occur due to tagging at these positions.

      We thank the reviewer for pointing this out. The original purpose of our strategy was to establish proximity labeling of proteasomes to enable applications both in cell culture and in vivo. The choice of PSMA4 and PSMC2 was dictated by previous successful tagging with GFP in mammalian cells (Salomons et al., Exp Cell Res 2010)(Bingol and Schuman, Nature 2006). However, the choice of C-terminal PSMC2 might have been not optimal. HEK293 cells overexpressing PSMC2-BirA show slower growth and the BioID data retrieve higher enrichment of assembly factors suggesting slower assembly of this fusion protein in proteasome. Although we did not observe a negative impact on overall proteasome activity and PSMC2-BirA was (at least in part) incorporated into fully assembled proteasomes as indicated by enrichment of 20S proteins.We apologize for not making it clear that we labeled the N-terminus of PSMD3/Rpn3 and not the C-terminus (Figure 1a and S1a). Therefore, we included in Figure S1a of the revised manuscript structures of the proteasome where the tagged subunit termini are highlighted: C-terminus for PSMA4 and PSMC2 and N-terminus for PSMD3. Additionally, we would like to point out that, differently from PSMC2-BirA, cells expressing BirA-PSMD3 did not show slower growth, and BioID data showed a more homogenous enrichment of both 19S and 20S proteins, as compared to PSMC2-BirA (Figure 1D and 1E). However, the overall level of enrichment of proteasome subunits was not comparable to PSMA4-BirA and, therefore, we opted for focusing the rest of the manuscript on this construct.

      In support of this point, the data provided in Figure 1E in which the change in the abundances of each proteasome subunit in the tagged line vs. the BirA control line demonstrates substantial enrichment of the subcomplexes of the proteasome that are tagged in each case; this effect may represent the known feedback-mediated upregulation of new proteasome subunit synthesis that occurs when proteasomal proteolysis is impaired, or alternatively, the accumulation of subcomplexes containing the tagged subunit that cannot readily incorporate into mature proteasomes. Acknowledging this limitation in the text would be valuable to readers who are less familiar with the proteasome.

      We would like to clarify that the data shown in Figure 1E do not represent whole proteome data, but rather log2 fold changes vs. BirA* control calculated on streptavidin enrichment samples. The differences in the enrichment of the various subcomplexes between cell lines derives from the fact that the effect size of the enrichment depends on both protein abundance in the isolated complexes, but also on the efficiency of biotinylation. The latter will be higher for proteins located in closer proximity to the bait. A similar observation was pointed out in a recent publication (PMID:36410438) that compared BioID and Co-IP for the same bait. When a component of the nuclear pore complex (Nup158) was analyzed by BioID only the more proximal proteins were enriched as compared to the whole complex in Co-IP data (Author response image 1):

      Author response image 1.

      Proteins identified in the NUP158 BioID or pulldown experiments are filled in red or light red for significance intervals A or B, respectively. The bait protein NUP158 is filled in yellow. Proteins enriched in the pulldown falling outside the SigA/B cutoff are filled in gray. NPC, nuclear pore complex. SigA, significant class A; SigB, significant class B. Reproduced from Figure 6 of PMID: 36410438.

      However, we would like to point out that despite quantitative differences between different proteasome subunits, both 19S and 20S proteins were found to be strongly enriched (typically >2 fold) in all the constructs compared to BirA* control line (Figure 1E). This indicates that at least a fraction of all the tagged subunits are incorporated into fully assembled proteasomes.

      Regarding the upregulation of proteasome subunits as a consequence of proteasome dysfunction, we did not find evidence of this, at least in the case of PSMA4. The immunoblot shown in Figure 2A and its quantification in S3A indicate no increased abundance of endogenous PSMA4 upon tetracycline induction of PSMA4-BirA*.

      (2) The use of myc as a substrate of the proteasome for demonstration that proteolysis is unaffected is perhaps not ideal. Myc is known to be degraded via both ubiquitin-dependent and ubiquitin-independent mechanisms, such that disruption of one means of degradation (e.g., ubiquitin-dependent degradation) via a given tag could potentially be compensated by another. A good example of this is that the C-terminal tagging of PSMC2/Rpt1 is likely to disrupt interaction between the core particle and the regulatory particle (as suggested in Fig. 1D); this may free up the core particle for ubiquitin-independent degradation of myc.

      Aside from using specific reporters for ubiquitin-dependent vs. independent degradation or a larger panel of known substrates, analysis of the abundance of K48-ubiquitinated proteins in the control vs. tag lines would provide additional evidence as to whether or not proteolysis is generally perturbed in the tag lines.

      We thank the reviewer for this suggestion. We have included an immunoblot analysis showing that the levels of K48 ubiquitylation (Figure S3d) are not affected by the expression of tagged PSMA4.

      (3) On pg. 8 near the bottom, the authors accidentally refer to ARMC6 as ARMC1 in one instance.

      We have corrected the mistake.

      (4) On pg. 10, the authors explain that they analyzed the interactome for all major mouse organs except the brain; although they explain in the discussion section why the brain was excluded, including this explanation on pg. 10 here instead of in the discussion might be a better place to discuss this.

      We moved the explanation from the discussion to the results part.

      Reviewer #2 (Recommendations For The Authors):

      (1) Perhaps the authors can quantify the fraction of unassembled PSMA4-BirA* from the SEC experiment (Fig. 2b) to give the readers a feeling for how large a problem this could be.

      The percentages based on Area Under the Curve calculations have been added to Figure S3b.

      (2) Do the authors observe any difference in the enrichment scores between proteins that are known to interact with the proteasome vs proteins that the authors can justify as "interactors of interactors" vs the completely new potential interactors? This could be an interesting way to show that the potential new interactors are not simply because of poor false positive rate calibration, but that they behave in the same way as the other populations.

      We thank the reviewer for this suggestion. We analyzed the enrichment scores for 20S proteasome subunits, known PIPs, first neighbors and the remaining enriched proteins. The remaining proteins (potential new interactors) have very similar scores as the first neighbors of known interactors. This plot has been added to Figure S3g.

      (3) Did the authors try to train a logistic model for the miniTurbo experiments, like it was done for the BirA* experiments? Perhaps combining the results of both experiments would yield higher confidence on the proteasome interactors.

      Following the reviewers suggestion, we applied the classifier on the dataset of the comparison between miniTurbo and PSMA-miniTurbo. We found a clear separation between the FPR and the TPR with 136 protein groups enriched in PSMA-miniTurbo. We have added the classifier and corresponding ROC curve to Figure S4f and S4g.

      75 protein groups were found to be enriched for both PSMA4-BirA* and PSMA4-miniTurbo (Author response image 2), including the proteasome core particles, regulatory particles, known interactors and potential new interactors. As we focused more on the identification of substrates with PSMA4-miniTurbo, we did not pursue these overlapping protein groups further, but rather used the comparison to the mouse model to identify potential new interactors.

      Author response image 2.

      Overlap between ProteasomeID enriched proteins (fpr<0.05) between PSMA4-BirA* and PSMA4-miniTurbo.

      (4) Perhaps this is already known, but did the authors check if MG132 affect proteasome assembly? The authors could for example repeat their SEC experiments in the presence of MG132.

      We thank the reviewer for the suggestion, however to our knowledge there are no reports that MG132 has an effect on the assembly of the proteasome. MG132 is one of the most used proteasome inhibitors in basic research and as such has been extensively characterized in the last 3 decades. The small peptide aldehyde acts as a substrate analogue and binds directly to the active site of the protease PSMB5/β5. We therefore think it is unlikely that MG132 is interfering with the assembly of the proteasome.

      (5) Minor comment: at the bottom of page 8, the authors probably mean ARMC6 and not ARMC1.

      We have corrected the mistake.

      (6) It would be interesting to expand the analysis of the already acquired in vivo data to try to identify tissue-specific proteasome interactors. Can the authors draw a four-way Venn diagram with the interactors of each tissue?

      We thank the reviewer for this suggestion. We have generated an UpSet plot showing the overlap of ProteasomeID enriched proteins in the four tissues that gave us meaningful results (Author response image 3). In order to investigate whether the observed differences in ProteasomeID enriched proteins could be meaningful in terms of proteasome biology, we have highlighted proteins belonging to the UPS that show tissue specific enrichments. We found proteasome activators such as PSME1/PA28alpha and PSME2/PA28beta to enrich preferentially in kidney and liver, respectively, as well as multiple deubiquitinases to enrich preferentially in the heart. These differences might be related to the specific cellular composition of the different tissues, e.g., number of immune cells present, or the tissue-specific interaction of proteasomes with enzymes involved in the ubiquitin cycle. Given the rather preliminary nature of these findings, we have opted for not including this figure in the main manuscript, but rather include it only in this rebuttal letter.

      Author response image 3.

      Upset plot showing overlap between ProteasomeID enriched proteins in different mouse organs.

      Reviewer #3 (Recommendations For The Authors):

      (1) In the first paragraph of the Introduction, the authors link cellular senescence caused by partial proteasome inhibition with the efficacy of proteasome inhibitors in cancer therapy. Although this is an interesting hypothesis, I am not aware of any direct evidence for this; rather, I believe the efficacy of bortezomib/carfilzomib in haematological malignancies is most commonly attributed to these cells having adapted to high levels of proteotoxic stress (e.g., chronic unfolded protein response activation). I would suggest rephrasing this sentence.

      We thank the reviewer for the comment and have amended the introduction.

      (2) For the initial validation experiments (e.g., Fig. 1B), have the authors checked what level of Streptavidin signal is obtained with "+ bio, - tet" ? Although I accept that the induction of PSMA4-BirA* upon doxycycline addition is clear from the anti-Flag blots, it would still be informative to ascertain what level of background labelling is obtained without induction (but in the presence of exogenous biotin).

      We tested four different conditions +/- tet and +/- biotin (24h) in PSMA4-BirA* cell lines (Author response image 4). As expected, biotinylation was most pronounced when tet and biotin were added. When biotin was omitted, streptavidin signal was the lowest regardless of the addition of tet. Compared to the -biotin conditions, a slight increase of streptavidin signal could be observed when biotin was added but tet was not added. This could be either due to the promoter leaking (PMID: 12869186) or traces of tetracycline in the FBS we used, as we did not specifically use tet-free FBS for our experiments.

      Author response image 4.

      Streptavidin-HRP immunoblot following induction of BirA fusion proteins with tetracycline (+tet) and supplementation of biotin (+bio). For the sample used as expression control tetracycline was omitted (-tet). To test background biotinylation, biotin supplementation was omitted (-bio). Immunoblot against BirA and PSMA was used to verify induction of fusion proteins, while GAPDH was used as loading control.

      (3) For the proteasome structure models in Fig. 1D, a scale bar would be useful to inform the reader of the expected 10 nm labelling radius (as the authors have done later, in Fig. 2D).

      We have added 10 nm scale bars to Figure 1d.

      (4) In the "Identification of proteasome substrates by ProteasomeID" Results subsection, I believe there is a typo where the authors refer to ARMC1 instead of ARMC6.

      We have corrected the mistake.

      (5) I think Fig. S5 was one of the most compelling in the manuscript. Given the interest in confirming on-target efficacy of targeted degradation modalities, as well as identifying potential off-target effects early-on in development, I would consider promoting this out of the supplement.

      We thank the reviewer for the comment and share the excitement about using ProteasomeID for targeted degradation screening. We have moved the data on PROTACs (Figure S5) into a new main Figure 5.

      In addition, in relation to the comment of this reviewer regarding the detection of endogenous substrates, we have now included validation for one more hit emerging from our analysis (TIGD5) and included the results in Figure 4f, 4g and S4j.

    2. eLife assessment

      This study presents an important method and resource in cell lines and in mice for mass spectrometry-based identification of interactors of the proteasome, a multi-protein complex with a central role in protein turnover in almost all tissues and cell types. The method presented, including the experimental workflow and analysis pipeline, as well as the several lines of validation provided throughout, is convincing. Given the growing interest in protein aggregation and targeted protein degradation modalities, this work will be of interest to a broad spectrum of basic cell biologists and translational researchers.

    3. Reviewer #2 (Public Review):

      Summary

      In this work, Bartolome and colleagues develop a new approach to identify proteasome interacting proteins and substrates. The approach is based on fusing proteasome subunits with a biotin ligase that will label proteins that come in close physical distance of the ligase. These biotin-labeled proteins (or their resulting tryptic peptides) can be affinity purified using streptavidin and identified by mass spectrometry.

      This elegant solution was able to identify a large proportion of known proteasome interactors, as well as multiple potential new interactors. Combining this approach with a proteasome inhibitor allowed also for the enrichment of substrates, due to increased contact time between substrates and the proteasome. Again, the authors were able to identify novel substrates. Finally, the authors implemented this strategy in vivo, providing the hints for potential tissue-specific proteasome interactors.<br /> This novel strategy provides an additional approach to identify new proteasome substrates, which can be particularly powerful for low abundant proteins, e.g., transcription factors. The possibility to implement it in vivo in specific cell types opens the possibility for identifying proteasome interactors in small cell subpopulations or in subpopulations involved in disease.

      Strengths

      The authors carefully characterized their genetically engineered proteasome-biotin ligase fusions to ensure that proteasome structure and activity was not altered. This is key to ensure that the proteins identified to interact with the proteasome reflect interactions that occur under physiological conditions.

      The authors implemented an algorithm that controls the false positive rate of the identified interactors of the proteasome. This is an important aspect to avoid spending time on the characterization of potential interactors that are just an artifact of the experimental setup.

      The addition of a proteasome inhibitor allowed the authors to identify substrates of the proteasome. Although there are other strategies to do this (e.g., affinity purification of Gly-Gly modified peptides, which is a marker for ubiquitination), this additional approach can highlight currently unknown substrates. One example are low abundance proteins, such as transcription factors.

      The overall strategy developed by the authors can be implemented in vivo, which opens for the possibility of determining cell type-specific proteasome interactors (and perhaps substrates).

      Weaknesses

      There is a proportion (approximately 38%) of the PSMA4-biotin ligase fusion that remains unassembled (i.e., not part of the functional proteasome) and that can contribute to a small proportion of false positive interactions.

    4. Reviewer #3 (Public Review):

      Summary:

      Bartolome et al. present ProteasomeID, a novel method to identify components, interactors, and (potentially) substrates of the proteasome in cell lines and mouse models. As a major protein degradation machine that is highly conserved across eukaryotes, the proteasome has historically been assumed to be relatively homogeneous across biological scales (with few notable exceptions, e.g., immunoproteasomes and thymoproteasomes). However, a growing body of evidence suggests that there is some degree of heterogeneity in the composition of proteasomes across cell tissues, and can be highly dynamic in response to physiologic and pathologic stimuli. This work provides a methodological framework for investigating such sources of variation. The authors start by adapting the increasingly popular biotin ligation strategy for labelling proteins coming into close proximity with one of three different subunits of the proteasome, before proceeding with PSMA4 for further development and analysis based on their preliminary labelling data. In a series of well-constructed and convincing validation experiments, the authors go on to show that the tagged PSMA4 construct can be incorporated into functional proteasomes, and is able to label a broad set of known proteasome components and interacting proteins in HEK293T cells. They also attempt to identify novel proteasomal degradation substrates with ProteasomeID; while this was convincing for known substrates with particularly short half-lives, the results for substrates with longer half-lives were less clear. One of the most compelling results was from a similar experiment to confirm proteasomal degradation induced by a BRD-targeting PROTAC, which I think is likely to be of keen interest to the targeted degradation community. Finally, the authors establish a ProteasomeID mouse model, and demonstrate its utility across several tissues.

      Strengths:

      (1) ProteasomeID itself is an important step forward for researchers with an interest in protein turnover across biological scales (e.g., in sub-cellular compartments, in cells, in tissues, and whole organisms). I especially see interest from two communities: those studying fundamental proteostasis in physiological and pathologic processes (e.g., ageing; tissue-specific protein aggregation diseases), and those developing targeted protein degradation modalities (e.g., PROTACs; molecular glues). All the datasets generated and deposited here are likely to provide a rich resource to both. The HEK293T cell line data are a valuable proof-of-concept to allow expansion into more biologically-relevant cell culture settings; however, I envision the greatest innovation here to be the mouse model. For example, in the targeted protein degradation space, two major hurdles in early-stage pre-clinical development are (i) evaluation of degradation efficacy across disease-relevant tissues, and (ii) toxicity and safety implications caused by off-target degradation, e.g., of newly-identified molecular glues and/or in particularly-sensitive tissues. The ProteasomeID mouse allows early in vivo assessment of both these questions. The results of the BRD PROTAC experiment in 293T cells provides an excellent in vitro proof-of-concept for this approach.

      (2) The mass-spectrometry-based proteomics workflows used and presented throughout the manuscript are robust, rigorous, and convincing. For example, the algorithm the authors use for defining enrichment score cut-offs are logical and based on rational models, rather than on arbitrary cut-offs that are common for similar proteomics studies. The construction (and subsequent validation) of both BirA*- and miniTurbo- tagged PSMA4 variants also increases the utility of the method, allowing researchers to choose the variant with the labelling time-scale required for their particular research question.

      (3) The optimised BioID and TurboID protocol the authors develop (summarised in Fig. S2A) and validate (Fig. S2B-D) is likely to be of broad interest to cell and molecular biologists beyond the protein degradation field, given that proximity labelling is a current gold-standard in global protein:protein interaction profiling.

      Limitations:

      I think the authors do an excellent job in highlighting the limitations of ProteasomeID throughout the Results and Discussion. I do have some specific comments that might provide additional context for the reader.

      (1) The authors do a good job in showing that a substantial proportion of PSMA4-BirA* is incorporated into functional proteasome particles; however, it is not immediately clear to me how much background (false-positive IDs) might be contributed by the ~40 % of PSMA4-BirA* that is not incorporated into the mature core particle (based on the BirA* SEC-MS traces in Fig. 2b and S3b, i.e., the large peak ~ fraction 20). Are there any bands lower down in the native gel shown in Fig. 2c, i.e., corresponding to lower molecular weight complexes or monomeric PSMA4-BirA*? The enrichment of proteasome assembly factors in all the ProteasomeID experiments might suggest the presence of assembly intermediates, which might themselves become substrates for proteasomal degradation (as has been shown for other incompletely-assembled protein complexes, e.g., the ribosome, TRiC/CCT).

      (2) Although the authors attempt to show that BirA* tagging of PSMA4 does not interfere with proteasome activity (Fig. 2e-f), I think the experimental evidence for this is incomplete. They show that the overall chymotrypsin-like activity (attributable to PSMB5) in cells expressing PSMA4-BirA* is not markedly reduced compared with control BirA*-expressing cells. However, they do not show that the activity of the specific proteasome sub-population that contains PSMA4-BirA* is unaffected (e.g., by purifying this sub-population via the Flag tag). The proteasome activity of the sub-population of wild-type proteasome complexes that do not contain the PSMA4-BirA* (~50%, based on the earlier immunoblots) could account for the entire chymotrypsin-like activity-especially in the context of HEK293T cells, where steady-state proteasome levels are unlikely to be limiting. It would also be useful to assess any changes in tryspin- and caspase- like activities, especially as tagging of PSMA4 could conceivably interfere with the activity of some PSMB subunits, but not others.

      (3) I was left slightly unsure as to the general utility of ProteasomeID for identifying novel proteasomal substrates in homeostatic conditions--especially for proteins with longer half-lives. The cycloheximide chases in Fig. 4g/S4j are clear for MYC and TIGD5 (which have short half-lives), but are not so clear for ARMC6 and BRAT1: the reduction in the bands are modest, and might have been clearer with longer "chase" time-points. Furthermore, classifying candidates based on enrichment following proteasome inhibition with MG-132 have the potential to lead to a high number of false positives. ProteasomeID's utility in identifying potential substrates in more targeted settings (e.g., molecular glues, off-target PROTAC substrates) is far more apparent.

    1. Author response:

      Overall recommendations.

      A brief summary of the main reviewers' recommendations that should be prioritized is listed below. Detailed recommendations as suggested by each individual reviewer are also included.

      -Better justification of the choice of the substitutions for the mutations should be added. In addition, authors should strongly consider adding more mutations to enable mechanistic tests of the proposed model for lipid conduction.

      We will characterize more mutations to the key residues at the TM4-TM6 interface. In addition to the TM4 lysine mutations shown in the original manuscript, we will include mutations to alanine and glutamate, and justify our choice of the substitutions in the revised manuscript. Furthermore, we will also test if introducing lysine mutations in TM6 will convert the ion channels into lipid scramblases. These additional experiments will greatly strengthen our conclusion.

      -Blockers to validate the concern that the recorded currents indeed arise from TMEM16A or OSCA/TMEM63 channels should be tested. Do the pore blockers also block scramblase activity in the gating mutants?

      TMEM16A and OSCA1.2 are readily expressed on cell surface. OSCA1.2 also has large conductance. This is the reason why we can record huge current even with inside-out patches. We will include TMEM16A inhibitor Ani9 and a non-specific inhibitor of OSCA channels to further validate. We have reported that Ani9 can inhibit a TMEM16A-derived lipid scramblase (L543K in TM4) in our previo3us publication (PMID: 31015464). We will test if Ani9 can also inhibit other TMEM16A scramblases reported in this study. We will also examine if Gd3+ is capable of blocking lipid scrambling of the OSCA1.2 gating mutations.

      -Include details of missing experimental conditions for scramblase activity.

      We will conduct a thorough revision to include detailed experimental conditions for scramblase activity measurement.

      -Additional mutants above and below the putative lysine gate as suggested by reviewer 3 to better assess the model.

      As we explained in Response #1, we will extend our mutations around the putative activation gate.

      -Concern about whether osmolarity changes are in fact activating OSC and TMEM63. As suggested by reviewers 1 and 3. This could be addressed by assessing scramblase activity and currents at different osmolarity levels.

      We will test the engineered OSCA1.2 scramblases in response to solutions with different osmolarity.

      Reviewer #1 (Public Review):

      Summary:

      TMEM16, OSCA/TMEM63, and TMC belong to a large superfamily of ion channels where TMEM16 members are calcium-activated lipid scramblases and chloride channels, whereas OSCA/TMEM63 and TMCs are mechanically activated ion channels. In the TMEM16 family, TMEM16F is a well-characterized calcium-activated lipid scramblase that plays an important role in processes like blood coagulation, cell death signaling, and phagocytosis. In a previous study, the group demonstrated that lysine mutation in TM4 of TMEM16A can enable the calcium-activated chloride channel to permeate phospholipids too. Based on this they hypothesize that the energy barrier for lipid scramblase in these ion channels is low, and that modification in the hydrophobic gate region by introducing a charged side chain between the TM4/6 interface in TMEM16 and OSCA/TMEM63 family can allow lipid scramblase. In this manuscript, using scramblase activity via Annexin V binding to phosphatidylserine, and electrophysiology, the authors demonstrate that lysine mutation in TM4 of TMEM16F and TMEM16A can cause constitutive lipid scramblase activity. The authors then go on to show that analogous mutations in OSCA1.2 and TMEM63A can lead to scramblase activity.

      Strengths:

      Overall, the authors introduce an interesting concept that this large superfamily can permeate ions and lipids.

      Weaknesses:

      The electrophysiology data does not entirely support their claims.

      We appreciate your positive comments. We will conduct more experiments including more electrophysiology characterizations as suggested.

      Reviewer #2 (Public Review):

      This concise and focused study by Lowry and colleagues identifies a motif in the pores of three families of channel/scramblase proteins that regulate exclusive ion permeation and lipid transport. These three ion channel families, which include the TMEM16s, the plant-expressed and stress-gated cation channel OSCA, and the mammalian homolog and mechanosensitive cation channel, TMEM63 share low sequence similarity between them and have seemingly differing functions, as anion (TMEM16s), or stress-activated cation channels (OSCA/TMEM63). The study finds that in all three families, mutating a single hydrophobic residue in the ion permeation pathway of the channels confers lipid transport through the pores of the channels, indicating that TMEM16 and the related OSCA and TMEM63 channels have a conserved potential for both ion and lipid permeation. The authors interpret the findings as revealing that these channel/scramblase proteins have a relatively low "energetic barrier for scramblase" activity. The experiments themselves seem to be done with a high level of rigor and the paper is well written. A weakness is the limited scope of the experiments which, if fixed, could open up a new line of inquiry.

      We appreciate the positive comments from the reviewer. We will conduct more experiments listed in our responses to the Overall Recommendations to improve the scope and quality of our study.

      Reviewer #3 (Public Review):

      This study was focused on the conserved mechanisms across the Transmembrane Channel/Scramblase superfamily, which includes members of the TMEM16, TMEM63/OSCA, and TMC families. The authors show that the introduction of lysine residues at the TM4-TM6 interface can disrupt gating and confer scramblase activity to non-scramblase proteins. Specifically, they show this to be true for conserved TM4 residues across TMEM16F, TMEM16A, OSCA1.2, and TMEM63A proteins. This breadth of data is a major strength of the paper and provides strong evidence for an underlying linked mechanism for ion conduction and phospholipid transport. Overall, the confocal imaging experiments, patch clamping experiments, and data analysis are performed well.

      However, there are several concerns regarding the scope of experiments supporting some claims in the paper. Although the authors propose that the TM4/TM6 interface is critical to ion conduction and phospholipid scramblase activity, in each case, there is very narrow evidence of support consisting of 1-3 lysine substitutions at specific residues on TM4. Given that the authors postulate that the introduction of a positive charge via the lysine side chain is essential to the constitutive activity of these proteins, additional mutation controls for side chain size (e.g. glutamine/methionine) or negative charge (e.g. glutamic acid), or a different positive charge (i.e. arginine) would have strengthened their argument. To more comprehensively understand the TM4/TM6 interface, mutations at locations one turn above and one turn below could be studied until there is no phenotype. In addition, the equivalent mutations on the TM6 side should be explored to rule out the effects of conformational changes that arise from mutating TM4 and to increase the strength of evidence for the importance of side-chain interactions at the TM6 interface. The experiments for OSCA1.2 osmolarity effects on gating and scramblase in Figure 4 could be improved by adding different levels of osmolarity in addition to time in the hypotonic solution.

      We appreciate the positive and constructive comments from the reviewer. As we outlined in our responses to the Overall Recommendations, we will include more mutations at the TM4 and TM6 interface to further strengthen our conclusion.

    2. Reviewer #3 (Public Review):

      This study was focused on the conserved mechanisms across the Transmembrane Channel/Scramblase superfamily, which includes members of the TMEM16, TMEM63/OSCA, and TMC families. The authors show that the introduction of lysine residues at the TM4-TM6 interface can disrupt gating and confer scramblase activity to non-scramblase proteins. Specifically, they show this to be true for conserved TM4 residues across TMEM16F, TMEM16A, OSCA1.2, and TMEM63A proteins. This breadth of data is a major strength of the paper and provides strong evidence for an underlying linked mechanism for ion conduction and phospholipid transport. Overall, the confocal imaging experiments, patch clamping experiments, and data analysis are performed well.

      However, there are several concerns regarding the scope of experiments supporting some claims in the paper. Although the authors propose that the TM4/TM6 interface is critical to ion conduction and phospholipid scramblase activity, in each case, there is very narrow evidence of support consisting of 1-3 lysine substitutions at specific residues on TM4. Given that the authors postulate that the introduction of a positive charge via the lysine side chain is essential to the constitutive activity of these proteins, additional mutation controls for side chain size (e.g. glutamine/methionine) or negative charge (e.g. glutamic acid), or a different positive charge (i.e. arginine) would have strengthened their argument. To more comprehensively understand the TM4/TM6 interface, mutations at locations one turn above and one turn below could be studied until there is no phenotype. In addition, the equivalent mutations on the TM6 side should be explored to rule out the effects of conformational changes that arise from mutating TM4 and to increase the strength of evidence for the importance of side-chain interactions at the TM6 interface. The experiments for OSCA1.2 osmolarity effects on gating and scramblase in Figure 4 could be improved by adding different levels of osmolarity in addition to time in the hypotonic solution.

    3. eLife assessment

      This manuscript finds evidence for a latent capability in several members of the TMEM16 and OSCA/TMEM family of ion channels for lipid scramblase activity. The authors demonstrate that the introduction of lysine mutations in evolutionarily conserved areas of TM4 can confer constitutive ion conduction and scramblase activity. Although the significance and scope of the work are important, the strength of the evidence is incomplete and could be improved.

    4. Reviewer #1 (Public Review):

      Summary:

      TMEM16, OSCA/TMEM63, and TMC belong to a large superfamily of ion channels where TMEM16 members are calcium-activated lipid scramblases and chloride channels, whereas OSCA/TMEM63 and TMCs are mechanically activated ion channels. In the TMEM16 family, TMEM16F is a well-characterized calcium-activated lipid scramblase that plays an important role in processes like blood coagulation, cell death signaling, and phagocytosis. In a previous study, the group demonstrated that lysine mutation in TM4 of TMEM16A can enable the calcium-activated chloride channel to permeate phospholipids too. Based on this they hypothesize that the energy barrier for lipid scramblase in these ion channels is low, and that modification in the hydrophobic gate region by introducing a charged side chain between the TM4/6 interface in TMEM16 and OSCA/TMEM63 family can allow lipid scramblase. In this manuscript, using scramblase activity via Annexin V binding to phosphatidylserine, and electrophysiology, the authors demonstrate that lysine mutation in TM4 of TMEM16F and TMEM16A can cause constitutive lipid scramblase activity. The authors then go on to show that analogous mutations in OSCA1.2 and TMEM63A can lead to scramblase activity.

      Strengths:

      Overall, the authors introduce an interesting concept that this large superfamily can permeate ions and lipids.

      Weaknesses:

      The electrophysiology data does not entirely support their claims.

    5. Reviewer #2 (Public Review):

      This concise and focused study by Lowry and colleagues identifies a motif in the pores of three families of channel/scramblase proteins that regulate exclusive ion permeation and lipid transport. These three ion channel families, which include the TMEM16s, the plant-expressed and stress-gated cation channel OSCA, and the mammalian homolog and mechanosensitive cation channel, TMEM63 share low sequence similarity between them and have seemingly differing functions, as anion (TMEM16s), or stress-activated cation channels (OSCA/TMEM63). The study finds that in all three families, mutating a single hydrophobic residue in the ion permeation pathway of the channels confers lipid transport through the pores of the channels, indicating that TMEM16 and the related OSCA and TMEM63 channels have a conserved potential for both ion and lipid permeation. The authors interpret the findings as revealing that these channel/scramblase proteins have a relatively low "energetic barrier for scramblase" activity. The experiments themselves seem to be done with a high level of rigor and the paper is well written. A weakness is the limited scope of the experiments which, if fixed, could open up a new line of inquiry.

    1. eLife assessment

      This valuable study is of relevance for those interested in the mechanism required for infections of humans by Klebsiella pneumoniae. The authors apply TraDIS (high-density TnSeq) to K. pneumoniae with the goal of identifying genes required for survival under various infection-relevant conditions and the gene sets identified, together with the raw sequence data, will be resources for the Klebsiella research community. The evidence to support the lists of essential and conditionally-essential genes is convincing. The study provides strong evidence that some genes are conditionally essential in urine because of iron limitation, but there is less mechanistic insight for genes that are conditionally essential in serum.

    1. Author response:

      The following is the authors’ response to the original reviews.

      The reviewer comments have been helpful, and we have revised the manuscript to address the concerns of reviewer 2. In addition to text changes, we also added a negative control to Figure 1 to address concerns about photobleaching or DNA unwrapping.

      Reviewer #1:

      This manuscript presents an extremely exciting and very timely analysis of the role that the nucleosome acidic patch plays in SWR1-catalyzed histone exchange. Intriguingly, SWR1 loses activity almost completely if any of the acidic patches are absent. To my knowledge, this makes SWR1 the first remodeler with such a unique and pronounced requirement for the acidic patch. The authors demonstrate that SWR1 affinity is dramatically reduced if at least one of the acidic patches is absent, pointing to a key role of the acidic patch in SWR1 binding to the nucleosome. The authors also pinpoint a specific subunit - Swc5 - that can bind nucleosomes, engage the acidic patch, and obtain a cryo-EM structure of Swc5 bound to a nucleosome. They also identify a conserved arginine-rich motif in this subunit that is critical for nucleosome binding and histone exchange in vitro and for SWR1 function in vivo. The authors provide evidence that suggests a direct interaction between this motif and the acidic patch.

      Strengths:

      The manuscript is well-written and the experimental data are of outstanding quality and importance for the field. This manuscript significantly expands our understanding of the fundamentally important and complex process of H2A.Z deposition by SWR1 and would be of great interest to a broad readership.

      We thank the reviewer for their enthusiastic and positive comments on our work.

      Reviewer #2:

      Summary:

      In this study, Baier et al. investigated the mechanism by which SWR1C recognizes nucleosomal substrates for the deposition of H2A.Z. Their data convincingly demonstrate that the nucleosome's acidic patch plays a crucial role in the substrate recognition by SWR1C. The authors presented clear evidence showing that Swc5 is a pivotal subunit involved in the interaction between SWR1C and the acidic patch. They pared down the specific region within Swc5 responsible for this interaction. However, two central assertions of the paper are less convincing. First, the data supporting the claim that the insertion of one Z-B dimer into the canonical nucleosome can stimulate SWR1C to insert the second Z-B dimer is somewhat questionable (see below). Given that this claim contradicts previous observations made by other groups, this hypothesis needs further testing to eliminate potential artifacts. Secondly, the claim that SWR1C simultaneously recognizes the acidic patch on both sides of the nucleosome also needs further investigation, as the assay used to establish this claim lacks the sensitivity necessary to distinguish any difference between nucleosomal substrates containing one or two intact acidic patches.

      Strengths:

      As mentioned in the summary, the authors presented clear evidence demonstrating the role of Swc5 in recognition of the nucleosome acidic patch. The identification of the specific region in Swc5 responsible for this interaction is important.

      We thank the reviewer for their careful critique of our work. Below we address each major concern.

      Major comments: (1) Figure 1B: It is unclear how much of the decrease in FRET is caused by the bleaching of fluorophores. The authors should include a negative control in which Z-B dimers are omitted from the reaction. In the absence of ZB dimers, SWR1C will not exchange histones. Therefore, any decrease in FRET should represent the bleaching of fluorophores on the nucleosomal substrate, allowing normalization of the FRET signal related to A-B eviction.

      In this manuscript, as well as in our two previous publications (Singh et al., 2019; Fan et al.,2022), we have presented the results of no enzyme controls, +/- ZB dimers, no ATP controls, or AMP-PNP controls for our FRET-based, H2A.Z deposition assay (see also Figure S3). We do not observe significant levels of photobleaching in this assay, either during ensemble measurements or in an smFRET experiment. To aid the reader, we have added the AMP-PNP data for the experiment shown in Figure 1B. The results show there is less than a 10% decrease in FRET over 30’, and the signal from the double acidic patch disrupted nucleosome is identical to this negative control.

      (2) Figure S3: The authors use the decrease in FRET signal as a metric of histone eviction. However, Figure S3 suggests that the FRET signal decrease could be due to DNA unwrapping. Histone exchange should not occur when SWR1C is incubated with AMP-PNP, as histone exchange requires ATP hydrolysis (10.7554/eLife.77352). And since the insertion of Z-B dimer and the eviction of A-B dimer are coupled, the decrease of FRET in the presence of AMP-PNP is unlikely due to histone eviction or exchange. Instead, the FRET decrease is likely due to DNA unwrapping (10.7554/eLife.77352). The authors should explicitly state what the loss of FRET means.

      We agree with the reviewer, that loss of FRET can be due to DNA unwrapping from the nucleosome. We have previously demonstrated this activity by SWR1C in our smFRET study (Fan et al., 2022). However, DNA unwrapping is highly reversible and has a time duration of only 1-3 seconds. We and others have not observed stable unwrapping of nucleosomes by SWR1C, but rather the stable loss of FRET reports on dimer eviction. We assume the reviewer is concerned about the rather large decrease in FRET signal shown in the AMP-PNP controls for Figure S3, panels A and D. For the other 7 panels, the decrease in FRET with AMP-PNP are minimal. In fact, if we average all of the AMP-PNP data points, the rate of FRET loss is not statistically different from no enzyme control reactions (nucleosome plus ZB dimers).

      Data for panels A and D used a 77NO nucleosomal substrate, with Cy3 labeling the linker distal dimer. This is our standard DNA fragment, and it was used in Figure 1B. The only difference between data sets is that the data shown in Fig 1B used nucleosome reconstituted with a Cy5-labelled histone octamer, rather than the hexasome assembly method used for Fig S3. Three points are important. First, for all of these substrates, we assembled 3 independent nucleosomes, and the results are highly reproducible. Two, we performed a total of 6 experiments for the 77NO-Cy5 substrates to ensure that the rates were accurate (+/-ATP). Third, and most important, we do not see this decrease in FRET signal in the absence of SWR1C (no enzyme control). This data was included in the data source file. Thus, it appears that there is significant SWR1C-induced nucleosome instability for these two hexasome-assembled substrates. We now note this in the legend to Figure S3. Key for this work, however, is that there is a large increase in the rate of FRET loss in the presence of ATP, and this rate is faster when a ZB dimer was present at the linker proximal location. In response to the last point, we state in the first paragraph of the results: “The dimer exchange activity of SWR1C is monitored by following the decrease in the 670 nm FRET signal due to eviction of the Cy5-labeled AB-Cy5 dimer (Figure 1A).”

      (3) Related to point 2. One way to distinguish nucleosomal DNA unwrapping from histone dimer eviction is that unwrapping is reversible, whereas A-B eviction is not. Therefore, if the authors remove AMP-PNP from the reaction chamber and a FRET signal reappears, then the initial loss of FRET was due to reversible DNA unwrapping. However, if the removal of AMP-PNP did not regain FRET, it means that the loss of FRET was likely due to A-B eviction. The authors should perform an AMP-PNP and/or ATP removal experiment to make sure the interpretation of the data is correct.

      See response to item 2 above

      (4) The nature of the error bars in Figure 1C is undefined; therefore, the statistical significance of the data is not interpretable.

      We apologize for not making this more explicit for each figure. The error bars report on 95% confidence intervals from at least 3 sets of experiments. This statement has been added to the legend.

      (5) The authors claim that the SWR1C requires intact acidic patches on both sides of the nucleosomes to exchange histone. This claim was based on the experiment in Figure 1C where they showed mutation of one of two acidic patches in the nucleosomal substrate is sufficient to inhibit SWR1C-mediated histone exchange activity. However, one could argue that the sensitivity of this assay is too low to distinguish any difference between nucleosomes with one (i.e., AB/AB-apm) versus two mutated acidic patches (i.e., AB-apm/AB-apm). The lack of sensitivity of the eviction assay can be seen when Figure 1B is taken into consideration. In the gel-shift assay, the AB-apm/AB-apm nucleosome exhibited a 10% SWR1C-mediated histone exchange activity compared to WT. However, in the eviction assay, the single AB/AB-apm mutant has no detectable activity. Therefore, to test their hypothesis, the authors should use the more sensitive in-gel histone exchange assay to see if the single AB/AB-apm mutant is more or equally active compared to the double AB-apm/AB-apm mutant.

      Our pincher model is based on three, independent sets of data, not just Figure 1C. First, as noted by the reviewer, we find that disruption of either acidic patch cripples the dimer exchange activity of SWR1C in the FRET-based assay. Whether the defect is identical to that of the double APM mutant nucleosome does not seem pertinent to the model. In a second set of assays, we used fluorescence polarization to quantify the binding affinity of SWR1C for wildtype nucleosomes, a double APM nucleosome, or each single APM nucleosome. Consistent with the pincher model, each single APM disruption decreases binding affinity at least 10-fold (below the sensitivity of the assay). Finally, we monitored the ability of different nucleosomes to stimulate the ATPase activity of SWR1C. Consistent with the pincher model, a single APM disruption was sufficient to eliminate nucleosome stimulation.

      (6) The authors claim that the AZ nucleosome is a better substrate than the AA nucleosome. This is a surprising result as previous studies showed that the two insertion steps of the two Z-B dimers are not cooperative (10.7554/eLife.77352 and 10.1016/J.CELREP.2019.12.006). The authors' claim was based on the eviction assay shown in Fig 1C. However, I am not sure how much variation in the eviction assay is contributed by different preparations of nucleosomes. The authors should use the in-gel assay to independently test this hypothesis.

      For all data shown in our manuscript, at least three different nucleosome preparations were used. The impact of a ZB dimer on the rates of dimer exchange was highly reproducible among different nucleosome preparations and experiments. We also see reproducible ZB stimulation for three different substrates – with ZB on the linker proximal side, the linker distal side, and on one side of a core particle. We do not believe that our data are inconsistent with previous studies. First, the previous work referenced by the reviewer performed dimer exchange reactions with a large excess of nucleosomes to SWR1C (catalytic conditions), whereas we used single turnover reactions. Secondly, our study is the first to use a homogenous, ZA heterotypic nucleosome as a substrate for SWR1C. All previous studies used a standard AA nucleosome, following the first and second rounds of dimer exchange that occur sequentially. And finally, we observe only a 20-30% increase in rate by a ZB dimer (e.g. 77N0 substrates), and such an increase was unlikely to have been detected by previous gel-based assays.

      Minor comments:

      (1) Abstract line 4: To say 'Numerous' studies have shown acidic patch impact chromatin remodeling enzymes activity may be too strong.

      Removed

      (2) Page 15, line 15: The authors claim that swc5∆ was inviable on formamide media. However, the data in Figure 8 shows cell growth in column 1 of swc5∆.

      The term ‘inviable’ has been replaced with ‘poor’ or ‘slow growth’

      (3) The authors should use standard yeast nomenclature when describing yeast genes and proteins. For example, for Figure 8 and legend, Swc5∆ was used to describe the yeast strain BY4741; MATa; his3Δ1; leu2Δ0; met15Δ0; ura3Δ0; YBR231c::kanMX4. Instead, the authors should describe the swc5∆ mutant strain as BY4741 MAT a his3∆1 leu2∆0 met15∆0 ura3∆0 swc5∆::kanMX4. Exogenous plasmid should also be indicated in italics and inside brackets, such as [SWC5-URA3] or [swc5(R219A)-URA3].

      We apologize for missing this mistake in the Figure 8 legend. We had inadvertently copied this from the euroscarf entry and forgot to edit the entry. We decided not to add all the plasmid names to the figure, as it was too cluttered. We state in the figure legend that the panels show growth of swc5 deletion strains harboring the indicated swc5 alleles on CEN/ARS plasmids.

      (4) According to Lin et al. 2017 NAR (doi: 10.1093/nar/gkx414), there is only one Swc5 subunit per SWR1C. Therefore, the pincher model proposed by the authors would suggest that there is a missing subunit that recognizes the second acidic patch. The authors should point out this fact in the discussion. However, as mentioned in Major comment 6, I am not sure if the pincer model is substantiated.

      In our discussion, we had noted that the published cryoEM structure had suggested that the Swc2 subunit likely interacts with the acidic patch on the dimer that is not targeted for replacement, and we proposed that Swc5 interacts with the acidic patch on the exchanging H2A/H2B dimer. We have now made this more clear in the text.

    2. Reviewer #1 (Public Review):

      This manuscript presents an extremely exciting and very timely analysis of the role that the nucleosome acidic patch plays in SWR1-catalyzed histone exchange. Intriguingly, SWR1 loses activity almost completely if any of the acidic patches are absent. To my knowledge, this makes SWR1 the first remodeler with such a unique and pronounced requirement for the acidic patch. The authors demonstrate that SWR1 affinity is dramatically reduced if at least one of the acidic patches is absent, pointing to a key role of the acidic patch in SWR1 binding to the nucleosome. The authors also pinpoint a specific subunit - Swc5 - that can bind nucleosomes and engage the acidic patch and obtain a cryo-EM structure of Swc5 bound to a nucleosome. They also identify a conserved arginine-rich motif in this subunit that is critical for nucleosome binding and histone exchange in vitro and for SWR1 function in vivo. The authors provide evidence that suggests a direct interaction between this motif and the acidic patch.

      Strengths:

      The manuscript is well-written and the experimental data are of outstanding quality and importance for the field. This manuscript significantly expands our understanding of the fundamentally important and complex process of H2A.Z deposition by SWR1 and would be of great interest for a broad readership.

    3. eLife assessment

      This manuscript presents an important analysis of the role that the nucleosome acidic patch plays in SWR1-catalyzed histone exchange. This manuscript contains convincing data which significantly expands our understanding of the complex process of H2A.Z deposition by SWR1 and therefore would be of interest to a broad readership.

    1. eLife assessment

      This paper addresses an important topic (normative trajectory modelling), seeking to provide a method aiming to accurately reflect the individual deviation of longitudinal/temporal change compared to the normal temporal change characterized based on a pre-trained population normative model. The evidence provided for the new methods is, however, inadequate. There is a lack of simulation studies to formally evaluate the performance of the proposed method in making accurate estimations and inferences about the longitudinal changes, the novelty of the method is not sufficiently described, and the example provided is unsatisfactory.

    2. Reviewer #1 (Public Review):<br /> <br /> Summary:

      This paper provides a methodology for normative trajectory modeling, using cross-sectional data to set the "norms," and then applying these norms to longitudinal brain observations. An example of schizophrenia trajectories (two time points) is provided. The method assumes a Bayesian mixed effects model, which included some hyperparameters that need to be tuned. The longitudinal assumption is essentially a random intercept model, assuming that the age-based quantiles do not shift, and if they do that is a sign of disease-like trajectories.

      Strengths:

      Normative modeling of brain feature trajectories is an important topic. Bayesian models are a promising alternative to modeling these. Leveraging large-scale data to provide norms is also potentially useful.

      Weaknesses:

      The models described are not fundamentally novel, essentially a random intercept model (with a warping function), and some flexible covariate effects using splines (i.e., additive models). The assumption of constant quantiles is very strong, and limits the utility of the model to very short term data. The schizophrenia example leads to a counter-intuitive normalization of trajectories, which leads to suspicions that this is driven by some artifact of the data modeling/imaging pipelines. The method also assumes that the cross-sectional data is from a "healthy population" without describing what this population is (there is certainly every chance of ascertainment bias in large scale studies as well as small scale studies). This issue is completely elided over in the manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, the authors provide a method aiming to accurately reflect the individual deviation of longitudinal/temporal change compared to the normal temporal change characterized based on pre-trained population normative model (i.e., a Bayesian linear regression normative model), which was built based on cross-sectional data. This manuscript aims to solve a recently identified problem of using normative models based on cross-sectional data to make inferences about longitudinal change.

      Although the proposed method was implemented with real data and shown to be more sensitive in capturing the differences confirmed by previous studies than conventional methods, there is still a lack of simulation studies to formally evaluate the performance of the proposed method in making accurate estimations and inferences about the longitudinal changes.

      Strengths:

      The efforts of this work make a good contribution to addressing an important question of normative modeling. With the greater availability of cross-sectional studies for normative modeling than longitudinal studies, and the inappropriateness of making inferences about longitudinal subject-specific changes using these cross-sectional data-based normative models, it's meaningful to try to address this gap from the perspective of methodological development.

      Weaknesses:

      • The organization and clarity of this manuscript need enhancement for better comprehension and flow. For example, in the first few paragraphs of the introduction, the wording is quite vague. A lot of information was scattered and repeated in the latter part of the introduction, and the actual challenges/motivation of this work were not introduced until the 5th paragraph.

      • There are no simulation studies to evaluate whether the adjustment of the cross-sectional normative model to longitudinal data can make accurate estimations and inferences regarding the longitudinal changes. Also, there are some assumptions involved in the modeling procedure, for example, the deviation of a healthy control from the population over time is purely caused by noise and constant variability of error/noise across x_n, and these seem to be quite strong assumptions. The presentation of this work's method development would be strengthened if the authors can conduct a formal simulation study to evaluate the method's performance when such assumptions are violated, and, ideally, propose some methods to check these assumptions before performing the analyses.

      • The proposed "z-diff score" still falls in the common form of z-score to describe the individual deviation from the population/reference level, but now is just specifically used to quantify the deviation of individual temporal change from the population level. The authors need to further highlight the difference between the "z-score" and "z-diff score", ideally at its first mention, in case readers get confused (I was confused at first until I reached the latter part of the manuscript). The z-score can also be called a measure of "standardized difference" which kind of collides with what "z-diff" implies by its name.

      • Explaining that one component of the variance is related to the estimation of the model and the other is due to prediction would be helpful for non-statistical readers.

      • It would be easier for the non-statistical reader if the authors consistently used precision or variance for all variance parameters. Probably variance would be more accessible.

      • The functions psi were never explicitly described. This would be helpful to have in the supplement with a reference to that in the paper.

      • What is the goal of equations (13) and (14)? The authors should clarify what the point of writing these equations is prior to showing the math. It seems like it is to obtain an estimate of \sigma_{\ksi}^2, which the reader only learns at the end.

      • What is the definition of "adaption" as used to describe equation (15)? In this equation, I think norm on subsample was not defined.

      • "(the sandwich part with A)" - maybe call this an inner product so that it is not confused with a sandwich variance estimator. This is a bit unclear. Equation (8) does have the inner product involving A and \beta^{-1} does include variability of \eta. It seems like you mean that equation (8) incorrectly includes variability of \eta and does not have the right term vector component of the inner product involving A, but this needs clarifying.

      • One challenge with the z-diff score is that it does not account for whether a person sits above or below zero at the first time point. It might make it difficult to interpret the results, as the results for a particular pathology could change depending on what stage of the lifespan a person is in. I am not sure how the authors would address those challenges.

    4. Author response:

      We thank the reviewers for the feedback on our manuscript; we are planning to address the raised concerns in the following manner:

      We will be more explicit about the novelty of this method framing it more concretely within the scope of current research. From some comments of the reviewers, we understand that it is not clear that our method is an extension of an already existing method and model that has been extensively validated with pre-trained models brought online. Consequently, the details of the model as well as the training cohort are only covered briefly, referencing relevant published works on this topic. We will improve the clarity in this respect in the full responses. Nevertheless, we agree that the work would benefit from a simulation study that formally evaluates the performance of our method compared with more traditional approaches and will add it in our full responses. We will take care specifically of investigating the effect of assumptions like the centile-stability in healthy controls as suggested by the Reviewer 2.

      The novelty of this work lies in introducing a mathematically transparent method to use normative modelling for evaluating studies with a longitudinal design, using normative models trained on cross sectional data. We emphasise strongly that this is otherwise not possible using current methods. Furthermore, by building on a pre-trained model, this method enjoys the benefits of big (cross-sectional) data (by the pre-trained model being fitted on an extensive population sample) without the need to have direct access to them, or a ‘big’ longitudinal dataset from the cohort at hand. This is crucial in neuroimaging, where longitudinal data are much more scarce than cross-sectional data.

      We strongly disagree with the notion raised by Reviewer 1 that after the first episode cortical thickness alterations are expected to become more severe. There is now increasing evidence that: (i) trajectories of cortical thickness are highly variable across different individuals after the first psychotic episode and (ii) that individuals treated with second-generation antipsychotics and with careful clinical follow-up can show normalisation of cortical thickness atypicalities after the first episode. Indeed, we can provide evidence for this in an independent cohort, with different analytical methodologies, where precisely this occurs (https://www.medrxiv.org/content/10.1101/2024.04.19.24306008v1, https://pubmed.ncbi.nlm.nih.gov/36805840/). In the full revision, we would be happy to provide further discussion of evidence in support of this.

      We  would also like to re-emphasise  that the data were processed with the utmost rigour using state of the art processing pipelines including quality control.

      We will take care to improve the flow of the manuscript with special attention to the theoretical part and sections highlighted by the Reviewer 2. 

      We agree with the challenge outlined by the Reviewer 2 regarding the limitations in interpretation of overall trends when the position in the visit one is different between the subjects. However, this is a much broader challenge and is not specific to this study. The non-random sampling of large cohort studies is problematic for nearly all studies using such cohorts, and regardless of the  statistical approach used. We will explicitly acknowledge these limitations in the full response.

    1. Author response:

      Reviewer #1 (Public Review):

      Given that this is one of the first studies to report the mapping of longitudinal intactness of proviral genomes in the globally dominant subtype C, the manuscript would benefit from placing these findings in the context of what has been reported in other populations, for example, how decay rates of intact and defective genomes compare with that of other subtypes where known.

      Most published studies are from men living with HIV-1 subtype B and the studies are not from the hyperacute infection phase and therefore a direct head-to-head comparison with the FRESH study is difficult. However, we can cite/highlight and contrast our study with a few examples from other acute infection studies as follows.

      (1) Peluso et. al., JCI, 2020, showed that in Caucasian men (SCOPE study), with subtype B infection, initiating ART during chronic infection virus intact genomes decayed at a rate of 15.7% per year, while defective genomes decayed at a rate of 4% per year. In our study we showed that in chronic treated participants genomes decreased at a rate of 25% (intact) and 3% (defective) per month for the first 6 months of treatment.

      (2) White et. al., PNAS, 2021, demonstrated that in a cohort of African, white and mixed-race American men treated during acute infection, the rate of decay of intact viral genomes in the first phase of decay was <0.3 logs copies in the first 2-3 weeks following ART initiation. In the FRESH cohort our data from acute treated participants shows a comparable decay rate of 0.31 log copies per month for virus intact genomes.

      (3) A study in Thailand (Leyre et. al., 2020, Science Translational Medicine), of predominantly HIV-1 CRF01-AE subtype compared HIV-reservoir levels in participants starting ART at the earliest stages of acute HIV infection (in the RV254/SEARCH 010 cohort) and participants initiating ART during chronic infection (in SEARCH 011 and RV304/SEARCH 013 cohorts). In keeping with our study, they showed that the frequency of infected cells with integrated HIV DNA remained stable in participants who initiated ART during chronic infection, while there was a sharp decay in these infected cells in all acutely treated individuals during the first 12 weeks of therapy. Rates of decay were not provided and therefore a direct comparison with our data from the FRESH cohort is not possible.

      (4) A study by Bruner et. al., Nat. Med. 2016, described the composition of proviral populations in acute treated (within 100 days) and chronic treated (>180 days), predominantly male subtype B cohort. In comparison to the FRESH chronic treated group, they showed that in chronic treated infection 98% (87% in FRESH) of viral genomes were defective, 80% (60% in FRESH) had large internal deletions and 14% (31% in FRESH) were hypermutated. In acute treated 93% (48% in FRESH) were defective and 35% (7%) in FRESH were hypermutated. The differences frequency of hypermutations could be explained by the differences in timing of infection specifically in the acute treated groups were FRESH participants initiate ART at a median of 1 day after infection. It is also possible that sex- or race-based differences in immunological factors that impact the reservoir may play a role.

      This study also showed that large deletions are non-random and occur at hotspots in the HIV-1 genome. The design of the subtype B IPDA assay (Bruner et. al., Nature, 2019) is based on optimal discrimination between intact and deleted sequences - obtained with a 5′ amplicon in the Ψ region and a 3′ amplicon in Envelope. This suggest that Envelope is a hotspot for large while deletions in Ψ is the site of frequent small deletions and is included in larger 5′ deletions. In the FRESH cohort of HIV-1 subtype C, genome deletions were most frequently observed between Integrase and Envelope relative to Gag (p<0.0001–0.001).

      (5) In 2017, Heiner et. al., in Cell Rep, also described genetic characteristics of the latent HIV-1 reservoir in 3 acute treated and 3 chronic treated male study participants with subtype B HIV. Their data was similar to Bruner et. al. above showing proportions of intact proviruses in participants who initiated therapy during acute/early infection at 6% (94% defective) and chronic infection at 3% (97% defective). In contrast the frequencies in FRESH in acute treated were 52% intact and 48% defective and in chronic infection were 13% intact and 87% defective. These differences could be attributed to the timing of treatment initiation where in the aforementioned study early treatment ranged from 0.6-3.4 months after infection.

      Indeed, in the abstract, the authors indicate that treatment was initiated before the peak. The use of the term 'peak' viremia in the hyperacute-treated group could perhaps be replaced with 'highest recorded viral load'. The statistical comparison of this measure in the two groups is perhaps more relevant with regards to viral burden over time or area under the curve viral load as these are previously reported as correlates of reservoir size.

      We will edit the manuscript text to describe the term peak viraemia in hyperacute treated participants more clearly. We will perform an analysis of area under the curve to compare viral burden in the two study groups.

      Reviewer #2 (Public Review):

      Other factors also deserve consideration and include age, and environment (e.g. other comorbidities and coinfections.)

      We agree that these factors could play a role however participants in this study were of similar age (18-23), and information on co-morbidities and coinfections are not known.

      Reviewer #3 (Public Review):

      The word reservoir should not be used to describe proviral DNA soon after ART initiation. It is generally agreed upon that there is still HIV DNA from actively infected cells (phase 1 & 2 decay of RNA) during the first 6-12 months of ART. Only after a full year of uninterrupted ART is it really safe to label intact proviral HIV DNA as an approximation of the reservoir. This should be amended throughout.

      We agree and will amend the use of the word reservoir to only refer to the proviral DNA load after full viral suppression, i.e., during undetectable viral load.

      All raw, individualized data should be made available for modelers and statisticians. It would be very nice to see the RNA and DNA data presented in a supplementary figure by an individual to get a better grasp of intra-host kinetics.

      We will make all relevant data available and accessible to interested parties.

      The legend of Supplementary Figure 2 should list when samples were taken.

      The data in this figure represents an overall analysis of all sequences available for each participant at all time points. This will be explained more clearly in the manuscript and added to the figure legend.

    2. eLife assessment

      This important, clearly written, and timely manuscript links the timing of ART with the kinetics of total and intact proviral HIV DNA. The conclusions are interesting and somewhat novel, and the importance of the work is high because the focus is on African women and clade C virus, both of which are understudied in the HIV reservoir field. The strength of the evidence is convincing though some definitions could be more precise and in some places the data could be reported slightly more clearly. Overall, this work will be of very high interest to scientists and clinicians in the HIV cure/persistence fields.

    3. Reviewer #1 (Public Review):

      The authors sought to determine the impact of early antiretroviral treatment on the size, composition, and decay of the HIV latent reservoir. This reservoir represents the source of viral rebound upon treatment interruption and therefore constitutes the greatest challenge to achieving an HIV cure. A particular strength of this study is that it reports on reservoir characteristics in African women, a significantly understudied population, of whom some have initiated treatment within days of acute HIV diagnosis. With the use of highly sensitive and current technologies, including digital droplet PCR and near full-length genome next-generation sequencing, the authors generated a valuable dataset for investigation of proviral dynamics in women initiating early treatment compared to those initiating treatment in chronic infection. The authors confirm previous reports that early antiretroviral treatment restricts reservoir size, but further show that this restriction extends to defective viral genomes, where late treatment initiation was associated with a greater frequency of defective genomes. Furthermore, an additional strength of this study is the longitudinal comparison of viral dynamics post-treatment, wherein early treatment was shown to be associated with a more rapid rate of decay in proviral genomes, regardless of intactness, over a period of one year post-treatment. While it is indicated that intact genomes were not detected after one year following early treatment initiation, caution should be taken with interpretation where sequence numbers are low. Defective genomes are more abundant than intact genomes and are therefore more likely to be sampled. Early treatment was also associated with reduced proviral diversity and fewer instances of polymorphisms associated with cytotoxic T-lymphocyte immune selection. This is expected given that rapid evolution and extensive immune selection are synonymous with HIV infection in the absence of treatment, yet points to an additional benefit of early treatment in the context of immune therapies to restrict the reservoir.

      Given that this is one of the first studies to report the mapping of longitudinal intactness of proviral genomes in the globally dominant subtype C, the manuscript would benefit from placing these findings in the context of what has been reported in other populations, for example, how decay rates of intact and defective genomes compare with that of other subtypes where known. While not a primary outcome of the study, the comparisons of peak viremia in the hyperacute and chronic-treated groups may be confounded by the fact that peak viremia may have been pre-empted by early treatment i.e., the true peak was not reached in early-treated individuals. Indeed, in the abstract, the authors indicate that treatment was initiated before the peak. The use of the term 'peak' viremia in the hyperacute-treated group could perhaps be replaced with 'highest recorded viral load'. The statistical comparison of this measure in the two groups is perhaps more relevant with regards to viral burden over time or area under the curve viral load as these are previously reported as correlates of reservoir size. The analysis of clonal expansion of proviral genomes may be limited by higher sequence homogeneity in hyperacute infection i.e., cells with different proviral integration sites may have a higher likelihood of containing identical genomes than chronic infection.

      Overall, these data demonstrate the distinct benefits of early treatment initiation at reducing the barrier to a functional cure for HIV, not only by restricting viral abundance and diversity but also potentially through the preservation of immune function and limiting immune escape. It therefore provides clues to curative strategies even in settings where early diagnosis and treatment may be unlikely.

    4. Reviewer #2 (Public Review):

      HIV infection is characterized by viral integration into permissive host cells - an event that occurs very early in viral-host encounter. This constitutes the HIV proviral reservoir and is a feature of HIV infection that provides the greatest challenge for eradicating HIV-1 infection once an individual is infected.

      This study looks at how starting HIV treatment very early after infection, which substantially reduces the peak viral load detectable (compared to untreated infection), affects the amount and characteristics of the viral reservoir. The authors studied 35 women in South Africa who were at high risk of getting HIV. Some of these women started HIV treatment very soon after getting infected, while others started later. This study is well-designed and has as its focus a very well-characterized cohort. Comparison groups are appropriately selected to address reservoir characterization and dynamics in the context of acute and chronic treated HIV-1. The amount of HIV and various characteristics of the genetic makeup of the virus (intact/defective proviral reservoir) were evaluated over one year of treatment. Methods employed for reservoir characterization are state-of-the-art and provide in-depth insights into the reservoir in peripheral blood.

      While starting treatment early didn't reduce the amount of HIV DNA at the outset, it did lead to a gradual decrease in total HIV DNA quantity over time. In contrast, those who started treatment later didn't see much change in this parameter. Starting treatment early led to a faster decrease in intact provirus (a measure of replication-competence), compared to starting treatment later. Additionally, early treatment reduced the genetic diversity of the viral DNA and resulted in fewer immune escape variants within intact genomes. This suggests that collectively having a smaller intact replication-competent reservoir, less viral variability, and less opportunity for the virus to evade the immune system - are all features that are likely to facilitate more effective clearance of viral reservoir, especially when combined with other intervention strategies.

      Major strengths of the study include the cohort of very early treated persons with HIV and the depth of study. These are important findings, particularly as the study was conducted in HIV-1 subtype C infected women (more cure studies have focussed on men and with subtype B infection)- and in populations most affected by HIV and in need of HIV cure interventions. This is highly relevant because it cannot be assumed that any interventions employed for reducing/clearing the HIV reservoir would perform similarly in men and women or across different populations. Other factors also deserve consideration and include age, and environment (e.g. other comorbidities and coinfections).

    5. Reviewer #3 (Public Review):

      Summary:

      This paper assesses the size and clearance kinetics of proviral HIV DNA (intact and total) in women in South Africa with clade C virus. who started ART at different time points of infection (very early vs late).

      Strengths:

      The cohort is excellent. The paper is easy to read. The methodology is appropriate. Some conclusions, particularly the differing kinetics of total HIV DNA despite a similar amount of virus in early vs late treated women are novel and thought-provoking. I really enjoyed reading this paper!

      Weaknesses:

      There are several areas in the paper that could be explicated a bit more accurately with more detailed references to past work.

      (1) The word reservoir should not be used to describe proviral DNA soon after ART initiation. It is generally agreed upon that there is still HIV DNA from actively infected cells (phase 1 & 2 decay of RNA) during the first 6-12 months of ART. Only after a full year of uninterrupted ART is it really safe to label intact proviral HIV DNA as an approximation of the reservoir. This should be amended throughout.

      (2) All raw, individualized data should be made available for modelers and statisticians. It would be very nice to see the RNA and DNA data presented in a supplementary figure by an individual to get a better grasp of intra-host kinetics.

      (3) The legend of Supplementary Figure 2 should list when samples were taken.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is a follow-up study to the authors' previous eLife report about the roles of an alpha-arrestin called protein thioredoxin interacting protein (Txnip) in cone photoreceptors and in the retinal pigment epithelium. The findings are important because they provide new information about the mechanism of glucose and lactate transport to cone photoreceptors and because they may become the basis for therapies for retinal degenerative diseases.

      Strengths:

      Overall, the study is carefully done and, although the analysis is fairly comprehensive with many different versions of the protein analyzed, it is clearly enough described to follow. Figure 4 greatly facilitated my ability to follow, understand and interpret the study. The authors have appropriately addressed a few concerns about statistical significance and the relationship between their findings and previous studies of the possible roles of Txnip on GLUT1 expression and localization on the surfaces of RPE cells.

      We are delighted that Reviewer #1 is satisfied with this revised version.

      Reviewer #2 (Public Review):

      The hard work of the authors is much appreciated. With overexpression of a-arrestin Txnip in RPE, cones and the combined respectively, the authors show a potential gene agnostic treatment that can be applied to retinitis pigmentosa. Furthermore, since Txnip is related to multiple intracellular signaling pathway, this study is of value for research in the mechanism of secondary cone dystrophy as well.

      There are a few areas in which the article may be improved through further analysis and application of the data, as well as some adjustments that should be made in to clarify specific points in the article.

      Strengths

      • The follow-up study builds on innovative ground by exploring the impact of TxnipC247S and its combination with HSP90AB1 knockdown on cone survival, offering novel therapeutic pathways.

      • Testing of different Txnip deletion mutants provides a nuanced understanding of its functional domains, contributing valuable insights into the mechanism of action in RP treatment.

      • The findings regarding GLUT1 clearance and the differential effects of Txnip mutants on cone and RPE cells lay the groundwork for targeted gene therapy in RP.

      Weaknesses

      • The focus on specific mutants and overexpression systems might overlook broader implications of Txnip interactions and its variants in the wider context of retinal degeneration.

      Txnip is not expressed in WT or RP cones, as described in our previous study (Xue et al., 2021, eLife), so we could not perform loss of function assays. We thus chose overexpression, and assayed various alleles, based upon the literature, as we describe in our manuscript.

      • The study's reliance on cell count and GLUT1 expression as primary outcomes misses an opportunity to include functional assessments of vision or retinal health, which would strengthen the clinical relevance.

      In our previous study, we demonstrated that the optomotor response of Txnip-treated RP mice improved (Xue et al., 2021, eLife). Also, as described in our previous Txnip study, as well as an independent study (Xue et al., 2021, eLife; Xue et al., 2023, PNAS), ERG assays of Txnip-treated RP cones were no different than the controls. Other therapies that prolong RP cone survival and the optomotor response in our lab also failed to save the ERG, suggesting that there are other pathways that need to be addressed, e.g. the visual cycle. A combination therapy addressing multiple problems is one of our goals.

      • The paper could benefit from a deeper exploration of why certain treatments (like Best1-146 Txnip.C247S) do not lead to cone rescue and the potential for these approaches to exacerbate disease phenotypes through glucose shortages.

      This system is more complicated than we currently understand, and more work needs to be done.

      • Minor inconsistencies, such as the missing space in text references and the need for clarification on data representation (retinas vs. mice), should be addressed for clarity and accuracy.

      The missing spaces are added.

      We described the strategy of injecting the same mouse in each eye, one eye with control and one with the experimental vector. However, the following sentence has been added to the Materials and Methods to better assist the reader:

      “In almost all experiments, other than as noted, one eye of the mouse was treated with control (AAV8-RedO-H2BGFP, 2.5 × 108 vg/eye), and the other eye was treated with the experimental vector plus AAV8-RedO-H2BGFP, 2.5 × 108 vg/eye.”

      • The observation of promoter leakage and potential vector tropism issues raise questions about the specificity and efficiency of the gene delivery system, necessitating further discussion and validation.

      The following sentences have been added to the Results. We do not think this phenomenon affects the practice of the experiments or the interpretation of the results in this study.

      “To enable automated cone counting and trace the infection, we co-injected an AAV (AAV8-RedO-H2BGFP-WPRE-bGHpA) encoding an allele of GFP fused to histone 2B (H2BGFP), which localized to the nucleus. As the red opsin promoter was used to express this gene, H2BGFP was seen in cone nuclei, but not in the RPE, if AAV8-RedO-H2BGFP-WPRE-bGHpA was injected alone. However, when an AAV that expressed in the RPE, i.e. AAV8-Best1-Sv40intron-(Gene)-WPRE-bGHpA, was co-injected with AAV8-RedO-H2BGFP-WPRE-bGHpA, H2BGFP was expressed in the RPE, along with expression in cones (Figure 2A). We speculate that this is due to concatenation or recombination of the two genomes, such that the H2BGFP comes under the control of the RPE promoter. This may be due to the high copy number of AAV in the RPE, as it did not happen in the reverse combination, i.e. AAV with an RPE promoter driving GFP and a cone promoter driving another gene, perhaps due to the observation that the AAV genome copy number is »10 fold lower in cones than in the RPE (Wang et al., 2020).”

    2. eLife assessment

      This fundamental study advances our understanding of the cell specific treatment of cone photoreceptor degeneration by Txnip. The evidence supporting the conclusions is compelling with rigorous genetic manipulation of Txnip mutations. The work will be of broad interest to vision researchers, cell biologists and biochemists.

    3. Reviewer #1 (Public Review):

      Summary:

      This is a follow-up study to the authors' previous eLife report about the roles of an alpha-arrestin called protein thioredoxin interacting protein (Txnip) in cone photoreceptors and in the retinal pigment epithelium. The findings are important because they provide new information about the mechanism of glucose and lactate transport to cone photoreceptors and because they may become the basis for therapies for retinal degenerative diseases.

      Strengths:

      Overall, the study is carefully done and, although the analysis is fairly comprehensive with many different versions of the protein analyzed, it is clearly enough described to follow. Figure 4 greatly facilitated my ability to follow, understand and interpret the study. The authors have appropriately addressed a few concerns about statistical significance and the relationship between their findings and previous studies of the possible roles of Txnip on GLUT1 expression and localization on the surfaces of RPE cells.

    4. Reviewer #2 (Public Review):

      The hard work of the authors is much appreciated. With overexpression of a-arrestin Txnip in RPE, cones and the combined respectively, the authors show a potential gene agnostic treatment that can be applied to retinitis pigmentosa. Furthermore, since Txnip is related to multiple intracellular signaling pathway, this study is of value for research in the mechanism of secondary cone dystrophy as well.

      Strengths

      - The follow-up study builds on innovative ground by exploring the impact of TxnipC247S and its combination with HSP90AB1 knockdown on cone survival, offering novel therapeutic pathways.<br /> - Testing of different Txnip deletion mutants provides a nuanced understanding of its functional domains, contributing valuable insights into the mechanism of action in RP treatment.<br /> - The findings regarding GLUT1 clearance and the differential effects of Txnip mutants on cone and RPE cells lay the groundwork for targeted gene therapy in RP.

      Comments on revised version:

      The researchers answered our questions and included additional discussion in the manuscript.

    1. eLife assessment

      This important study suggests that capsaicin nanoparticle administration in rats activates the transcription factor Nrf2 by directly binding to its repressor KEAP1, leading to cytoprotective gene induction, and preventing alcohol-induced gastric damage, an avenue to treat alcoholism-related gastric disorders. The evidence is currently incomplete as there is no experimental proof that capsaicin exerts its cytoprotective effects via Nrf2, and not via any of its multiple known pharmacological effects. In particular, Nrf2-deficient mice should be used to show that Nrf2 is causal to the cytoprotective effect, and better controls should be provided for the direct KEAP2-capsaicin interaction, given the high concentrations used.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper by Gao et al. describes the effect of capsaicin on the NRF2/KEAP1 pathway. The authors carried out a set of in vitro experiments that addressed the mechanisms of the protective effect of capsaicin on ethanol-induced cytotoxicity. They also conducted in vivo studies in rats focusing on ethanol-induced gastric mucosal oxidative damage. The authors conclude that capsaicin activates NRF2, which leads to the induction of cytoprotective genes, preventing oxidative damage. This effect has already been shown, and it is well established that capsaicin activates NRF2, but what can be novel in the paper is the demonstration that capsaicin may directly bind to KEAP1 and that it is a noncovalent modification of the Kelch domain. The authors also designed new albumin-coated capsaicin nanoparticles, which were tested for the therapeutic effect in vivo. Apart from novelty concerns, the manuscript may be potentially interesting, but in my opinion, it is not fully technically sound, which weakens the strength of the evidence.

      Major concerns:

      For studies investigating capsaicin binding to KEAP1, the authors used capsaicin concentrations that are toxic to cells (Figures S1D and 4F, G). In vivo studies were performed only in 3 rats per group. The T-test was used for the comparison of more than two groups. Given the well-known issues with the specificity of the NRF2 antibody, the authors should provide appropriate controls, especially for IF and IHC staining.

    3. Reviewer #2 (Public Review):

      Summary:

      In this paper, the authors wanted to show that capsaicin can disrupt the interaction between Keap1 and Nrf2 by directly binding to Keap1 at an allosteric site. The resulting stabilization of Nrf2 would protect CAP-treated gastric cells from alcohol-induced redox stress and damage as well as inflammation (both in vitro and in vivo).

      Strengths:

      One major strength of the study is the use of multiple methods (CoIP, SPR, BLI, deuterium exchange MS, CETSA, MS simulations, target gene expression) that consistently show for the first time that capsaicin can disrupt the Nrf2/Keap1 interaction at an allosteric site and lead to stabilization and nuclear translocation of Nrf2.

      Weaknesses:

      One major weakness of the study is that plausibility is taken as proof for causality. The finding that capsaicin directly binds to Keap1 and releases Nrf2 from its fate of degradation (in vitro) is taken for granted as the sole explanation for the observed improved gastric health upon alcohol exposure (in vivo). There is no consideration or exclusion of any potential unrelated off-target effect of capsaicin, or proteins other than Nrf2 that are also controlled by Keap1.

      Another point that hampers full appreciation of the capsaicin effect in cells is that capsaicin is not investigated alone, but mostly in combination with alcohol only.

      Bottom Line:

      Overall, the authors are convincing that capsaicin (although weakly) binds to Keap1 and releases Nrf2 from degradation. With this, the authors provide a significant finding with marked relevance for the redox/Nrf2 as well as natural products /hit discovery communities. Moreover, the employed toolbox of different complementary methodologies can set the bar for future PPI inhibitor studies. The translation of this novel finding in a biological setting (alcohol-stressed gastric cells) still leaves some open questions and concerns. These concerns mainly arise from lacking control experiments and/or somewhat biased conclusions from the obtained data sets.

    4. Reviewer #3 (Public Review):

      Summary:

      The paper by Gao et al. describes that capsaicin (CAP) might act as a novel NRF2 agonist capable of suppressing ethanol (EtOH)-induced oxidative damage in the gastric mucosa by disrupting the KEAP1-NRF2 interaction. Initially, the authors established and validated a cell model for EtOH-induced oxidative stress which they used to experiment with different CAP concentrations and to determine changes in a variety of parameters such as cell morphology, ROS production, status of redox balance to mitochondrial function, amongst others.

      The proposed mechanism by which CAP activates NRF2 and mitigates oxidative stress is thought to be via non-covalent binding to the Kelch domain of KEAP1. A variety of assays such as BLI, CETSA, Pull-down, Co-IP, and HDX-MS were employed to investigate the KEAP1 binding behavior of CAP both in vitro and in GES1 cells. Consequently, the authors developed in vivo nanoparticles harboring CAP and tested those in a rat model. They found that pretreatment with the CAP-nanoparticles led to significant upregulation of NRF2 and subsequent effects on pro- (suppression of IL-1β, TNF-α, IL-6, and CXCL1) and anti-inflammatory (activation of IL-10) cytokines pointing to a resolved state of inflammation and oxidative stress.

      Strengths:

      The work comprises a comprehensive approach with a variety of in vitro assays as well as cell culture experiments to investigate CAP binding behaviour to KEAP1. In addition, the authors employ in vivo validation by establishing an ethanol-induced acute gastric mucosal damage rat model and providing evidence of the potential therapeutic effect of CAP.

      The study further provides novel insights into the mode of CAP action by elucidating the mechanism by which CAP promotes NRF2 expression and downstream antioxidant target gene activation.

      The design of IR-Dye800 modified albumin-coated CAP nanoparticles for enhanced drug solubility and delivery efficiency demonstrates a valuable practical application of the research findings.

      In summary, the study's findings suggest that CAP could be a safe and novel NRF2 agonist with implications for the development of lead drugs for oxidative stress-related diseases. Collectively, the data support the significance and potential impact of CAP as a therapeutic agent for oxidative stress-related conditions.

      Weaknesses:

      While the study provides valuable insights into the molecular mechanisms and in vivo effects of CAP, further clinical studies are needed to validate its efficacy and safety in human subjects. The study primarily focuses on the acute effects of CAP on ethanol-induced gastric mucosa damage. Long-term studies are necessary to assess the sustained therapeutic effects and potential side effects of CAP treatment.

      Furthermore, the study primarily focuses on the interaction between CAP and the KEAP1-NRF2 axis in the context of ethanol-induced gastric mucosa damage. It may be beneficial to explore the broader effects of CAP on other pathways or conditions related to oxidative stress. CAP has been known for its interaction with the Transient Receptor Potential Vanilloid type 1 (TRPV1) channel and subsequent NRF2 signaling pathway activation. Those receptors are also expressed within the gastric mucosa and could potentially cross-react with CAP leading to the observed outcome. Including experiments to investigate this route of activation could strengthen the present study.

      While the design of CAP nanoparticles is innovative, further research is needed to optimize the nanoparticle formulation for enhanced efficacy and targeted delivery to specific tissues.

      Addressing these weaknesses through additional research and clinical trials can strengthen the validity and applicability of CAP as a therapeutic agent for oxidative stress-related conditions.

    1. eLife assessment

      This study presents useful findings about daily rhythm changes of the Drosophila melanogaster adult gut metabolome, which is shown to be dependent on the fly microbiota, diet, and genotype. The phenomena observed are supported by solid experimental evidence, however, there are limitations regarding the analysis and a deeper interpretation of results would improve the manuscript. An absence of mechanistic functional investigation limits the power of the proposed conclusions. The experiments are currently incomplete as the effect of food intake timing was not directly addressed by measuring the quantity and timing of food consumption. The authors should strongly consider including the model organism used in the study in the title of the manuscript to reflect the work. This study will be of interest to physiologists of circadian biology and nutrition.

    2. Reviewer #1 (Public Review):

      The authors build on their previous study that showed the midgut microbiome does not oscillate in Drosophila. Here, they focus on metabolites and find that these rhythms are in fact microbiome-dependent. Tests of time-restricted feeding, a clock gene mutant, and diet reveal additional regulatory roles for factors that dictate the timing and rhythmicity of metabolites. The study is well-written and straightforward, adding to a growing body of literature that shows the time of food consumption affects microbial metabolism which in turn could affect the host.

      Some additional questions and considerations remain:

      (1) The main finding that the microbiome promotes metabolite rhythms is very interesting. Which microbiota are likely to be responsible for these effects? The author's previous work in this area may shed light on this question. Are specific microbiota linked to some of the metabolic pathways investigated in Figure 5?

      (2) TF increases the number of rhythmic metabolites in both microbiome-containing and abiotic flies in Figure 1. This is somewhat surprising given that flies typically eat during the daytime rather than at night, very similar to TF conditions. I would have assumed that in a clock-functioning animal, the effect of restricting food availability should not make a huge difference in the time of food consumption, and thus downstream impacts on metabolism and microbiome. Can the authors measure food intake directly to compare the ad-lib vs TF flies to see if there are changes in food intake? Would restricting feeding to other times of day shift the timing of metabolites accordingly?

      (3) In Figure 2, Per loss of function reveals a change in the phase of rhythmic metabolites. In addition, the effect of the microbiome on these is very different = The per mutants show increased numbers of rhythmic metabolites when the microbiome is absent, unlike the controls. Is it possible that these changes are due to altered daily feeding rhythms in per mutants? Testing the time and amount of food consumed by the per mutant flies would address this question. Would TF in the per mutants rescue their metabolite rhythms and make them resemble clock-functioning controls?

      (4) The calorie content of each diet - normal vs high protein vs high-sugar are different. The possibility of a calorie effect rather than a difference in nutrition (protein/carbohydrate) should be discussed. Another issue worth considering is the effect of high protein/sugar on the microbiome itself. While the microbiome doesn't seem to affect rhythms in the high-protein diet, the high-sugar diet seems highly microbiome-dependent in Supplementary Fig 8C vs D. Does the diet impact the microbiome and thus metabolite rhythmicity downstream?

      (5) It would be good if a supplementary table was provided outlining the specific metabolites that are shown in the radial plots. It is not clear if the rhythms shown in the figures refer to the same metabolites peaking at the same time, or rather the overall abundance of completely different metabolites. This information would be useful for future research in this area.

    3. Reviewer #2 (Public Review):

      Summary:

      The paper addresses several factors that influence daily changes in concentration of metabolites in the Drosophila melanogaster gut. The authors describe metabolomes extracted from fly guts at four time-points during a 24-hour period, comparing profiles of primary metabolites, lipids, and biogenic amines. The study elucidates that the percentage of metabolites that exhibit a circadian cycle, peak phases of their increased appearance, and the cycling amplitude depends on the combination of factors (microbiome status, composition or timing of the diet, circadian clock genotype). Multiple general conclusions based on the data obtained with modern metabolomics techniques are provided in each part of the article. Descriptive analysis of the data supports the finding that microbiome increases the number of metabolites for which concentration oscillates during the day period. Results of the experiments show that timed feeding significantly enhanced metabolite cycling and changed its phase regardless of the presence of a microbiome. The authors suggest that the host circadian rhythm modifies both metabolite cycling period and the number of such metabolites.

      Strengths:

      The obvious strength of the study is the data on circadian cycling of the detected 843, 4510, and 4330 total primary metabolites, lipids, and biogenic amines respectively in iso31 flies and 623, 2245, and 2791 respective metabolites in per01 mutants. The comparison of the abundance of these metabolites, their cycling phase, and the ratio of cycling/non-cycling metabolites is well described and illustrated. The conditions tested represent significant experimental interest for contemporary chronobiology: with/without microbiota, wild-type/mutant period gene, ad libitum/time-restricted feeding, and high-sugar/high-protein diet. The authors conclude that the complex interaction between these factors exists, and some metabolic implications of combinations of these factors can be perceived as reminiscent of metabolic implications of another combination ("...the microbiome and time-restricted feeding paradigms can compensate for each other, suggesting that different strategies can be leveraged to serve organismal health"). Enrichment analysis of cycling metabolites leads to an interesting suggestion that oscillation of metabolites related to amino acids is promoted by the absence of microbiota, alteration of circadian clock, and time-restricted feeding. In contrast, association with microbiota induces oscillation of alpha-linolenic acid-related metabolites. These results provide the initial step for hypothesising about functional explanations of the uncovered observations.

      Weaknesses:

      Among the weaknesses of the study, one might point out too generalist interpretations of the results, which propose hypothetical conclusions without their mechanistic proof. The quantitative indices analysed are obviously of particular interest, however are not self-explaining and exhaustive. More specific biological examples would add valuable insights into the results of this study, making conclusions clearer. More specific comments on the weaknesses are listed below:

      (1) The criterion of the percentage of cycling metabolites used for comparisons has its own limitations. It is not clear, whether the cycling metabolites are the same in the guts with/without microbiota, or whether there are totally different groups of metabolites that cycle in each condition. GO enrichment analysis gives only a partial assessment, but is still not quantitative enough.

      (2) The period of cycling data is based on only 4 time points during 24 hours in 4 replicates (>200 guts per replicate) on the fifth day of the experiment (10-12-day-old adults). It does not convincingly prove that these metabolites cycle the following days or more finely within the day. Moreover, it is not clear how peaks in polar histogram plots were detected in between the timepoints of ZT0, ZT6, ZT12, and ZT18.

      (3) Average expression and amplitude are analysed for groups of many metabolites, whereas the data on distinct metabolites is hidden behind these general comparisons. This kind of loss of information can be misleading, making interpretation of the mentioned parameters quite complicated for authors and their readers. Probably more particular datasets can be extracted to be discussed more thoroughly, rather than those general descriptions.

      (4) The metabolites' preservation is crucial for this type of analysis, and both proper sampling plus normalisation require more attention. More details about measures taken to avoid different degradation rates, different sizes of intestines, and different amounts of microbes inside them will be beneficial for data interpretation.

      (5) The data in the article describes formal phenomena, not directly connected with organism physiology. The parameters discussed obviously depend on the behaviour of flies. Food consumption, sleep, and locomotor activity could be additionally taken into account.

      (6) Division of metabolites into three classes limits functional discussion of found differences. Since the enrichment analysis provided insights into groups of metabolites of particular interest (for example, amino acid metabolism), a comparison of their cycling characteristics can be shown separately and discussed.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors. sought to quantify the influence of the gut microbiome on metabolite cycling in a Drosophila model with extensive metabolomic profiling over a 24-hour period. The major strength of the work is the production of a large dataset of metabolites that can be the basis for hypothesis generation for more specific experiments. There are several weaknesses that make the conclusions difficult to evaluate. Additional experiments to quantify food intake over time will be required to determine the direct role of the microbiome in metabolite cycling.

      Strengths:

      An extensive metabolomic dataset was collected with treatments designed to determine the influence of the gut microbiome on metabolite circadian cycling.

      Weaknesses:

      (1) The major strength of this study is the extensive metabolomic data, but as far as I can tell, the raw data is not made publicly available to the community. The presentation of highly processed data in the figures further underscores the need to provide the raw data (see comment 3).

      (2) Feeding times heavily influence the metabolome. The authors use timed feeding to constrain when flies can eat, but there is no measurement of how much they ate and when. That needs to be addressed.

      Since food is the major source of metabolites, the timing of feeding needs to be measured for each of the treatment groups. In the previous paper (Zhang et al 2023 PNAS), the feeding activity of groups of 4 male flies was measured for the wildtype flies. That is not sufficient to determine to what extent feeding controls the metabolic profile of the flies. Additionally, timed feeding opportunities do not equate to the precise time of feeding. They may also result in dietary restriction, leading to the loss of stress resistance in the TF flies. The authors need to measure food consumption over time in the exact conditions under which transcriptomic and metabolomic cycling are measured. I suggest using the EX-Q assay as it is much less effort than the CAFE assay and can be more easily adapted to the rearing conditions of the experiments.

      (3) The data on the cycling of metabolites is presented in a heavily analyzed form, making it difficult to evaluate the validity of the findings, particularly when a lack of cycling is detected. The normalization to calculate the change in cycling due to particular treatments is particularly unclear and makes me question whether it is affecting the conclusions. More presentation of the raw data to show when cycling is occurring versus not would help address this concern, as would a more thorough explanation of how the normalization is calculated - the brief description in the methods is not sufficient.

      For instance, the authors state that "timed feeding had less effect on flies containing a microbiome relative to sterile flies." One alternative interpretation of that result is that both treatments are cycling but that the normalization of one treatment to the other removes the apparent effect. This concern should be straightforward to address by showing the raw data for individual metabolites rather than the group.

    1. eLife assessment

      This is a valuable paper that uses super-resolution microscopy to show the nanoclustering of the Nipah virus fusion protein on cell and viral membranes. Some of the conclusions regarding the clustering of viral fusion proteins is supported by solid biochemical and super-resolution imaging data while other conclusions such as significance for viral fusion mechanisms is not fully supported by the data provided.

    2. Reviewer #1 (Public Review):

      Summary:

      In this work by Wang et al., the authors use single-molecule super-resolution microscopy together with biochemical assays to quantify the organization of Nipah virus fusion protein F (NiV-F) on cell and viral membranes. They find that these proteins form nanoscale clusters which favors membrane fusion activation, and that the physical parameters of these clusters are unaffected by protein expression level and endosomal cleavage. Furthermore, they find that the cluster organization is affected by mutations in the trimer interface on the NiV-F ectodomain and the putative oligomerization motif on the transmembrane domain, and that the clusters are stabilized by interactions among NiV-F, the AP2-complex, and the clathrin coat assembly. This work improves our understanding of the NiV fusion machinery, which may have implications also for our understanding of the function of other viruses.

      Strengths:

      The conclusions of this paper are well-supported by the presented data. This study sheds light on the activation mechanisms underlying the NiV fusion machinery.

      Weaknesses:

      The authors provide limited details of the convolutional neural network they developed in this work. Even though custom-codes are made available, a description of the network and specifications of how it was used in this work would aid the readers in assessing its performance and applicability. The same holds for the custom-written OPTICS algorithm. Furthermore, limited details are provided for the imaging setup, oxygen scavenging buffer, and analysis for the single-molecule data, which limits reproducibility in other laboratories. The claim of 10 nm resolution is not backed up by data and seems low given the imaging conditions and fluorophores used. Fourier Ring Correlation analysis would have validated this claim. If the authors refer to localization precision rather than resolution, then this should be specified and appropriate data provided to support this claim.

    3. Reviewer #2 (Public Review):

      Summary:

      In this manuscript, Wang and co-workets employ single molecule light microscopy (SMLM) to detect Nipah virus Fusion protein (NiV-F) in the surface of cells. They corroborate that these glycoproteins form microclusters (previously seen and characterized together with the NiV-G and Nipah Matrix protein by Liu and co-workers (2018) also with super-resolution light microscopy). Also seen by Liu and coworkers the authors show that the level of expression of NiV-F does not alter the identity of these microclusters nor endosomal cleavage. Moreover, mutations and the transmembrane domain or the hexamer-of-trimer interface seem to have a mild effect on the size of the clusters that the authors quantified. Importantly, it has also been shown that these particles tend to cluster in Nipah VLPs.

      Strengths:

      The authors have tried to perform SMLM in single VLPs and have shown partially the importance of NiV-F clustering.

      Weaknesses:

      The labelling strategy for the NiV-F is not sufficiently explained. The use of a FLAG tag in the extracellular domain should be validated and compared with the unlabelled WT NiV-F when expressed in functional pseudoviruses (for example HIV-1 based particles decorated with NiV-F). This experiment should also be carried out for both infection and fusion (including BlaM-Vpr as a readout for fusion). I would also suggest to run a time-of-addition BlaM experiment to understand how this particular labelling strategy affects single virion fusion as compared to the the WT. It would also be very important to compare the FLAG labelling approach with recent advances in the field (for instance incorporating noncanonical amino acids (ncAAs) into NiV-F by amber stop-codon suppression, followed by click chemistry).

      The correlation between the existence of microclusters of a particular size and their functionality is missing. Only cell-cell fusion assays are shown in supplementary figures and clearly, single virus entry and fusion cannot be compared with the biophysics of cell-cell fusion. Not only the environment is completely different, membrane curvature and the number of NiV-F drastically varies also. Therefore, specific fusion assays (either single virus tracking and/or time-of-addition BlaM kinetics with functional pseudoviruses) are needed to substantiate this claim.

      The authors also claim they could not characterize the number of NiV-F particles per cluster. Another technique such as number and brightness (Digman et al., 2008) could support current SMLM data and identify the number of single molecules per cluster. Also, this technology does not require complex microscopy apparatus. I suggest they perform either confocal fluorescence fluctuation spectroscopy or TIRF-based nandb to validate the clusters and identify how many molecule are present in these clusters. Also, it is not clear how many cells the authors employ for their statistics (at least 30-50 cells should be employed and not consider the number of events blinking events). I hope the authors are not considering only a single cell to run their stats... The differences between the mutants and the NiV-F is minor even if their statistical analyses give a difference (they should average the number and size of the clusters per cell for a total of 30-50 cells with experiments performed at least in three different cells following the same protocol). They should also compare the level of expression (with the number of molecules per cell provided by number and brightness) with the total number of clusters. Overall, it seems that the authors have only evaluated a very low number of cells.

      The same applies to the VLP assay. I assume the authors have only taken VLPs expressing both NiV-M and NiV-F (and NiV-G). But even if this is not clearly stated I would urge the authors to show how many viruses were compared per condition (normally I would expect 300 particles per condition coming from three independent experiments). As a negative control to evaluate the cluster effect I would mix the different conditions. Clearly you have clusters with all conditions and the differences in clustering depending on each condition are minimal. Therefore you need to increase the n for all experiments.

    4. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Wang and colleagues describes single molecule localization microscopy to quantify the distribution and organization of Nipah virus F expressed on cells and on virus-like particles. Notably the crystal structure of F indicated hexameric assemblies of F trimers. The authors propose that F clustering favors membrane fusion.

      Strengths:

      The manuscript provides solid data on imaging of F clustering with the main findings of:<br /> - F clusters are independent of expression levels<br /> - Proteolytic cleavage does not affect F clustering<br /> - Mutations that have been reported to affect the hexamer interface reduce clustering on cells and its distribution on VLPs<br /> - - F nanoclusters are stabilized by AP

      Weaknesses:

      The relationship between F clustering and fusion is per se interesting, but looking at F clusters on the plasma membrane does not exclude that F clustering occurs for budding. Many viral glycoproteins cluster at the plasma membrane to generate micro domains for budding. This does not exclude that these clusters include hexamer assemblies or clustering requires hexamer assemblies.<br /> Assuming that the clusters are important for entry, hexameric clusters are not unique to Nipah virus F. Similar hexameric clusters have been described for the HEF on influenza virus C particles (Halldorsson et al 2021) and env organization on Foamy virus particles (Effantin et al 2016), both with specific interactions between trimers. What is the organization of F on Nipah virus particles? If F requires to be hexameric for entry, this should be easily imaged by EM on infectious or inactivated virus particles.<br /> AP stabilization of the F clusters is curious if the clusters are solely required for entry? Virus entry does not recruit the clathrin machinery. Is it possible that F clusters are endocytosed in the absence of budding?

      Other points:<br /> Fig. 3: Some of the V108D and L53D clusters look similar in size than wt clusters. It seems that the interaction is important but not absolutely essential? Would a double mutant abrogate clustering completely?<br /> Fig. 4: The distribution of F on VLPs should be confirmed by cryoEM analyses. This would also confirm the symmetry of the clusters.

      The manuscript by Chernomordik et al. JBC 2004 showed that influenza HA outside the direct contact zone affects fusion, which could be further elaborated in the context of F clusters and the fusion mechanism.

    1. Reviewer #1 (Public Review):

      Summary:

      The study "Endogenous oligomer formation underlies DVL2 condensates and promotes Wnt/β-catenin signaling" by Senem Ntourmas et al. contributes to the understanding of phase separation in Dishevelled (DVL) proteins, specifically focusing on DVL2. It builds upon existing research by investigating the endogenous complexes of DVL2 using ultracentrifugation and contrasting them with DVL1 and DVL3 behavior. The study identifies a DVL2-specific region involved in condensate formation and introduces the "two-step" concept of DVL2 condensate formation, enriching the field's knowledge.

      Strengths:

      A notable strength of this study is the validation of endogenous DVL2 complexes, providing insights into its behavior compared to DVL1 and DVL3. The functional validation of the DVL C-terminus (here termed conserved domain 2 (CD2) and the identification of DVL2-specific regions (here termed LCR4) involved in condensate formation are significant contributions that complement the current knowledge on the importance of DVL DIX domain, DEP domain and intrinsically disordered regions between DIX and PDZ domains. Additionally, the introduction of the concept where oligomerization (step 1) precedes condensate formation (step 2) is an interesting hypothesis, which can be further experimentally challenged in the future.

      Weaknesses:

      However, the applicability of the findings to full-length DVL2 protein, hence the physiological relevance, is limited. This is mostly due to the fact that the authors almost completely depend on the set of DVL2 mutants, which lack the (i) DEP domain and (ii) nuclear export signal (NES). These variants fail to establish DEP domain-mediated interactions, including those with FZD receptors. Of note, the DEP domain itself represents a dimerization/tetramerization interface, which could affect the protein condensate formation of these mutants. Possibly even more importantly, the used mutants localize into the nucleus, which has different biochemical & biophysical properties than a cytoplasm, where DVL typically reside, which in turn affects the condensate formation. On top, in the nucleus, most of the DVL binding partners, including relevant kinases, which were reported to affect protein condensate formation, are missing.

      Second, the use of an overexpression system, while suitable for comparing DVL2 protein condensate features, falls short in functional assays. The study could benefit from employing established "rescue systems" using DVL1/2/3 knockout cells and re-expression of DVL variants for more robust functional assessments.

      Furthermore, the discussion and introduction overlook some essential aspects of DVL biology. One such example is the importance of the open/close conformation of DVL and its effects on DVL phase separation and activity. In the context of this study, it is important to say that this conformational plasticity is mediated by DVL C-terminus (CD2 in this study). The second example is the reported roles of DVL1 and DVL3, which can both mediate the Wnt3a signal. How this can be interpreted when DVL1 and DVL3 lack LCR4 and still form condensates?

      In order to increase the physiological relevance of the study, I would recommend analyzing several key mutants in the context of the full-length DVL2 protein using the rescue/complementation system. Further, a more thorough discussion and connections with the existing literature on DVL protein condensates/puncta/LLPS can improve the impact of the study.

    2. eLife assessment

      This valuable study contributes to the understanding of phase separation in Dishevelled (DVL) proteins, by investigating the endogenous complexes of DVL2 using ultracentrifugation and contrasting them with DVL1 and DVL3 behavior and the functional validation of the DVL2 intrinsically disordered regions mediating the protein condensate. The study is, however, incomplete due to the lack of several controls and its focus on overexpression and mutants lacking key domains.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors aimed to identify which regions of DVL2 contribute to its endogenous/basal clustering, as well as the relevance of such domains to condensate/phase separation and WNT activation.

      Strengths:

      A strength of the study is the focus on endogenous DVL2 to set up the research questions, as well as the incorporation of various techniques to tackle it. I found also quite interesting that DVL2-CFR addition to DVL1 increased its MW in density gradients.

      Weaknesses:

      I think that several of the approaches of the manuscript are subpar to achieve the goals and/or support several of the conclusions. For example:

      (1) Although endogenous DVL2 indeed seems to form complexes (Figure 1A), neither the number of proteins involved nor whether those are homo-complexes can be determined with a density gradient. Super-resolution imaging or structural analyses are needed to support these claims.

      (2) Follow-up analyses of the relevance of the DVL2 domains solely rely on overexpressed proteins. However, there were previous questions arising from o/e studies that prompted the focus on endogenous, physiologically relevant DVL interactions, clustering, and condensate formation. Although the title, conclusions, and relevance all point to the importance of this study for understanding endogenous complexes, only Figures 1A and B deal with endogenous DVL2.

      (3) Mutants lacking activity/complex formation, e.g. DVL2_1-418, may need further validation. For instance, DVL2_1-506 (same mutant but with DEP) seems to form condensates and it is functional in WNT signalling (King et al., 20223). These differences could be caused by the lack of DEP domain in this particular construct and/or folding differences.

      (4) The key mutants, DeltaCFR and VV/FF only show mild phenotypes. The authors' results suggest that these regions contribute but are not necessary for 1) complex formation (Density gradient Figures 7A and B), condensate formation (Figures 7C and D), and WNT activity (Figure 7E). Of note Figure 7C shows examples for the mutants with no condensates while the qualification indicates that 50% of the cells do have condensates.

      (5) Most of the o/e analyses (including all reporter assays) should be performed in DVL1-3 KO cells in order to explore specifically the behaviour of the investigated mutants.

      (6) How comparable are condensates found in the cytoplasm (usually for wt DVL) with those located in the nucleus (DEP mutants)?

      Several studies in the last two decades have analysed the relevance of DVL homo - and hetero-clustering by relying on overexpressed proteins. Recent studies also explored the possibility of DVL undergoing liquid-liquid phase separation following similar principles. As highlighted by the authors in the introduction, there is a need to understand DVL dynamics under endogenous/physiological conditions. Recent super-resolution studies aimed at that question by characterising endogenously edited DVL2. The authors seemed to aim in the same direction with their initial findings (Figure 1A) but quickly moved to o/e proteins without going back to the initial question. This reviewer thinks that to support their conclusions and advance in this important question, the authors should introduce the relevant mutations in the endogenous locus (e.g. by Cas9+ donor template encoding the required 3' exons, as done by others before for WNT components, including DVL2) and determine their impact in the above-indicated processes.

    1. eLife assessment

      This study presents important new insights linking obesity to kidney disease using a Drosophila model. A series of compelling experiments demonstrated that a high-fat diet induces the excretion of a leptin-like JAK-STAT ligand from the fat body, driving the adipose-nephrocyte axis through activated JAK-STAT signaling and subsequently causing a functional defect in nephrocytes. While the combination of genetic tools and pharmacological intervention provides solid data and confirms the mechanistic link, the phenotypic analysis is restricted to tracer endocytosis and would benefit from immunofluorescence studies and higher animal numbers.

    2. Reviewer #1 (Public Review):

      Summary:

      Zhao and colleagues employ Drosophila nephrocytes as a model to investigate the effects of a high-fat diet on these podocyte-like cells. Through a highly focused analysis, they initially confirm previous research in their hands demonstrating impaired nephrocyte function and move on to observe the mislocalization of a slit diaphragm-associated protein (pyd). Employing a reporter construct, they identify the activation of the JAK/STAT signaling pathway in nephrocytes. Subsequently, the authors demonstrate the involvement of this pathway in nephrocyte function from multiple angles, using a gain-of-function construct, silencing of an inhibitor, and ectopic overexpression of a ligand. Silencing the effector Stat92E via RNAi or inhibiting JAK/STAT with Methotrexate effectively restored impaired nephrocyte function induced by a high-fat diet, while showing no impact under normal dietary conditions.

      Strengths:

      The findings establish a link between JAK/STAT activity and the impact of a high-fat diet on nephrocytes. This nicely underscores the importance of organ crosstalk for nephrocytes and supports a potential role for JAK/STAT in diabetic nephropathy, as previously suggested by other models.

      Weaknesses:

      The analysis is overly reliant on tracer endocytosis and single lines. Immunofluorescence of slit diaphragm proteins would provide a more specific assessment of the phenotypes.

    3. Reviewer #2 (Public Review):

      Summary:

      In their manuscript, Zhao et al. describe a link between JAK-STAT pathway activation in nephrocytes on a high-fat diet. Nephrocytes are the homologs to mammalian podocytes and it has been previously shown, that metabolic syndrome and obesity are associated with worse outcomes for chronic kidney disease. A study from 2021 (Lubojemska et al.) could already confirm a severe nephrocyte phenotype upon feeding Drosophila a high-fat diet and also linking lipid overflow by expressing adipose triglyceride lipase in the fat body to nephrocyte dysfunction. In this study, the authors identified a second pathway and mechanism, how lipid dysregulation impact on nephrocyte function. In detail, they show activation of JAK-STAT signaling in nephrocytes upon feeding them a high-fat diet, which was induced by Upd2 expression (a leptin-like hormone) in the fat body, and the adipose tissue in Drosophila. Further, they could show genetic and pharmacological interventions can reduce JAK-STAT activation and thereby prevent the nephrocyte phenotype in the high-fat diet model.

      Strengths:

      The strength of this study is the combination of genetic tools and pharmacological intervention to confirm a mechanistic link between the fat body/adipose tissue and nephrocytes. Inter-organ communication is crucial in the development of several diseases, but the underlying mechanisms are only poorly understood. Using Drosophila, it is possible to investigate several players of one pathway, here JAK-STAT. This was done, by investigating the functional role of Hop, Socs36E, and Stat92E in nephrocytes and has also been combined with feeding a high-fat diet, to assess restoration of nephrocyte function by inhibiting JAK-STAT signaling. Adding a translational approach was done by inhibiting JAK-STAT signaling with methotrexate, which also resulted in attenuated nephrocyte dysfunction. Expression of the leptin-like hormone upd2 in the fat body is a good approach to studying inter-organ communication and the impact of other organs/tissue on nephrocyte function and expands their findings from nephrocyte function towards whole animal physiology.

      Weaknesses:

      Although the general findings of this study are of great interest, there are some weaknesses in the study, which should be addressed. Overall, the number of flies investigated for the majority of the experiments is very low (6 flies) and it is not clear whether the flies used, are from independent experiments to exclude problems with food/diet. For the analysis, the mean values of flies should be calculated, as one fly can be considered a biological replicate, but not all individual cells. By increasing the number of flies investigated, statistical analysis will become more solid. In addition, the morphological assessment is rather preliminary, by only using a Pyd antibody. Duf or Sns should be visualized as well, also the investigation of the different transgenic fly strains studying the importance of JAK-STAT signaling in nephrocytes needs to include a morphological assessment. Moreover, the expected effect of feeding a high-fat diet on nephrocytes needs to be shown (e.g. by lipid droplet formation) and whether upd2 is actually increased here should also be assessed. The time points of assessment vary between 1, 3, and 7 days and should be consistent throughout the study or the authors should describe why they use different time points.

    1. Reviewer #2 (Public Review):

      This work deals with a very difficult physical problem: relating the assembly of building blocks on a molecular scale to the appearance of large, macroscopic assemblies. This problem is particularly difficult to treat, because of the large number of units involved, and of the complex way in which these units-monomers-interact with each other and with the solvent. In order to make the problem treatable, the authors recur to a number of approximations: Among these, there is the assumption that the system is spatially homogeneous, i.e., its features are the same in all regions of space. In particular, the homogeneity assumption may not hold in biologically relevant systems such as cells, where the behavior close to the cell membrane may strongly differ from the one in the bulk. As a result, this hypothesis calls for a cautious consideration and interpretation of the results of this work. Another notable simplification introduced by the authors is the assumption that the system can only follow two possible behaviors: In the first, each monomer interacts equally with the solvent; no matter the size of the cluster of which it is part. In the second case, monomers in the bulk of a cluster and monomers at the assembly boundary interact with the solvent in a different way. These two cases are considered not only because they simplify the problem, but also because they are inspired by biologically relevant proteins.

      With these simplifications, the authors trace the phase diagram of the system, characterizing its phases for different fractions of the volume occupied by the monomers and solvent, and for different values of the temperature. The results qualitatively reproduce some features observed in recent experiments, such as an anomalous distribution of cluster sizes below the system saturation threshold, and the gelation of condensed phases above such threshold.

    2. eLife assessment

      The authors present an important theoretical framework that describes the interplay between liquid-liquid phase separation and protein aggregation within a mean-field model. This work will be of high interest to the biophysics and molecular biology communities, as it will understand and analyse assembly within biomolecular condensates in cells or in-vitro. Major strengths of this convincing work are the consideration of aggregates with various dimensionality and the possibility for protein gelation. A relative weakness is the lack of intuitive interpretation of some of the results and the work could be more accessible to non-experts.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors present a mean-field model that describes the interplay between (protein) aggregation and phase separation. Different classes of interaction complexity and aggregate dimensionality are considered, both in calculations concerning (equilibrium) phase behavior and kinetics of assembly formation.

      Strengths:

      The present work is, although purely theoretical, of high interest to understanding biological processes that occur as a result of a coupling between protein aggregation and phase separation. Of course, such processes are abundant, in the living cell as well as in in-vitro experiments. I appreciate the consideration of aggregates with various dimensionality, as well as the categorization into different "interaction classes", together with the mentioning of experimental observations from biology. The model is convincing and underlines the complexity associated with the distribution of proteins across phases and aggregates in the living cell.

      Weaknesses:

      There are a few minor weaknesses.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors combine classical theories of phase separation and self-assembly to establish a framework for explaining the coupling between the two phenomena in the context of protein assemblies and condensates. By starting from a mean-field free energy for monomers and assemblies immersed in solvent and imposing conditions of equilibrium, the authors derive phase diagrams indicating how assemblies partition into different condensed phases as temperature and the total volume fraction of proteins are varied. They find that phase separation can promote assembly within the protein-rich phase, providing a potential mechanism for spatial control of assembly. They extend their theory to account for the possibility of gelation. They also create a theory for the kinetics of self-assembly within phase separated systems, predicting how assembly size distributions change with time within the different phases as well as how the volumes of the different phases change with time.

      Strengths:

      The theoretical framework that the authors present is an interesting marriage of classic theories of phase separation and self-assembly. Its simplicity should make it a powerful general tool for understanding the thermodynamics of assembly coupled to phase separation, and it should provide a useful framework for analyzing experiments on assembly within biomolecular condensates.

      The key advance over previous work is that the authors now account for how self-assembly can change the boundaries of the phase diagram.

      A second interesting point is the explicit theoretical consideration for the possibility that gelation (i.e. self-assembly into a macroscopic aggregate) could account for widely observed solidification of condensates. While this concept has been broadly discussed, to date I have yet to see a rigorous theoretical analysis of the possibility.

      The kinetic theory in sections 5 and 6 is also interesting as it extends on previous work by considering the kinetics of phase separation as well as those of self-assembly.

      Weaknesses:

      A key point the authors make about their theory is that it allows, as opposed to previous research, to study non-dilute limits. It is true that they consider gelation when the 3D assemblies become macroscopic. However, dilute solution theory assumptions seem to be embedded in many aspects of their theory, and it is not always clear where else the non-dilute limits are considered. Is it in the inter-species interaction \chi_{ij}? Why then do they never explore cases for which \chi_{ij} is nonzero in their analysis?

      The connection between this theory and biological systems is described in the introduction but lost along the main text. It would be very helpful to point out, for instance, that the presence of phase separation might induce aggregation of proteins. This point is described formally at the end of Section 3, but a more qualitative connection to biological systems would be very useful here.

      Building on the previous point, it would be helpful to give an intuitive sense of where the equations derived in the Appendices and presented in the main text come from and to spell out clear physical interpretations of the results. For example, it would be helpful to point out that Eq. 4 is a form of the law of mass action, familiar from introductory chemistry.

      It would be useful to better explain how the current work extends on existing previous work from these authors as well as others. Along these lines, closely related work by W. Jacobs and B. Rogers [O. Hedge et al. 2023, https://arxiv.org/abs/2301.06134; T. Li et al. 2023, https://arxiv.org/abs/2306.13198] should be cited in the introduction.

      The results discussed in the first paragraph of Section 3 on assembly size distributions in a homogeneous system are well-known from classic theories of self-assembly. This should be acknowledged and appropriate references should be added; see for instance Rev. Mod. Phys. 93, 025008 and Statistical Thermodynamics Of Surfaces, Interfaces, And Membranes by Sam Safran.

      Equation 14 for the kinetic of volume fractions is given with a reference to Bauermann et al 2022, but it should be accompanied by a better intuitive interpretation of its terms in the main text. In particular, how should one understand the third term in this equation? Why does the change in volume impact the change of volume fraction in this way?

      The discussion in the last paragraph of Section 6 should be clarified. How can the total amount of protein in both phases decrease? This would necessarily violate either mass or volume conservation. Also, the discussion of why the volume is non-monotonic in time is not clear.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      Summary:

      This paper provides a straightforward mechanism of how mycobacterial cAMP level is increased under stressful conditions and shows that the increase is important for the survival of the bacterium in animal hosts. The cAMP level is increased by decreasing the expression of an enzyme that degrades cAMP.

      We thank the reviewer for these extremely encouraging comments.

      Strengths:

      The paper shows that under different stresses the response regulator PhoP represses a phosphodiesterase (PDE) that degrades cAMP specifically. Identification of PhoP as a regulator of cAMP is significant progress in understanding Mtb pathogenesis, as increase in cAMP apparently increases bacterial survival upon infection. On the practical side, reduction of cAMP by increasing PDE can be a means to attenuate the growth of the bacilli. The results have wider implications since PhoP is implicated in controlling diverse mycobacterial stress responses and many bacterial pathogens modulate host cell cAMP level. The results here are straightforward, internally consistent, and of both theoretical and applied interests. The results also open considerable future work, especially how increases in cAMP level help to increase survival of the pathogen.

      Weaknesses:

      It is not clear whether PhoP-PDE Rv0805 is the only pathway to regulate cAMP level under stress.

      Reviewer 1 (Recommendations for the authors):

      (1) L.1: "maintenance of" or 'regulating'- I thought change in cAMP level upon stress is the whole point of the paper. Also, can replace "intracellular survival" with 'survival in host macrophages' if you want to be more specific.

      We agree with the reviewer, and therefore, we have now replaced “maintenance of” with “regulating cAMP level” in the title. However, we feel more comfortable with “intracellular survival” rather than being more specific with ‘survival in host macrophages’ as we have also shown animal experiments to demonstrate ‘in vivo’ effect in mice lung and spleen.

      (2) L.26: ---requires the bacterial virulence regulator –

      The suggested change has been made to the text.

      (3) L.30: Replace "phoP locus since the" with 'PhoP since this'. (The product, not the locus, is the regulator). The same comment for l.113.

      We agree with the reviewer. The suggested changes have been made to the text.

      (4) L.31: Change represtsor to repressor.

      We are sorry for the embarrassing spelling mistake. We have rectified the mistake in the revised version.

      (5) L.32: "hydrolytically degrades" or hydrolyses? (lytic and degrade sound like tautology). Same comment for l.117.

      We agree. The suggested change has been made to the text in both places of the revised manuscript.

      (6) L.35: I would also suggest changing "intra-mycobacterial" to 'intra bacterial' because you are talking about one bacterium here. The same change is recommended in l.29.

      Following reviewer’s recommendation, we have made the changes in the revised manuscript.

      (7) L.37: bacillus unless use of the plural form is the norm in the field.

      We agree. The suggested change has been made to the text.

      (8) L.43: Delete "intracellular" and change "intracellular" to host in l.44.

      The suggested changes have been made to the text.

      (9) L.66: --that a burst--

      We have corrected the mistake in the revised manuscript.

      (10) L.76: Receptor or receptor?

      We have corrected the mistake in the revised manuscript.

      (11) L.86: -- mechanisms of regulation of mycobacterial cAMP level. (homeostasis needs to be introduced first, and not used in the concluding statement for the first time).

      The suggested changes have been made to the text.

      (12) L.96: "essential" or 'a requirement'. (reduction is not the same as elimination)

      We understand the reviewer’s concern. However, several studies have independently established that phoPR remains an essential requirement for mycobacterial virulence.

      (13) L.97: Moreover, a mutant

      The suggested change has been made to the text.

      (14) L.113: --locus since PhoP has been –

      The suggested change has been made to the text.

      (15) L.119: mechanism or manner? (you are stating a fact, not a mechanism)

      We agree. We have now replaced ‘mechanism’ with ‘manner’ in the revised manuscript.

      (16) L.130: --lacking copies of both phoP and phoR (I am assuming you don't have two copies of each gene)

      We understand the reviewer’s concern. For better clarity, we have now clearly mentioned that the phoPR-KO mutant lacks both the single copies of phoP and phoR genes.

      (17) L.156: Indicate why GroEL2? - cells as another cytoplasmic protein, GroEL2 was also undetectable

      We have now mentioned it in the secretion experiments that mycobacterial cells did not undergo autolysis. To prove this point, we have used cytoplasmic GroEL2 as a marker protein. The absence of detectable GroEL2 in the culture filtrates (CFs) suggests absence of autolysis. To this end, we have modified the sentence in the revised manuscript (duplicated below):

      “Fig. 1C confirms absence of autolysis of mycobacterial cells as GroEL2, a cytoplasmic protein, was undetectable in the culture filtrates (CF).”

      (18) L.266: May delete "Together". Start with These data--, which would draw more attention to integrated view. In l.268-270, a reminder that intracellular pH is acidic in the normal course would enhance the physiological significance of the present results.

      We agree. We have made the suggested changes to the text. In view of the second comment of the reviewer, we have modified the text (duplicated below):

      “These data represent an integrated view of our results suggesting that PhoP-dependant repression of rv0805 regulates intra-mycobacterial cAMP level. In keeping with these results, activated PhoP under acidic pH conditions significantly represses rv0805, and intracellular mycobacteria most likely utilizes a higher level of cAMP to effectively mitigate stress for survival under hostile environment including acidic pH of the phagosome.”

      (19) L.272: Delete "and intracellular survival" (?) (I am assuming the survival is due to stress tolerance; also the section talks about stress only). No period in l.273.

      Following reviewer’s recommendations, the suggested changes have been made to the text.

      (20) L.295: Start the sentence thus: It appears that at least one of ---. (This would put more emphasis on the inference)

      We agree. We have now incorporated the recommended changes in the revised version.

      (21) L.301: No parenthesis.

      The parenthesis has been removed in the revised manuscript.

      (22) L.306: Together already implies these. Either delete Together (which I would prefer) or say 'Together, the results suggest that strains expressing wild type and mutant----properties, and the results are

      We agree. We have now deleted ‘Together’ in the revised manuscript.

      (23) L.311: These results support our view that higher---- (to avoid repetition of l.266)

      We agree. We have now incorporated the suggested change in the revised manuscript.

      (24) L.316: Using or with?

      We think “with” goes well with the statement.

      (25) L.329: Rephrase thus: Effect of intra-bacterial cAMP level on in vivo--

      The recommended change has been made to the text.

      (26) L.333: I would use ~, if you want to indicate about.

      We agree. We have now used ‘~’ in the revised version. Changes were incorporated in lines 328, 330 and 333 of the revised manuscript.

      (27) L.350: Change "somewhat functionally" to phenotypically?

      We thank the reviewer for this suggestion. We have changed “somewhat functionally” to “phenotypically” in the revised manuscript.

      (28) L.361: Change "is connected to" to 'regulates'.

      The suggested change has been made to the text.

      (29) L.365: ACs (to be parallel with PDEs)

      We agree. The suggested change has been made to the text.

      (30) L.366: delete "very" (let the readers decide how recent from the reference date).

      The suggested change has been made to the text.

      (31) L.382: level remained unknown before the present study.

      The recommended change has been made to the text.

      (32) L.399: add at the end of the sentence 'under stress'. Also, represent, not represents.

      The recommended changes have been made to the text.

      (33) L.560 and 571: Section headings formatted differently from the rest. Similar problem in l.900.

      We have rectified the issue and all of the section headings are now formatted in the same style.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript, the authors have presented new mechanistic details to show how intracellular cAMP levels are maintained linked to the phosphodiesterase enzyme which in turn is controlled by PhoP. Later, they showed the physiological relevance linked to altered cAMP concentrations.

      Strengths:

      Well thought out experiments. The authors carefully planned the experiments well to uncover the molecular aspects of it diligently.

      We thank the reviewer for these extremely encouraging comments.

      Weaknesses:

      Some fresh queries were made based on the author's previous responses and hope to get satisfactory answers this time.

      We provide below a point-by-point response to the fresh queries.

      (2) Line 134: please describe the complementation strain features as it is mentioned for the first time (plasmid, copy number, promoter etc.) in the manuscript. Especially under NO stress what could be the authors' justification regarding the high cAMP concentration in the complementation strain?

      As recommended by the reviewer, the details of construction of the complemented strain have been incorporated in the 'Materials and Methods' section of the revised manuscript (duplicated below): "To complement phoPR expression, pSM607 containing a 3.6-kb DNA fragment of M. tuberculosis phoPR including 200-bp phoP promoter region, a hygromycin resistance cassette, attP site and the gene encoding phage L5 integrase, as detailed earlier (Walters et al., 2006) was used to transform phoPR mutant to integrate at the L5 attB site.

      " To address the reviewer's other concern, we have now included the following sentence in the 'Results' section of the revised manuscript (duplicated below): "A higher cAMP level in the complemented strain under NO stress is possibly attributable to reproducibly higher phoP expression in the complemented mutant under specific stress condition (Khan et al., 2022)."

      Reference: Khan et al. (2022) Convergence of two global regulators to coordinate expression of essential virulence determinants of Mycobacterium tuberculosis. eLife 2022, 11:e80965.

      New query: The complemented gene (in pSM607 plasmid) becomes a single copy after chromosomal integration, so it should ideally behave like a WT strain. How could authors still justify the high cAMP concentration under NO stress?

      We agree with the reviewer. We are unable to provide a cogent justification regarding this result. We speculate that PhoP is strikingly activated under NO stress by a non-canonical mechanism and strongly represses rv0805 expression. As a result, there is a significantly higher cAMP concentration in case of the complemented mutant under NO stress.

      (13) Line 292: There is a difference between red and green bars. Authors should do statistical analysis and then comment on whether overexpression of WT and mutant pde are different or similar, to me they are different; also, explain why the WT-Rv0805 strain is different than the phoPR-KO strain in the context of cell wall metabolism.

      As recommended by the reviewer, we have now included statistical significance of the data in the revised version, and modified the text accordingly in the manuscript.

      New query: Authors are asked to put a statistical significance test between WT-Rv0805 and WT-Rv0805M.

      We have included it in the modified figure. Also, to explain it we incorporated new text in the legend to Fig. 4C of the revised manuscript (duplicated below):

      “Note that similar to phoPR-KO, WT-Rv0805 shows a comparably higher sensitivity to CHP relative to WT bacilli. However, WT-Rv0805M expressing a mutant Rv0805, shows a significantly lower sensitivity to CHP relative to WT-Rv0805, as measured by the corresponding CFU values.”

      (14) Line 299-303: Authors should explain how the colocalization % are calculated. Also, in the figure 4D merge panel please highlight the difference.

      As suggested by the reviewer, we have now explained the methodology used to calculate percent colocalization in greater details. Also, we have modified Figure 4D to highlight the difference between samples shown in merge panel. Please see our response to comment # 33 from the Reviewer 1.

      New query: In the figure legend it should be mentioned that the white arrow indicates non-co-localization which is visibly higher in WT and WT Rvo805M.

      We thank the reviewer for this very important suggestion. We have now included the following text in the legend to Fig. 4D of the revised manuscript.

      “White arrowheads in the merge panels indicate non-colocalization, which remains higher in WT-H37Rv and WT-Rv0805M relative to phoPR-KO or WT-Rv0805.”

    2. eLife assessment

      This important study describes how PhoP regulates cyclic-AMP production in the human pathogen Mycobacterium tuberculosis. The authors provide convincing evidence that PhoP acts as a repressor of the cyclic-AMP-specific phosphodiesterase, Rv0805, which can degrade cyclic-AMP. The revised manuscript has addressed all outstanding comments and the work will be of interest to bacteriologists.

    3. Reviewer #1 (Public Review):

      Summary:

      This paper provides a straightforward mechanism of how mycobacterial cAMP level is increased under stressful conditions and shows that the increase is important for the survival of the bacterium in animal hosts. The cAMP level is increased by decreasing the expression of an enzyme that degrades cAMP.

      Strengths:

      The paper shows that under different stresses the response regulator PhoP represses a phosphodiesterase (PDE) that degrades cAMP specifically. Identification of PhoP as a regulator of cAMP is significant progress in understanding Mtb pathogenesis, as an increase in cAMP apparently increases bacterial survival upon infection. On the practical side, reduction of cAMP by increasing PDE can be a means to attenuate the growth of the bacilli. The results have wider implications since PhoP is implicated in controlling diverse mycobacterial stress responses and many bacterial pathogens modulate host cell cAMP levels. The results here are straightforward, internally consistent, and of both theoretical and applied interests. The results also open considerable future work, especially how increases in cAMP level help to increase survival of the pathogen.

      Weaknesses:

      It is not clear whether PhoP-PDE Rv0805 is the only pathway to regulate cAMP level under stress.

      Comments on revised submission:

      The authors have addressed my comments adequately, actually except for all but one. I have only one comment to do with the last line of the abstract. First, "genetic manipulation" usually means changing DNA. In Mtb pathogenesis I hope there is no DNA modification or change in the bacterial DNA. Also, the authors did not really inactivate the whole PhoP- rv0805-cAMP pathway. It would be best if the last line is made more fact based: Thus, inactivation of PhoP decreases cAMP level, thereby stress tolerance and intracellular survival of the bacillus.

    4. Reviewer #2 (Public Review):

      Summary:

      In the manuscript, the authors have presented new mechanistic details to show how intracellular cAMP levels are maintained and linked to the phosphodiesterase enzyme which in turn is controlled by PhoP. Later, they showed the physiological relevance linked to altered cAMP concentrations.

      Strengths:

      Well-thought-out experiments. The authors carefully planned the experiments well to uncover the molecular aspects of it diligently.

      Weaknesses:

      None. The authors have meticulously responded to all my queries and concerns through multiple rounds of review.

    1. Reviewer #1 (Public Review):

      Using a combination of cutting-edge high-resolution approaches (expansion microscopy, SIM, and CLEM) and biochemical approaches (in vitro translocation of actin filaments, cargo uptake assays, and drug treatment), the authors revisit previous results about TbMyo1 and TbACT in the bloodstream form (BSF) of Trypanosoma brucei. They show that a great part of the myosin motor is cytoplasmic but the fraction associated with organelles is in proximity to the endosomal system. In addition, they show that TbMyo1 can move actin filaments in vitro and visualize for the first time this actomyosin system using specific antibodies, a "classical" antibody for TbMyo1, and a chromobody for actin. Finally, using latrunculin A, which sequesters G-actin and prevents F-actin assembly, the authors show the delocalization and eventually the loss of the filamentous actin signal as well as the concomitant loss of the endosomal system integrity. However, they do not assess the localization of TbMyo1 in the same conditions.

      Overall the work is well conducted and convincing. The conclusions are not over-interpreted and are supported by the experimental results.

    2. Reviewer #2 (Public Review):

      Summary:

      The study by Link et al. advances our understanding of the actomyosin system in T. brucei, focusing on the role of TbMyo1, a class I myosin, within the parasite's endosomal system. Using a combination of biochemical fractionation, in vitro motility assays, and advanced imaging techniques such as correlative light and electron microscopy (CLEM), this paper demonstrates that TbMyo1 is dynamically distributed across early and late endosomes, the cytosol, is associated with the cytoskeleton, and a fraction has an unexpected association with glycosomes. Notably, the study shows that TbMyo1 can translocate actin filaments at velocities suggesting an active role in intracellular trafficking, potentially higher than those observed for similar myosins in other cell types. This work not only elucidates the spatial dynamics of TbMyo1 within T. brucei but also suggests its broader involvement in maintaining the complex architecture of the endosomal network, underscoring the critical role of the actomyosin system in a parasite that relies on high rates of endocytosis for immune evasion.

      Strengths:

      A key strength of the study is its exceptional rigor and successful integration of a wide array of sophisticated techniques, such as in vitro motility assays, and advanced imaging methods, including correlative light and electron microscopy (CLEM) and immuno-electron microscopy. This combination of approaches underscores the study's comprehensive approach to examining the ultrastructural organization of the trypanosome endomembrane system. The application of functional data using inhibitors, such as latrunculin A for actin depolymerization, further strengthens the study by providing insights into the dynamics and regulatory mechanisms of the endomembrane system. This demonstrates how the actomyosin system contributes to cellular morphology and trafficking processes. Furthermore, the discovery of TbMyo1 localization to glycosomes introduces a novel aspect to the potential roles of myosin I proteins within the cell, particularly in the context of organelles analogous to peroxisomes. This observation not only broadens our understanding of myosin I functionality but also opens up new avenues for research into the cellular biology of trypanosomatids, marking a significant contribution to the field.

      Weaknesses:

      Certain limitations inherent in the study's design and scope render the narrative incomplete and make it challenging to reach definitive conclusions. One significant limitation is the reliance on spatial association data, such as colocalization of TbMyo1 with various cellular components-or the absence thereof-to infer functional relationships. Although these data suggest potential interactions, the authors do not confirm functional or direct physical interactions.

      While TbMyo1's localization is informative, the authors do not directly demonstrate its biochemical or mechanical activities in vivo, leaving its precise role in cellular processes speculative. Direct assays that manipulate TbMyo1 levels, activity, and/or function, coupled with observations of the outcomes on cellular processes, would provide more definitive evidence of the protein's specific roles in T. brucei. A multifaceted approach, including genetic manipulations, uptake assays, kinetic trafficking experiments, and imaging, would offer a more robust framework for understanding TbMyo1's roles. This comprehensive approach would elucidate not just the "what" and "where" of TbMyo1's function but also the "how" and "why," thereby deepening our mechanistic insights into T. brucei's biology.

    3. Reviewer #3 (Public Review):

      Summary:

      In this work, Link and colleagues have investigated the localization and function of the actomyosin system in the parasite Trypanosoma brucei, which represents a highly divergent and streamlined version of this important cytoskeletal pathway. Using a variety of cutting-edge methods, the authors have shown that the T. brucei Myo1 homolog is a dynamic motor that can translocate actin, suggesting that it may not function as a more passive crosslinker. Using expansion microscopy, iEM, and CLEM, the authors show that MyoI localizes to the endosomal pathway, specifically the portion tasked with internalizing and targeting cargo for degradation, not the recycling endosomes. The glycosomes also appear to be associated with MyoI, which was previously not known. An actin chromobody was employed to determine the localization of filamentous actin in cells, which was correlated with the localization of Myo1. Interestingly, the pool of actomyosin was not always closely associated with the flagellar pocket region, suggesting that portions of the endolysomal system may remain at a distance from the sole site of parasite endocytosis. Lastly, the authors used actin-perturbing drugs to show that disrupting actin causes a collapse of the endosomal system in T. brucei, which they have shown recently does not comprise distinct compartments but instead a single continuous membrane system with subdomains containing distinct Rab markers.

      Strengths:

      Overall, the quality of the work is extremely high. It contains a wide variety of methods, including biochemistry, biophysics, and advanced microscopy that are all well-deployed to answer the central question. The data is also well-quantitated to provide additional rigor to the results. The main premise, that actomyosin is essential for the overall structure of the T. brucei endocytic system, is well supported and is of general interest, considering how uniquely configured this pathway is in this divergent eukaryote and how important it is to the elevated rates of endocytosis that are necessary for this parasite to inhabit its host.

      Weaknesses:

      (1) Did the authors observe any negative effects on parasite growth or phenotypes like BigEye upon expression of the actin chromobody?

      (2) The Garcia-Salcedo EMBO paper cited included the production of anti-actin polyclonal antibodies that appeared to work quite well. The localization pattern produced by the anti-actin polyclonals looks similar to the chromobody, with perhaps a slightly larger labeling profile that could be due to differences in imaging conditions. I feel that the anti-actin antibody labeling should be expressly mentioned in this manuscript, and perhaps could reflect differences in the F-actin vs total actin pool within cells.

      (3) The authors showed that disruption of F-actin with LatA leads to disruption of the endomembrane system, which suggests that the unique configuration of this compartment in T. brucei relies on actin dynamics. What happens under conditions where endocytosis and endocyctic traffic is blocked, such as 4 C? Are there changes to the localization of the actomyosin components?

      (4) Along these lines, the authors suggest that their LatA treatments were able to disrupt the endosomal pathway without disrupting clathrin-mediated endocytosis at the flagellar pocket. Do they believe that actin is dispensable in this process? That seems like an important point that should be stated clearly or put in greater context.

    1. eLife assessment

      This study presents a useful characterization of 3D chromosome conformation changes in activated T lymphocytes, linking risk variants for autoimmune disease to putative target genes. The study employs solid methods and approaches and demonstrates the utility of using chromatin conformation to understand gene regulatory processes. However, the same data modality (chromatin conformation) was previously generated by another group in the same model system, and a more in-depth comparison of results would have improved the utility of this study.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors profile gene expression, chromatin accessibility, and chromosomal architecture (by Hi-C) in activated CD4 T cells and use this information to link non-coding variants associated with autoimmune diseases with putative target genes. They find over 1000 genes physically linked with autoimmune disease loci in these cells, many of which are upregulated upon T cell activation. Focusing on IL2, they dissect the regulatory architecture of this locus, including the allelic effects of GWAS variants. They also intersect their variant-to-gene lists with data from CRISPR screens for genes involved in CD4 T cell activation and expression of inflammatory genes, finding enrichments for regulators. Finally, they showed that pharmacological inhibition of some of these genes impacts T-cell activation.

      This is a solid study that follows a well-established canvas for variant-to-gene prioritisation using 3D genomics, applying it to activated T cells. The authors go some way in validating the lists of candidate genes, as well as exploring the regulatory architecture of a candidate GWAS locus. Jointly with data from previous studies performing variant-to-gene assignment in activated CD4 T cells (and other immune cells), this work provides a useful additional resource for interpreting autoimmune disease-associated genetic variation.

      Suggestions for improvement:

      Autoimmune disease variants were already linked with genes in CD28-stimulated CD4 T cells using chromosome conformation capture, specifically Promoter CHi-C and the COGS pipeline (Javierre et al., Cell 2016; Burren et al., Genome Biol 2017; Yang et al., Nat Comms 2020). The authors cite these papers and present a comparative analysis of their variant-to-gene assignments (in addition to scRNA-seq eQTL-based assignments). Furthermore, they find that the Burren analysis yields a higher enrichment for gold standard genes.

      The obvious question that the authors don't venture into is why the results are quite different. In principle, this could be due to the differences between:<br /> (a) the cell stimulation procedure<br /> (b) the GWAS datasets used<br /> (c) the types of assay (Hi-C vs Capture Hi-C)<br /> (d) approaches for defining gene-linked regions (loops vs neighbourhoods)<br /> (e) how the GWAS signals at gene-linked regions are aggregated (e.g., the flavours of COGS in Javierre and Burren vs the authors' approach).

      Re (a), I'm not sure the authors make it explicitly clear in the main text that the Capture Hi-C-based studies also use *stimulated* CD4 T cells, particularly in the section "Comparative predictive power...". So the cells used are pretty much the same, and the differences likely arise from points (b) to (e).

      It would be useful for the community to understand more clearly what is driving these differences, ideally with some added data. Could the authors, for example, take the PCHi-C data from Javierre/Burren and use their GWAS data and variant-to-gene assignment algorithms?

      In addition, given that the authors use Hi-C, a popular method for V2G prioritisation for this type of data is currently ABC (Nasser et al, Nature 2021). Could the authors provide a comparative analysis with respect to the V2G assignments in the paper and, if they see it appropriate, also run ABC-based GWAS integration on their own Hi-C data?

    3. Reviewer #2 (Public Review):

      Summary:

      There is significant interest in characterizing the mechanisms by which genetic mutations linked to autoimmunity perturb immune processes. Pahl et al. collect information on dynamic accessible regions, genes, and 3D contacts in primary CD4+ T cell samples that have been stimulated ex vivo. The study includes a variety of analyses characterizing these dynamic changes. With TF footprinting they propose factors linked to active regulatory elements. They compare the performance of their variant mapping pipeline that uses their data versus existing datasets. Most compelling there was a deep dive into additional study of regulatory elements nearby the IL2 gene. Finally, they perform a pharmacological screen targeting several genes they suggest are involved in T cell proliferation.

      Strengths:

      The work done characterizing elements at the IL2 locus is impressive.

      Weaknesses:

      - Missing critical context to evaluate claims. There are extensive studies performed on resting and activated immune cell states (CD4+ T cells and other cell types) and some at multiple time points or concentrations of stimuli that collect ATAC-seq and/or RNA-seq that have been ignored by this study. How do conclusions from previous studies compare to what the authors conclude here? It is impossible to evaluate the claims without this additional context. These are a few studies I am familiar with (the authors should perform a more comprehensive search to be sure they're not ignoring existing observations) that would be important to compare/contrast conclusions:<br /> o Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424-431 (2018).<br /> o Calderon, D., Nguyen, M.L.T., Mezger, A. et al. Landscape of stimulation-responsive chromatin across diverse human immune cells. Nat Genet 51, 1494-1505 (2019).<br /> o Gate, R.E., Cheng, C.S., Aiden, A.P. et al. Genetic determinants of co-accessible chromatin regions in activated T cells across humans. Nat Genet 50, 1140-1150 (2018).<br /> o Glinos, D.A., Soskic, B., Williams, C. et al. Genomic profiling of T-cell activation suggests increased sensitivity of memory T cells to CD28 costimulation. Genes Immun 21, 390-408 (2020).<br /> o Gutierrez-Arcelus, M., Baglaenko, Y., Arora, J. et al. Allele-specific expression changes dynamically during T cell activation in HLA and other autoimmune loci. Nat Genet 52, 247-253 (2020).<br /> o Kim-Hellmuth, S. et al. Genetic regulatory effects modified by immune activation contribute to autoimmune disease associations. Nat. Commun. 8, 266 (2017).<br /> o Ye, C. J. et al. Intersection of population variation and autoimmunity genetics in human T cell activation. Science 345, 1254665 (2014).

      - As a general point, I appreciate it when each claim includes a corresponding effect size and p-value, which helps me evaluate the strength of significance of supporting evidence.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper used RNAseq, ATACseq, and Hi-C to assess gene expression, chromatin accessibility, and chromatin physical associations for native CD4+ T cells as they respond to stimulation through TCR and CD28. With these data in hand, the author identified 423 GWAS signals to their respective target genes, where most of these were not in the proximal promoter, but rather distal enhancers. The IL-2 gene was used as an example to identify new distal cis-regulatory regions required for optimal IL-2 gene transcription. These distal elements interact with the proximal IL2 promoter region. When the distal enhancer contained an autoimmune SNP, it affected IL-2 gene transcription. The authors also identified genetic risk variants that were associated with genes upon activation. Some of these regulate proliferation and cytokine production, but others are novel.

      Strengths:

      This paper provides a wealth of data related to gene expression after CD4 T cells are activated through the TCR and CD28. An important strength of this paper is that these data were intensively analyzed to uncover autoimmune disease SNPs in cis-acting regions. Many of these could be assigned to likely target genes even though they often are in distal enhancers. These findings help to provide a better understanding concerning the mechanism by which GWAS risk elements impact gene expression.

      Another strength of this study was the proof-of-principle studies examining the IL-2 gene. Not only were new cis-acting enhancers discovered, but they were functionally shown to be important in regulating IL-2 expression, including susceptibility to colitis. Their importance was also established with respect to such distal enhancers harboring disease-relevant SNPs, which were shown to affect IL-2 transcription.

      The data from this study were also mined against past CRISPR screens that identified genes that control aspects of CD4 T cell activation. From these comparisons, novel genes were identified that function during T cell activation.

      Weaknesses:

      A weakness of this study is that few individuals were analyzed, i.e., RNAseq and ATACseq (n=3) and HiC (n=2). Thus, the authors may have underestimated potentially relevant risk associations by their chromatin capture-based methodology. This might account for the low overlap of their data with the eQTL-based approach or the HIEI truth set.

      Impact:

      This study indicates that defining distal chromatin interacting regions helps to identify distal genetic elements, including relevant variants, that contribute to gene activation.

    1. eLife assessment

      This is a valuable manuscript describing the competitive binding between the RING2 and phosphorylated Ubl domains within Parkin involved in the regulation of Parkin activity. The evidence supporting this conclusion is incomplete, as it primarily relies on a single biochemical assay and does not utilize more stringent, quantitative biophysical approaches to probe this competitive binding. This work will be of interest to the research communities focused on the molecular basis of ubiquitin ligase regulation, PINK-PARKIN-regulated mitophagy, and mitochondrial quality control.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors used structural and biophysical methods to provide insight into Parkin regulation. The breadth of data supporting their findings was impressive and generally well-orchestrated. Still, the impact of their results builds on recent structural studies and the stated impact is based on these prior works.

      Strengths:

      (1) After reading through the paper, the major findings are:<br /> - RING2 and pUbl compete for binding to RING0.<br /> - Parkin can dimerize.<br /> - ACT plays an important role in enzyme kinetics.

      (2) The use of molecular scissors in their construct represents a creative approach to examining inter-domain interactions.

      (3) From my assessment, the experiments are well-conceived and executed.

      Weaknesses:

      (1) The manuscript, as written, is NOT for a general audience. Admittedly, I am not an expert on Parkin structure and function, but I had to do a lot of homework to try to understand the underlying rationale and impact. This reflects, I think, that the work generally represents an incremental advance on recent structural findings.

      (2) To this point, it is hard to understand the impact of this work without more information highlighting the novelty. There are several structures of Parkin in various auto-inhibited states, and it was hard to delineate how this is different.

      (3) As noted, I appreciated the use of protease sites in the fusion protein construct. It is unclear how the loop region might affect the protein structure and function. The authors worked to demonstrate that this did not introduce artifacts, but the biological context is missing.

      (4) While it is likely that the binding is competitive between the Ubl and RING2 domains, the data is not quantitative. Is it known whether the folding of the distinct domains is independent? Or are there interactions that alter folding? It seems plausible that conformational rearrangements may invoke an orientation of domains that would be incompatible. The biological context for the importance of this interaction was not clear to me.

      (5) What is the rationale for mutating Lys211 to Asn? Were other mutations tried? Glu? Ala? Just missing the rationale. I think this may have been identified previously in the field, but not clear what this mutation represents biologically.

      (6) I was confused about how the phospho-proteins were generated. After looking through the methods, there appear to be phosphorylation experiments, but it is unclear what the efficiency was for each protein (i.e. what % gets modified). In the text, the authors refer to phospho-Parkin (T270R, C431A), but not clear how these mutations might influence this process. I gather that these are catalytically inactive, but it is unclear to me how this is catalyzing the ubiquitination in the assay.

      (7) The authors note that "ACT can be complemented in trans; however, it is more efficient in cis", but it is unclear whether both would be important or if the favored interaction is dominant in a biological context.

      (8) The authors repeatedly note that this study could aid in the development of small-molecule regulators against Parkin to treat PD, but this is a long way off. And it is not clear from their manuscript how this would be achieved. As stated, this is conjecture.

    3. Reviewer #2 (Public Review):

      This manuscript uses biochemistry and X-ray crystallography to further probe the molecular mechanism of Parkin regulation and activation. Using a construct that incorporates cleavage sites between different Parkin domains to increase the local concentration of specific domains (i.e., molecular scissors), the authors suggest that competitive binding between the p-Ubl and RING2 domains for the RING0 domain regulates Parkin activity. Further, they demonstrate that this competition can occur in trans, with a p-Ubl domain of one Parkin molecule binding the RING0 domain of a second monomer, thus activating the catalytic RING1 domain. In addition, they suggest that the ACT domain can similarly bind and activate Parkin in trans, albeit at a lower efficiency than that observed for p-Ubl. The authors also suggest from crystal structure analysis and some biochemical experiments that the linker region between RING2 and repressor elements interacts with the donor ubiquitin to enhance Parkin activity.

      Ultimately this manuscript challenges previous work suggesting that the p-Ubl domain does not bind to the Parkin core in the mechanism of Parkin activation. The use of the 'molecular scissors' approach to probe these effects is an interesting approach to probe this type of competitive binding. However, there are issues with the experimental approach manuscript that detract from the overall quality and potential impact of the work.

      The competitive binding between p-Ubl and RING2 domains for the Parkin core could have been better defined using biophysical and biochemical approaches that explicitly define the relative affinities that dictate these interactions. A better understanding of these affinities could provide more insight into the relative bindings of these domains, especially as it relates to the in trans interactions.

      I also have concerns about the results of using molecular scissors to 'increase local concentrations' and allow for binding to be observed. These experiments are done primarily using proteolytic cleavage of different domains followed by size exclusion chromatography. ITC experiments suggest that the binding constants for these interactions are in the µM range, although these experiments are problematic as the authors indicate in the text that protein precipitation was observed during these experiments. This type of binding could easily be measured in other assays. My issue relates to the ability of a protein complex (comprising the core and cleaved domains) with a Kd of 1 µM to be maintained in an SEC experiment. The off-rates for these complexes must be exceeding slow, which doesn't really correspond to the low µM binding constants discussed in the text. How do the authors explain this? What is driving the Koff to levels sufficiently slow to prevent dissociation by SEC? Considering that the authors are challenging previous work describing the lack of binding between the p-Ubl domain and the core, these issues should be better resolved in this current manuscript. Further, it's important to have a more detailed understanding of relative affinities when considering the functional implications of this competition in the context of full-length Parkin. Similar comments could be made about the ACT experiments described in the text.

      Ultimately, this work does suggest additional insights into the mechanism of Parkin activation that could contribute to the field. There is a lot of information included in this manuscript, giving it breadth, albeit at the cost of depth for the study of specific interactions. Further, I felt that the authors oversold some of their data in the text, and I'd recommend being a bit more careful when claiming an experiment 'confirms' a specific model. In many cases, there are other models that could explain similar results. For example, in Figure 1C, the authors state that their crystal structure 'confirms' that "RING2 is transiently displaced from the RING0 domain and returns to its original position after washing off the p-Ubl linker". However, it isn't clear to me that RING2 ever dissociated when prepared this way. While there are issues with the work that I feel should be further addressed with additional experiments, there are interesting mechanistic details suggested by this work that could improve our understanding of Parkin activation. However, the full impact of this work won't be fully appreciated until there is a more thorough understanding of the regulation and competitive binding between p-Ubl and RIGN2 to RORB both in cis and in trans.

    4. Reviewer #3 (Public Review):

      Summary:

      In their manuscript "Additional feedforward mechanism of Parkin activation via binding of phospho-UBL and RING0 in trans", Lenka et al present data that could suggest an "in trans" model of Parkin ubiquitination activity. Parkin is an intensely studied E3 ligase implicated in mitophagy, whereby missense mutations to the PARK2 gene are known to cause autosomal recessive juvenile parkinsonism. From a mechanistic point of view, Parkin is extremely complex. Its activity is tightly controlled by several modes of auto-inhibition that must be released by queues of mitochondrial damage. While the general overview of Parkin activation has been mapped out in recent years, several details have remained murky. In particular, whether Parkin dimerizes as part of its feed-forward signaling mechanism, and whether said dimerization can facilitate ligase activation, has remained unclear. Here, Lenka et al. use various truncation mutants of Parkin in an attempt to understand the likelihood of dimerization (in support of an "in trans" model for catalysis).

      Strengths:

      The results are bolstered by several distinct approaches including analytical SEC with cleavable Parkin constructs, ITC interaction studies, ubiquitination assays, protein crystallography, and cellular localization studies.

      Weaknesses:

      As presented, however, the storyline is very confusing to follow and several lines of experimentation felt like distractions from the primary message. Furthermore, many experiments could only indirectly support the author's conclusions, and therefore the final picture of what new features can be firmly added to the model of Parkin activation and function is unclear.

      Major concerns:

      (1) This manuscript solves numerous crystal structures of various Parkin components to help support their idea of in trans transfer. The way these structures are presented more resemble models and it is unclear from the figures that these are new complexes solved in this work, and what new insights can be gleaned from them.

      (2) There are no experiments that definitively show the in trans activation of Parkin. The binding experiments and size exclusion chromatography are a good start, but the way these experiments are performed, they'd be better suited as support for a stronger experiment showing Parkin dimerization. In addition, the rationale for an in trans activation model is not convincingly explained until the concept of Parkin isoforms is introduced in the Discussion. The authors should consider expanding this concept into other parts of the manuscript.

      2a. For the in trans activation experiment using wt Parkin and pParkin (T270R/C431A) (Figure 3D), there needs to be a large excess of pParkin to stimulate the catalytic activity of wt Parkin. This experiment has low cellular relevance as these point mutations are unlikely to occur together to create this nonfunctional pParkin protein. In the case of pParkin activating wt Parkin (regardless of artificial point mutations inserted to study specifically the in trans activation), if there needs to be much more pParkin around to fully activate wt Parkin, isn't it just more likely that the pParkin would activate in cis?

      2ai. Another underlying issue with this experiment is that the authors do not consider the possibility that the increased activity observed is a result of increased "substrate" for auto-ubiquitination, as opposed to any role in catalytic activation. Have the authors considered looking at Miro as a substrate in order to control for this?

      2b. The authors mention a "higher net concentration" of the "fused domains" with RING0, and use this to justify artificially cleaving the Ubl or RING2 domains from the Parkin core. This fact should be moot. In cells, it is expected there will only be a 1:1 ratio of the Parkin core with the Ubl or RING2 domains. To date, there is no evidence suggesting multiple pUbls or multiple RING2s can bind the RING0 binding site. In fact, the authors here even show that either the RING2 or pUbl needs to be displaced to permit the binding of the other domain. That being said, there would be no "higher net concentration" because there would always be the same molar equivalents of Ubl, RING2, and the Parkin core.

      2c. A larger issue remaining in terms of Parkin activation is the lack of clarity surrounding the role of the linker (77-140); particularly whether its primary role is to tether the Ubl to the cis Parkin molecule versus a role in permitting distal interactions to a trans molecule. The way the authors have conducted the experiments presented in Figure 2 limits the possible interactions that the activated pUbl could have by (a) ablating the binding site in the cis molecule with the K211N mutation; (b) further blocking the binding site in the cis molecule by keeping the RING2 domain intact. These restrictions to the cis parkin molecule effectively force the pUbl to bind in trans. A competition experiment to demonstrate the likelihood of cis or trans activation in direct comparison with each other would provide stronger evidence for trans activation.

      (3) A major limitation of this study is that the authors interpret structural flexibility from experiments that do not report directly on flexibility. The analytical SEC experiments report on binding affinity and more specifically off-rates. By removing the interdomain linkages, the accompanying on-rate would be drastically impacted, and thus the observations are disconnected from a native scenario. Likewise, observations from protein crystallography can be consistent with flexibility, but certainly should not be directly interpreted in this manner. Rigorous determination of linker and/or domain flexibility would require alternative methods that measure this directly.

      (4) The analysis of the ACT element comes across as incomplete. The authors make a point of a competing interaction with Lys48 of the Ubl domain, but the significance of this is unclear. It is possible that this observation could be an overinterpretation of the crystal structures. Additionally, the rationale for why the ACT element should or shouldn't contribute to in trans activation of different Parkin constructs is not clear. Lastly, the conclusion that this work explains the evolutionary nature of this element in chordates is highly overstated.

      (5) The analysis of the REP linker element also seems incomplete. The authors identify contacts to a neighboring pUb molecule in their crystal structure, but the connection between this interface (which could be a crystallization artifact) and their biochemical activity data is not straightforward. The analysis of flexibility within this region using crystallographic and AlphaFold modeling observations is very indirect. The authors also draw parallels with linker regions in other RBR ligases that are involved in recognizing the E2-loaded Ub. Firstly, it is not clear from the text or figures whether the "conserved" hydrophobic within the linker region is involved in these alternative Ub interfaces. And secondly, the authors appear to jump to the conclusion that the Parkin linker region also binds an E2-loaded Ub, even though their original observation from the crystal structure seems inconsistent with this. The entire analysis feels very preliminary and also comes across as tangential to the primary storyline of in trans Parkin activation.

    1. eLife assessment

      This valuable work presents the latest version of CTFFIND, which is the most popular software for determination of the contrast transfer function (CTF) in cryo-electron microscopy. CTFFIND5 estimates and considers acquisition geometry and sample thickness, which leads to improved CTF determination. The paper describes convincing evidence that CTFFIND5 finds better CTF parameters than previous methods, in particular for tilted samples (e.g. for cryo-electron tomography) or where thickness is an issue (e.g. cellular samples, or electron microscopy at low voltages).

    2. Reviewer #1 (Public Review):

      This work presents CTFFIND5, a new version of the software for determination of the Contrast Transfer Function (CTF) that models the distortions introduced by the microscope in cryoEM images. CTFFIND5 can take acquisition geometry and sample thickness into consideration to improve CTF estimation.

      To estimate tilt (tilt angle and tilt axis), the input image is split into tiles and correlation coefficients are computed between their power spectra and a local CTF model that includes the defocus variation according to a tilted plane. As a final step, by applying a rescaling factor to the power spectra of the tiles, an average tilt-corrected power spectrum is obtained and used for diagnostic purposes and to estimate the goodness of fit. This global procedure and the rescaling factor resemble those used in Bsoft, Warp, etc, with determination of the tilt parameters being a feature specific of CTFFIND5 (and formerly CTFTILT). The performance of the algorithm is evaluated with tilted 2D crystals and tilt-series, demonstrating accurate tilt estimation in some cases and some limitations in others. Further analysis of CTF determination with tilt-series, particularly showing whether there is accurate or stable estimation at high tilts, might be helpful to show the robustness of CTFFIND5 in cryoET.

      CTFFIND5 represents the first CTF determination tool that considers the thickness-related modulation envelope of the CTF firstly described by McMullan et al. (2015) and experimentally confirmed by Tichelaar et al. (2020). To this end, CTFFIND5 uses a new CTF model that takes the sample thickness into account. CTFFIND5 thus provides more accurate CTF estimation and, furthermore, gives an estimation of the sample thickness, which may be a valuable resource to judge the potential for high resolution. To evaluate the accuracy of thickness estimation in CTFFIND5, the authors use the Lambert-Beer law on energy-filtered data and also tomographic data, thus demonstrating that the estimates are reasonable for images with exposure around 30 e/A2. While consideration of sample thickness in CTF determination sounds ideally suited for cryoET, practical application under the standard acquisition protocols in cryoET (exposure of 3-5 e/A2 per image) is still limited. In this regard, the authors are honest in the conclusions and clearly identify the areas where thickness-aware CTF determination will be valuable at present: e.g. in situ single particle analysis and in vitro single particle cryoEM of purified samples at low voltages.

      In conclusion, the manuscript introduces novel methods inside CTFFIND5 that improve CTF estimation, namely acquisition geometry and sample thickness. The evaluation demonstrates the performance of the new tool, with fairly accurate estimates of tilt axis, tilt angle and sample thickness and improved CTF estimation. The manuscript critically defines the current range of application of the new methods in cryoEM.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper describes the latest version of the most popular program for CTF estimation for cryo-EM images: CTFFIND5. New features in CTFFIND5 are the estimation of tilt geometry, including for samples, like FIB-milled lamellae, that are pre-tilted along a different axis than the tilt axis of the tomographic experiment, plus the estimation of sample thickness from the expanded CTF model described by McMullan et al (2015). The results convincingly show the added value of the program for thicker and tilted images, such as are common in modern cryo-ET experiments. The program will therefore have a considerable impact on the field.

      I have only minor suggestions for improvement below:

      Abstract: "[CTF estimation] has been one of the key aspects of the resolution revolution"-> This is a bit over the top. Not much changed in the actual algorithms for CTF estimation during the resolution revolution.<br /> L34: "These parameters" -> Cs is typically given, only defocus (and if relevant phase shift) are estimated.<br /> L110-116: The text is ambiguous: are rotations defined clockwise or counter-clockwise? It would be good to explicitly state what subsequent rotations, in which directions and around which axes this transformation matrix (and the input/output angles in CTFFIND5) correspond to.<br /> L129-130: As a suggestion: it would be relatively easy, and possibly beneficial to the user, to implement a high-resolution limit that varies with the accumulated dose on the sample. One example of this exists in the tomography pipeline of RELION-5.<br /> Substituting Eq (7) into Eq (6) yields ksi=pi, which cannot be true. If t is the sample thickness, then how can this be a function of the frequency g of the first node of the CTF function? The former is a feature of the sample, the latter is a parameter of the optical system. This needs correction.

    4. Reviewer #3 (Public Review):

      In this manuscript, the authors detail improvements in the core CTFFIND (CTFFIND5 as implemented in cisTEM) algorithm that better estimates CTF parameters from titled micrographs and those that exhibit signal attenuation due to ice thickness. These improvements typically yield more accurate CTF values that better represent the data. Although some of the improvements result in slower calculations per micrograph, these can be easily overcome through parallelization.

      There are some concerns outlined below that would benefit from further evaluation by the authors.

      For the examples shown in Figure 3b, given the small differences in estimated defocus1 and 2, what type of improvements would be expected in the reconstructed tomograms? Do such improvements in estimates manifest in better tilt-series reconstruction?

      Similarly, the data shown in Figure 3C shows minimal improvements in the CTF resolution estimate (e.g., 4.3 versus 4.2 Å), but exhibited several hundred Å difference in defocus values. How do such differences impact downstream processing? Is such a difference overcame by per-particle (local) CTF refinements (like the authors mention in the discussion, see below)?

      At which point does the thickness of the specimen preclude the ice thickness modulation to be included for "accurate" estimate? 500Å? 1000Å? 2000Å? Based on the data shown in Figure 3B, as high as 969 Å thick specimens benefit moderately (4.6 versus 3.4 Å fit estimate), but perhaps not significantly, from the ice thickness estimation. Considering the increased computational time for ice thickness estimation, such an estimate of when to incorporate for single-particle workflows would be beneficial.

      It would seem that this statement could be evaluated herein: "the analysis of images of purified samples recorded at lower acceleration voltages, e.g., 100 keV (McMullan et al., 2023), may also benefit since thickness-dependent CTF modulations will appear at lower resolution with longer electron wavelengths". There are numerous examples of 300kV, 200kV, and 100kV EMPIAR datasets to be compared and recommendations would be welcomed.

      Although logical, this statement is not supported by the data presented in this manuscript: "The improvements of CTFFIND5 will provide better starting values for this refinement, yielding better overall CTF estimation and recovery of high-resolution information during 3D reconstruction."

      Moreso, the lack of single-particle data evaluation does present a concern. Naively, these improvements would benefit all cryoEM data, regardless of modality.

    1. eLife assessment

      This work is of fundamental significance and has a compelling level of evidence for a new population that protects against obesity-induced hypothalamic inflammation. This topic will attract attention from a broad base of readers, from hypothalamic neuroscientists to immunologists with an interest in metabolism.

    2. Reviewer #1 (Public Review):

      Summary:

      The present work from Velloso and collaborators investigated the transcription profiles of resident and recruited hypothalamic microglia. They found sex-dependent differences between males and females and identified the protective role of chemokine receptor CXCR3 against diet-induced obesity.

      Strengths:

      (1) Novelty<br /> (2) Relevance, since this work provides evidence about a subset of recruited microglia that has a protective effect against DIO. This provides a new concept in hypothalamic inflammation and obesity.

      Weaknesses:

      (1) Lack of mechanistic insight into the sex-dependent effects.<br /> (2) Analysis of indirect calorimetry data requires more depth.<br /> (3) A deeper analysis of hypothalamic inflammation and ER stress pathways would strengthen the manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      This study by Mendes et al provides novel key insights into the role of chemotaxis and immune cell recruitment into the hypothalamus in the development of diet-induced obesity. Specifically, the authors reveal that although transcriptional changes in hypothalamic resident microglia following exposure to high-fat feeding are minor, there are compelling transcriptomic differences between resident microglia and microglia recruited to the hypothalamus, and these are sexually dimorphic. Using independent loss-of-function studies, the authors also demonstrate an important role of CXCR3 and hypothalamic CXCL10 in the hypothalamic recruitment of CCR2+ positive cells on metabolism following exposure to high-fat diet-feeding in mice. This manuscript puts forth conceptually novel evidence that inhibition of chemotaxis-mediated immune cell recruitment accelerates body weight gain in high-fat diet-feeding, suggesting that a subset of microglia that express CXCR3 may confer protective, anti-obesogenic effects.

      Strengths:

      The work is exciting and relevant given the prevalence of obesity and the consequences of inflammation in the brain on perturbations of energy metabolism and ensuant metabolic diseases. Hypothalamic inflammation is associated with disrupted energy balance, and activated microglia within the hypothalamus resulting from excessive caloric intake and saturated fatty acids are often thought to be mediators of impairment of hypothalamic regulation of metabolism. The present work reports a novel notion in which immune cells recruited into the hypothalamus that express chemokine receptor CXCR3 may have a protective role against diet-induced obesity. In vivo studies reported herein demonstrate that inhibition of CXCR3 exacerbates high-fat diet-induced body weight gain, increases circulating triglycerides and fasting glucose levels, worsens glucose tolerance, and increases the expression of orexigenic neuropeptides, at least in female mice.

      This work provides a highly interesting and needed overview of preclinical and clinical brain inflammation, which is relevant to readers with an interest in metabolism and immunometabolism in the context of obesity.

      Using flow cytometry, cell sorting, and transcriptomics including RNA-sequencing, the manuscript provides novel insights into transcriptional landscapes of resident and recruited microglia in the hypothalamus. Importantly, sex differences are investigated.

      Overall, the manuscript is perceived to be highly interesting, relevant, and timely. The discussion is thoughtful, well-articulated, and a pleasure to read and felt to be of interest to a broad audience.

      Weaknesses:

      There were no major weaknesses perceived. Some comments for potential textual additions to the results/discussion are listed in recommendations to authors.

    1. Author Response

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      Through an unbiased genomewide KO screen, the authors identified loss of DBT to suppress MG132-mediated death of cultured RPE cells. Further analyses suggested that DBT reduces ubiquitinated proteins by promoting autophagy. Mechanistic studies indicated that DBT loss promotes autophagy via AMPK and its downstream ULK and mTOR signaling. Furthermore, loss of DBT suppresses polyglutamine- or TDP-43-mediated cytotoxicity and/or neurodegeneration in fly models. Finally, the authors showed that DBT proteins are increased in ALS patient tissues, compared to non-neurological controls.

      Strengths:

      The idea is novel, the evidence is mostly convincing, and the data are clean. The findings have implications for human diseases.

      Reply: We thank the reviewer for the supportive comments.

      Weaknesses:

      More experiments are needed to establish the connections between DBT and autophagy. The mechanistic studies are somewhat biased, and it's unclear whether the same mechanism (i.e., AMPK-->mTOR) can be applied to TDP-43-mediated neurodegeneration. Also, some data interpretation has to be more accurate.

      Reply: We thank the reviewer for raising these questions, and we have provided additional evidence in the revised manuscript to support the model that DBTKO can enhance autophagy and induce resistance to TDP-43-associated toxicity. This is described in greater detail below.

      (1) To provide further evidence for the connection between DBT and autophagy, we have introduced additional controls. For the additional controls, we have included the AMPK shRNA and drug treatment controls (Fig.4D, Fig.S4B), and these results suggest that reducing the AMPK level renders DBTKO cells sensitive to MG132 toxicity. We also added the TSC1 shRNA and mTOR agonist treatment controls (Fig.5E, Fig.S4G), and the results show that increasing mTOR levels also make the DBTKO cells sensitive to MG132.

      (2) To further confirm the roles of AMPK and mTOR in DBTKO cells, we introduced the AMPK agonist (EX229) and mTOR inhibitors (RAD001 and AZD8055) in co-treatment experiments with MG132 and then measured cell survival (Fig.S4D, S4G). The results indicate that promoting AMPK activation or inhibiting mTOR can enhance cell resistance to MG132-induced toxicity.

      (3) Additionally, we included the overexpression and rescue experiments for DBT and analyzed the AMPK-ULK1 signaling in WT RPE1 and DBTKO cells (Fig.S5D, S5E). The results indicate that the increase of DBT can significantly reduce the phosphorylation of AMPK/ULK1 and the levels of the autophagy marker LC3II. Together, these results suggest that DBT plays an important role in autophagy.

      (4) We had shown in the original version of the manuscript that DBTKO renders cells more resistant to TDP-43-associated toxicity, similar to the tolerance of MG132-induced toxicity. Here we further show that expression of TDP-43M337V enhances the phosphorylation of AMPK in the DBTKO cells (Fig. S7A), similar to the effect of the MG132 treatment. These results suggest that the resistance of DBTKO cells to MG132 or TDP-43-assoicated toxicity shares a similar mechanism of activated the AMPK signaling.

      Reviewer #2 (Public Review):

      Summary:

      Hwang, Ran-Der et al utilized a CRISPR-Cas9 knockout in human retinal pigment epithelium (RPE1) cells to evaluate for suppressors of toxicity by the proteasome inhibitor MG132 and identified that knockout of dihydrolipoamide branched chain transacylase E2 (DBT) suppressed cell death. They show that DBT knockout in RPE1 cells does not alter proteasome or autophagy function at baseline. However, with MG132 treatment, they show a reduction in ubiquitinated proteins but with no change in proteasome function. Instead, they show that DBT knockout cells treated with MG132 have improved autophagy flux compared to wildtype cells treated with MG132. They show that MG132 treatment decreases ATP/ADP ratios to a greater extent in DBT knockout cells, and in accordance causes activation of AMPK. They then show downstream altered autophagy signaling in DBT knockout cells treated with MG132 compared to wild-type cells treated with MG132. Then they express the ALS mutant TDP43 M337 or expanded polyglutamine repeats to model Huntington's disease and show that knockdown of DBT improves cell survival in RPE1 cells with improved autophagic flux. They also utilize a Drosophila model and show that utilizing either a RNAi or CRISPR-Cas9 knockout of DBT improves eye pigment in TDP43M337V and polyglutamine repeat-expressing transgenic flies. Finally, they show evidence for increased DBT in postmortem spinal cord tissue from patients with ALS via both immunoblotting and immunofluorescence.

      Strengths:

      This is a mechanistic and well-designed paper that identifies DBT as a novel regulator of proteotoxicity via activating autophagy in the setting of proteasome inhibition. Major strengths include careful delineation of a mechanistic pathway to define how DBT is protective. These conclusions are largely justified, but additional experiments and information would be useful to clarify and extend these conclusions.

      Reply: We thank the reviewer for the supportive comments.

      Weaknesses:

      The large majority of the experiments are evaluating suppression of drug (MG132) toxicity in an in vitro epithelial cell line, so the generalizability to disease is unclear. Indeed, MG132 itself has been shown to modulate autophagy, and off-target effects of MG132 are not addressed. While this paper is strengthened by the inclusion of mouse-induced motor neurons, Drosophila models, and postmortem tissue, the putative mechanisms are minimally evaluated in these models.

      Also, this effect is only seen with MG132 treatment, at a dose that causes markedly impaired cell survival. In this setting, it is certainly plausible that changes in autophagy could be the result of differences in cell survival, as opposed to an underlying mechanism for cell survival. Additional controls would be useful to increase confidence that DBT knockdown is protective via modulation of autophagy.

      While the authors report increased DBT in postmortem ALS tissue as suggestive that DBT may modulate proteotoxicity in neurodegeneration, this point would be better supported with the evaluation of overexpression of DBT in their model.

      Reply: We appreciate the reviewer for raising these questions, and we have provided further evidence in the revised manuscript to support the proposed mechanism that DBTKO confers resistance to MG132-induced toxicity through activation of autophagy. This is discussed in greater detail below.

      (1) To provide further mechanistic analysis, we have included additional controls for the analysis of AMPK signaling in Fig. 4D and Fig. S4B. These results demonstrate that using drugs or shRNAs to reduce AMPK activity can decrease DBTKO survival. We have also shown that that an increasing the AMPK activity with an activator enhances the survival of both WT and DBTKO cells under MG132 treatment (Fig. S4D), suggesting that DBTKO cells resist MG132-induced toxicity through the activation of AMPK signaling.

      (2) We have included additional controls for the analysis of mTOR signaling in Fig. 5E and Fig. S4F. The results in Fig. 5E show that reducing TSC1 using shRNAs can decrease DBTKO survival. We also added the experiments with mTOR agonist MHY1485 as a control in Fig. S4F. These results indicate that mTOR activation can promote DBTKO cells' sensitivity to MG132 toxicity. To further confirm the importance of mTOR in DBTKO-mediated resistance to MG132 toxicity, we included the mTOR inhibitors RAD001 and AZD8055 in the co-treatment experiments with MG132, and then measured cell survival (Fig. S4G). The results show that both mTOR inhibitors can enhance cell resistance to MG132-induced toxicity (Fig. S4G). These findings suggest that mTOR inhibition is required for DBTKO-mediated cell survival under MG132 treatment.

      (3) To further test the hypothesis that DBT knockdown is protective via modulation of autophagy, we have introduced the overexpression of DBT and the rescue of DBT in DBTKO cells to analyze the AMPK signaling that regulates autophagy (Fig. S5E). The results demonstrate that overexpression of DBT significantly reduced the phosphorylation of AMPK and ULK1 (Fig. S5E). In the rescue experiment, the results mirror those of the overexpression experiment, showing a significant reduction in the phosphorylation of AMPK and ULK1 (Fig. S5E). We also analyzed the autophagy marker LC3II in both the overexpression and rescue experiments, and the results indicate that increasing the DBT level specifically reduces the LC3II level (Fig. S5D). These results support the model that loss of DBT promotes the activation of autophagy.

      (4) To test the hypothesis that DBT may modulate proteotoxicity in neurodegeneration, we included the studies with TDP-43M337V and found that the expression of the mutant TDP43 enhanced the phosphorylation of AMPK in the DBTKO cells (Fig. S7A), consistent with the observations made with MG-132 treatment. Together with other findings in the manuscript, these results indicate that DBTKO can sensitize the activation of the AMPK signaling and confer the resistance to TDP-43-associated toxicity.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Editor’s summary:

      This paper by Castello-Serrano et al. addresses the role of lipid rafts in trafficking in the secretory pathway. By performing carefully controlled experiments with synthetic membrane proteins derived from the transmembrane region of LAT, the authors describe, model and quantify the importance of transmembrane domains in the kinetics of trafficking of a protein through the cell. Their data suggest affinity for ordered domains influences the kinetics of exit from the Golgi. Additional microscopy data suggest that lipid-driven partitioning might segregate Golgi membranes into domains. However, the relationship between the partitioning of the synthetic membrane proteins into ordered domains visualised ex vivo in GPMVs, and the domains in the TGN, remain at best correlative. Additional experiments that relate to the existence and nature of domains at the TGN are necessary to provide a direct connection between the phase partitioning capability of the transmembrane regions of membrane proteins and the sorting potential of this phenomenon.

      The authors have used the RUSH system to study the traffic of model secretory proteins containing single-pass transmembrane domains that confer defined affinities for liquid ordered (lo) phases in Giant Plasma Membrane derived Vesicles (GPMVs), out of the ER and Golgi. A native protein termed LAT partitioned into these lo-domains, unlike a synthetic model protein termed LAT-allL, which had a substituted transmembrane domain. The authors experiments provide support for the idea that ER exit relies on motifs in the cytosolic tails, but that accelerated Golgi exit is correlated with lo domain partitioning.

      Additional experiments provided evidence for segregation of Golgi membranes into coexisting lipid-driven domains that potentially concentrate different proteins. Their inference is that lipid rafts play an important role in Golgi exit. While this is an attractive idea, the experiments described in this manuscript do not provide a convincing argument one way or the other. It does however revive the discussion about the relationship between the potential for phase partitioning and its influence on membrane traffic.

      We thank the editors and scientific reviewers for thorough evaluation of our manuscript and for positive feedback. While we agree that our experimental findings present a correlation between trafficking rates and raft affinity, in our view, the synthetic, minimal nature of the transmembrane protein constructs in question makes a strong argument for involvement of membrane domains in their trafficking. These constructs have no known sorting determinants and are unlikely to interact directly with trafficking proteins in cells, since they contain almost no extramembrane amino acids. Yet, the LATTMD traffics through Golgi similarly to the full-length LAT protein, but quite different from mutants with lower raft phase affinity. We suggest that these observations can be best rationalized by involvement of raft domains in the trafficking fates and rates of these constructs, providing strong evidence (beyond a simple correlation) for the existence and relevance of such domains.

      We have substantially revised the manuscript to address all reviewer comments, including several new experiments and analyses. These revisions have substantially improved the manuscript without changing any of the core conclusions and we are pleased to have this version considered as the “version of record” in eLife.

      Below is our point-by-point response to all reviewer comments.

      ER exit:

      The experiments conducted to identify an ER exit motif in the C-terminal domain of LAT are straightforward and convincing. This is also consistent with available literature. The authors should comment on whether the conservation of the putative COPII association motif (detailed in Fig. 2A) is significantly higher than that of other parts of the C-terminal domain.

      Thank you for this suggestion, this information has now been included as Supp Fig 2B. While there are other wellconserved residues of the LAT C-terminus, many regions have relatively low conservation. In contrast, the essential residues of the COPII association motif (P148 and A150) are completely conserved across in LAT across all species analyzed.

      One cause of concern is that addition of a short cytoplasmic domain from LAT is sufficient to drive ER exit, and in its absence the synthetic constructs are all very slow. However, the argument presented that specific lo phase partitioning behaviour of the TMDs do not have a significant effect on exit from the ER is a little confusing. This is related to the choice of the allL-TMD as the 'non-lo domain' partitioning comparator. Previous data has shown that longer TMDs (23+) promote ER export (eg. Munro 91, Munro 95, Sharpe 2005). The mechanism for this is not, to my knowledge, known. One could postulate that it has something to do with the very subject of this manuscript- lipid phase partitioning. If this is the case, then a TMD length of 22 might be a poor choice of comparison. A TMD 17 Ls' long would be a more appropriate 'non-raft' cargo. It would be interesting to see a couple of experiments with a cargo like this.

      The basis for the claim that raft affinity has relatively minor influence on ER exit kinetics, especially in comparison to the effect of the putative COPII interaction motif, is in Fig 1G. We do observe some differences between constructs and they may be related to raft affinity, however we considered these relatively minor compared to the nearly 4-fold increase in ER efflux induced by COPII motifs.

      We have modified the wording in the manuscript to avoid the impression that we have ruled out an effect of raft affinity of ER exit.

      We believe that our observations are broadly consistent with those of Munro and colleagues. In both their work and ours, long TMDs were able to exit the ER. In our experiments, this was true for several proteins with long TMDs, either as fulllength or as TMD-only versions (see Fig 1G). We intentionally did not measure shorter synthetic TMDs because these would not have been comparable with the raft-preferring variants, which all require relatively long TMDs, as demonstrated in our previous work1,2. Thus, because our manuscript does not make any claims about the influence of TMD length on trafficking, we did not feel that experiments with shorter non-raft constructs would substantively influence our conclusions.

      However, to address reviewer interest, we did complete one set of experiments to test the effect of shortening the TMD on ER exit. We truncated the native LAT TMD by removing 6 residues from the C-terminal end of the TMD (LAT-TMDd6aa). This construct exited the ER similarly to all others we measured, revealing that for this set of constructs, short TMDs did not accumulate in the ER. ER exit of the truncated variant was slightly slower than the full-length LAT-TMD, but somewhat faster than the allL-TMD. These effects are consistent with our previous measurements with showed that this shortened construct has slightly lower raft phase partitioning than the LAT-TMD but higher than allL2. While these are interesting observations, a more thorough exploration of the effect of TMD length would be required to make any strong conclusion, so we did not include these data in the final manuscript.

      Author response image 1.

      Golgi exit:

      For the LAT constructs, the kinetics of Golgi exit as shown in Fig. 3B are surprisingly slow. About half of the protein Remains in the Golgi at 1 h after biotin addition. Most secretory cargo proteins would have almost completely exited the Golgi by that time, as illustrated by VSVG in Fig. S3. There is a concern that LAT may have some tendency to linger in the Golgi, presumably due to a factor independent of the transmembrane domain, and therefore cannot be viewed as a good model protein. For kinetic modeling in particular, the existence of such an additional factor would be far from ideal. A valuable control would be to examine the Golgi exit kinetics of at least one additional secretory cargo.

      We disagree that LAT is an unusual protein with respect to Golgi efflux kinetics. In our experiments, Golgi efflux of VSVG was similar to full-length LAT (t1/2 ~ 45 min), and both of these were similar to previously reported values3. Especially for the truncated (i.e. TMD) constructs, it is very unlikely that some factor independent of their TMDs affects Golgi exit, as they contain almost no amino acids outside the membrane-embedded TMD.

      Practically, it has proven somewhat challenging to produce functional RUSH-Golgi constructs. We attempted the experiment suggested by the reviewer by constructing SBP-tagged versions of several model cargo proteins, but all failed to trap in the Golgi. We speculate that the Golgin84 hook is much more sensitive to the location of the SBP on the cargo, being an integral membrane protein rather than the lumenal KDEL-streptavidin hook. This limitation can likely be overcome by engineering the cargo, but we did not feel that another control cargo protein was essential for the conclusions we presented, thus we did not pursue this direction further.

      Comments about the trafficking model

      (1) In Figure 1E, the export of LAT-TMD from the ER is fitted to a single-exponential fit that the authors say is "well described". This is unclear and there is perhaps something more complex going on. It appears that there is an initial lag phase and then similar kinetics after that - perhaps the authors can comment on this?

      This is a good observation. This effect is explainable by the mechanics of the measurement: in Figs 1 and 2, we measure not ‘fraction of protein in ER’ but ‘fraction of cells positive for ER fluorescence’. This is because the very slow ER exit of the TMD-only constructs present a major challenge for live-cell imaging, so ER exit was quantified on a population level, by fixing cells at various time points after biotin addition and quantifying the fraction of cells with observable ER localization (rather than tracking a single cell over time).

      For fitting to the kinetic model (which attempts to describe ‘fraction in ER/Golgi’) we re-measured all constructs by livecell imaging (see Supp Fig 5) to directly quantify relative construct abundance in the ER or Golgi. These data did not have the plateau in Fig 1E, suggesting that this is an artifact of counting “ER positive cells” which would be expected to have a longer lag than “fraction of protein in ER”. Notably however, t1/2 measured by both methods was similar, suggesting that the population measurement agrees well with single-cell live imaging.

      We have included all these explanations and caveats in the manuscript. We have also changed the wording from “well described” to “reasonably approximated”.

      (2) The model for Golgi sorting is also complicated and controversial, and while the authors' intention to not overinterpreting their data in this regard must be respected, this data is in support of the two-phase Golgi export model (Patterson et al PMID:18555781).

      The reviewers are correct, our observations and model are consistent with Patterson et al and it was a major oversight that a reference to this foundational work was not included. We have now added a discussion regarding the “two phase model” of Patterson and Lippincott-Schwartz.

      Furthermore contrary to the statement in lines 200-202, the kinetics of VSVG exit from the Golgi (Fig. S3) are roughly linear and so are NOT consistent with the previous report by Hirschberg et al.

      Regarding kinetics of VSVG, our intention was to claim that the timescale of VSVG efflux from the Golgi was similar to previously reported in Hirschberg, i.e. t1/2 roughly between 30-60 minutes. We have clarified this in the text. Minor differences in the details between our observations and Hirschberg are likely attributable to temperature, as those measurements were done at 32°C for the tsVSVG mutant.

      Moreover, the kinetics of LAT export from the Golgi (Fig. 3B) appear quite different, more closely approximating exponential decay of the signal. These points should be described accurately and discussed.

      Regarding linear versus exponential fits, we agree that the reality of Golgi sorting and efflux is far more complicated than accounted for by either the phenomenological curve fitting in Figs 1-3 or the modeling in Fig 4. In addition to the possibility of lateral domains within Golgi stacks, there is transport between stacks, retrograde traffic, etc. The fits in Figs 1-3 are not intended to model specifics of transport, but rather to be phenomenological descriptors that allowed us to describe efflux kinetics with one parameter (i.e. t1/2). In contrast, the more refined kinetic modeling presented in Figure 4 is designed to test a mechanistic hypothesis (i.e. coexisting membrane domains in Golgi) and describes well the key features of the trafficking data.

      Relationship between membrane traffic and domain partitioning:

      (1) Phase segregation in the GPMV is dictated by thermodynamics given its composition and the measurement temperature (at low temperatures 4degC). However at physiological temperatures (32-37degC) at which membrane trafficking is taking place these GPMVs are not phase separated. Hence it is difficult to argue that a sorting mechanism based solely on the partitioning of the synthetic LAT-TMD constructs into lo domains detected at low temperatures in GPMVs provide a basis (or its lack) for the differential kinetics of traffic of out of the Golgi (or ER). The mechanism in a living cell to form any lipid based sorting platforms naturally requires further elaboration, and by definition cannot resemble the lo domains generated in GPMVs at low temperatures.

      We thank the reviewers for bringing up this important point. GPMVs are a useful tool because they allow direct, quantitative measurements of protein partitioning between coexisting ordered and disordered phases in complex, cell-derived membranes. However, we entirely agree, that GPMVs do not fully represent the native organization of the living cell plasma membrane and we have previously discussed some of the relevant differences4,5. Despite these caveats, many studies have supported the cellular relevance of phase separation in GPMVs and the partitioning of proteins to raft domains therein 6-9. Most notably, elegant experiments from several independent labs have shown that fluorescent lipid analogs that partition to Lo domains in GPMVs also show distinct diffusive behaviors in live cells 6,7, strongly suggesting the presence of nanoscopic Lo domains in live cells. Similarly, our recent collaborative work with the lab of Sarah Veatch showed excellent agreement between raft preference in GPMVs and protein organization in living immune cells imaged by super-resolution microscopy10. Further, several labs6,7, including ours11, have reported nice correlations between raft partitioning in GPMVs and detergent resistance, which is a classical (though controversial) assay for raft association.

      Based on these points, we feel that GPMVs are a useful tool for quantifying protein preference for ordered (raft) membrane domains and that this preference is a useful proxy for the raft-associated behavior of these probes in living cells. We propose that this approach allows us to overcome a major reason for the historical controversy surrounding the raft field: nonquantitative and unreliable methodologies that prevented consistent definition of which proteins are supposed to be present in lipid rafts and why. Our work directly addresses this limitation by relating quantitative raft affinity measurements in a biological membrane with a relevant and measurable cellular outcome, specifically inter-organelle trafficking rates.

      Addressing the point about phase transition temperatures in GPMVs: this is the temperature at which macroscopic domains are observed. Based on physical models of phase separation, it has been proposed that macroscopic phase separation at lower temperatures is consistent sub-microscopic, nanoscale domains at higher temperatures8,12. These smaller domains can potentially be stabilized / functionalized by protein-protein interactions in cells13 that may not be present in GPMVs (e.g. because of lack of ATP).

      (2) The lipid compositions of each of these membranes - PM, ER and Golgi are drastically different. Each is likely to phase separate at different phase transition temperatures (if at all). The transition temperature is probably even lower for Golgi and the ER membranes compared to the PM. Hence, if the reported compositions of these compartments are to be taken at face value, the propensity to form phase separated domains at a physiological temperature will be very low. Are ordered domains even formed at the Golgi at physiological temperatures?

      It is a good point that the membrane compositions and the resulting physical properties (including any potential phase behavior) will be very different in the PM, ER, and Golgi. Whether ordered domains are present in any of these membranes in living cells remains difficult to directly visualize, especially for non-PM membranes which are not easily accessible by probes, are nanoscopic, and have complex morphologies. However, the fact that raft-preferring probes / proteins share some trafficking characteristics, while very similar non-raft mutants behave differently argues that raft affinity plays a role in subcellular traffic.

      (3) The hypothesis of 'lipid rafts' is a very specific idea, related to functional segregation, and the underlying basis for domain formation has been also hotly debated. In this article the authors conflate thermodynamic phase separation mechanisms with the potential formation of functional sorting domains, further adding to the confusion in the literature. To conclude that this segregation is indeed based on lipid environments of varying degrees of lipid order, it would probably be best to look at the heterogeneity of the various membranes directly using probes designed to measure lipid packing, and then look for colocalization of domains of different cargo with these domains.

      This is a very good suggestion, and a direction we are currently following. Unfortunately, due to the dynamic nature and small size of putative lateral membrane domains, combined with the interior of a cell being filled with lipophilic environments that overlay each other, directly imaging domains in organellar membranes with lipid packing probes remains extremely difficult with current technology (or at least available to us). We argue that the TMD probes used in this manuscript are a reasonable alternative, as they are fluorescent probes with validated selectivity for membrane compartments with different physical properties.

      Ultimately, the features of membrane domains suggested by a variety of techniques – i.e. nanometric, dynamic, relatively similar in composition to the surrounding membrane, potentially diverse/heterogeneous – make them inherently difficult to microscopically visualize. This is one reason why we believe studies like ours, which use a natural model system to directly quantify raft-associated behaviors and relate them to cellular effects (in our case, protein sorting), are a useful direction for this field.

      We believe we have been careful in our manuscript to avoid confusing language surrounding lipid rafts, phase separation, etc. Our experiments clearly show that mammalian membranes have the capacity to phase separate, that some proteins preferentially interact with more ordered domains, and that this preference is related to the subcellular trafficking fates and rates of these proteins. We have edited the manuscript to emphasize these claims and avoid the historical controversies and confusions.

      (4) In the super-resolution experiments (by SIM- where the enhancement of resolution is around two fold or less compared to optical), the authors are able to discern a segregation of the two types of Golgi-resident cargo that have different preferences for the lo-domains in GPMVs. It should be noted that TMD-allL and the LATallL end up in the late endosome after exit of the Golgi. Previous work from the Bonafacino laboratory (PMID: 28978644) has shown that proteins (such as M6PR) destined to go to the late endosome bud from a different part of the Golgi in vesicular carriers, while those that are destined for the cell surface first (including TfR) bud with tubular vesicular carriers. Thus at the resolution depicted in Fig 5, the segregation seen by the authors could be due to an alternative explanation, that these molecules are present in different areas of the Golgi for reasons different from phase partitioning. The relatively high colocalization of TfR with the GPI probe in Fig 5E is consistent with this explanation. TfR and GPI prefer different domains in the GPMV assays yet they show a high degree of colocalization and also traffic to the cell surface.

      This is a good point. Even at microscopic resolutions beyond the optical diffraction limit, we cannot make any strong claims that the segregation we observe is due to lateral lipid domains and not several reasonable alternatives, including separation between cisternae (rather than within), cargo vesicles moving between cisternae, or lateral domains that are mediated by protein assemblies rather than lipids. We have explicitly included this point in the Discussion: “Our SIM imaging suggests segregation of raft from nonraft cargo in the Golgi shortly (5 min) after RUSH release (Fig 5B), but at this level of resolution, we can only report reduced colocalization, not intra-Golgi protein distributions. Moreover, segregation within a Golgi cisterna would be very difficult to distinguish from cargo moving between cisternae at different rates or exiting via Golgi-proximal vesicles.”

      We have also added a similar caveat in the Results section of the manuscript: “These observations support the hypothesis that proteins can segregate in Golgi based on their affinity for distinct membrane domains; however, it is important to emphasize that this segregation does not necessarily imply lateral lipid-driven domains within a Golgi cisterna. Reasonable alternative possibilities include separation between cisternae (rather than within), cargo vesicles moving between cisternae, or lateral domains that are mediated by protein assemblies rather than lipids.”

      Finally, while probes with allL TMD do eventually end up in late endosomes (consistent with the Bonifacino lab’s findings which we include), they do so while initially transiting the PM2,11.

      Minor concerns:

      (1) Generally, the quantitation is high quality from difficult experimental data. Although a lot appears to be manual, it appears appropriately performed and interpreted. There are some claims that are made based on this quantitation, however, where there are no statistics performed. For example, figure 1B. Any quantitation with an accompanying conclusion should be subject to a statistical test. I think the quality of the model fits- this is particularly important.

      We appreciate the thoughtful feedback, the quantifications and fits were not trivial, but we believe important. We have added statistical significance to Figure 1B and others where it was missing.

      (2) Modulation of lipid levels in Fig 4E shows a significant change for the trafficking rate for the LAT-TMD construct and a not so significant change for all-TMD construct. However, these data are not convincing and appear to depend on a singular data point that appears to lower the mean value. In general, the experiment with the MZA inhibitor (Fig. 4D-F) is hard to interpret because cells will likely be sick after inhibition of sphingolipid and cholesterol synthesis. Moreover, the difference in effects for LAT-TMD and allL-TMD is marginal.

      We disagree with this interpretation. Fig 4E shows the average of three experiments and demonstrates clearly that the inhibitors change the Golgi efflux rate of LAT-TMD but not allL-TMD. This is summarized in the t1/2 quantifications of Fig 4F, which show a statistically significant change for LAT-TMD but not allL-TMD. This is not an effect of a singular data point, but rather the trend across the dataset.

      Further, the inhibitor conditions were tuned carefully to avoid cells becoming “sick”: at higher concentrations, cells did adopt unusual morphologies and began to detach from the plates. We pursued only lower concentrations, which cells survived for at least 48 hrs and without major morphological changes.

      (3) Line 173: 146-AAPSA-152 should read either 146-AAPSA-150 or 146-AAPSAPA-152, depending on what the authors intended.

      Thanks for the careful reading, we intended the former and it has been fixed.

      (4) What is the actual statistical significance in Fig. 3C and Fig. 3E? There is a single asterisk in each panel of the figure but two asterisks in the legend.

      Apologies, a single asterisk representing p<0.05 was intended. It has been fixed.

      (5) The code used to calculate the model. is not accessible. It is standard practice to host well-annotated code on Github or similar, and it would be good to have this publicly available.

      We have deposited the code on a public repository (doi: 10.5281/zenodo. 10478607) and added a note to the Methods.

      (1) Lorent, J. H. et al. Structural determinants and func7onal consequences of protein affinity for membrane ra=s. Nature communica/ons 8, 1219 (2017).PMC5663905

      (2) Diaz-Rohrer, B. B., Levental, K. R., Simons, K. & Levental, I. Membrane ra= associa7on is a determinant of plasma membrane localiza7on. Proc Natl Acad Sci U S A 111, 8500-8505 (2014).PMC4060687

      (3) Hirschberg, K. et al. Kine7c analysis of secretory protein traffic and characteriza7on of golgi to plasma membrane transport intermediates in living cells. J Cell Biol 143, 1485-1503 (1998).PMC2132993

      (4) Levental, K. R. & Levental, I. Giant plasma membrane vesicles: models for understanding membrane organiza7on. Current topics in membranes 75, 25-57 (2015)

      (5) Sezgin, E. et al. Elucida7ng membrane structure and protein behavior using giant plasma membrane vesicles. Nat Protoc 7, 1042-1051 (2012)

      (6) Komura, N. et al. Ra=-based interac7ons of gangliosides with a GPI-anchored receptor. Nat Chem Biol 12, 402-410 (2016)

      (7) Kinoshita, M. et al. Ra=-based sphingomyelin interac7ons revealed by new fluorescent sphingomyelin analogs. J Cell Biol 216, 1183-1204 (2017).PMC5379944

      (8) Stone, M. B., Shelby, S. A., Nunez, M. F., Wisser, K. & Veatch, S. L. Protein sor7ng by lipid phase-like domains supports emergent signaling func7on in B lymphocyte plasma membranes. eLife 6 (2017).PMC5373823

      (9) Machta, B. B. et al. Condi7ons that Stabilize Membrane Domains Also Antagonize n-Alcohol Anesthesia. Biophys J 111, 537-545 (2016)

      (10) Shelby, S. A., Castello-Serrano, I., Wisser, I., Levental, I. & S., V. Membrane phase separa7on drives protein organiza7on at BCR clusters. Nat Chem Biol in press (2023)

      (11) Diaz-Rohrer, B. et al. Rab3 mediates a pathway for endocy7c sor7ng and plasma membrane recycling of ordered microdomains Proc Natl Acad Sci U S A 120, e2207461120 (2023)

      (12) Veatch, S. L. et al. Cri7cal fluctua7ons in plasma membrane vesicles. ACS Chem Biol 3, 287-293 (2008)

      (13) Wang, H. Y. et al. Coupling of protein condensates to ordered lipid domains determines func7onal membrane organiza7on. Science advances 9, eadf6205 (2023).PMC10132753

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Suggestions to the authors:

      • Please re-analyze findings by omitting from all Tables and Figures all data of comparators who were not randomized (BAC). I understand the difficulties of running this trial but the results of excess reduction of mortality do not allow the publication of a trial where comparators do not come from the randomized patient population.

      We wish to thank the editors and reviewers for their useful comments. Given that the study was designed with both randomised and CC participants we can’t easily exclude the CC analysis from the paper. However, we do provide graphs for both randomised only and randomised and CC participants for the primary and secondary endpoints. The fact that the primary endpoint (CRP) results are mirrored in both instances is also informative form a trial design perspective and indicative of the effect of dornase alfa therapy on inflammation being robust enough to yield the same results with small and larger cohorts.

      We agree that there are potential drawbacks of using contemporary controls. To address these potential biases we used CC patients recruited at the same time period at single site using the same selection criteria as the randomised group, which minimised potential bias. However, the enrolment and comparison of CRP in CC-BAC participants to concurrent randomised control R-BAC patients indicated that the two groups responded to BAC treatment in the same manner (Table 2, LS means log(CRP) 3.78 vs 3.53, P=0.386), whereas the R-BAC+DA vs R-BAC group comparison yielded significant differences (Table 2, LS means log(CRP) 3.1 vs 3.59, P=0.041). These comparisons mitigate to a large degree these potential problems.

      Still, to make easy to distinguish the groups we now use the following unique nomenclature throughout the manuscript which is clearly defined on ln. 111 and state that comparisons of treated participants were performed with both control groups separately and combined.

      R-BAC: Randomised BAC CC-BAC: Contemporary control BAC R-BAC+DA : Randomised BAC+ dornase alfa T-BAC: R-BAC + CC-BAC

      In fact, the most important bias in our study, might actually be the placebo effect, given that participants randomised to BAC did not receive a nebulized control substance. We now discuss these points in more detail in the manuscript and modified the title by removing the reference to a randomised trial and clinical outcomes.

      • The presentation remains confusing and the manuscript should be critically revised for clarity. There is a repetition of methods (e.g. lines 176-187 repeat 160-175) and redundant results (e.g. Figure S2, Table 3).

      We apologise for the repetition. We removed the repeated text in the Exclusion criteria (lines 176-187 in the old manuscript).

      Figure S2 is not related to Table 3. Figure S2 depicts baseline characteristics, whereas Table 3 complements the graph in Figure 3A but lists the mean daily value of the primary endpoint as requested by Reviewer 1 in the first round of revision.

      At Table 4: the authors should select one method of illustration for lab results, either Table or figure, without repetitions

      We agree and have removed Table 4 leaving the graphs instead.

      • Regarding inclusion criteria, it is unclear whether high radiological suspicion is sufficient for inclusion or whether PCR based confirmation is required in all instances (differences in wording between lines 153 and 191), and under which oxygen requirements (lines 155 and 192)

      We thank the reviewer for pointing this out. Indeed, radiological suspicion was not sufficient and all participants in this study had a positive PCR test as part of their diagnosis prior to inclusion in the study. The entire eligibility section was rewritten to reflect this important point.

      • Table 1 should be merged with Table S2 and a better description of cohort baseline severity (P/F, SOFA, APACHE, organ support, number of patients in each point of the WHO severity score) and treatments should be made available.

      We thank the reviewer for this suggestion. We have now merged Table 1 and S2 and included WHO ordinal severity information in Table 1, with median, average, SD, min and max values which reflect the participant distribution. Unfortunately, although the additional requested information was recorded, it was not systematically collected for the analysis of the trial and it was not straight forward to compile at this stage.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendation for the authors):

      (1) On a few occasions, I found that the authors would introduce a concept, but provide evidence much later on. For example, in line 57, they introduced the idea that feedback timing modulates engagement of the hippocampus and striatum, but they provided the details much later on around line 99. There are a few instances like these, and the authors may want to go through the manuscript critically to bridge such gaps to improve the flow of reading.

      First, we thank the reviewer for acknowledging the contribution of our study and the methodological choices. We acknowledge the concern raised about the flow of information in the introduction. We have critically reviewed the manuscript, especially on writing style and overall structure, to ensure a smoother transition between the introduction of concepts and the provision of supporting evidence. In the case of the concept of feedback timing and memory systems, lines 46-58 first introduce the concept enhanced with evidence regarding adults, and we then pick up the concept around line 103 again to relate it to children and their brain development to motivate our research question. To further improve readability, we have included an outline of what to expect in the introduction. Specifically, we added a sentence in line 66-68 that provides an overview of the different paragraphs: “We will introduce the key parameters in reinforcement learning and then we review the existing literature on developmental trajectories in reinforcement learning as well as on hippocampus and striatum, our two brain regions of interest.”

      This should prepare the reader better when to expect more evidence regarding the concepts introduced. We included similar “road-marker” outline sentences in other occasions the reviewer commented on, to enhance consistency and readability.

      (2) I am curious as to how they think the 5-second delay condition maps onto real-life examples, for example in a classroom setting feedback after 5 seconds could easily be framed as immediate feedback.

      The authors may want to highlight a few illustrative examples.

      Thank you for asking about the practical implications of a 5-second delay condition, which may be very relevant to the reader. We have modified the introduction example in line 39-41 towards the role of feedback timing in the classroom to point out its practical relevance early on: “For example, children must learn to raise their hand before speaking during class. The teacher may reinforce this behavior immediately or with a delay, which raises the question whether feedback timing modulates their learning”.

      We have also expanded a respective discussion point in lines 720-728 to pick up the classroom example and to illustrate how we think timescale differences may apply: “In scenarios such as in the classroom, a teacher may comment on a child’s behavior immediately after the action or some moments later, in par with our experimental manipulation of 1 second versus 5 seconds. Within such short range of delay in teachers’ feedback, children’s learning ability during the first years of schooling may function equally well and depend on the striatal-dependent memory system. However, we anticipate that the reliance on the hippocampus will become even more pronounced when feedback is further delayed for longer time. Children’s capacity for learning over longer timescales relies on the hippocampal-dependent memory system, which is still under development. This knowledge could help to better structure learning according to their development.”

      (3) In the methods section, there are a few instances of task description discrepancies which make things a little bit confusing, for example, line 173 reward versus punishment, or reward versus null elsewhere e.g. line 229. In the same section, line 175, there are a few instances of typos.

      We appreciate your attention to detail in pointing out discrepancies in task descriptions and typos in the method section. We have revised the section, corrected typos, and now phrased the learning outcomes consistently as “reward” and “punishment”.

      (4). I wasn't very clear as to why the authors did not compute choice switch probability directly from raw data but implemented this as a model that makes use of a weight parameter. Former would-be much easier and straightforward for data plotting especially for uninformed readers, i.e., people who do not have backgrounds in computational modelling.

      Thank you for asking for clarification on the calculation of switching behavior. Indeed, in the behavioral results, switching behavior was directly calculated from the raw data. We now stressed this in the methods in lines 230-235, also by naming win-stay and lose-shift as “proportions” instead of as “probabilities”:“As a first step, we calculated learning outcomes diretly from the raw data, which where learning accuracy, win-stay and lose-shift behavior as well as reaction time.

      Learning accuracy was defined as the proportion to choose the more rewarding option, while win-stay and lose-shift refer to the proportion of staying with the previously chosen option after a reward and switching to the alternative choice after receiving a punishment, respectively.”

      In contrast to the raw data switching behavior, the computational heuristic strategy model indeed uses a weight for a relative tendency of switching behavior. We have also stressed the advantage of the computational measure and its difference to the raw data switching behavior in lines 248-252 and believe that the reader can now clearly distinguish between the raw data and the computational results: “Note that these model-based outcomes are not identical to the win-stay and lose-shift behavior that were calculated from the raw data. The use of such model-based measure offers the advantage in discerning the underlying hidden cognitive process with greather nuance, in contrast to classical approaches that directly use raw behavioral data.”

      (5) I agree with the authors' assertion that both inverse temperature and outcome sensitivity parameters may lead to non-identifiability issues, but I was not 100% convinced about their modelling approach exclusively assessing a different family of models (inv temperature versus outcome sensitivity). Here, I would like to make one mid-way recommendation. They may want to redefine the inverse temperature term in terms of reaction time, i.e., B=exp^(s+g(RT-mean (RT)) where s and g are free parameters (see Webb, 2019), and keep the outcome sensitivity parameter in the model with bounds [0,2] so that the interpretation could be % increase or decrease in actual outcome. Personally, in tasks with binary outcomes i.e. [0,1: null vs reward] I do not think outcome sensitivity parameters higher than 2 are interpretable as these assign an inflated coefficient to outcomes.

      We appreciate the mid-way recommendation regarding the modeling approach for inverse temperature and outcome sensitivity parameters. We have carefully revised our analysis approach by considering alternative modeling choices. Regarding the suggestion to redefine the inverse temperature in terms of reaction time by B=exp^(s+g(RT-mean (RT)), we unfortunately were not able to identify the reference Webb (2019), nor did we find references to the suggested modeling approach. Any further information that the reviewer could provide will be greatly appreciated. Regardless, we agree that including reaction times through the implementation of drift-diffusion modeling may be beneficial. However, changing the inverse temperature model in such a way would necessitate major changes in our modeling approach, which unfortunately would result in non-convergence issues in our MCMC pipeline using Rstan. Hence, this approach goes beyond the scope of the manuscript. Nonetheless, we have decided to mention the use of a drift-diffusion model, along with other methodological considerations, as future recommendation for disentangling outcome sensitivity from inverse temperature in lines 711-712: “Future studies might shed new light by examining neural activations at both task phases, by additionally modeling reaction times using a drift-diffusion approach, or by choosing a task design that allows independent manipulations of these phases and associated model parameters, e.g., by using different reward magnitudes during reinforcement learning, or by studying outcome sensitivity without decisionmaking.“

      Regarding the upper bound of outcome sensitivity, we agree that traditionally, limiting the parameter values at 2 is the choice for the parameter to be best interpretable. During model fitting, we had experienced non-convergence issues and ceiling effects in the outcome sensitivity parameter when fixing the inverse temperature at 1. The non-convergence issue was not resolved when we fixed the inverse temperature at 15.47, which was the group mean of the winning inverse temperature family. Model convergence was only achieved after increasing the outcome sensitivity upper bound to 20, with inverse temperature again fixed at 1. Since this model also performed well during parameter and model recovery, we argue that the parameter is nevertheless meaningful, despite the more extreme trial-to-trial value fluctuations under higher outcome sensitivity. We described our choice for this model in the methods section in lines 282-288: “Even though outcome sensitivity is usually restricted to an upper bound of 2 to not inflate outcomes at value update, this configuration led to ceiling effects in outcome sensitivity and non-converging model results. Further, this issue was not resolved when we fixed the inverse temperature at the group mean of 15.47 of the winning inverse temperature family model. It may be that in children, individual differences in outcome sensitivity are more pronounced, leading to more extreme values. Therefore, we decided to extend the upper bound to 20, parallel to the inverse temperature, and all our models converged with Rhat < 1.1.”.

      (6) I think the authors reporting optimal parameters for the model is very important (line 464), but the learning rate they report under stable contingencies is much higher than LRs reported by for example Behrens et al 2007, LRs around 0.08 for the optimal learning behaviour. The authors may want to discuss why their task design calls for higher learning rates.

      Thank you for appreciating our optimal parameter analysis, and for the recommendation to discuss why optimal learning rates in our task design may call for higher learning rates compared to those reported in some other studies. As largely articulated in Zhang et al (2020; primer piece by one of our co-authors), the optimal parameter combination is determined by several factors, such as the reward schedule (e.g., 75:25, vs 80:20) and task design (e.g., no reversal, one reversal, vs multiple reversal) and number of trials (e.g., 80, vs 100, vs, 120). Notably, in these taskrelated regards, our task is different from Behrens et al. (2007), which hinders a quantitative comparison among the optimal parameters in the two tasks. We have now included more details in our discussion in lines 643-656: “However, the differences in learning rate across studies have to be interpreted with caution. The differences in the task and the analysis approach may limit their comparability. Task proporties such as the trial number per condition differed across studies. Our study included 32 trials per cue in each condition, while in adult studies, the trials per condition ranged from 28 to 100. Optimal learning rates in a stable learning environment were at around 0.25 for 10 to 30 trials, another study reported a lower optimal learning rate of around 0.08 for 120 trials. This may partly explain why in our case of 32 trials per condition and cue, optimal learning rates called for a relatively high optimal learning rate of 0.29, while in other studies, optimal learning rates may be lower. Regarding differences in the analysis approach, the hierarchical bayesian estimation approach used in our study produces more reliable results in comparison to maximum likelihood estimation, which had been used in some of the previous adult studies and may have led to biased results towards extreme values. Taken together, our study underscores the importance of using longitudinal data to examine developmental change as well as the importance of simulation-based optimal parameters to interpret the direction of developmental change.”

      (7) The authors may want to report degrees of freedom in t-tests so that it would be possible to infer the final sample size for a specific analysis, for example, line 546.

      We appreciate the recommendation to include degrees of freedom, which are now added in all t-test results, for example in line 579: “Episodic memory, as measured by individual corrected object recognition memory (hits - false alarms) of confident (“sure”) ratings, showed at trend better memory for items shown in the delayed feedback condition (𝛽!""#$%&’(#")%*"# = .009, SE =.005, t(df = 137) = 1.80, p = .074, see Figure 5A).”

      (8) I'm not sure why reductions in lose shift behaviour are framed as an improvement between 2 assessment points, e.g. line 578. It all depends on the strength of the contingency so a discussion around this point should be expanded.

      We acknowledge that a reduction in lose-shift behavior only reflect improvements under certain conditions where uncertainty is low and the learning contingencies are stable, which is the case in our task. We have added Supplementary Material 4 to illustrate the optimality of win-stay and lose-shift proportions from model simulation and to confirm that children’s longitudinal development was indeed towards more optimal switching behavior. In the manuscript, we refer to these results in lines 488-490: “We further found that the average longitudinal change in win-stay and lose-shift proportion also developed towards more optimal value-based learning (Supplementary Material 4).”

      (9) If I'm not mistaken, the authors reframe a trend-level association as weak evidence. I do not think this is an accurate framing considering the association is strictly non-significant, therefore should be omitted line 585.

      We thank for the point regarding the interpretation of a trend-level association as weak evidence. We changed our interpretation, corrected in lines 581-585: “The inclusion of poor learners in the complete dataset may have weakend this effect because their hippocampal function was worse and was not involved in learning (nor encoding), regardless of feedback timing. To summarize, there was inconclusive support for enhanced episodic memory during delayed compared to immediate feedback, calling for future study to test the postulation of a selective association between hippocampal volume and delayed feedback learning.” as well as lines 622-623: “Contrary to our expectations, episodic memory performance was not enhanced under delayed feedback compared to immediate feedback.”

      Reviewer # 2 (Public Review):

      We thank the reviewer for acknowledging the strength of our study and pointing out its weaknesses.

      Weaknesses:

      There were a few things that I thought would be helpful to clarify. First, what exactly are the anatomical regions included in the striatum here?

      We appreciate the clarification question regarding the anatomical regions included in the striatum. The striatum included ventral and dorsal regions, i.e., accumbens, caudate and putamen. We have now specified the anatomical regions that were included in the striatum in lines 211-212: “We extracted the bilateral brain volumes for our regions of interest, which were striatum and hippocampus. The striatum regions included nucleus accumbens, caudate and putamen.”

      Second, it was mentioned that for the reduced dataset, object recognition memory focused on "sure" ratings. This seems like the appropriate way to do it, but it was not clear whether this was also the case for the full analyses in the main text.

      Thank you for pointing out that in the full dataset analysis, the use of “sure” ratings for object recognition memory was previously not mentioned. Including only “sure” ratings was used consistently across analyses. This detail is now described under methods in lines 332-333: “Only confident (“sure”) ratings were included in the analysis, which were 98.1 % of all given responses.”

      Third, the children's fitted parameters were far from optimal; is it known whether adults would be closer to optimal on the task?

      We thank for your question on whether adult learning rates in the task have been reported to be more optimal than those of the children in our study. This indeed seems to be the case, and we added this point in our discussion in line 639-643: “Adult studies that examined feedback timing during reinforcement learning reported average learning rates range from 0.12 to 0.34, which are much closer to the simulated optimal learning rates of 0.29 than children’s average learning rates of 0.02 and 0.05 at wave 1 and 2 in our study. Therefore, it is likely that individuals approach adult-like optimal learning rates later during adolescence.”

      The main thing I would find helpful is to better integrate the differences between the main results reported and the many additional results reported in the supplement, for example from the reduced dataset when excluding non-learners. I found it a bit challenging to keep track of all the differences with all the analyses and parameters. It might be helpful to report some results in tables side-by-side in the two different samples. And if relevant, discuss the differences or their implication in the Discussion. For example, if the patterns change when excluding the poor learners, in particular for the associations between delayed feedback and hippocampal volume, and those participants were also those less well fit by the value-based model, is that something to be concerned about and does that affect any interpretations? What was not clear to me is whether excluding the poor learners at one extreme simply weakens the general pattern, or whether there is a more qualitative difference between learners and non-learners. The discussion points to the relevance of deficits in hippocampaldependent learning for psychopathology and understanding such a distinction may be relevant.

      We appreciate the feedback that it might seem challenging to keep track of differences between the analyses of the full and the reduced dataset. We have now gathered all the analyses for the reduced dataset in Supplementary Material 6, with side-by-side tables for comparison to the full dataset results. Whenever there were differences between the results, they were pointed out in the results section, see lines 557-560: “In the results of the reduced dataset, the hippocampal association to the delayed learning score was no longer significant, suggesting a weakened pattern when excluding poor learners (Supplementary Material 6). It is likely that the exclusion reduced the group variance for hippocampal volume and delayed learning score in the model.” and lines 579-581: “Note that in the reduced dataset, delayed feedback predicted enhanced item memory significantly (Supplementary Material 6).”

      The found differences were further included in our discussion in lines 737-740 in the context of deficits in hippocampal-dependent learning and psychopathology: “Interestingly, poor learners showed relatively less value-based learning in favor of stronger simple heuristic strategies, and excluding them modulated the hippocampal-dependent associations to learning and memory in our results. More studies are needed to further clarify the relationship between hippocampus and psychopathology during cognitive and brain development.”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) There appears to be a flaw in the exploration of cortical inputs. the authors never show that HFS of cortical inputs has no effect in the absence of thalamic stimulation. It appears that there is a citation showing this, but I think it would be important to show this in this study as well.

      We understand that the reviewer would like us to induce an HFS protocol on cortical input and then test if there is any change in synaptic strength in thalamic input. We have done this experiment which shows that without a footshock, high-frequency stimulation (HFS) of the cortical inputs did not induce synaptic potentiation on the thalamic pathway (Extended Data Fig. 4d).

      (2) t is somewhat confusing that the authors refer to the cortical input as driving heterosynaptic LTP, but this is not shown until Figure 4J, that after non-associative conditioning (unpaired shock and tone) HFS of the cortex can drive freezing and heterosynaptic LTP of thalamic inputs.

      We agree with the reviewer that it is in figure 4j and figure 5,b,c which we show electrophysiological evidence for cortical input driving heterosynaptic LTP. It is only to be consistent with our terminology that initially we used behavioral evidence as the proxy for heteroLTP (figure 3c).

      …, the authors are 'surprised' by this outcome, which appears to be what they predict.

      We removed the phrase “To our surprise”.

      (3) 'Cortex' as a stimulation site is vague. The authors have coordinates they used, it is unclear why they are not using standard anatomical nomenclature.

      We replaced “cortex” with “auditory/associative cortex”.

      (4) The authors' repeated use of homoLTP and heteroLTP to define the input that is being stimulated makes it challenging to understand the experimental detail. While I appreciate this is part of the goal, more descriptive words such as 'thalamic' and 'cortical' would make this much easier to understand.

      We agree with the reviewer that a phrase such as “an LTP protocol on thalamic and cortical inputs” would be more descriptive. We chose the words “homoLTP” and “heteroLTP” only to clarify (for the readers) the physiological relevance of these protocols. We thought by using “thalamic” and “cortical” readers may miss this point. However, when for the first time we introduce the words “homoLTP” and “heteroLTP”, we describe which stimulated pathway each refers to.

      Reviewer #2 (Public Review):

      (1) …The experimental schemes in Figs. 1 and 3 (and Fig. 4e and extended data 4a,b) show that one group of animals was subjected to retrieval in the test context at 24 h, then received HFS, which was then followed by a second retrieval session. With this design, it remains unclear what the HFS impacts when it is delivered between these two 24 h memory retrieval sessions.

      We understand that the reviewer has raised the concern that the increase in freezing we observed after the HFS protocol (ex. Fig. 1b, the bar labeled as Wth+24hHFSth) could be caused or modulated by the recall prior to the HFS (Fig. 1a, top branch). To address this concern, in a new group of mice, 24 hours after weak conditioning, we induced the HFS protocol, followed by testing (that is, no testing prior to the HFS protocol). We observed that homoLTP was as effective in mice that were tested prior to the induction protocol as those that were not (Fig. 1b, Extended Data Fig. 1d,e).

      It would be nice to see these data parsed out in a clean experimental design for all experiments (in Figs 1, 3, and 4), that means 4 groups with different treatments that are all tested only once at 24 h, and the appropriate statistical tests (ANOVA). This would also avoid repeating data in different panels for different pairwise comparisons (Fig 1, Fig 3, Fig 4, and extended Fig 4).

      While we understand the benefit of the reviewer’s suggestion, the current presentation of the data was done to match the flow of the text and the delivery of the information throughout the manuscript. We think it is unlikely that the retrieval test prior to the HFS impacts its effectiveness, as confirmed by homosynaptic HFS data (Extended Data Fig. 1d,e). It is beyond the scope of current manuscript to investigate the mechanisms and manipulations related to reconsolidation and retrieval effects.

      (2) … It would be critical to know if LFPs change over 24 h in animals in which memory is not altered by HFS, and to see correlations between memory performance and LFP changes, as two animals displayed low freezing levels. … They would suggest that thalamo-LA potentiation occurs directly after learning+HFS (which could be tested) and is maintained over 24 h.

      We have performed the experiment where we recorded the evoked LFP 2hrs and 24hrs following the weak conditioning protocol. We observed that a weak conditioning protocol that was not followed by an optical LTP protocol on the cortical inputs failed to produce synaptic potentiation of the thalamic inputs (tested 2hrs and 24hrs after the LTP protocol; Extended Data Fig. 5d,e).

      (3) The statistical analyses need to be clarified. All statements should be supported with statistical testing (e.g. extended data 5c, pg 7 stats are missing). The specific tests should be clearly stated throughout. For ANOVAs, the post-hoc tests and their outcomes should be stated. In some cases, 2-way ANOVAs were performed, but it seems there is only one independent variable, calling for one-way ANOVA.

      All the statistical analyses have been revised and the post-hoc tests performed after the ANOVAs are mentioned in the relevant figure legends.

      Reviewer #2 (Recommendations For The Authors):

      The wording "transient" and "persistent" used here in the context of memory seems a bit misleading, as only one timepoint was assessed for memory recall (24 h), at which the memory strength (freezing levels) seem to change.

      As the reviewer mentioned, we have tested memory recall only at one time point. For this reason, throughout the text we used “transient” exclusively to refer to the experience (receiving footshock) and not to the memory. We replaced “persistence” with “stabilization” where it refers to a memory (“the induction of plasticity influences the stabilization of the memory”).

      For the procedures in which the CS and US were not paired, the term "unpairing" is used (which is probably the more adequate one), but the term "non-associative conditioning" appears in the text, which seems a bit misleading, as this term may have another connotation. There is also literature that an unpairing of CS and US could lead to the formation of a safety memory to the CS, that may be disrupted by HFS stimulation.

      We replaced "non-associative" with “unpaired”.

      Validation of viral injection sites for all experiments: Only representative examples are shown, it would be nice to see all viral expression sites.

      For this manuscript, we have used 155 mice. For this reason, including the injection sites for all the animals in the manuscript is not feasible. Except for the mice that have been excluded, (please see exclusion criteria added in the methods), the expression pattern we observed was consistent across animals and therefore the images shown are true representatives.

      Extended Data 1b: Please explain what N, U, W, and S behavioral groups mean. To what groups mentioned in the text (pg 2,3) do these correspond?

      The requested clarifications are implemented in the figure legend.

      Please elaborate on the following aspects of your methods and approaches:

      • Please explain if the protocol for HFS to manipulate behavior was the same as the one used for the LTP experiments (Fig 1d, Fig 4j) and was identical for homo/hetero inputs from thal and ctx?

      We used the same HFS protocol for all the HFS inductions. We included this information in the methods section.

      • Please state when the HFS was given in respect to the conditioning (what means immediately before and after?) and in which context it was given. Were animals subjected to HFS exposed to the context longer (either before or after the conditioning while receiving HFS) than the other groups? When the HFS was given in another context (for the 24 h group)- how was this controlled for?

      Requested information has been added to the methods section. The control and intervention groups were treated in the same way.

      • When were the footshocks given in the anesthesized recordings (Fig. 4j) and how was the temporal relationship to the HFS? Was the timing the same as for the HFS in the behavioral experiments?

      Requested information has been added to the methods section.

      • Please add information on how the LFP was stimulated and how the LFP- EPSP slope was determined in in vivo recordings, likewise for the whole cell recordings of EPSPs in Fig. 5d-f.

      Requested information has been added to the methods section.

      Here, the y-Axis in Fig. 5e should be corrected to EPSP slope rather than fEPSP slope if these are whole-cell recordings.

      This has been corrected.

      • Please include information if the viral injections and opto-manipulations were done bilateral or unilateral and if so in which hemisphere. Likewise, indicate where the LFP recordings were done.

      Requested information has been added to the methods section.

      • Were there any exclusion criteria for animals (e.g. insufficient viral targeting or placement of fibers and electrodes), other than the testing of the optical CS for adverse effects?

      Requested information has been added to the methods section.

      Statistics: In addition to clarifying analytical statistics, please clarify n-numbers for slice recordings (number of animals, number of slices, and number of cells if applicable).

      Requested information has been added to the methods section.

      It would be nice to scrutinize the results in extended data 4b. The freezing levels with U+24h HFS show a strong trend towards an increase, the effect size may be similar to immediate HFS Fig 4f and extended data 4a) if n was increased.

      We agree with the reviewer. To address this point, we added “HomoLTP protocol when delivered 24hrs later, produced an increase in freezing; however, the value was not statistically significant.” To show this point, we used the same scale for freezing in Extended Data Fig. 4a and b.

      In the final experiment (Fig. 5a-c), Fig. 5b seems to show results from only one animal, but behavioral results are from 4 animals (Fig 5c). It would be helpful to see the quantification of potentiation in each animal.

      The results (now with error bar) include all mice.

      Please spell out the abbreviation "STC".

      Now, it is spelled out.

      Page 8 last sentence of the discussion does not seem to fit there.

      The sentence has been removed.

      Reviewer #3 (Recommendations For The Authors):

      (1) The authors did not determine how WTh affects Th-LA synapses, as field EPSPs were recorded only after HFS. WTh was required for the effects of HFS, as HFS alone did not produce CR in naïve and/or unpaired controls. As such the effects of the WTh protocol on synaptic strength must be investigated.

      We have performed the experiment where we recorded the evoked LFP 2hrs and 24hrs following the weak conditioning protocol. We observed that a weak conditioning protocol that was not followed by an optical LTP protocol on the cortical inputs failed to produce synaptic potentiation of the thalamic inputs (tested 2hrs and 24hrs after the LTP protocol; Extended Data Fig. 5d,e).

      (2) The authors provide some evidence that their dual opsin approach is feasible, particularly the use of sustained yellow light to block the effects of blue light on ChrimsonR. However, this validation was done using single pulses making it difficult to assess the effect of this protocol on Th input when HFS was used. Without strong evidence that the optogenetic methods used here are fault-proof, the main conclusions of this study are compromised. Why did the authors not use a protocol in which fibers were placed directly in the Ctx and Th while using soma-restricted opsins to avoid cross-contamination?

      We understand that the reviewer raises the possibility that our dual-opsin approach, although effective with single pulses, may fail in higher frequency stimulation protocols (10Hz and 85Hz). To address this concern, in a new group of mice we applied our approach to 10Hz and 85Hz stimulation protocols. We show that our approach is effective in single-pulse as well as in 10Hz and 85Hz stimulation protocols (Fig. 2d-h).

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      Zhang et al. demonstrate that CD4+ single positive (SP) thymocytes, CD4+ recent thymic emigrants (RTE), and CD4+ T naive (Tn) cells from Cd11c-p28-flox mice, which lack IL-27p28 selectively in Cd11c+ cells, exhibit a hyper-Th1 phenotype instead of the expected hyper Th2 phenotype. Using IL-27R-deficient mice, the authors confirm that this hyper-Th1 phenotype is due to IL-27 signaling via IL-27R, rather than the effects of monomeric IL-27p28. They also crossed Cd11c-p28-flox mice with autoimmune-prone Aire-deficient mice and showed that both T cell responses and tissue pathology are enhanced, suggesting that SP, RTE, and Tn cells from Cd11c-p28-flox mice are poised to become Th1 cells in response to self-antigens. Regarding mechanism, the authors demonstrate that SP, RTE, and Tn cells from Cd11c-p28-flox mice have reduced DNA methylation at the IFN-g and Tbx21 loci, indicating 'de-repression', along with enhanced histone tri-methylation at H3K4, indicating a 'permissive' transcriptional state. They also find evidence for enhanced STAT1 activity, which is relevant given the well-established role of STAT1 in promoting Th1 responses, and surprising given IL-27 is a potent STAT1 activator. This latter finding suggests that the Th1-inhibiting property of thymic IL-27 may not be due to direct effects on the T cells themselves.

      Strengths:

      Overall the data presented are high quality and the manuscript is well-reasoned and composed. The basic finding - that thymic IL-27 production limits the Th1 potential of SP, RTE, and Tn cells - is both unexpected and well described.

      Weaknesses:

      A credible mechanistic explanation, cellular or molecular, is lacking. The authors convincingly affirm the hyper-Th1 phenotype at epigenetic level but it remains unclear whether the observed changes reflect the capacity of IL-27 to directly elicit epigenetic remodeling in developing thymocytes or knock-on effects from other cell types which, in turn, elicit the epigenetic changes (presumably via cytokines). The authors propose that increased STAT1 activity is a driving force for the epigenetic changes and resultant hyper-Th1 phenotype. That conclusion is logical given the data at hand but the alternative hypothesis - that the hyper-STAT1 response is just a downstream consequence of the hyper-Th1 phenotype - remains equally likely. Thus, while the discovery of a new anti-inflammatory function for IL-27 within the thymus is compelling, further mechanistic studies are needed to advance the finding beyond phenomenology.

      Thanks for the comments. Following the suggestions of the reviewer, further studies will be performed to test whether developing thymocytes are the direct targets of IL-27 using Cd4-IL-27ra knockout mice or mixed bone marrow chimeras of wildtype and IL-27ra knockout cells.

      To address the potential autocrine loop in the STAT1 hyperactivation, we added IFN-γ antibody into CD4+ T cell cultures and saw no obvious impact on STAT1 phosphorylation. If deemed necessary, we could further test this possibility in vivo using Cd4-Ifng and CD11c-p28 double knockout mice.

      The detailed mechanisms underlying the hyperactivation of STAT1 remain to be determined. IL-27p28 has recently been shown to act as an antagonist of gp130-mediated signaling. In addition, structural studies have demonstrated that IL-27p28 has the interface with EBI3, as well as the two receptor subunits IL-27Rα and gp130. Taken into consideration of these findings and the fact that p28 and IL-27ra deficiency exhibits similar phenotype, we speculate that deficiency in either p28 or IL-27ra makes more gp130 available to transduce signals elicited by other cytokines. We will next focus on gp130 related cytokines to search for the candidate(s) which ultimately leads to enhanced STAT1 activation in the absence of p28. Alternatively, release of EBI3 in the absence of p28 may facilitate its coupling with other cytokine subunits. IL-35, which is composed of EBI3 and p35, is of particular interest as IL-27Rα is also involved in its signaling.

      To narrow down the candidate cytokines, we will first examine the expression of IL-35 and gp130 related cytokines, including IL-6, IL-11, LIF, CT1, OSM, IL-31, CLCF1, CNTF in the thymus and thymocyte-depleted thymic stromal cells by mining public databases and by RT-PCR. Similarly, CD4+ thymocytes will be examined for the expression of receptor subunits which can couple with gp130, including IL-6R, IL-11R, LIFR, OSMRβ, IL-31Rα, CNTFRα, IL-23R, and IL-12Rβ2.

      We next will select those cytokines expressed in the thymus or thymic stromal cells with cognate receptor expression in CD4+ thymocytes and test their effect on STAT1 phosphorylation of wildtype and p28-deficient CD4+ thymocytes. If deemed necessary, double knockout mice will be engaged to rescue the hyper-Th1 phenotype.

      Reviewer #2 (Public Review):

      Summary:

      Naïve CD4 T cells in CD11c-Cre p28-floxed mice express highly elevated levels of proinflammatory IFNg and the transcription factor T-bet. This phenotype turned out to be imposed by thymic dendritic cells (DCs) during CD4SP T cell development in the thymus [PMID: 23175475]. The current study affirms these observations, first, by developmentally mapping the IFNg dysregulation to newly generated thymic CD4SP cells [PMID: 23175475], second, by demonstrating increased STAT1 activation being associated with increased T-bet expression in CD11c-Cre p28-floxed CD4 T cells [PMID: 36109504], and lastly, by confirming IL-27 as the key cytokine in this process [PMID: 27469302]. The authors further demonstrate that such dysregulated cytokine expression is specific to the Th1 cytokine IFNg, without affecting the expression of the Th2 cytokine IL-4, thus proposing a role for thymic DC-derived p28 in shaping the cytokine response of newly generated CD4 helper T cells. Mechanistically, CD4SP cells of CD11c-Cre p28-floxed mice were found to display epigenetic changes in the Ifng and Tbx21 gene loci that were consistent with increased transcriptional activities of IFNg and T-bet mRNA expression. Moreover, in autoimmune Aire-deficiency settings, CD11c-Cre p28-floxed CD4 T cells still expressed significantly increased amounts of IFNg, exacerbating the autoimmune response and disease severity. Based on these results, the investigators propose a model where thymic DC-derived IL-27 is necessary to suppress IFNg expression by CD4SP cells and thus would impose a Th2-skewed predisposition of newly generated CD4 T cells in the thymus, potentially relevant in autoimmunity.

      Strengths:

      Experiments are well-designed and executed. The conclusions are convincing and supported by the experimental results.

      Weaknesses:

      The premise of the current study is confusing as it tries to use the CD11c-p28 floxed mouse model to explain the Th2-prone immune profile of newly generated CD4SP thymocytes. Instead, it would be more helpful to (1) give full credit to the original study which already described the proinflammatory IFNg+ phenotype of CD4 T cells in CD11c-p28 floxed mice to be mediated by thymic dendritic cells [PMID: 23175475], and then, (2) build on that to explain that this study is aimed to understand the molecular basis of the original finding. In its essence, this study mostly rediscovers and reaffirms previously reported findings, but with different tools. While the mapping of epigenetic changes in the IFNg and T-bet gene loci and the STAT1 gene signature in CD4SP cells are interesting, these are expected results, and they only reaffirm what would be assumed from the literature. Thus, there is only incremental gain in new insights and information on the role of DC-derived IL-27 in driving the Th1 phenotype of CD4SP cells in CD11c-p28 floxed mice.

      Indeed, the present study is based on the finding of enhanced IFN-γ production by CD4+ T cells from CD11c-p28 floxed mice, which was originally reported by Zhang et al. and repeatedly cited in the our manuscript. We revisited this phenomenon in the context of functional bias of newly generated CD4+ T cells and sought to reveal the mechanisms underlying the hyper-Th1 phenotype in the absence of thymic DC-derived IL-27. We showed that deletion of p28 resulted in an unexpected hyperactivation of STAT1, which was accompanied by epigenetic changes in favor of Th1 bias. However, the gap remains between p28 deficiency and STAT1 activation.

      Altogether, the major issues of this study remain unresolved:

      (1) It is still unclear why the p28-deficiency in thymic dendritic cells would result in increased STAT1 activation in CD4SP cells. Based on their in vitro experiments with blocking anti-IFNg antibodies, the authors conclude that it is unlikely that the constitutive activation of STAT1 would be a secondary effect due to autocrine IFNg production by CD4SP cells. However, this possibility should be further tested with in vivo models, such as Ifng-deficient CD11c-p28 floxed mice. Alternatively, is this an indirect effect by other IFNg producers in the thymus, such as iNKT cells? It is necessary to explain what drives the STAT1 activation in CD11c-p28 floxed CD4SP cells in the first place.

      Thanks for the suggestions. Further studies will be performed to test the potential autocrine loop for IFN-γ production in vivo using Cd4-Ifng and CD11c-p28 double knockout mice. This model should also be helpful to exclude the possibility of indirect role of IFN- production by such cells as iNKT.

      As pointed out by the reviewer, a critical unanswered question is what drives the STAT1 activation in CD11c-p28 floxed CD4SP cells. Several lines of evidence point to the possibility that p28 deficiency increases the responsiveness of developing thymocytes to STAT1-activating cytokines. Firstly, IL-27p28 has recently been shown to act as an antagonist of gp130-mediated signaling. Secondly, structural studies have demonstrated that IL-27p28 is centrally positioned in the complex formed with EBI3, as well as the two receptor subunits IL-27Rα and gp130. Thirdly, we observed similar hyper-Th1 phenotype in the absence of either p28 and IL-27ra. Therefore, it is speculated that more gp130 should be available to transduce signals elicited by other cytokines in such a scenario. We will next seek to determine the candidate cytokine(s) responsible for the enhanced STAT1 activation in the absence of p28 as outlined in the response to Reviewer 1.

      (2) It is also unclear whether CD4SP cells are the direct targets of IL-27 p28. The cell-intrinsic effects of IL-27 p28 signaling in CD4SP cells should be assessed and demonstrated, ideally by CD4SP-specific deletion of IL-27Ra, or by establishing bone marrow chimeras of IL-27Ra germline KO mice.

      Thanks for the suggestions. Further studies will be performed to test whether developing thymocytes are the direct targets of IL-27 using Cd4-IL-27ra knockout mice or mixed bone marrow chimeras of wildtype and IL-27ra knockout cells.

    1. Author Response:

      We thank the editors for their assessment of our manuscript. We appreciate the reviewers’ thoughtful comments and plan to incorporate their feedback into a revised manuscript. We agree that incorporating an additional, more common ablation tool would be highly complementary to our Kir2.1 ablation studies. We also agree that images across timepoints should be expanded for contact analyses, connectomics data can be better leveraged, additional quantifications can be performed as suggested by the reviewers to better support claims, and that the introduction and discussion can be revised to better position our work in the context of previous studies. We also strongly agree that providing data on receptor RNA and protein expression in the GF across timepoints would be extremely informative, however we have found acquiring these data, at the necessary resolution, would require new approaches and tools that may be outside the scope of the project.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Farhat-Younis and colleagues demonstrate tumor-specific IgM's capacity to induce tumor cell death in monocyte-derived dendritic cell cultures. They subsequently designed a chimeric receptor based on high-affinity FcRI. However, the authors found that the transfection process was more efficient when either the variable light or heavy chain was transfected individually rather than the entire scFv. This scFv construct led to an endoplasmic reticulum (ER) stress response and scFv degradation. A considerable portion of the manuscript is dedicated to the negative scFv expression results. The authors pivoted to a modified FcgRI capable of transmitting IgM signals. This represents a tremendous amount of work in the development of this chimeric receptor, the critical experiment showing efficacy in vivo was not presented, and instead various in vitro assays are shown. Thus, this manuscript will markedly benefit from showing improved responses to tumors in vivo when macrophages express FcgRI-IgM.

      We deeply thank the reviewer for his thoughtful comments and overall favorable review of our manuscript.

      1) In a mouse tumor model, the authors demonstrated that monocyte-derived dendritic cells (MoDCs) treated with IgG immune complexes (ICs) were more effective at preventing tumor growth compared to those treated with IgM ICs (as shown in Figure 1B). In Figure 1C, their in vitro experiments revealed that IgM resulted in tumor cell death, as well as increased production of nitric oxide (NO) and granzyme B. How do the authors reconcile IgG IC-treated MoDCs performing better in preventing tumors in vivo than IgM IC-treated MoDCs, despite the in vitro results with IgM-ICs. The authors speculate that IgG IC-treated MoDCs might trigger T cell immunity but do not show T cell involvement.

      We apologize for not making this point clearer. We have extensively studied this phenomenon and published two papers that detailed the underlying mechanism in two consecutive papers (PMID: 27812544, PMID: 25924063). Briefly, we showed that DC activated with IgM-IC DC undergo cell death concomitantly to their release of lytic granules and lysis of tumor cells. As a result, they do not migrate to the lymph nodes where they should induce reactive T cell clones. In contrast, DC activated with IgG-IC do not elicit in vitro cytotoxicity but rather process the IC to present its derived antigens of MHC-II. We addressed that issue in the revised version and cited the relevant paper to further clarify it.

      (2) The authors report distinct functional consequences of MoDCs incubated with tumor-IgG complexes and tumor IgM complexes. Tumor growth was inhibited and T cell immunity induced with the former. The latter, however, elicited robust anti-tumor killing. What happens if MoDCs are incubated with both IgG and IgM complexes? If this combined treatment induces effective killing and T cell memory, would this impact the design of the chimeric receptor to include IgG responsiveness as well?

      This is a very interesting point. As mentioned above, our previous publications strongly suggest that tumor binding IgG and IgM induce different processes in myeloid cells. Yet, since MoDC naturally express the high affinity receptors for IgG FcRI, we speculate that treating tumor-bearing mice modified monocyte, alone or in combination with tumor-binding IgG, would shed some light into that. Indeed, such treatment elicit a strong T cell immunity in these mice and the data was added to Supplementary Data Figure S4J. With that being said, a complete analysis of this question is very complicated and extent beyond the scope of this work. We would like to emphasize that the purpose of this work is to highlight some of the challenges unique to genetic manipulation in myeloid cells and to suggest one alternative scaffold for integrating signaling in these cells. We do not argue that the specific solution presented here is the most potent one and more work is required before promoting such treatment into the clinic. We have added a sentence to the Discussion section that stress that issue.

      (3) In Figure 5H, the authors demonstrate the ability of the chimeric receptor construct to deplete tumor cells in vitro. The ms would improve if the authors could show the chimeric receptor construct results in tumor cell death and/or prevention in an in vivo model. Similarly, if combined stimulation with IgG and IgM complexes enhances tumor response, this should be incorporated into the therapeutic strategy.

      This is a wonderful suggestion. To address that, we challenged C57Bl/6 mice with B16F10 melanoma and allowed them to grow until it reached a palpable size of approximately 25 mm2. Concomitantly, we cultured bone marrow dendritic cells from syngeneic mice and transfected them with a linear mRNA of the alpha/mu construct. Tumor bearing mice were then treated with alpha/mu and sham transduced BMDC alone, or in combination with antibody against the melanoma antigen Trp1 (TA99). The results were added as Figure 5K and to Supplementary Figure S4h-S4I.

      Reviewer #2 (Public Review):

      Summary:

      While a significant portion of immunotherapy research has focused on the pivotal role of T cells in tumor immunity, their effectiveness may be limited by the suppressive nature of the tumor environment. On the other hand, myeloid cells are commonly found within tumors and can withstand these adverse conditions. However, these cells often adopt an immunosuppressive phenotype when infiltrating tumors. Therefore, manipulating myeloid cells could potentially enhance the anti-tumor potential of immunotherapy.

      In this manuscript, Farhat-Younes and colleagues have demonstrated that activating the IgM receptor signaling in myeloid cells induces an oxygen burst, the secretion of Granzyme B, and the lysis of adjacent tumor cells. Furthermore, they have outlined a strategy to utilize these features to generate CAR macrophages. However, they have identified a limitation: the expression of scFv in myeloid cells induces ER stress and the degradation of misfolded proteins. To address this issue, chimeric receptors were designed based on the high-affinity FcγRI for IgG. When macrophages transfected with these receptors were exposed to tumor-binding IgG, extensive tumor cell killing, and the release of reactive oxygen species and Granzyme B were observed.

      Strengths:

      In general, I consider this work to be significant, and the results are compelling. It emphasizes the specific considerations and requirements for successful manipulation in myeloid cells, which could further advance the field of cellular engineering for the benefit of immunotherapy

      We thank the reviewer for his thoughtful comments and overall appreciation of our findings.

      Weaknesses:

      Nevertheless, there are several minor issues that should be addressed:

      (1) TCR fragments are commonly used to induce ER stress in non-immune cells. Therefore, it would be interesting to investigate whether TCR fragments can be expressed in myeloid cells and if they induce ER stress. Addressing this issue would support the notion that these cells lack the ER chaperones required for folding immunoglobulin variable chains.

      This is a wonderful suggestion. To assess that possibility, we cloned the alpha chain of anti-Trp1 TCR and transfected RAW 264.7 macrophages. Importantly, we could not detect expression on this construct in macrophages, further supporting our findings with scFv in these cells. We added this result to Figure 4J and Supplementary Figure S3C.

      (2) It would be valuable to determine whether, after the degradation of scFv fragments by myeloid cells, they are presented on MHC-I and MHC-II.

      This is a very interesting point. To address that, we generated a genetic construct where we fused the anti-CD19 scFv to a polypeptide composed from the MHCI and the MHCII fragments of Ova Albumin. Next, DC 2.4 were transfected with this construct and measured their capacity to stimulate the proliferation of CD8+ T cells from OT-I and CD4+ from OT-II mice. DC transfected with this construct efficiently stimulated the proliferation of both T cells, suggesting that both Ova fragments are indeed presented on MHCI and MHCII. Nonehteless, DC transfected with polypeptide of MHCI and MHCII fragments of Ova Albumin only (with no scFv), were almost equally effective in stimulating OT-I and OT-II T cell proliferation. We added that result to Supplementary Figure S3D-S3E.

      (3) Some methodological details, such as the vaccination protocol and high-resolution microscopy procedures, are missing from the text.

      We thank the reviewer for pointing out these issues. We added the missing details to the revised version of the manuscript.