- Feb 2025
-
www.biorxiv.org www.biorxiv.org
-
Reviewer #2 (Public review):
This manuscript describes the impact of deleting or enhancing the expression of the neuronal-specific kinase DLK in glutamatergic hippocampal neurons using clever genetic strategies, which demonstrates that DLK deletion had minimal effects while overexpression resulted in neurodegeneration in vivo. To determine the molecular mechanisms underlying this effect, ribotag mice were used to determine changes in active translation which identified Jun and STMN4 as DLK-dependent genes that may contribute to this effect. Finally, experiments in cultured neurons were conducted to better understand the in vivo effects. These experiments demonstrated that DLK overexpression resulted in morphological and synaptic abnormalities.
Strengths:
This study provides interesting new insights into the role of DLK in the normal function of hippocampal neurons. Specifically, the study identifies:
(1) CA1 vs CA3 hippocampal neurons have differing sensitivity to increased DLK signaling.
(2) DLK-dependent signaling in these neurons is similar to but distinct from the downstream factors identified in other cell types, highlighted by the identification of STMN4 as a downstream signal.
(3) DLK overexpression in hippocampal neurons results in signaling that is similar to that induced by neuronal injury.
The study also provides confirmatory evidence that supports previously published work through orthogonal methods, which adds additional confidence to our understanding of DLK signaling in neurons. Taken together, this is a useful addition to our understanding of DLK function.
Comments on the latest version:
The authors have sufficiently addressed all issues raised with the initial manuscript.
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This manuscript describes the impact of modulating signaling by a key regulatory enzyme, Dual Leucine Zipper Kinase (DLK), on hippocampal neurons. The results are interesting and will be important for scientists interested in synapse formation, axon specification, and cell death. The methods and interpretation of the data are solid, but the study can be further strengthened with some additional studies and controls.
We greatly appreciate the thorough review and thoughtful suggestions from the reviewers and editors on our original manuscript. We provide point-to-point response below. We added new studies on P10 mice and controls as suggested, and made revision of figures and texts for clarification. The revised manuscript includes three new supplemental figures; major text revision is copied under response.
Reviewer #1 (Public Review):
Summary:
In this work, Ritchie and colleagues explore functional consequences of neuronal over-expression or deletion of the MAP3K DLK that their labs and others have strongly implicated in both axon degeneration, neuronal cell death, and axon regeneration. Their recent work in eLife (Li, 2021) showed that inducible over-expression of DLK (or the related LZK) induces neuronal death in the cerebellum. Here, they extend this work to show that inducible over-expression in Vglut1+ neurons also kills excitatory neurons in hippocampal CA1, but not CA3. They complement this very interesting finding with translatomics to quantify genes whose mRNAs are differentially translated in the context of DLK over-expression or knockout, the latter manipulation having little to no effect on the phenotypes measured. The authors note that several genes and pathways are differentially regulated according to whether DLK is over-expressed or knocked out. They note DLK-dependent changes in genes related to synaptic function and the cytoskeleton and ultimately relate this in cultured neurons to findings that DLK over-expression negatively impacts synapse number and changes microtubules and neurites, though with a less obvious correlation.
Strengths:
This work represents a conceptual advance in defining DLK-dependent changes in translation. Moreover, the finding that DLK may differentially impact neuronal death will become the basis for future studies exploring whether DLK contributes to differential neuronal susceptibility to death, which is a broadly important topic.
We thank the reviewer for the comments on the value of our work.
Weaknesses:
This seems like two works in parallel that the authors have not yet connected. First is that DLK affects the translation of an interesting set of genes, and second, that DLK(OE) kills some neurons, disrupts their synapses, and affects neurite growth in culture.
Specific questions:
(1) Is DLK effectively knocked out? The authors reference the floxxed allele in their 2016 work (PMID: 27511108), however, the methods of this paper say that the mouse will be characterized in a future publication. Has this ever been published? The major concern is that here the authors show that Cre-mediated deletion results in a smaller molecular weight protein and the maintenance of mRNA levels.
We apologize for out-of-date citation of the DLK(cKO)<sup>fl/fl</sup> mice. The DLK(cKO)<sup>fl/fl</sup> mice have been published in (Li et al., 2021; Saikia et al., 2022); excision of the flox-ed exon was verified using several Cre drivers (Pv-Cre, AAV-Cre, and VGlut1-Cre in this study). The flox-ed exon contains the initiation ATG and 148 amino acids. By western blot analysis using antibodies against C-terminal peptides of DLK on cerebellar extracts (in Li et al., 2021) and hippocampal extracts (this study), the full-length DLK protein was significantly reduced (Fig 1A-B); DLK is expressed in other hippocampal cells, in addition to glutamatergic neurons, explaining remaining full-length DLK detected.
Our Ribo-seq of VGlut1-Cre; DLK(cKO)<sup>fl/fl</sup> detected remaining Dlk mRNAs lacking the floxed exon (Fig.S1C), which has several candidate ATG at amino acid 223 and after (Fig.S1C1). We detected a very faint band for smaller molecular weight proteins on western blots, only when the membrane was exposed under 5X longer exposure using Pico PLUS Chemiluminescent Substrate (Thermo Scientific, 34580) and a Licor Odyssey XF Imager (revised Fig. S1B). This smaller molecular weight protein might be produced using any candidate ATGs, but would represent an N-terminal truncated DLK protein lacking the ATP binding site and ~1/4 of the kinase domain, i.e. not a functional kinase.
The revised manuscript has updated citation for DLK(cKO)<sup>fl/fl</sup>. Revised Fig.S1B includes images of a western blot under normal exposure vs longer exposure of western blots using anti-DLK antibodies. New Fig.S1C1 shows effects of floxed exon on DLK.
(2) Why does DLK(OE) not kill CA3 neurons? The phenomenon is clear but there is no link to gene expression changes. In fact, the highlighted transcript in this work, Stmn4, changes in a DLK-dependent manner in CA3.
We agree that this is a very interesting question not answered by our gene expression analysis. While we verified Stmn4 expression levels to correlate to the levels of DLK, we do not think that increased Stmn4 per se in DLK(iOE) is a major factor accounting for CA1 death vs CA3 survival. Several published studies have also reported regulation of Stmn4 mRNAs in other cell types, in the contexts of cell death (Watkins et al., 2013; Le Pichon et al., 2017) and axon regeneration and cytoskeleton disruption (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019; Shin et al., 2019). As Stmns have significant expression and function redundancy, conventional knockdown or overexpression of individual Stmn generally does not lead to detectable effects on cellular function. As CA3 neurons are widely known for their dense connections and show resilience to NMDA-mediated neurotoxicity (Sammons et al., 2024; Vornov et al., 1991), we speculate that the differential vulnerability of CA1 and CA3 under DLK(iOE) is a reflection of both the intrinsic property, such as gene expression, and also their circuit connection.
In the revised manuscript, we have included following statement on pg 18:
‘While our data does not pinpoint the molecular changes explaining why CA3 would show less vulnerability to increased DLK, we may speculate that DLK(iOE) induced signal transduction amplification may differ in CA1 vs CA3. CA1 genes appear to be more strongly regulated than CA3 genes, consistent with our observation that increased c-Jun expression in CA1 is greater than that in CA3. Other parallel molecular factors may also contribute to resilience of CA3 neurons to DLK(iOE), such as HSP70 chaperones, different JNK isoforms, and phosphatases, some of which showed differential expression in our RiboTag analysis of DLK(iOE) vs WT (shown in File S2. WT vs DLK(iOE) DEGs). Together with other genes that show dependency on DLK, the DLK and Jun regulatory network contributes to the regional differences in hippocampal neuronal vulnerability under pathological conditions.’
Further we state in ‘Limitation of our study’ on pg 20:
‘Our analysis also does not directly address why CA3 neurons are less vulnerable to increased DLK expression. Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’
We hope our data will stimulate continued interests for testable hypothesis in future studies.
(3) Why are whole hippocampi analyzed to IP ribosome-associated mRNAs? The authors nicely show a differential effect of DLK on CA1 vs CA3, but then - at least according to their methods ¬- lyse whole hippocampi to perform IP/sequencing. Their data are therefore a mix of cells where DLK does and does not change cell death. The key issue is whether DLK does/does not have an effect based on the expression changes it drives.
At the time of planning the Ribo-Tag experiment several years ago, we focused on the hippocampal glutamatergic neurons. Due to technical difficulty in micro-dissecting individual hippocampal regions from this early timepoint, we opted to use whole hippocampi to isolate ribosome-associated mRNAs. We agree with the reviewer that it is important to sort out DLK-dependent general gene expression changes vs those specific to a particular cell type where DLK impacts its survival. With emerging CA1, CA3 and other cell-type specific Cre drivers and advanced RNAseq technology, we hope that our work will stimulate broad interest in these questions in future studies.
In the revised manuscript, we have included new analysis comparing our Vglut1-RiboTag profiling (P15) with CamK2-RiboTag (for CA1) and Grik4-RiboTag (for CA3) (P42) published in Traunmüller et al., 2023 (GSE209870). We find that >80% of the top ranked genes in their CamK2-RiboTag (for CA1) and Girk4-RiboTag (for CA3) were detected in our VGlut1-RiboTag (revised methods and Supplemental Excel File S3). CA1-enriched genes tended to be expressed higher in DLK(cKO), compared to control, whereas CA3-enriched genes showed less significant correlation to DLK expression levels. Additionally, many genes known to specify CA1 fate do not show significant downregulation in DLK(iOE). This analysis, along with other data in our manuscript, is consistent with an idea that DLK does not regulate neuronal fate.
In the revised manuscript, we presented this additional analysis in Fig. S6K-L, and expanded text description on page 9:
‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration’.
(4) Is the subtle decrease in synapse number (Basson/Homer co-loc.) in the DLK (OE) simply a function of neurons (and their synapses, presumably) having died? At the P15 time point that the authors choose because cell death is minimal, there is still a ~25% reduction in CA1 thickness (Figure 2B), which is larger than the ~15% change in synapses (Figure 5H) they describe.
We thank reviewer for the question. To address this, we have analyzed synapses in the CA1 region at P10 in DLK(iOE) mice when there was no detectable loss of neurons. At P10, we did not detect significant changes in Bassoon, Homer1, or colocalized puncta in CA1 (Fig.S11A-F). In P15 DLK(iOE) mice, Homer1 puncta were slightly smaller (Fig.5L) and showed a significant decrease in CA1 SR (Fig.5I).
In the revised manuscript we have also redone our statistical analysis of synapses, using mice rather than ROIs (revised Fig. 5), as recommended by R3. We also analyzed synapses in CA3, and found no significant differences in P10 or P15 (Fig.S12). We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.
Reviewer #2 (Public Review):
This manuscript describes the impact of deleting or enhancing the expression of the neuronal-specific kinase DLK in glutamatergic hippocampal neurons using clever genetic strategies, which demonstrates that DLK deletion had minimal effects while overexpression resulted in neurodegeneration in vivo. To determine the molecular mechanisms underlying this effect, ribotag mice were used to determine changes in active translation which identified Jun and STMN4 as DLK-dependent genes that may contribute to this effect. Finally, experiments in cultured neurons were conducted to better understand the in vivo effects. These experiments demonstrated that DLK overexpression resulted in morphological and synaptic abnormalities.
Strengths:
This study provides interesting new insights into the role of DLK in the normal function of hippocampal neurons. Specifically, the study identifies:
(1) CA1 vs CA3 hippocampal neurons have differing sensitivity to increased DLK signaling.
(2) DLK-dependent signaling in these neurons is similar to but distinct from the downstream factors identified in other cell types, highlighted by the identification of STMN4 as a downstream signal.
(3) DLK overexpression in hippocampal neurons results in signaling that is similar to that induced by neuronal injury.
The study also provides confirmatory evidence that supports previously published work through orthogonal methods, which adds additional confidence to our understanding of DLK signaling in neurons. Taken together, this is a useful addition to our understanding of DLK function.
We thank the reviewer for careful reading and positive comments.
Weaknesses:
There are a few weaknesses that limit the impact of this manuscript, most of which are pointed out by the authors in the discussion. Namely:
(1) It is difficult to distinguish whether the changes in the translatome identified by the authors are DLK-dependent transcriptional changes, DLK-dependent post-transcriptional changes or secondary gene expression changes that occur as a result of the neurodegeneration that occurs in vivo. Additional expression analysis at earlier time points could be one method to address this concern.
We appreciate the reviewer’s comment, and have performed new analysis on c-Jun and p-c-Jun levels in CA1, CA3, and DG in P10 DLK(OE) mice. Our data suggest that in CA3 elevations in p-c-Jun and c-Jun occur separately from cell death in a DLK-dependent manner, though the high elevation of both p-c-Jun and c-Jun in CA1 correlates with cell death.
The data is presented in revised Fig.S7A,B, and described in revised text on pg 9-10:
‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis.’
Also, on pg.10:
In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).
(2) Related to the above, it is difficult to conclusively determine from the current data whether the changes in synaptic proteins observed in vivo are a secondary result of neuronal degeneration or a primary impact on synapse formation. The in vitro studies suggest this has the potential to be a primary effect, though the difference in experimental paradigm makes it impossible to determine whether the same mechanisms are present in vitro and in vivo.
We appreciate the comment, which is related to R1 point 4. We have performed further analysis and revised the text on pg.12 with the following text:
‘To assess effects of DLK overexpression on synapses, we immunostained hippocampal sections from both P10 and P15, with age-matched littermate controls. Quantification of Bassoon and Homer1 immunostaining revealed no significant differences in CA1 SR and CA3 SR and SL in P10 mice of _<_i>Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> and control (Fig.S11A-F, S12A-J). In P15, Bassoon density and size in CA1 SR were comparable in both mice (Fig 5G, H, K), while Homer1 density and size were reduced in DLK(iOE) (Fig.5G,I, L). Overall synapse number in CA1 SR was similar in DLK(iOE) and control mice (Fig.5J). Similar analysis on CA3 SR and SL detected no significant difference from control (Fig.S12M-V).’
We would interpret the data to mean that the effects of DLK(OE) on synapses in CA1 may represent an early step in neuronal death. We hope that future studies will shed clarity on this question.
Additionally, to address whether the same mechanisms are present in vitro, we have performed further analysis on cultured hippocampal neurons. As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:
For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup>
For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup> X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup>
For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup>
Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.
On pg 13, we stated our observation:
‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.
(3) The phenotype of DLK cKO mice is very subtle (consistent with previous reports) and while the outcome of increased DLK levels is interesting, the relevance to physiological DLK signaling is less clear. What does seem possible is that increased DLK may phenocopy other neuronal injuries but there are no real comparisons to directly address this in the manuscript. It would be helpful for the authors to provide this analysis as well as a table with all of the translational changes along with fold changes.
Thank you for the suggestion. The fold changes of genes showing significantly altered expression in DLK(cKO) and DLK(iOE) are provided in the excel files (Supplementary excel File S1 WT vs DLK(cKO) DEGs and File S2. WT vs DLK(iOE) DEGs, highlighted columns B and F).
On pg 6, we revised the text as following to include comparison of DLK levels in other physiological conditions and our mice:
‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.
And,
‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’
In Discussion, we state (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’
(4) For the in vivo experiments, it is unclear whether multiple sections from each animal were quantified for each condition. More information here would be helpful and it is important that any quantification takes multiple sections from each animal into account to account for natural variability.
We apologize this was unclear in the original manuscript.
In the revised methods, under Confocal imaging and quantification (pg 33), we stated: “For brain tissue, three sections per mouse were imaged with a minimum of three mice per genotype for data analysis.”
In revised figure legends, we made it clear that multiple sections from each animal have been used for quantification in all instances, i.e. “Each dot represents averaged thickness from 3 sections per mouse, N≥4 mice/genotype per timepoint.”
In Fig.1F-H: “Each dot represents averaged intensity from 3 sections per mouse”
In Fig.S3B “Data points represent individual mice, averages taken across 3 sections per mouse”
Reviewer #3 (Public Review):
Dr Jin and colleagues revisit DLK and its established multifactorial roles in neuronal development, axonal injury, and neurodegeneration. The ambitious aim here is to understand the DLK-dependent gene network in the brain and, to pursue this, they explore the role of DLK in hippocampal glutamatergic neurons using conditional knockout and induced overexpression mice. They produce evidence that dorsal CA1 and dentate gyrus neurons are vulnerable to elevated expression of DLK, while CA3 neurons appear unaffected. Then they identify the DLK-dependent translatome featured by conserved molecular signatures and cell-type specificity. Their evidence suggests that increased DLK signaling is associated with possible STMN4 disruptions to microtubules, among else. They also produce evidence on cultured hippocampal neurons showing that expression levels of DLK are associated with changes in neurite outgrowth, axon specification, and synapse formation. They posit that downstream translational events related to DLK signaling in hippocampal glutamatergic neurons are a generalizable paradigm for understanding neurodegenerative diseases.
Strengths
This is an interesting paper based on a lot of work and a high number of diverse experiments that point to the pervasive roles of DLK in the development of select glutamatergic hippocampal neurons. One should applaud the authors for their work in constructing sophisticated molecular cre-lox tools and their expert Ribotag analysis, as well as technical skill and scholarly treatment of the literature. I am somewhat more skeptical of interpretations and conclusions on spatial anatomical selectivity without stereological approaches and also going directly from (extremely complex) Ribotag profiling patterns to relevance based on immunohistochemistry and no additional interventions to manipulate (e.g. by knocking down or blocking) their top Ribotag profile hits. Also, it seems to this reviewer that major developmental claims in the paper are based on gene translational profiling dependent on DLK expression, not DLK activation, despite some evidence in the paper that there is a correlation between the two. Therefore, observed patterns and correlations may or may not be physiologically or pathologically relevant. Generalizability to neurodegenerative diseases is an overreach not justified by the scope, approach, and findings of the paper.
We thank the reviewer for the encouraging and constructive comments on the manuscript.
Weaknesses and Suggestions:
The authors state that the rationale for the translatomic studies is to "to gain molecular understanding of gene expression associated with DLK in glutamatergic neurons" and to characterize the "DLK-dependent molecular and cellular network", However, a problem with the experimental design is the selection of an anatomical region at a time point featured by active neurodegeneration. Therefore, it is not straightforward that the differentially expressed genes or pathways caused by DLK overexpression changes could be due to processes related to neurodegeneration. Indeed, the authors find enrichment of signals related to pathways involved in extracellular matrix organization, apoptosis, unfolded protein responses, the complement cascade, DNA damage responses, and depletion of signals related to mitochondrial electron transport, etc., all of which could be the consequence of neurodegeneration regardless of cause. A more appropriate design to discover DLK-dependent pathways might be to look at a region and/or a time point that is not confounded by neurodegeneration.
We appreciate reviewer’s comment. We included our thoughts in ‘Limitation of the study’ (pg 20):
‘Future studies using cell-type specific RiboTag profiling and other methods at a refined time window will be required to address how DLK dependent signaling interacts with other networks underlying hippocampal regional neuron vulnerability to pathological insults.’
In a related vein, the authors ask "if the differentially expressed genes associated with DLK(iOE) might show correlation to neuronal vulnerability" and, to answer this question, they select the set of differentially expressed genes after DLK overexpression and assess their expression patterns in various regions under normal conditions. It looks to me that this selection is already confounded by neurodegeneration which could be the cause for their downregulation. Therefore, such gene profiles may not be directly linked to neuronal vulnerability. A similar issue also relates to the conclusion that "...the enrichment of DLK-dependent translation of genes in CA1 suggests that the decreased expression of these genes may contribute to CA1 neuron vulnerability to elevated DLK".
We agree with the reviewer’s concern that it is difficult to separate neurodegenerative consequences from changes caused by DLK solely based on our translatomics studies on P15 DLK(iOE) mice. As responded to reviewer 1 (point 4) and reviewer 2 (point 1), we have included new analysis of P10 mice (Fig.S7A,B) when neurons did not show detectable sign of degeneration.
We consider several lines of evidence supporting that some differentially expressed genes in DLK(iOE) vs control may likely be specific for increased DLK signaling.
First, the genes identified in DLK(iOE) vs control represent a small set of genes (260), which is comparable to other DLK dependent datasets (Asghari Adib et al., 2024) but shows cell-type specificity.
Second, our analysis using rank-rank hypergeometric overlap (RRHO) detects a significant correlation between upregulated genes from DLK(iOE) vs downregulated genes in DLK(cKO), and vice versa, suggesting that expression of a similar set of genes is depended on DLK (Fig.3C, S6C-E). Consistently, GO term analysis using the list of genes coordinately regulated by DLK, derived from our RRHO analysis, leads to identification of similar GO terms related to up- and downregulated genes as using DLK(iOE)-RiboTag data alone. SynGO analysis of DLK(iOE) regulated genes and DLK(cKO) regulated genes also identified similar synaptic processes regulated by significantly regulated genes (Fig.3F and S6J).
Third, we performed additional analysis comparing our Vglut1-RiboTag dataset with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We observed >80% overlap among the top ranked genes (revised Methods). We described this analysis on pg 9 and Fig. S6K-L (and Supplemental Excel File S3):
‘Additionally, we compared our Vglut1-RiboTag datasets with CamK2-RiboTag and Grik4-RiboTag datasets from 6-week-old wild type mice reported by (Traunmüller et al., 2023; GSE209870). We defined a list of genes enriched in CamK2-expressing CA1 neurons relative to Grik4-expressing CA3 neurons (CA1 genes), and those enriched in Grik4-expressing CA3 neurons (CA3 genes) (File S3). When compared with the entire list of Vglut1-RiboTag profiling in our control and DLK(cKO), we found CA1 genes tended to be expressed more in DLK(cKO) mice, compared to control (Fig.S6K), while CA3 genes showed a slight enrichment in control though the trend was less significant, and were less clustered towards one genotype (Fig.S6L). Moreover, many CA1 genes related to cell-type specification, such as FoxP1, Satb2, Wfs1, Gpr161, Adcy8, Ndst3, Chrna5, Ldb2, Ptpru, and Ntm, did not show significant downregulation when DLK was overexpressed. These observations imply that DLK likely specifically down-regulates CA1 genes both under normal conditions and when overexpressed, with a stronger effect on CA1 genes, compared to CA3 genes. Overall, the informatic analysis suggests that decreased expression of CA1 enriched genes may contribute to CA1 neuron vulnerability to elevated DLK, although it is also possible that the observed down-regulation of these genes is a secondary effect associated with CA1 neuron degeneration.’
To understand the role and relevance of the DLK overexpression model, there should be a discussion of to what extent it corresponds to endogenous levels of DLK expression or DLK-MAPK pathway activation under baseline or pathological conditions.
We appreciate the suggestion, which is similar to R2 point 3. We have revised the text and discussion to include how DLK levels may be altered in other physiological conditions vs our mice.
Pg. 6: ‘Several studies have reported that DLK protein levels increase under a variety of conditions, including optic nerve crush (Watkins et al., 2013), NGF withdrawal (~2 fold) (Huntwork-Rodriguez et al., 2013; Larhammar et al., 2017), and sciatic nerve injury (Larhammar et al., 2017). Induced human neurons show increased DLK abundance about ~4 fold in response to ApoE4 treatment (Huang et al., 2019). Increased expression of DLK can lead to its activation through dimerization and autophosphorylation (Nihalani et al., 2000)’.
And,
‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’
In Discussion (pg. 16): ‘The levels of DLK in our DLK(iOE) mice model appear comparable to those reported under traumatic injury and chronic stress.’
The authors posit that "dorsal CA1 neurons are vulnerable to elevated DLK expression, while neurons in CA3 appear largely resistant to DLK overexpression". This statement assumes that DLK expression levels start at a similar baseline among regions. Do the authors have any such data? Ideally, they should show whether DLK expression and p-c-Jun (as a marker of downstream DLK signaling) are the same or different across regions in both WT and overexpression mice. For example, what are the DLK/p-c-Jun expression levels in regions other than CA1 in Supplementary Figures 2-3 and how do they compare with each other? Normalization to baseline for each region does not allow such a comparison. Also, in Supplementary Figure 6, analyses and comparisons between regions are done at a time point when degeneration has already started. Ideally, these should be done at P10.
We thank the reviewer for raising these points. In the revised manuscript we have included protein expression analysis of DLK (Fig S3), c-Jun, and p-c-Jun at P10 (Fig. S7).
We provided a quantification of DLK immunostaining intensity in CA1 and CA3 in Fig.S3D,E and find roughly comparable levels between regions.
Pg. 6: ‘Additional analysis at the mRNA level (supplemental excel, File S2. WT vs DLK(iOE) DEGs) and at the protein level (Fig.S8E) suggest that the increase in DLK abundance was around 3 times the control level. The localization patterns of DLK protein appeared to vary depending on region of hippocampus and age of animals in both control and Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice (Fig.S3C).’
We provided our quantifications without normalization to baseline in each region for c-Jun and p-c-Jun, and revised the text accordingly:
Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.
Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).
Illustration of proposed selective changes in hippocampal sector volume needs to be very carefully prepared in view of the substantial claims on selective vulnerability. In 2A under P15 and especially P60, it is difficult to see the difference - this needs lower magnification and a lot of care that anteroposterior levels are identical because hippocampal sector anatomy and volumes of sectors vary from level to level. One wonders if the cortex shrinks, too. This is important.
Thank you for raising the point. We have provided images to view the anteroposterior level in Fig.S2A-C. We have noticed cortex in DLK(OE) mice to become thinner, along with expansion of ventricles in some animals at later timepoints (Fig.S2C).
One cannot be sure that there is selective death of hippocampal sectors with DLK overexpression versus, say, rearrangement of hippocampal architecture. One may need stereological analysis, otherwise this substantial claim appears overinterpreted.
We appreciate the comment.
In the revised manuscript, we included a new supplemental figure (Fig. S2) showing lower magnification images of coronal sections, and used cautionary wording, such as ‘CA3 is less vulnerable, compared to CA1’, to minimize the impression of over-interpretation. By NeuN staining, at P10, P15, P60, we did not observe detectable difference in overall hippocampus architecture, apart from noted cell death of CA1 and DG and associated thinning of each of the layers. At 46 weeks, some animals showed differences in the overall shape of dorsal hippocampus, though this appeared to reflect a disproportionately large CA3 region compared to other regions (Fig S2). Increased GFAP staining (Fig.S5A-C) was detected in CA1 but not in CA3, and microglia by IBA1 staining (Fig.S5E) also displayed less reactivity in CA3, compared to CA1. Thus, based on NeuN staining, GFAP staining, IBA1 staining and analysis of the differentially regulated genes, we infer that the effect of DLK(iOE) in CA1 is different than the effect on CA3.
Is the GFAP excess reflective of neuroinflammation? What do microglial markers show? The presence of neuroinflammation does not bode well with apoptosis. Speaking of which, TUNEL in one cell in Supplementary Figure 4E is not strong evidence of a more widespread apoptotic event in CA1.
We have included staining data for the microglia marker IBA1. Both GFAP and IBA1 showed evidence of reactivity particularly in the CA1 region (S5A-E), supporting the differential vulnerability in different regions, though whether cell death is primarily due to apoptosis is unclear.
We agree that our data of sparse TUNEL staining at P15 (Fig S5F,G) do not rule out whether other mechanisms of cell death may also occur. We have included this in our limitations (pg.20) “While we find evidence for apoptosis, other forms of cell death may also occur.”
In several places in the paper (as illustrated in Figure 4B, Supplementary Figure 2B, etc.): the unit of biological observation in animal models is typically not a cell, but an organism, in which averaged measures are generated. This is a significant methodological problem because it is not easy to sample neurons without involving stereological methods. With the approach taken here, there is a risk that significance may be overblown.
We appreciate the reviewer’s point. We used same region for quantification of RNAscope, genotype-blind when possible. We revised the graphs to show mean values for individual mice in Fig.4B, 4C, and Fig.S3B (previously Fig.S2B).
Other Comments and Questions:
Supplementary Figure 9: The authors state that data points are shown for individual ROIs - ideally, they should also show averages for biological replicates. Can the authors confirm that statistical analyses are based on biological replicates (mice) and not ROIs?
We have revised the graphs to show averages from individual mice in Fig.5B-D, F5E-F (previously Fig.S9G-I), Fig.5H-J, and Fig.5K-L (previously Fig.S9J-L) and Fig.S10B,C,E,F (previously Fig.S9B,C, E,F). The statistical analyses are based on biological replicates of mice.
For in vitro experiments, what is the effect of DLK overexpression on neuronal viability and density? Could these variables confound effects on synaptogenesis/synapse maturation?
As described in the Methods, we made hippocampal neuron cultures from P1 pups of the following crosses:
For control: Vglut1<sup>Cre/+</sup> X Rosa26<sup>tdT/+</sup>
For DLKcKO: Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup> X Vglut1<sup>Cre/+</sup>;DLK(cKO)<sup>fl/fl</sup>;Rosa26<sup>tdT/+</sup>
For DLKiOE: H11-DLK<sup>iOE/iOE</sup> X Vglut1<sup>Cre/+</sup>;Rosa26<sup>tdT/+</sup>
Dissociated cells from a given litter were pooled into the same culture. Because there were different proportions of neurons with our genotype of interest in each culture, it is not simple to know whether DLK was causing significant cell death.
On pg 13, we stated our observation:
‘We did not notice an obvious effect of DLK(iOE) or DLK(cKO) on neuron density in cultures at DIV2. To assess neuronal type distribution in our cultures, we immunostained DIV14 neurons with antibodies for Satb2, as a CA1 marker (Nielsen et al., 2010), and Prox1, as a marker of DG neurons (Iwano et al., 2012). We did not observe significant differences in the proportion of cells labeled with each marker in DLK(cKO) or DLK(iOE) cultures (Fig.S13E). These data are consistent with the idea that DLK signaling does not have a strong role in neuron-type specification both in vivo and in vitro’.
We cannot rule out whether variable factors in our cultures may confound effects on synaptogenesis/synapse maturation, and would hope future studies will shed clarity.
Correlations between c-jun expression and phosphorylation are extremely important and need to be carefully and convincingly documented. I am a bit concerned about Supplementary Figure 6 images, especially 6B-CA1 (no difference between control and KO, too small images) and 6D (no p-c-Jun expression at all anywhere in the hippocampus at P15?).
At P10, P15, and P60 we stained for p-c-Jun using the Rabbit monoclonal p-c-Jun (Ser73) (D47G9) antibody from Cell Signaling (cat# 3270) at a 1:200 dilution and imaged using an LSM800 confocal microscope with a 20x objective. We observed p-c-Jun to be quite low generally in control animals. We have replaced the images in Fig.S7F (previously S6D), and adjusted the brightness/contrast to enable better visualization of the low signal in Fig.S7B,D,F (previously Fig.S6B,D).
We revised our text to present the data carefully as stated above:
Pg. 9-10: ‘In control mice, glutamatergic neurons in CA1 had low but detectable c-Jun immunostaining at P10 and P15, but reduced intensity at P60; those in CA3 showed an overall low level of c-Jun immunostaining at P10, P15 and P60; and those in DG showed a low level of c-Jun immunostaining at P10 and P15, and an increased intensity at P60 (Fig.S7A,C,E). In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice at P10 when no discernable neuron degeneration was seen in any regions of hippocampus, only CA3 neurons showed a significant increase of immunostaining intensity of c-Jun, compared to control (Fig.S7A). In P15 mice, we observed further increased immunostaining intensity of c-Jun in CA1, CA3, and DG, with the strongest increase (~4-fold) in CA1, compared to age-matched control mice (Fig.S7C). The overall increased c-Jun staining is consistent with RiboTag analysis’.
Pg. 10: ‘In Vglut1<sup>Cre/+</sup>;H11-DLK<sup>iOE/+</sup> mice, we observed increased p-c-Jun positive nuclei in CA1 at P10, and strong increase in CA1 (~10-fold), CA3 (~6-fold), and DG (~8-fold) at P15 (Fig.S7B,D).
Recommendations for the authors:
Several major and minor reservations were raised. The major issues are the need for more information about the over-expression of DLK and a need to extrapolate to an in vivo condition with DLK. A considerable amount of useful information is presented with some very nicely done experiments but it is not yet a coherent or integrated story. The lack of impact of DLK overexpression in some neurons is perhaps the most impactful observation of the study and would be great to have more information around the differential transcriptional/signaling response in these cell types. There is also a need for more experimental details and to address several questions about the mouse genetic and translatome analysis. They are valid concerns that require attention by the authors.
We thank the editors and reviewers for their thoughtful evaluation and suggestions. We hope that the editors and reviewers find that the new data and text changes in our revised manuscript, along with above point-to-point response, have addressed the concerns and strengthened our findings.
Minor points:
(1)The authors state that deletion of DLK has no effect on CA1 at 1yr, however, the image of CA1 in Figure S1D shows substantially fewer NeuN+ neurons. Is this a representative field of view?
We have re-examined images, and observed no effect on hippocampal morphology at 1 yr. We now included representative images in the revised Fig S1D.
(2) Is the DLK protein section staining in Figure 2C a real signal? The staining looks like speckles and is purely somatic. Axonal staining is widely expected based on the literature and the authors' own work. There should be a specificity control.
To our knowledge, axonal staining of DLK reported in the literature is mostly based on cultured DRG neurons. In addition to the reported axonal localization, DLK is present in the cell soma, near the golgi (Hirai et al., 2002), and in the post-synaptic density (Pozniak et al., 2013).
In the revised manuscript, we addressed this point by including controls with no primary antibody, and using an antibody against the closely related kinase, LZK. These additional data are shown in (Fig.S3C,D) (previously Fig.S2C), supporting that DLK protein staining represents real signal. At P10 and P15, DLK immunostaining around CA3 showed axonal staining of the mossy fibers, as well as in the soma and dendritic layers (Fig.S3C,D). A similar pattern was also seen in primary cultured neurons (Fig 6A).
(3) The protein expression of DLK in the transgenic overexpressor (Figure S7C) looks, to the resolution of this blot, to be at least 50kD heavier than 'WT' DLK. Can the authors explain this discrepancy?
The Cre-induced DLK(iOE) transgene has T2A and tdTomato in-frame to C-terminus of DLK. It is known that T2A ‘self-cleavage’ is often incomplete. DLK-T2A-tdTomato would be about 50 kD bigger than WT DLK. We now include the transgene design in revised Fig S1D, and also stated in figure legend of Fig.S8C (previously S7C) that ‘Larger molecular weight band of DLK in Vglut1<sup>Cre/+</sup>;H11-DLKiOE/+ would match the predicted molecular weight of DLK-T2A-tdTomato if T2A-peptide induced ‘self-cleavage’ due to ribosomal skipping is ineffective (Fig.S1D).’
(4) Expression changes in DLK affect various aspects of neurites in CA1 cultures (Figure 6), and changes in DLK also modestly affect STMN4 (and 2, perhaps indirectly) levels (Figure S7C), but there is no indication that DLK acts via STMN4 to cause these changes. It is not clear what to make of these data. Of note, Stmn4 levels change in response to DLK in CA3, without DLK affecting cell death in this region.
We appreciate and agree with the comment. Other studies (Asghari Adib et al., 2024; DeVault et al., 2024; Hu et al., 2019; Larhammar et al., 2017; Le Pichon et al., 2017; Shin et al., 2019; Watkins et al., 2013) reported expression changes in Stmn4 mRNAs in other cell types and cellular contexts, which appeared to depend on DLK. Hippocampal neurons express multiple Stmns (Fig.S8A). While we present our analysis on the effects of DLK dosage on Stmn4, and also Stmn2, we do not think that DLK-induced changes of Stmn4 expression per se is a major factor underlying CA1 cell death vs CA3 survival.
In the revised manuscript, we addressed this point in ‘Limitation of our study’ (pg 20):
‘Additional experiments will be needed to elucidate in vivo roles of STMN4 and its interaction with other STMNs’.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study presents important findings on the function of enteric glia expressing proteolipid protein 1 (PLP1+ glia). The evidence supporting the claims of the authors is solid, although the inclusion of additional data showing the mechanisms by which PLP1+ enteric glia acts on Paneth cells would have strengthened the study. The work will be of interest to colleagues studying intestinal biology.
-
Reviewer #1 (Public review):
The role of enteric glial cells in regulating intestinal mucosal functions at steady state has been a matter of debate in recent years. Enteric glial cell heterogeneity and related methodological differences likely underlie the contrasting findings obtained by different laboratories. Here, Prochera and colleagues used Plp1-CreERT2 driver mice to deplete the vast majority of enteric glia from the gut, and performed an elegant set of transcriptomic, microscopic and biochemical essays to examine the impact of enteric glia loss. It was found that enteric glia depletion has very limited effects on the transcriptome of gut cells 11 days after tamoxifen treatment (used to induce Diphtheria Toxin A expression in the majority of enteric glia including those present in the mucosa), and by extension - more specifically, has only minimal impact on cells of the intestinal mucosa. Interestingly, in the colon (where Paneth cells are not present) they did observe transcriptomic changes related to Paneth cell biology. Although no overt gene expression alterations were found in the small intestine - also not in Paneth cells - morphological, ultrastructural and functional changes were detected in the Paneth cells of enteric glia-depleted mice. In addition, and likely related to impaired Paneth cell secretory activity, enteric glia-depleted mice also show alterations in intestinal microbiota composition. This is an excellent study that convincingly demonstrates a role for enteric glia in supporting Paneth cells of the intestinal mucosa, suggesting that enteric glial cells shape host-microbiome interactions via the regulation of Paneth cell homeostasis.
-
Reviewer #2 (Public review):
This is an excellent and timely study from the Rao lab investigating the interactions of enteric glia with the intestinal epithelium. Two early studies in the late 90's and early 2000's had previously suggested that enteric glia play a pivotal role in control of the intestinal epithelial barrier, as their ablation using mouse models resulted in severe and fatal intestinal inflammation. However, it was later identified that these inflammatory effects could have been an indirect product of the transgenic mouse models used, rather than due to the depletion of enteric glia. In previous studies from this lab, the authors had identified expression of PLP1 in enteric glia, and its use in CRE driver lines to label and ablate enteric glia.
In the current paper, the authors carefully examine the role of enteric glia by first identifying that PLP1-creERT2 is the most useful driver to direct enteric glial ablation, in terms of the quantity of glial cells targeted, their proximity to the intestinal epithelium, and the relevance for human studies (GFAP expression is rather limited in human samples in comparison). They examined gene expression changes in different regions of the intestine using bulk RNA-seq following ablation of enteric glia by driving expression of diptheria toxin A (PLP1-creERT2;Rosa26-DTA). Alterations in gene expression were observed in different regions of the gut, with specific effects in different regions. Interestingly, while there were gene expression changes in the epithelium, there were limited changes to the proportions of different epithelial cell types identified using immunohistochemistry in control vs glial-ablated mice. The authors then focused on investigation of Paneth cells in the ileum, identifying changes in the ultrastructural morphology and lysozyme activity. In addition, they identified alterations in gut microbiome diversity. As Paneth cells secrete antimicrobial peptides, the authors conclude that the changes in gut microbiome are due to enteric glia-mediated impacts on Paneth cell activity.
Overall, the study is excellent and delves into the different possible mechanisms of action, including investigation of changes in enteric cholinergic neurons innervating the intestinal crypts. The use of different CRE-drivers to target enteric glial cells has led to varying results in the past, and the authors should be commended on how they address this in the Discussion.
Comments on the latest version:
Thanks to the authors for addressing my concerns. The additional stratification of male vs female microbiome data was very helpful.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
The role of enteric glial cells in regulating intestinal mucosal functions at a steady state has been a matter of debate in recent years. Enteric glial cell heterogeneity and related methodological differences likely underlie the contrasting findings obtained by different laboratories. Here, Prochera and colleagues used Plp1-CreERT2 driver mice to deplete the majority of enteric glia from the gut. They found that glial loss has very limited effects on the transcriptome of gut cells 11 days after tamoxifen treatment (used to induce DTA expression), and by extension - more specifically, has only minimal impact on cells of the intestinal mucosa. Interestingly, in the colon (where Paneth cells are not present) they did observe transcriptomic changes related to Paneth cell biology. Although no overt gene expression alterations were found in the small intestine - also not in Paneth cells - morphological, ultrastructural, and functional changes were detected in the Paneth cells of enteric glia-depleted mice. In addition, and possibly related to Paneth cell dysfunction, enteric glia-depleted mice also show alterations in intestinal microbiota composition.
In their analyses of enteric glia from existing single-cell transcriptomic data sets, it is stated that these come from 'non-diseased' humans. However, the data on the small intestine is obtained from children with functional gastrointestinal disorders (Zheng 2023). Data on colonic enteric glia was obtained from colorectal cancer patients (Lee 2020). Although here the cells were isolated from non-malignant regions, saying that the large intestines of these patients are nondiseased is probably an overstatement.
In the Zheng et al. dataset, “functional GI disorders” refers to biopsies from children that do not have any histopathologic evidence of digestive disease. The children do, however, have at least one GI symptom that prompted a diagnostic endoscopy with biopsies, leading to the designation of “functional” disorder. Given that diagnostic endoscopies are invasive procedures that necessitate anesthesia, obtaining biopsies from asymptomatic children without any clinical indication would not be allowable per most institutional review boards, leading the authors of that study to use these samples as a control group. We had thus used the “non-diseased” label to encompass these samples as well as those from the unaffected regions of large intestine from colorectal cancer patients. We now recognize, however, that this label could be misleading, so we have revised the Results and Figure Legends to more accurately reflect details of control tissue origin for this and the Lee et al. (2020) datasets. Per the reviewer’s suggestion, we have removed the term “non-diseased”.
Another existing dataset including human mucosal enteric glia of healthy subjects is presented in Smillie et al (2019). It would be interesting to see how the current findings relate to the data from Smillie et al.
Per the reviewer’s suggestion, we have now added an analysis of the Smillie et al. dataset in Supp. Fig. 1B. This dataset derives from colonic mucosal biopsies from 12 healthy adults (8480 stromal cells) and 18 adults with ulcerative colitis (10,245 stromal cells from inflamed bowel segments and 13,147 from uninflamed), all between the ages of 20-77 years. These data show that SOX10, PLP1, and S100B are selectively expressed within the putative glial cluster from colonic mucosa of both healthy adults and individuals with ulcerative colitis, whereas GFAP is not detected (Supp. Fig. 1B). These observations are consistent with our observations from the two other human datasets already included in our manuscript in Fig. 1 and Supp. Fig. 1.
The time between enteric glia depletion and analyses (mouse sacrifice) must be a crucial determinant of the type of effects, and the timing thereof. In the current study 11 days after tamoxifen treatment was chosen as the time point for analyses, which is consistent with earlier work by the lab using the same model (Rao et al 2017). What would happen when they wait longer than 11 days after tamoxifen treatment? Data, not necessarily for all parameters, on later time points would strengthen the manuscript significantly.
This is an excellent question, particularly given the longer-lived nature of Paneth cells relative to other epithelial cell types. As detailed in our previous study, Cre<sup>+</sup> mice in the Plp1CreER-DTA model are well-appearing and indistinguishable from their Cre-negative control littermates through 11dpt. Unfortunately, a limitation of the model is that beyond 11dpt, Cre<sup>+</sup> mice become anorexic, lose body weight, and have signs of neurologic debility such as hindlimb weakness and uncoordinated gait. These deficits are overt by 14dpt and likely due to targeting Plp1<sup>+</sup> glia outside the gut, such as Schwann cells and oligodendrocytes (as described in another study which used a similar model to study demyelination in the central nervous system, PMID: 20851998). Given these CNS effects and that starvation is well known to affect Paneth cell phenotypes (PMIDs: 1167179, 21986443), we elected not to examine timepoints beyond 11dpt. Technological advances that enable more selective cell depletion will allow study of chronic effects of enteric glial loss in the future.
The authors found transcriptional dysregulation related to Paneth cell biology in the colon, where Paneth cells are normally not present. Given the bulk RNA sequencing approach, the cellular identity in which this shift is taking place cannot be determined. However, it would be useful if the authors could speculate on which colonic cell type they reckon this is happening in.
Per the reviewer’s suggestion, we have added a paragraph to the Discussion addressing one plausible hypothesis to explain this observation. Paneth-like cells have been described in the large intestine and are known, particularly in humans, to express markers typical of Paneth cells, such as lysozyme and defensins (PMID: 27573849, 31753849). These cells could represent the source of the Paneth cell-like transcriptional signature observed in our model. Alternatively, ectopic expression of Paneth cell-associated genes in the colon has been documented in certain pathological conditions, such as colorectal cancer models (e.g., PMID: 15059925), where changes in the local microenvironment appear to trigger activation of Paneth cell genes. Similar, yet unidentified changes in our model could potentially underlie the transcriptional dysregulation related to Paneth cell biology observed here.
On the other hand, enteric glia depletion was found to affect Paneth cells structurally and functionally in the small intestine, where transcriptional changes were initially not identified. Only when performing GSEA with the in silico help of cell type-specific gene profiles, differences in Paneth cell transcriptional programs in the small intestine were uncovered. A comment on this discrepancy would be helpful, especially for the non-bioinformatician readers among us.
Standard differential gene expression analysis (DEG) of the effects of glial loss revealed significant differences only in the colon, and even then, only a handful of genes were changed. These changes were not accompanied by corresponding changes at the protein level, at least as detectable by IHC. In the small intestine, there were no significant differences by standard DEG thresholds. Unlike DEG, gene set enrichment analyses (GSEA), provides a significance value based on whether there is a higher than chance number of genes that are changing in a uniform direction without consideration for the significance of the magnitude of change. Therefore, the GSEA detected that a significant number of genes in the curated Paneth cell gene list exhibited a positive fold change difference in the bulk RNA sequencing data. This prompted us to examine Paneth cells and other epithelial cell types in more detail by IHC, functional and ultrastructural analyses, which all converged on the observation that Paneth cells were relatively selectively disrupted in the epithelium of glial depleted mice.
From looking at Figure 3B it is clear that Paneth cells are not the only epithelial cell type affected (after less stringent in silico analyses) by enteric glial cell depletion. Although the authors show that this does not translate into ultrastructural or numerical changes of most of these cell types, this makes one wonder how specific the enteric glia - Paneth cell link is. Besides possible indirect crosstalk (via neurons), it is not clear if enteric glia more closely associate with Paneth cells as compared to these other cell types. Immunofluorescence stainings of some of these cells in the Plp1-GFP mice would be informative here.
Enteric glia have long been reported to closely associate with crypts, the sites of residence for Paneth cells and intestinal stem cells (PMID: 7043279, 16423922). Consistent with these reports, our observations from Plp1-eGFP mice confirm that enteric glia often appose the entire base of small intestinal crypts (see Author response image 1 below). Given this reproducible observation, we did not pursue histological quantification to compare preferential glial apposition to specific epithelial cell types. Enteric glia have been reported to form close associations with enteroendocrine cells as well (PMID: 24587096), which is not surprising because these cells are highly innervated; however, our analyses did not reveal changes in the abundance and morphology of these cells or other epithelial cell types.
Author response image 1.
(A) Immunohistochemical staining of a small intestinal cross-section from a Vil1<sup>Cre</sup>Rosa26<sup>tdTomato/+</sup> Plp1<sup>eGFP</sup> transgenic mouse in which enteric glia are labeled with green fluorescent protein (GFP) and intestinal epithelial cells are labeled with tdTomato. (B) Mucosal glia closely associate with epithelial cells in intestinal crypts. Scale bar – 20µm.
The authors mention IL-22 as a possible link, but do Paneth cells express receptors for transmitters commonly released by enteric glia? Maybe they can have a look at putative cell-cell interactions by mapping ligand-receptor pairs in the scRNAseq datasets they used.
Beyond IL-22R, it is established that Paneth cells express receptors for secreted WNT proteins, which enteric glia have been shown to express (PMID: 34727519). This interaction could potentially be involved in glial regulation of Paneth cells, but mice lacking glia do not exhibit the same phenotypes as mouse models with disrupted WNT signaling. For example, animals lacking the WNT receptor Frizzled-5 in Paneth cells have mislocalization of Paneth cells to the villi (PMID: 15778706), which we do not readily observe in Plp1CreER-DTA mice. Furthermore, while mucosal enteric glia have been proposed as a source of WNT ligands, this role has been specifically attributed to GFAP+ cells, which may or may not be glia in the mucosa. Moreover, several other cell types in the mucosa around crypts have also been identified as significant sources of WNT ligands (PMID: 16083717, 22922422). We have now added these ideas to the Discussion.
Per the reviewer’s suggestion to use bioinformatics to explore other potential ligand-receptor pairings that might underlie glial regulation of Paneth cells, we conducted a CellPhoneDB analysis focused on these two cell types with a collaborator. This analysis highlighted a handful of potential ligand-receptor interactions, but none of these pathways could be clearly linked to the observed Paneth cell phenotype. Furthermore, virtually all the candidate interactions were not specific to glia, with the candidate ligands expressed by many other more abundant cell types in the mucosa. For these reasons, we decided not to include this analysis in the revised manuscript.
Previously the authors showed that enteric glia regulation of intestinal motility is sex-dependent (Rao et al 2017). While enteric glia depletion caused dysmotility in female mice, it did not affect motility in males. For this reason, most experiments in the current study were conducted in male mice only. However, for the experiments focusing on the effect of enteric glia depletion on hostmicrobiome interactions and intestinal microbiota composition both male and female mice were used. In Figure 8A male and female mice are distinctly depicted but this was not done for Figure 8C. Separate characterization of the microbiome of male and female mice would have helped to figure out how much intestinal dysmotility (in females) contributes to the effect on gut microbial composition. This is an important exercise to confirm that the effect on the microbiome is indeed a consequence of altered Paneth cell function, as suggested by the authors (in the results and discussion, and in the abstract).
In our microbiome analysis, we initially analyzed males and females separately but did not observe significant differences between the two sexes. Thus, we merged the data to increase the statistical power of the genotype comparisons. It was an oversight on our part to not label the datapoints by sex as we did for the other data in the manuscript. We have now revised the figures related to microbiome characterization (Fig. 5D-E and Supp. Fig. 8C) to indicate the sexes of the mice used. Stratifying the data by sex within-sample revealed no major sex-specific differences in microbiome diversity or enriched/depleted biomarkers in the core genotype-dependent observations.
In this context, it would also be interesting to compare the bulk sequencing data after enteric glia depletion between female and male mice.
Our bulk sequencing analysis of the effects of glial loss was conducted in male mice only in order to assess the effects independent of colonic dysmotility, a phenotype observed only in female Plp1CreER-DTA animals (PMID: 28711628). Given that we found rather muted transcriptional changes in male mice, we chose not to perform subsequent transcriptional analyses in female mice, further reasoning that any changes identified would most likely be attributable to dysmotility rather than direct glial effects. Future studies focusing on sex differences in the small intestine, where motility in the Plp1CreER-DTA model is unaffected by glial loss, could provide additional insights, especially in light of the recently reported sex differences in the gene expression and activity levels of enteric glia in the myenteric plexus (PMID: 34593632, 38895433).
Reviewer #1 (Recommendations For The Authors):
- Intro 2nd paragraph: please add to the sentence: "They found no major defects in epithelial properties AT STEADY STATE (or during homeostasis).
Revised as suggested.
- There seems to be a word missing in the 2nd sentence of paragraph 2 on page 4. "... but xxx consistent...".
Reviewed and there were no missing words.
- In the 2nd paragraph on page 8, when discussing GFAP expression in IBD patients, a reference is missing. Also, here it should be GFAP, not Gfap (in italics).
Revised as suggested.
Reviewer #2 (Public Review):
This is an excellent and timely study from the Rao lab investigating the interactions of enteric glia with the intestinal epithelium. Two early studies in the late 1990s and early 2000s had previously suggested that enteric glia play a pivotal role in control of the intestinal epithelial barrier, as their ablation using mouse models resulted in severe and fatal intestinal inflammation. However, it was later identified that these inflammatory effects could have been an indirect product of the transgenic mouse models used, rather than due to the depletion of enteric glia. In previous studies from this lab, the authors had identified expression of PLP1 in enteric glia, and its use in CRE driver lines to label and ablate enteric glia.
In the current paper, the authors carefully examine the role of enteric glia by first identifying that PLP1-creERT2 is the most useful driver to direct enteric glial ablation, in terms of the number of glial cells targeted, their proximity to the intestinal epithelium, and the relevance for human studies (GFAP expression is rather limited in human samples in comparison). They examined gene expression changes in different regions of the intestine using bulk RNA-seq following ablation of enteric glia by driving expression of diphtheria toxin A (PLP1-creERT2;Rosa26-DTA). Alterations in gene expression were observed in different regions of the gut, with specific effects in different regions. Interestingly, while there were gene expression changes in the epithelium, there were limited changes to the proportions of different epithelial cell types identified using immunohistochemistry in control vs glial-ablated mice. The authors then focused on the investigation of Paneth cells in the ileum, identifying changes in the ultrastructural morphology and lysozyme activity. In addition, they identified alterations in gut microbiome diversity. As Paneth cells secrete antimicrobial peptides, the authors conclude that the changes in gut microbiome are due to enteric glia-mediated impacts on Paneth cell activity.
Overall, the study is excellent and delves into the different possible mechanisms of action, including the investigation of changes in enteric cholinergic neurons innervating the intestinal crypts. The use of different CRE drivers to target enteric glial cells has led to varying results in the past, and the authors should be commended on how they address this in the Discussion.
We thank the reviewer for this positive feedback.
Reviewer #2 (Recommendations For The Authors):
I have a few minor comments:
Changes in bacterial diversity - the authors make a very compelling case that changes in the proportions of various intestinal microbiome species were impacted by the change in Paneth cell secretions resulting from the depletion of enteric glia. Another potential mechanism of action could be alterations in gut motility resulting from loss of enteric glia. It appears that faecal samples were collected from both male and female mice, and hence changes in colonic motility could be involved. This should be addressed in the Results and Discussion.
We agree with the reviewer that GI dysmotility could influence microbial composition. To address this, we initially analyzed microbiome data separately for male and female mice, because only female Plp1CreER-Rosa26DTA exhibit dysmotility. We found no significant sex-specific differences in microbiome composition, however, which suggested to us that dysmotility was unlikely to be the primary driver of the observed microbial changes. Based on these findings, we opted to combine data from male and female mice in our final microbiome analysis. We have now revised the Results, Discussion, and Methods sections to clarify this.
Supplementary Figure 2: it would be helpful to include some labels of landmarks on the tissues, and arrows pointing to immunoreactive cells.
We have added labels and arrows to images in Supplementary Figure 2 per the reviewer’s suggestion.
Figure 4B: It's hard to tell the difference in ultrastructural morphology of the Paneth cells between Cre- and Cre+ mice in the EM images. Heterogeneous granules (PG) seem to be labelled in cells from both genotypes of mice. Some outlines of cells or arrows pointing to errant granules would be helpful.
We have added arrows indicated errant granules to images in Figure 4 per the reviewer’s suggestion.
Reviewer #3 (Public Review):
In this study, Prochera, et al. identify PLP1+ cells as the glia that most closely interact with the gut epithelium and show that genetic depletion of these PLP1+ glia in mice does not have major effects on the intestinal transcriptome or the cellular composition of the epithelium. Enteric glial loss, however, causes dysregulation of Paneth cell gene expression that is associated with morphological disruption of Paneth cells, diminished lysozyme secretion, and altered gut microbial composition.
Overall, the authors need to first prove whether the Plp1CreER Rosa26DTA/+ mice system is viable.
In previous work, we discovered that the gene Plp1 is broadly expressed by enteric glia and, within the mouse intestine, is quite specific to glial cells (PMID: 26119414). We characterized the Plp1CreER mouse line as a genetic tool in detail in this initial study. Then in a subsequent manuscript, we used Plp1CreER-DTA mice to genetically deplete enteric glia and study the consequences on epithelial barrier integrity, crypt cell proliferation, enteric neuronal health and gastrointestinal motility (PMID: 28711628). In this second study, we performed extensive validation of the Plp1CreER-DTA mouse model including detailed quantification of glial depletion in the small and large intestines across the myenteric, intramuscular and mucosa compartments by immunohistochemical (IHC) staining of whole tissue segments to sample thousands of cells. We found that the majority of S100B<sup>+</sup>enteric glia were depleted within 5 days in both sexes, including more than 88% loss of mucosal glia, and that this loss was stable at 3 subsequent timepoints (7, 9 and 14 days post-tamoxifen induction of Cre activity). Glial loss was further confirmed by IHC for GFAP in the myenteric plexus, and by ultrastructural analysis of the small intestine to ensure cell depletion rather than simply loss of marker expression. Our group was the first to use this model to study enteric glia, and since then similar models and our key observations have been replicated by other groups (PMID: 33282743, 34550727). Thus, we consider this model to be well established.
Also, most experimental systems have been evaluated by immunohistochemistry, scRNAseq, and electron microscopy, but need quantitative statistical processing.
RNA-sequencing and microbiome analyses are inherently quantitative (Figures 1A-B, Supp. Figure 1, Figure 2, Supp. Figure 4A, Figure 3A-B, Supp. Figure 5, Figure 5, and Supp. Figure 8C). Virtually all our other observations are also supported by quantitative analysis including analysis of mucosal glial markers (Fig. 1C-D), validation of Paneth cell transcript expression in the colon (Supp. Fig. 4B), measurement of epithelial cell type composition (Figure 3C, D), assessment of crypt innervation (Supp. Fig. 7E), and measurement of bacteria-to-crypt distance (Supp. Fig. 8A-B). The only observation that was not quantified was that of morphological abnormalities of Paneth cells. Given the inherently low sampling rate of EM studies, we felt that functional assays (explant secretion assays, effects on microbial composition) would be more meaningful for interrogation of a potential Paneth cell phenotype and thus elected to focus our quantitative analyses on those functional assays rather than further histological measurements.
In addition, the value of the paper would be enhanced if the significance of why the phenotype appeared in the large intestine rather than the small intestine when PLP1 is deficient for Paneth cells is clarified.
Please see detailed response to Reviewer 1 that addresses this comment and the corresponding addition to the Discussion.
Major Weaknesses:
(1) Supplementary Figure 2; Cannot be evaluated without quantification.
Supplemental Figure 2 shows qualitative IHC observations that were highly reproducible across all the subjects indicated for each marker and align well with the quantitative transcriptional data from human subjects shown in Figure 1 and Supplemental Figure 1. The DAB staining in Supplemental Figure 2 could theoretically be quantified by staining intensity or counting cell number but we felt this would be arbitrary and difficult to achieve in a meaningful way with a single chromogen. The DAB reaction is associated with a non-linear relationship between amount of an antigen and staining intensity, especially at higher levels (PMID: 16978204, 19575836), because it is not a direct conjugate and relies upon an enzymatic reaction. The amplification step required for DAB staining using Horseradish Peroxidase (HRP) introduces variability, particularly with cytoplasmic markers and in complex tissue structures like the plexuses, where proteins are distributed throughout the glial network. Counting cell number also would not lead to fair comparisons between markers because while SOX10 shows a clear nuclear signal suitable for quantification, the other markers are all membrane or cytoplasmic proteins, making accurate counting nearly impossible in dense ganglia. Finally, quantifying cell number in 5-micron paraffin sections which have major differences in sampling from one subject to another in terms of presence of ganglia and ganglia size, would also make this prone to inaccuracy. Given these limitations and the robust qualitative data we have shown that aligns completely with the quantitative transcriptional analyses, we respectfully disagree with the reviewer’s comment.
(2) Figure 2A; Is Plp1CreER Rosa26DTA/+ mice system established correctly? S100B immunohistology picture is not clear. A similar study is needed for female Plp1CreER Rosa26DTA/+ mice. What is the justification for setting 5 dpt, 11 dpt? Any consideration of changes to organs other than the intestine? Wouldn't it be clearer to introduce Organoid technology?
Please see the detailed response to first comment. The Plp1CreER- DTA mouse model is well-established and there are detailed experimental justifications for the 5 and 11dpt timepoints as well as the focus on male mice for RNA-sequencing analyses. As described in our previous work (PMID: 28711628), Plp1<sup>+</sup> cells throughout the animal would be affected, including Schwann cells and oligodendrocytes, which is why we limit our analyses to the first 11dpt, when there are fewer confounding variables. The S100B immunohistology picture in Figure 2A was intended to be a schematic graphical representation of the paradigm of glial loss, not a data figure. Extensive validation of glial loss in this model was shown in our previous study. To improve clarity, we have now enlarged the picture for the reader.
Regarding the suggestion to use organoid technology, standard intestinal epithelial organoids do not incorporate any elements of the enteric nervous system (ENS), which is the focus of this study. Some groups have made heroic efforts to incorporate ENS components into intestinal organoids by introducing neural crest progenitor cells and grafting the hybrid organoids under the renal capsule in mice (example PMID: 27869805); but these studies are still limited, and it remains unclear how much the preparations reflect functional, natively innervated intestine. Our ex vivo explant assay preserves native ENS-epithelial interactions, providing a more effective model for studying the relationship between enteric glia and Paneth cells.
(3) Figure 2B; Need an explanation for the 5 genes that were altered in the colon. Five genes should be evaluated by RT-qPCR. Why was there a lack of change in the duodenum and ileum?
While RT-qPCR validation of differentially expressed genes was once common practice, especially with microarray data, there is now robust evidence for strong correlations between RNA sequencing (RNAseq) results and RT-qPCR measurements of gene expression (PMID: 26208977, 28484260). Notably Rajkumar et al. (PMID: 26208977) demonstrated that RNAseq analyzed using DESeq2 (a method which we employed in our study), yields highly accurate results. They reported a 0% false positive rate and a 100% positive predictive value for DESeq2, rendering additional RT-qPCR validation redundant. We only performed RT-qPCR analysis of colonic Lyz1 expression because our IHC analyses failed to show any ectopic expression of the protein in the colons of Cre<sup>+</sup> mice (Supp. Figure 4D) and we wished to validate the gene expression change seen by RNAseq in an independent cohort to be absolutely sure. Per the detailed response to Reviewer 1, we do not have a mechanistic explanation for why there is selective transcriptional induction of Paneth cell-related genes in the colon upon glial depletion. We have elaborated on this in the revised Discussion.
(4) Supplementary Figure 3; Top 3 genes should be evaluated by RT-qPCR.
Given that none of the changes included in Supplementary Figure 3 for the duodenum or ileum reach the standard threshold for statistical significance and in view of the findings by Rajkumar, et al. (2015) described above, we don’t believe that evaluating expression of these genes by RT-qPCR would be informative in interpreting these negative results.
(5) Supplementary Figure 4B, C, and D; Why not show analysis in the small intestine?
We chose to focus on the colon for this analysis because this was the only region of the intestine that exhibited statistically significant differences in transcriptional profiles as assessed by DEG.
(6) Supplementary Figure 4D; Cannot be evaluated without quantification.
As shown in the representative images, no LYZ1 or DEFA5 signal was detected in the colons of Cre<sup>-</sup> or Cre<sup>+</sup> mice (n=3 mice per genotype; >100 crypts/mouse assessed), though it was readily detectable in the ileums of both genotypes. We have now added the number of crypts assessed to the figure legend.
(7) Figure 3D; Cannot be evaluated without quantification.
Please see Fig. 3C for quantification of each cell type marker shown in Figure 3D.
(8) Supplementary Figure 5B and C; Top 3 genes should be evaluated by RT-qPCR.
Please see detailed explanation to comments #3 and #4 above.
(9) Supplementary Figure 6; Top 3 genes should be evaluated by RT-qPCR.
This comment was likely made in error because Supplementary Fig. 6 does not show any gene expression data.
(10) Figure 4A; Cannot be evaluated without quantification.
We appreciate the reviewer’s comment here and strived very hard to add quantification of the Paneth cell granule phenotype seen by light microscopy to our study. IHC for LYZ1 is typically the gold standard for assessment of Paneth cell granules by light microscopy. In our hands, however, we encountered persistent issues with IHC for this protein. While it very reproducibly detected Paneth cells with sufficient specificity to enable quantification of number of immunoreactive cells (as shown in Figure 3C), it did not enable quantification of granule morphology because it consistently exhibited diffuse staining throughout the cell (see Author response image 2 below). This appearance persisted regardless of extensive titration of fixation parameters (time, temperature, fixative supplier, 10% NBF vs 4% PFA), tissue preparation (fixed as intact tubes versus “swiss-rolls”), permeabilization conditions, operator, antibody used, and other variables. Upon subsequently surveying the literature, it seems that similar diffuse staining patterns for LYZ1 have been observed by numerous other groups and this may simply be an experimental limitation.
Author response image 2.
Representative IHC images showing LYZ1 staining optimization. Ileal tissues from 8-10-week-old mice were prepared as either 'swiss-rolls' (A-D) or tubes (E-F) and fixed using different protocols: 10% neutral buffered formalin (NBF) from Epredia (#5710-LP) (A-B, E), 10% NBF from G-Biosciences (#786-1057) (C-D), or 4% paraformaldehyde (PFA) from VWR (#100503-917) (F). Fixations were conducted at room temperature (A, C) or at 4°C (B, D-F). Diffuse cytoplasmic LYZ1 staining is observed within Paneth cells, regardless of conditions of tissue preparation.
As an alternative approach to detecting Paneth cell granules, we tried UEA-I lectin staining. This labeling approach was sufficient to reveal qualitative differences in Paneth granule morphology in Cre<sup>+</sup> mice, as shown in Fig. 4A. However, the transient nature of this lectin labeling made it very difficult to systematically quantify granule morphology in a blinded manner, as we did for our other analyses. Given these persistent challenges, we decided to present qualitative data on morphology by two orthogonal approaches (UEA-I staining by light microscopy and ultrastructure by EM) and focus on functional read-outs for quantitative analyses (explant secretion assays and microbiome analyses). In aggregate, we feel that these data provide robust and complementary evidence of the observed phenotype from independent experimental approaches.
(11) Figure 4D; Cannot be evaluated without quantification.
This comment was likely made in error because there is no Figure 4D.
(12) Additional experiments on in vivo infection systems comparing Plp1CreER Rosa26DTA/+ mice and controls would be great.
We agree that in vivo infection experiments would be very interesting to pursue, particularly given the potential role of Paneth cells in innate immunity. These studies are beyond the scope of the current manuscript, but we hope to report on them in the future.
Reviewer #3 (Recommendations For The Authors):
Patients with inflammatory bowel disease (IBD); UC or CD.
Revised per reviewer suggestion.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study on mouse Ly49 receptors expressed on natural killer (NK) cells shows that Ly49A, in the presence of the corresponding MHC Class I allele, can lead to NK cell licensing, thereby providing valuable insights into the mechanisms of NK cell modulation by Ly49 receptors. The work may have significant implications for studies of human Killer-cell immunoglobulin-like receptors (KIR) expressing and other NK cells. Overall, the study was well-developed with convincing evidence.
-
Reviewer #1 (Public review):
Summary:
The article by Piersma et al. aims to reduce the complex process of NK cell licensing to the action of a single inhibitory receptor for MHC class I. This is achieved using a mouse strain lacking all of the Ly49 receptors expressed by NK cells and inserting the Ly49a gene into the Ncr1 locus, leading to expression on all the majority of NK cells.
Strengths:
The mouse model used represents a precise deletion of all NK-expressed genes within the Ly49 cluster. Re-introduction of the Ly49a gene into the Ncr1 locus allows expression by most NK cells. Convincing effects of Ly49a expression on in vitro activation and in vivo killing assay are shown.
Weaknesses:
The choice of Ly49a provides a clear picture of H-2Dd recognition by this Ly49. It would be valuable to perform additional studies investigating Ly49c and Ly49i receptors for H-2b. This is of interest because there are reports indicating that Ly49c may not be a functional receptor in B6 mice due to strong cis interactions. Investigation of the Ly49c and Ly49i receptors in this model would be the basis of future studies that are beyond the scope of the current report.
This work generates an excellent mouse model for the study of NK cell licensing by inhibitory Ly49s that will be useful for the community. It provides a platform whereby the functional activity of a single Ly49 can be assessed.
Comments on revisions: No additional concerns
-
Reviewer #2 (Public review):
Piersma et al. continue to work on deciphering the role and function of Ly49 NK cell receptors. This manuscript shows that a single inhibitory Ly49 receptor is sufficient to license NK cells and eliminate MHC-I-deficient target cells in mice. In short, they refined the mouse model ∆Ly49-1 (Parikh et al., 2020) into the Ly49KO model in which all Ly49 genes are disrupted. Using this model, they confirmed that NK cells from Ly49KO mice cannot be licensed, produce lower levels of IFN-gamma, and cannot reject MHC-I-deficient cells. To study the effect of a single Ly49 receptor in the function of NK cells, the authors backcrossed Ly49KO mice to H-2Dd transgenic KODO (D8-KODO) Ly49A knock-in mice in which a single inhibitory Ly49A receptor that recognizes H-2Dd ligands is expressed. By doing so, they demonstrate that a single inhibitory Ly49 receptor expressed by all NK cells is sufficient for licensing and missing-self killing.
While the results of the study are largely consistent with the conclusions, it is important to address some discrepancies. For instance, in the title of Figure 1, the authors state that NK cells in Ly49KO mice compared to WT mice have a less mature phenotype , which is not consistent with the corresponding text in the Results section (lines 170-171) that states there is no difference in maturation. These differences are not evident in Figure 1, panel D. It is crucial to acknowledge these inconsistencies to ensure a comprehensive understanding of the research findings.
In the legend of Figure 2. the text related to panel C indicates the use of dyes to label the splenocytes, and CFSE, CTV, and CTFR were mentioned. However, only CTV and CTFR are shown on the plots and mentioned in the corresponding text in the Results section. Similarly, in the legend of Figure 4, which is related to panel C, the authors write that splenocytes were differentially labeled with CFSE and CTV as indicated; however, in Figure 4, C and the Results section text, there is no mention of CFSE.
The authors should clarify why they assume that KLRG1 expression is influenced by the expression of inhibitory Ly49 receptors and not by manipulations on chromosome 6, where the genes for both KLRG1 and Ly49 receptors are located. However, a better explanation for the possible influence of other inhibitory NK cell receptors still needs to be included. In the study by Zhang et al. (doi: 10.1038/s41467-019-13032-5 the authors showed the synergized regulation of NK cell education by the NKG2A receptor and the specific Ly49 family members. Although in this study, Piersma and colleagues show the control of MHC-I deficient cells by Ly49A+ NKG2A-NK cells in Figure 4., this receptor is not mentioned in the Results or in the Discussion section, so its role in this story needs to be clarified. Therefore, the reader would benefit from more information regarding NKG2A receptor and NKG2A+/- populations in their results.
Comments on revisions: The authors have successfully answered all my questions and edited the manuscript accordingly.
-
Reviewer #3 (Public review):
Summary:
In this study, Piersma et al. successfully generated a mouse model with all Ly49 genes knocked out, resulting in the complete absence of Ly49 receptor expression on the cell surface. The absence of Ly49 expression led to the loss of NK cell education/licensing and consequently, a failure in responsiveness against missing-self target cells. The authors demonstrate the restoration of NK cell licensing by knocking in a single Ly49 gene, Ly49A, in a mouse expressing the H-2Dd ligand for this receptor, which is a novel and important finding.
Strengths:
The authors established a novel mouse model enabling them to have a clean and thorough study on the function of Ly49 on NK cell licensing. Also, by knock in a single Ly49, they were able to investigate the function of a given Ly49 receptor excluding the "contamination" of co-expression any other Ly49 genes. The experiment designing and data interpretation were logically clear and the evidence was solid.
Weaknesses:
The mouse model was somehow genetically similar to a previous study. The experimental work and findings are partially overlapping with the previous work by Zhang et al. (2019), who also performed knockout of the entire Ly49 locus in mice and demonstrated that loss of NK responsiveness was due to the removal of inhibitory, and not activating Ly49 genes.
Potential achievements and discussions: The mouse model developed by the authors holds great potential for advancing NK cell functional studies, particularly regarding the regulation of NK cell functions through receptor-ligand interactions. Moreover, it provides a valuable tool for investigating NK cell education and the development of checkpoint inhibitors. These applications could significantly contribute to the broader research efforts in cancer therapy utilizing NK cells.
Comments on revisions: The authors have successfully addressed all the concerns raised in my previous feedback. They have significantly improved the logical structure, making it clearer and more coherent. Additionally, they have ensured consistency in the use of specific terminology throughout the manuscript. The substantial revisions and re-writing efforts are commendable and have greatly enhanced the overall quality of the manuscript.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The article by Piersma et al. aims to reduce the complex process of NK cell licensing to the action of a single inhibitory receptor for MHC class I. This is achieved using a mouse strain lacking all of the Ly49 receptors expressed by NK cells and inserting the Ly49a gene into the Ncr1 locus, leading to expression on the majority of NK cells.
Strengths:
The mouse model used represents a precise deletion of all NK-expressed genes within the Ly49 cluster. The re-introduction of the Ly49a gene into the Ncr1 locus allows expression by most NK cells. Convincing effects of Ly49a expression on in vitro activation and in vivo killing assay are shown.
Weaknesses:
The choice of Ly49a provides a clear picture of H-2D<sup>d</sup> recognition by this Ly49. It would be valuable to perform additional studies investigating Ly49c and Ly49i receptors for H-2b. This is of interest because there are reports indicating that Ly49c may not be a functional receptor in B6 mice due to strong cis interactions.
We agree with the reviewer that it will be important to extend our findings to H-2b haplotypes with individual cognate Ly49 receptors (Ly49C and Ly49I). While these experiments are subject of our ongoing studies, they are beyond the scope of the current manuscript considering the significant time, effort and cost to generate these new Ly49C and Ly49I knockin mice.
This work generates an excellent mouse model for the study of NK cell licensing by inhibitory Ly49s that will be useful for the community. It provides a platform whereby the functional activity of a single Ly49 can be assessed.
Reviewer #2 (Public review):
Piersma et al. continue to work on deciphering the role and function of Ly49 NK cell receptors. This manuscript shows that a single inhibitory Ly49 receptor is sufficient to license NK cells and eliminate MHC-I-deficient target cells in mice. In short, they refined the mouse model ∆Ly49-1 (Parikh et al., 2020) into the Ly49KO model in which all Ly49 genes are disrupted. Using this model, they confirmed that NK cells from Ly49KO mice cannot be licensed, produce lower levels of IFN-gamma, and cannot reject MHC-I-deficient cells. To study the effect of a single Ly49 receptor in the function of NK cells, the authors backcrossed Ly49KO mice to H-2D<sup>d</sup> transgenic KODO (D8-KODO) Ly49A knock-in mice in which a single inhibitory Ly49A receptor that recognizes H-2D<sup>d</sup> ligands is expressed. By doing so, they demonstrate that a single inhibitory Ly49 receptor expressed by all NK cells is sufficient for licensing and missing-self killing.
While the results of the study are largely consistent with the conclusions, it is important to address some discrepancies. For instance, in the title of Figure 1, the authors state that NK cells in Ly49KO mice compared to WT mice have a less mature phenotype , which is not consistent with the corresponding text in the Results section (lines 170-171) that states there is no difference in maturation. These differences are not evident in Figure 1, panel D. It is crucial to acknowledge these inconsistencies to ensure a comprehensive understanding of the research findings.
We thank the reviewer for pointing this out. We have corrected the figure legend title to: “Mice generated to lack all NK-related Ly49 molecules using CRISPR have NK cells that display alterations in select surface molecules.”
In the legend of Figure 2. the text related to panel C indicates the use of dyes to label the splenocytes, and CFSE, CTV, and CTFR were mentioned. However, only CTV and CTFR are shown on the plots and mentioned in the corresponding text in the Results section. Similarly, in the legend of Figure 4, which is related to panel C, the authors write that splenocytes were differentially labeled with CFSE and CTV as indicated; however, in Figure 4, C and the Results section text, there is no mention of CFSE.
We thank the reviewer to point out these inconsistencies. We did label target cells with CFSE to distinguish them from host cells, to clarify we have done the following:
We have removed CFSE from figure legends of Figure 2 and 4.
We included the following on CFSE labeling in the Materials and Methods section: “Target splenocytes were additionally labeled with CFSE to identify transferred target splenocytes from host cells.”
The authors should clarify why they assume that KLRG1 expression is influenced by the expression of inhibitory Ly49 receptors and not by manipulations on chromosome 6, where the genes for both KLRG1 and Ly49 receptors are located.
The effect on KLRG1 expression in phenocopied in the Ly49A KI mice (on a Ly49 KO background). The Ly49A KI allele is encoded by the Ncr1 locus, which is located on chromosome 7 and not by chromosome 6 where KLRG1 is located, thus excluding involvement of cis-regulatory elements encoded by the Ly49 locus on chromosome 6.
We have clarified this in the discussion section (lines 350-358):
“The Ly49 gene family as well as Klrg1 is located within the NKC on chromosome 6 (Yokoyama and Plougastel, 2003) …. expression of only Ly49A, encoded in the Ncr1 locus located on chromosome 7, in Ly49KO mice on a H-2D<sup>d</sup> background restored KLRG1 expression”
However, a better explanation for the possible influence of other inhibitory NK cell receptors still needs to be included. In the study by Zhang et al. (doi: 10.1038/s41467-019-13032-5 the authors showed the synergized regulation of NK cell education by the NKG2A receptor and the specific Ly49 family members. Although in this study, Piersma and colleagues show the control of MHC-I deficient cells by Ly49A+ NKG2A-NK cells in Figure 4., this receptor is not mentioned in the Results or in the Discussion section, so its role in this story needs to be clarified. Therefore, the reader would benefit from more information regarding NKG2A receptor and NKG2A+/- populations in their results.
We agree with the reviewer that it is important to describe our results in the context of other inhibitory receptors. To clarify the role of NKG2A and potentially other inhibitory receptors we have made the following improvements to our manuscript:
We discuss the role of NKG2A in the discussion section, which now include (lines 259-266):
“While our results did not interrogate licensing by inhibitory receptors outside of the Ly49 receptor family, such as has been reported for NKG2A (Anfossi et al., 2006; Zhang et al., 2019), they do demonstrate that expression of Ly49A without other Ly49 family members can mediate NK cell licensing. Moreover, we found that Ly49 receptors are required and sufficient for missing-self rejection under steady-state conditions. However, these observations do not rule out involvement of other inhibitory receptors under specific inflammatory conditions. For example, NKG2A contributes to rejection of missing-self targets in poly(I:C)-treated mice (Zhang et al., 2019).”
We also added the following to the result section (lines 179-182):
NKG2A has been implicated in NK cell licensing by the non-classical MHC-I molecule Qa1 (Anfossi et al., 2006), to eliminate potential confounding effects by this interaction, effector functions of NKG2A- NK cells were evaluated as described before (Bern et al., 2017).
Reviewer #3 (Public review):
Summary:
In this study, Piersma et al. successfully generated a mouse model with all Ly49n et al., 2017 genes knocked out, resulting in the complete absence of Ly49 receptor expression on the cell surface. The absence of Ly49 expression led to the loss of NK cell education/licensing and consequently, a failure in responsiveness against missing-self target cells. The experimental work and findings are partially overlapping with the previous work by Zhang et al. (2019), who also performed knockout of the entire Ly49 locus in mice and demonstrated that loss of NK responsiveness was due to the removal of inhibitory, and not activating Ly49 genes. The authors demonstrate the restoration of NK cell licensing by knocking in a single Ly49 gene, Ly49A, in a mouse expressing the H-2D<sup>d</sup> ligand for this receptor, which is a novel and important finding.
Strengths:
The authors established a novel mouse model enabling them to have a clean and thorough study on the function of Ly49 on NK cell licensing. Also, by knocking in a single Ly49, they were able to investigate the function of a given Ly49 receptor excluding the "contamination" of co-expression of any other Ly49 genes. Their idea and method were novel though the mouse model was somehow genetically similar to a previous study. The experiment design and data interpretation were logically clear and the evidence was solid.
Weaknesses:
The paper is very poorly written and confusing. The authors should be more accurate in the usage of terminology, provide more details on experimental procedures, and revise much of the text to improve clarity and coherence. A thorough revision aiming to clarify the paper would be helpful.
We regret that the manuscript was confusing to the reviewer. We have made thorough revisions to the different sections, which we hope will improve the clarity of the manuscript.
We have made changes to all sections of the manuscript, including the title. These revisions include improved clarity on description of NK cell licensing and consistent usage throughout the manuscript, per the reviewer recommendations. We hope that all our improvements help the clarity of the manuscript.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
I was confused by lines 262-270 in the discussion. The data from Hanke et al. is presented as contradictory to the observation that Ly49s bind more efficiently to H2-Kb than -Db, but they showed that Ly49c/i did not bind Kb-deficient cells, supporting the preferred binding to Kb.
We have clarified this issue and the paragraph now reads: “This is further supported by early studies using Ly49 transfectants binding to Con A blasts showing that Ly49C and Ly49I can bind to H-2D<sup>b</sup>-deficient but not H-2K<sup>b</sup>-deficient cells (Hanke et al., 1999), despite the caveat of testing binding to cells overexpressing Ly49s in these studies.”
Reviewer #2 (Recommendations for the authors):
The authors' conclusion that one type of inhibitory Ly49 receptor expressed on NK cells is sufficient for successful licensing and rejection of missing self-cells is a significant step forward. However, it would be beneficial to complement this with additional data. For instance, exploring the role of a single inhibitory Ly49 receptor responsible for licensing in a mouse model with a different haplotype (e.g. Ly49C or Ly49I on H-2b MHC I haplotype in C57BL/6J mice) could provide valuable insights and open new avenues for research in the field.
We agree with the reviewer that it will be important to extend our findings to additional MHC-I haplotypes with single cognate Ly49 receptors. While these experiments are subject of our ongoing studies, they are beyond the scope of the current manuscript considering the significant effort, time, and cost to generate these new Ly49C and Ly49I knockin mice.
Reviewer #3 (Recommendations for the authors):
Specific issues that should be addressed are as follows:
(1) The title of the paper: "Expression of a single inhibitory Ly49 receptor is sufficient to license NK cells for effector functions" is ambiguous. When I first read the title, I thought the authors meant that only a single Ly49 molecule on the NK cell surface was necessary to induce licensing. It might be better to replace "single inhibitory receptor" with "single member of Ly49 receptor family".
We have changed the title to: “Expression of a single inhibitory member of the Ly49 receptor family is sufficient to license NK cells for effector functions”
(2) In the abstract, introduction, and results, the authors distinguish "licensing" and "rejection of missing-self targets" as two distinct phenomena. An example includes Abstract, lines 51-53: "Herein, we showed mice lacking expression of all Ly49s were unable to reject missing-self target cells in vivo, were defective in NK cell licensing, and displayed lower KLRG1 on the surface of NK cells". Similarly, the title of the second subsection of the Results states: "Ly49-deficient NK cells are defective in licensing and rejection of cognate MHC-I deficient target cells" (line 176). In these instances, it seems that by "licensing", they mean only response to plate-bound anti-NK1.1 stimulation and not a response to missing-self targets. Alternatively, in the first paragraph of the Discussion, it sounds as if licensing includes both anti-NK1.1 and missing-self responses (lines 258-260): "...NK cells were fully licensed in terms of their functional phenotype, including the capacity to be activated by an activation receptor in vitro and efficient rejection of MHC-I deficient target cells in vivo". Please define the terms and use the terms consistently throughout the paper.
We were the first to describe the term licensing and have defined this as acquisition of NK cell functional competence by self-MHC molecules (Kim et al., 2005), which is characterized by increased NK cell effector functions to activating signals. Thus, licensed NK cells are prevented from attacking normal MHC-I<sup>+</sup> cells by the same self-MHC-I-specific receptor that conferred licensing, while unlicensed NK cells without appropriate Ly49 receptors are functionally incompetent.
To clarify we made changes throughout the manuscript including the following:
Lines 91-101:
“In addition to effector function in missing-self, Ly49 receptors that recognize their cognate MHC-I ligands are involved in licensing or education of NK cells to acquire functional competence. NK cell licensing is characterized by potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation (Elliott et al., 2010; Kim et al., 2005). Like missing-self recognition, inhibitory Ly49s require SHP-1 for NK cell licensing which interacts with the ITIM-motif encoded in the cytosolic tail of inhibitory Ly49s (Bern et al., 2017; Kim et al., 2005; Viant et al., 2014). Moreover, lower expression of SHP-1, particularly within the immunological synapse, is associated with licensed NK cells (Schmied et al., 2023; Wu et al., 2021). Thus, inhibitory Ly49s have a second function that licenses NK cells to self-MHC-I thereby generating functionally competent NK cells but it has not been possible to exclude contributions from other co-expressed Ly49s.”
Lines 268-271 (previously 258-260):
“Yet the NK cells were fully licensed in terms of IFNγ production and degranulation in vitro and efficiently rejected MHC-I deficient target cells in vivo. Thus, a single Ly49 receptor is capable to confer the licensed phenotype and missing-self rejection in vitro and in vivo.”
Lines 309-312:
“In conclusion, these data show that expression of a single inhibitory Ly49 receptor is necessary and sufficient to license NK cells and mediate missing self-rejection under steady state conditions in vivo.”
(3) Introduction, lines 76-79. Please provide the C57BL/6 MHC-I genotype. It is difficult to follow the text here without this information. In general, please provide information to help the reader who may not be working in this precise field.
We thank the reviewer for pointing this out. We have included this and the lines now read: “For example, in the C57BL/6 background, Ly49C and Ly49I can recognize H-2<sup>b</sup> MHC-I molecules that include H-2K<sup>b</sup> and H-2D<sup>b</sup>, while Ly49A and Ly49G cannot recognize H-2<sup>b</sup> molecules and instead they recognize H-2<sup>d</sup> alleles.”
(4) Introduction, lines 85-97. Please use commas: "...the MHC-I specificities of other Ly49s have been primarily studied with MHC tetramers containing human b2m, which is not recognized by Ly49A, on cells overexpressing Ly49s" in order to clarify the sentence.
Commas have been added as suggested by the reviewer.
(5) Introduction, lines 91-101. The whole paragraph starting with the following sentence does not make sense and should be re-written. "In addition to effector function in missing-self, when inhibitory Ly49 receptors recognize their cognate MHC-I ligands in vivo, they license or educate NK cells for potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation".
We regret that this paragraph was not clear to the reviewer. We have changed this paragraph to:
“In addition to effector function in missing-self, Ly49 receptors that recognize their cognate MHC-I ligands are involved in licensing or education of NK cells to acquire functional competence. NK cell licensing is characterized by potent effector functions including IFNγ production and degranulation in response to activation receptor stimulation (Elliott et al., 2010; Kim et al., 2005). Like missing-self recognition, inhibitory Ly49s require SHP-1 for NK cell licensing which interacts with the ITIM-motif encoded in the cytosolic tail of inhibitory Ly49s (Bern et al., 2017; Kim et al., 2005; Viant et al., 2014). Moreover, lower expression of SHP-1, particularly within the immunological synapse, is associated with licensed NK cells (Schmied et al., 2023; Wu et al., 2021). Thus, inhibitory Ly49s have a second function that licenses NK cells to self-MHC-I thereby generating functionally competent NK cells but it has not been possible to exclude contributions from other co-expressed Ly49s.”
(6) Results, line 181. Please edit: "...MHC-I-deficient H-2K<sup>b</sup> x H-2D<sup>b</sup> deficient (KODO) mice".
This sentence now reads “... NK cells from H-2K<sup>b</sup> and H-2D<sup>b</sup> double deficient (KODO) mice”
(7) Results, line 192. Please re-word the following phrase: "missing-self is dominated by H-2K<sup>b</sup> in the C57BL/6 background", as it is unclear. Do you mean that H-2K<sup>b</sup> is protected from lysis as opposed to H-2D<sup>b</sup>?
We thank the reviewer for pointing this out, line 192 now reads: “..missing-self recognition in the C57BL/6 background depends on the absence of H-2K<sup>b</sup> rather than H-2D<sup>b</sup>.”
(8) Please briefly describe the Ncr1-Ly49A knockin procedure so that the reader understands the link between NKp46 and Ly49A expression without going to the earlier paper. Also, it needs to be mentioned that Ncr1 is the gene encoding NKp46.
Lines 201-205 now read: “To investigate the potential of a single inhibitory Ly49 receptor on mediating NK cell licensing and missing-self rejection, the Ly49KO mice were backcrossed to H-2D<sup>d</sup> transgenic KODO (D8-KODO) Ly49A KI mice that express Klra1 cDNA encoding the inhibitory Ly49A receptor in the Ncr1 locus encoding NKp46 and its cognate ligand H-2D<sup>d</sup> but not any other classical MHC-I molecules (Parikh et al., 2020).
In the materials and Methods section, the following has been added (lines 324-326):
“In Ly49A KI mice the stop codon of Ncr1 encoding NKp46 is replaced with a P2A peptide-cleavage site upstream of the Ly49A cDNA, while maintaining the 3’ untranslated region.”
(9) Figure 4C, legend. There is no CFSE staining in this experiment. Please correct.
We did label target cells with CFSE to distinguish them from host cells, to clarify we have done the following:
We have removed CFSE from figure legends of Figure 2 and 4.
We included the following on CFSE labeling in the Materials and Methods section (lines 377-379): “Target splenocytes were additionally labeled with CFSE to identify transferred target splenocytes from host cells.”
(10) Discussion, lines 262-270. This paragraph sounds as if data by Hanke et al. does not agree with the data presented in the paper. On the contrary, Hanke et al. demonstrate that Ly49C and Ly49I detectably bind to H-2K<sup>b</sup>, but poorly to H-2D<sup>b</sup>, supporting observations shown in Figure 2C.
We have clarified this issue and the paragraph now reads: “This is further supported by early studies using Ly49 transfectants binding to Con A blasts showing that Ly49C and Ly49I can bind to H-2D<sup>b</sup>-deficient but not H-2K<sup>b</sup>-deficient cells (Hanke et al., 1999), despite the caveat of testing binding to cells overexpressing Ly49s in these studies.”
-
-
www.biorxiv.org www.biorxiv.org
-
Author response:
The following is the authors’ response to the current reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Zanetti et al use biophysical and cellular assays to investigate the interaction of the birnavirus VP3 protein with the early endosome lipid PI3P. The major novel finding is that association of the VP3 protein with an anionic lipid (PI3P) appears to be important for viral replication, as evidenced through a cellular assay on FFUs.
Strengths:
Support previously published claims that VP3 associates with early endosome membrane, potentially through binding to PI3P. The finding that mutating a single residue (R200) critically affects early endosome binding and that the same mutation also inhibits viral replication suggests a very important role for this binding in the viral life cycle.
Weaknesses:
The manuscript is relatively narrowly focused: the specifics of the bi-molecular interaction between the VP3 of an unusual avian virus and a host cell lipid (PIP3). Further, the affinity of this interaction is low and its specificity relative to other PIPs is not tested, leading to questions about whether VP3-PI3P binding is relevant.
Regarding the manuscript’s focus, we challenge the notion that studying a single bi-molecular interaction makes the scope of the paper overly narrow. This interaction—between VP3 and PI3P—plays a critical role in the replication of the birnavirus, which is the central theme of our work. Moreover, identifying and understanding such distinct interactions is a fundamental aspect of molecular virology, as they shed light on the precise mechanisms that viruses exploit to hijack the host cell machinery. Consequently, far from being narrowly focused, we believe our work contributes to the broader understanding of host-pathogen interactions.
As for the low affinity of the VP3-PI3P interaction, we argue that this is not a limitation but rather a biologically relevant feature. As discussed in the manuscript, the moderate strength of this interaction is likely critical for regulating the turnover rate of VP3/endosomal PI3P complexes, which in turn could optimize viral replication efficiency. A stronger affinity might trap VP3 on the endosomal membrane, whereas weaker interactions might reduce its ability to efficiently target PI3P. Thus, the observed affinity may reflect a fine-tuned balance that supports the viral life cycle.
With regard to specificity, we emphasize that in the context of the paper, we refer to biological specificity, which is not necessarily the same as chemical specificity. The binding of PI3P to early endosomes is “biologically” preconditioned by the distribution of PI3P within the cell. PI3P is predominantly localized in endosomal membranes, which “biologically precludes” interference from other PIPs due to their distinct cellular distributions. Moreover, while early endosomes also contain other anionic lipids, our work demonstrates that among these, PI3P plays a distinctive role in VP3 binding. This highlights its functional relevance in the context of early endosome dynamics.
Reviewer #3 (Public review):
Summary:
Infectious bursal disease virus (IBDV) is a birnavirus and an important avian pathogen. Interestingly, IBDV appears to be a unique dsRNA virus that uses early endosomes for RNA replication that is more common for +ssRNA viruses such as for example SARS-CoV-2. This work builds on previous studies showing that IBDV VP3 interacts with PIP3 during virus replication. The authors provide further biophysical evidence for the interaction and map the interacting domain on VP3.
Strengths:
Detailed characterization of the interaction between VP3 and PIP3 identified R200D mutation as critical for the interaction. Cryo-EM data show that VP3 leads to membrane deformation.
We thank the reviewer for the feedback.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
Zanetti et al. use biophysical and cellular assays to investigate the interaction of the birnavirus VP3 protein with the early endosome lipid PI3P. The major novel finding is that the association of the VP3 protein with an anionic lipid (PI3P) appears to be important for viral replication, as evidenced through a cellular assay on FFUs.
Strengths:
Supports previously published claims that VP3 may associate with early endosomes and bind to PI3P-containing membranes. The claim that mutating a single residue (R<sub>200</sub>) critically affects early endosome binding and that the same mutation also inhibits viral replication suggests a very important role for this binding in the viral life cycle.
Weaknesses:
The manuscript is relatively narrowly focused: one bimolecular interaction between a host cell lipid and one protein of an unusual avian virus (VP3-PI3P). Aspects of this interaction have been described previously. Additional data would strengthen claims about the specificity and some technical issues should be addressed. Many of the core claims would benefit from additional experimental support to improve consistency.
Indeed, our group has previously described aspects of the VP3-PI3P interaction, as indicated in lines 100-105 from the manuscript. In this manuscript, however, we present biochemical and biophysical details that have not been reported before about how VP3 connects with early endosomes, showing that it interacts directly with the PI3P. Additionally, we have now identified a critical residue in VP3—the R<sub>200</sub>—for binding to PI3P and its key role in the viral life cycle. Furthermore, the molecular dynamics simulations helped us come up with a mechanism for VP3 to connect with PI3P in early endosomes. This constitutes a big step forward in our understanding of how these "non-canonical" viruses replicate.
We have now incorporated new experimental and simulation data; and have carefully revised the manuscript in accordance with the reviewers’ recommendations. We are confident that these improvements have further strengthened the manuscript.
Reviewer #2 (Public Review):
Summary:
Birnavirus replication factories form alongside early endosomes (EEs) in the host cell cytoplasm. Previous work from the Delgui lab has shown that the VP3 protein of the birnavirus strain infectious bursal disease virus (IBDV) interacts with phosphatidylinositol-3-phosphate (PI3P) within the EE membrane (Gimenez et al., 2018, 2020). Here, Zanetti et al. extend this previous work by biochemically mapping the specific determinants within IBDV VP3 that are required for PI3P binding in vitro, and they employ in silico simulations to propose a biophysical model for VP3-PI3P interactions.
Strengths:
The manuscript is generally well-written, and much of the data is rigorous and solid. The results provide deep knowledge into how birnaviruses might nucleate factories in association with EEs. The combination of approaches (biochemical, imaging, and computational) employed to investigate VP3-PI3P interactions is deemed a strength.
Weaknesses:
(1) Concerns about the sources, sizes, and amounts of recombinant proteins used for co-flotation: Figures 1A, 1B, 1G, and 4A show the results of co-flotation experiments in which recombinant proteins (control His-FYVE v. either full length or mutant His VP3) were either found to be associated with membranes (top) or non-associated (bottom). However, in some experiments, the total amounts of protein in the top + bottom fractions do not appear to be consistent in control v. experimental conditions. For instance, the Figure 4A western blot of His-2xFYVE following co-flotation with PI3P+ membranes shows almost no detectable protein in either top or bottom fractions.
Liposome-based methods, such as the co-flotation assay, are well-established and widely regarded as the preferred approach for studying protein-phosphoinositide interactions. However, this approach is rather qualitative, as density gradient separation reveals whether the protein is located in the top fractions (bound to liposomes) or the bottom fractions (unbound). Our quantifications aim to demonstrate differences in the bound fraction between liposome populations with and without PI3P. Given the setting of the co-flotation assays, each protein-liposome system [2xFYVE-PI3P(-), 2xFYVE-PI3P(+), VP3-PI3P(-), or VP3-PI3P(+)] is assessed separately, and even if the experimental conditions are homogeneous, it is not surprising to observe differences in the protein level between different experiments. Indeed, the revised version of the manuscript includes membranes with more similar band intensities, as depicted in the new versions of Figures 1 and 4.
Reading the paper, it was difficult to understand which source of protein was used for each experiment (i.e., E. coli or baculovirus-expressed), and this information is contradicted in several places (see lines 358-359 v. 383-384). Also, both the control protein and the His-VP3-FL proteins show up as several bands in the western blots, but they don't appear to be consistent with the sizes of the proteins stated on lines 383-384. For example, line 383 states that His-VP3-FL is ~43 kDa, but the blots show triplet bands that are all below the 35 kDa marker (Figures 1B and 1G). Mass spectrometry information is shown in the supplemental data (describing the different bands for His-VP3-FL) but this is not mentioned in the actual manuscript, causing confusion. Finally, the results appear to differ throughout the paper (see Figures 1B v. 1G and 1A v. 4A).
Thank you for pointing out these potentially confusing points in the previous version of the manuscript. Indeed, we were able to produce recombinant VP3 from the two sources: Baculovirus and Escherichia coli. Initially, we opted for the baculovirus system, based on evidence from previous studies showing that it was suitable for ectopic expression of VP3. Subsequently, we successfully produced VP3 using Escherichia coli. On the other side, the fusion proteins His-2xFYVE and GST-2xFYVE were only produced in the prokaryotic system, also following previous reported evidence. We confirmed that VP3, produced in either system, exhibited similar behavior in our co-flotation and bio-layer interferometry (BLI) assays. However, the results of co-flotation and BLI assays shown in Figs. 1 and 4 were performed using the His-VP3 FL, His-VP3 FL R<sub>200</sub>D and His-VP3 FL DCt fusion proteins produced from the corresponding baculoviruses. We have clarified this in the revised version of our manuscript. Please, see lines 430-432.
Additionally, we have made clear that the His-VP3 FL protein purification yielded four distinct bands, and we confirmed their VP3 identity through mass spectrometry in the revised version of the manuscript. Please, see lines 123-124.
Finally, we replaced membranes for Figs. 4A and 1G (left panel) with those with more similar band intensities. Please, see the new version of Figures 1 and 4.
(2) Possible "other" effects of the R<sub>200</sub>D mutation on the VP3 protein. The authors performed mutagenesis to identify which residues within patch 2 on VP3 are important for association with PI3P. They found that a VP3 mutant with an engineered R<sub>200</sub>D change (i) did not associate with PI3P membranes in co-floatation assays, and (ii) did not co-localize with EE markers in transfected cells. Moreover, this mutation resulted in the loss of IBDV viability in reverse genetics studies. The authors interpret these results to indicate that this residue is important for "mediating VP3-PI3P interaction" (line 211) and that this interaction is essential for viral replication. However, it seems possible that this mutation abrogated other aspects of VP3 function (e.g., dimerization or other protein/RNA interactions) aside from or in addition to PI3P binding. Such possibilities are not mentioned by the authors.
The arginine amino acid at position 200 of VP3 is not located in any of the protein regions associated with its other known functions: VP3 has a dimerization domain located in the second helical domain, where different amino acids across the three helices form a total of 81 interprotomeric close contacts; however, R<sub>200</sub> is not involved in these contacts (Structure. 2008 Jan;16(1):29-37, doi:10.1016/j.str.2007.10.023); VP3 has an oligomerization domain mapped within the 42 C-terminal residues of the polypeptide, i.e., the segment of the protein composed by the residues at positions 216-257 (J Virol. 2003 Jun;77(11):6438–6449, doi: 10.1128/jvi.77.11.6438-6449.2003); VP3’s ability to bind RNA is facilitated by a region of positively-charged amino acids, identified as P1, which includes K<sub>99</sub>, R<sub>102</sub>, K<sub>105</sub>, and K<sub>106</sub> (PLoS One. 2012;7(9):e45957, doi: 10.1371/journal.pone.0045957). Furthermore, our findings indicate that the R<sub>200</sub>D mutant retains a folding pattern similar to the wild-type protein, as shown in Figure 4B. All these lead us to conclude that the loss of replication capacity of R<sub>200</sub>D viruses results from impaired, or even loss of, VP3-PI3P interaction.
We agree with the reviewer that this is an important point and have accordingly addressed it in the Discussion section of the revised manuscript. Please, see lines 333-346.
(3) Interpretations from computational simulations. The authors performed computational simulations on the VP3 structure to infer how the protein might interact with membranes. Such computational approaches are powerful hypothesis-generating tools. However, additional biochemical evidence beyond what is presented would be required to support the authors' claims that they "unveiled a two-stage modular mechanism" for VP3-PI3P interactions (see lines 55-59). Moreover, given the biochemical data presented for R<sub>200</sub>D VP3, it was surprising that the authors did not perform computational simulations on this mutant. The inclusion of such an experiment would help tie together the in vitro and in silico data and strengthen the manuscript.
We acknowledge that the wording used in the previous version of the manuscript may have overstated the "unveiling" of the two-stage binding mechanism of VP3. Our intention was to propose a potential mechanism, that is consistent both with the biophysical experiments and the molecular simulations. In the revised version of the manuscript, we have tempered these claims and framed them more appropriately.
Regarding the simulations for the R<sub>200</sub>D VP3 mutant, these simulations were indeed performed and included in the original manuscript as part of Figure S14 in the Supplementary Information. However, we realize that this was not sufficiently emphasized in the main text, an oversight on our part. We have now revised the manuscript to highlight these results more clearly.
Additionally, to further strengthen the connection between experimental and simulation trends, we have now included a new figure in the Supplementary Information (Figure S15). This figure depicts the binding energy of VP3 ΔNt and two of its mutants, VP3 ΔNt R<sub>200</sub>D and VP3 ΔNt P2 Mut, as a function of salt concentration. The results show that as the number of positively charged residues in VP3 is systematically reduced, the binding of the protein to the membrane becomes weaker. The effect is more pronounced at lower salt concentrations, which highlights the weight of electrostatic forces on the adsorption of VP3 on negatively charged membranes. Please, see Supplementary Information (Figure S15).
Reviewer #3 (Public Review):
Summary:
Infectious bursal disease virus (IBDV) is a birnavirus and an important avian pathogen. Interestingly, IBDV appears to be a unique dsRNA virus that uses early endosomes for RNA replication that is more common for +ssRNA viruses such as for example SARS-CoV-2.
This work builds on previous studies showing that IBDV VP3 interacts with PIP3 during virus replication. The authors provide further biophysical evidence for the interaction and map the interacting domain on VP3.
Strengths:
Detailed characterization of the interaction between VP3 and PIP3 identified R<sub>200</sub>D mutation as critical for the interaction. Cryo-EM data show that VP3 leads to membrane deformation.
Weaknesses:
The work does not directly show that the identified R<sub>200</sub> residues are directly involved in VP3-early endosome recruitment during infection. The majority of work is done with transfected VP3 protein (or in vitro) and not in virus-infected cells. Additional controls such as the use of PIP3 antagonizing drugs in infected cells together with a colocalization study of VP3 with early endosomes would strengthen the study.
In addition, it would be advisable to include a control for cryo-EM using liposomes that do not contain PIP3 but are incubated with HIS-VP3-FL. This would allow ruling out any unspecific binding that might not be detected on WB.
The authors also do not propose how their findings could be translated into drug development that could be applied to protect poultry during an outbreak. The title of the manuscript is broad and would improve with rewording so that it captures what the authors achieved.
In previous works from our group, we demonstrated the crucial role of the VP3 P2 region in targeting the early endosomal membranes and for viral replication, including the use of PI3K inhibitors to deplete PI3P, showing that both the control RFP-2xFYVE and VP3 lost their ability to associate with the early endosomal membranes and reduces the production of an infective viral progeny (J Virol. 2018 May 14;92(11):e01964-17, doi: 10.1128/jvi.01964-17; J Virol. 2021 Feb 24;95(6):e02313-20, doi: 10.1128/jvi.02313-20). In the present work, to further characterize the role of R<sub>200</sub> in binding to early endosomes and for viral replication, we show that: i) the transfected VP3 R<sub>200</sub>D protein loses the ability to bind to early endosomes in immunofluorescence assays (Figure 2E and Figure 3); ii) the recombinant His-VP3 FL R<sub>200</sub>D protein loses the ability to bind to liposomes PI3P(+) in co-flotation assays (Figure 4A); and, iii) the mutant virus R<sub>200</sub>D loses replication capacity (Figure 4C).
Regarding the cryo-electron microscopy observation, we verified that there is no binding of gold particles to liposomes PI3P(-) when they are incubated solely with the gold-particle reagent, or when they are pre-incubated with the gold-particle reagent with either His-2xFYVE or His-VP3 FL. We have incorporated a new panel in Figure 1C showing a representative image of these results. Please, see lines 143-144 in the revised version of our manuscript and our revised version of Figure 1C.
We have replaced the title of the manuscript by a more specific one. Thus, our current is " On the Role of VP3-PI3P Interaction in Birnavirus Endosomal Membrane Targeting".
Regarding the question of how our findings could be translated into drug development, indeed, VP3-PI3P binding constitutes a good potential target for drugs that counteract infectious bursal disease. However, we did not mention this idea in the manuscript, first because it is somewhat speculative and second because infected farms do not implement any specific treatment. The control is based on vaccination.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Critical issues to address:
(1) The citations in the important paragraph on lines 101-5 are not identifiable. These references are described as showing that VP3 is associated with EEs via P2 and PI3P, which is basically what this paper also shows. The significant advance here is unclear.
We apologize for this mistake. These citations are identifiable in the revised version of the manuscript (lines 100-105). As mentioned before, in this manuscript we present biochemical and biophysical details that have not been reported before about how VP3 connects with early endosomes, showing that it interacts directly with the PI3P. Additionally, we have now identified a critical residue in VP3 P2—the R<sub>200</sub>—for binding to PI3P and its key role in the viral life cycle. Furthermore, the molecular dynamics simulations helped us come up with a mechanism for VP3 to connect with PI3P in early endosomes. This constitutes a big step forward in our understanding of how these "non-canonical" viruses replicate.
(2) Even if all the claims were to be clearly supported through major revamping, authors should make the significance of knowing that this protein binds to early endosomes through PI3P more clear?
Thank you for the recommendation, which aligns with a similar suggestion from Reviewer #2. In response, we have revised the significance paragraph to emphasize the mechanistic aspects of our findings. Please refer to lines 62–67 in the revised manuscript.
(3) Flotation assay shows binding, but this is not quantitative. An estimate of a Kd would be useful. BLI experiments suggest that half of the binding disappears at 0.5 mM, implying a very low binding affinity.
We agree with the reviewer that our biophysical and molecular simulation results suggest a specific but weak interaction of VP3 with PI3P bearing membranes. Indeed, our previous version of the manuscript already contained a paragraph in this regard. Please, see lines 323-332 in the revised version of the manuscript.
From a biological point of view, a low binding affinity of VP3 for the endosomes may constitute an advantage for the virus, in the sense that its traffic through the endosomes may be short lived during its infectious cycle. Indeed, VP3 has been demonstrated to be a "multifunctional" protein involved in several processes of the viral cycle (detailed in lines 84-90), and in our laboratory we have shown that the Golgi complex and the endoplasmic reticulum are organelles where further viral maturation occurs. Taking all of this into account, a high binding affinity of VP3 for endosomes could result in the protein becoming trapped on the endosomal membrane, potentially hindering the progression of the viral infection within the host cell.
(4) There are some major internal inconsistencies in the data: Figure 1B quantifies VP3-FL T/B ratio ~4 (which appears inconsistent with the image shown, as the T lanes are much lighter than the B) whereas apparently the same experiment in Figure 1G shows it to be ~0.6. With the error bars shown, these results would appear dramatically different from each other, despite supposedly measuring the same thing. The same issue with the FYVE domain between Figures 1A and 4A.
We appreciate the reviewer’s comment, as it made us aware of an error in Figure 1B. There, the mean value for the VP3-FL Ts/B ratio is 3.0786 for liposomes PI3P(+) and 0.4553 for liposomes PI3P(-) (Please, see the new bar graph on Figure 1B). This may have occurred because, due to the significance of these experiments, we performed multiple rounds of quantification in search of the most suitable procedure for our observations, leading to a mix-up of data sets. Anyway, it’s possible that these corrected values still seem inconsistent given that T lanes are much lighter than the B for VP3-FL in the image shown. Flotation assays are quite labor-intensive and, at least in our experience, yield fairly variable results in terms of quantification. To illustrate this point, the following image shows the three experiments conducted for Figure 1B, where it is clear that, despite producing visually distinct images, all three yielded the same qualitative observation. For Figure 1B, we chose to present the results from experiment #2. However, all three experiments contributed to a Ts/B ratio of 3.0786 for His-VP3 FL, which may account for the apparent inconsistency when focusing solely on the image in Figure 1B.
Author response image 1.
We acknowledge that, at first glance, some inconsistencies may appear in the results, and we have thoroughly discussed the best approach for quantification. However, we believe the observations are robust in terms of reproducibility and reliable, as the VP3-PI3P interaction was consistently validated by comparison with liposomes lacking PI3P, where no binding was observed.
(5) Comparison of PA (or PI) to PI3P at the same molar concentration is inappropriate because PI3P has at least double charge. The more interesting question about specificity would be whether PI45P2 (or even better PI35P2) binds or not. Without this comparison, no claim to specificity can be made.
For us, "specificity" refers to the requirement of a phosphoinositide in the endosomal membrane for VP3 binding. Phosphoinositides have a conspicuous distribution among cellular compartments, and knowing that VP3 associates with early endosomes, our specificity assays aimed to demonstrate that PI3P is strictly required for the binding of VP3. To validate this, we used PI (lacking the phosphate group) and PA (lacking the inositol group) despite their similar charges. In spite of the potential chemical interactions between VP3 and various phosphoinositides, our experimental results suggest that the virus specifically targets endosomal membranes by binding to PI3P, a phosphoinositide present only in early endosomes.
That said, we agree with the reviewer’s point and consider adequate to smooth our specificity claim in the manuscript as follows: “We observed that His-VP3 FL bound to liposomes PI3P(+), but not to liposomes PA or PI, reinforcing the notion that a phosphoinositide is required since neither a single negative charge nor an inositol ring are sufficient to promote VP3 binding to liposomes (SI Appendix, Fig. S2)” (Lines 136-139).
(6) In the EM images, many of the gold beads are inside the vesicles. How do they cross the membranes?
They do not cross the membrane. Our EM images are two-dimensional projections, meaning that the gold particles located on top or beneath the plane appear to be inside the liposome.
(7) Images in Figure 2D are very low quality and do not show the claimed difference between any of the mutants. All red signal looks basically cytosolic in all images. It is not clear what criteria were used for the quantification in Figure 2E. The same issue is in Figure 2E, where no red WT puncta are observable at all. Consistently, there is minimal colocalization in the quantification in Figure S3, which appears to show no significant differences between any of the mutants, in direct contradiction to the claim in the manuscript.
We apologize for the poor quality of panels in Figures 2D and 2E. Unfortunately, this was due to the PDF conversion of the original files. Please, check the high-quality version of Figure 2. As suggested by reviewers #2 and #3, we have incorporated zoomed panels, which help the reader to better see the differences in distribution.
As mentioned in the legend to Figure 2, the quantification in Figure 2D was performed by calculating the percentage of cells with punctuated fluorescent red signal (showing VP3 distribution) for each protein. The data were then normalized to the P2 WT protein, which is the VP3 wild type.
Figure S3 certainly shows a tendency which positively correlates with the results shown in Figure 3, where we used FYVE to detect PI3P on endosomes and observed significantly less co-localization when VP3 bears its P2 region all reversed or lacks the R<sub>200</sub>
(8) The only significant differences in colocalization are in Figure 3B, whose images look rather dramatically different from the rest of the manuscript, leading to some concern about repeatability. Also, it is unclear how colocalization is quantified, but this number typically cannot be above 1. Finally, it is unclear what is being colocalized here: with three fluorescent components, there are 3 possible binary colocalizations and an additional ternary colocalization.
We thank the reviewer for pointing out those aspects related to Figure 3. The experiments performed for Figure 3B were conducted by a collaborator abroad handling the purified GST-2xFYVE, which recognizes endogenous PI3P, while the rest of the cell biology experiments were conducted in our laboratory in Argentina. This is why they are aesthetically different. We have made an effort in homogenizing the way they look for the revised version of the manuscript. Please, see the new version of Figure 3.
For quantification of the co-localization of VP3 and EGFP-2xFYVE (Figure 3A), the Manders M2 coefficient was calculated out of approximately 30 cells per construct and experiment. The M2 coefficient, which reflects co-localization of signals, is defined as the ratio of the total intensities of magenta image pixels for which the intensity in the blue channel is above zero to the total intensity in the magenta channel. JACoP plugin was utilized to determine M2. For VP3 puncta co-distributing with EEA1 and GST-FYVE (Figure 3B), the number of puncta co-distributing for the three signals was manually determined out of approximately 40 cells per construct and experiment per 200 µm². We understand that Manders or Pearson coefficients, typically ranging between 0 and 1, is the most commonly used method to quantify co-localizing immunofluorescent signals; however, this “manual” method has been used and validated in previous published manuscripts [Figures 3 and 7 from (Morel et al., 2013); Figure 7 in (Khaldoun et al., 2014); and Figure 4 in (Boukhalfa et al., 2021)].
(9) SegA/B plasmids are not introduced, and it is not clear what these are or how this assay is meant to work. Where are the foci forming units in the images of Figure 4C? How does this inform on replication? Again, this assay is not quantitative, which is essential here: does the R<sub>200</sub> mutant completely kill activity (whatever that is here)? Or reduce it somewhat?
We apologize for the missing information. Segments A and B are basically the components of the IBDV reverse genetics system. For their construction, we used a modification of the system described by Qi and coworkers (Qi et al., 2007), in which the full length sequences of the IBDV RNA segments A and B, flanked by a hammerhead ribozyme at the 5’-end and the hepatitis delta ribozyme at the 3’-end, were expressed under the control of an RNA polymerase II promoter within the plasmids pCAGEN.Hmz.SegA.Hdz (SegA) and pCAGEN.Hmz.SegB.Hdz (SegB). For this specific experiment we generated a third plasmid, pCAGEN.Hmz.SegA.R<sub>200</sub>D.Hdz (SegA.R<sub>200</sub>D), harboring a mutant version of segment A cDNA containing the R<sub>200</sub>D substitution. Then, QM7 cells were transfected with the plasmids SegA, SegB or Seg.R<sub>200</sub>D alone (as controls) or with a mixture of plasmids SegA+SegB (wild type situation) or SegA.R<sub>200</sub>D+SegB (mutant situation). At 8 h post transfection (p.t.), when the new viruses have been able to assemble starting from the two segments of RNA, the cells were recovered and re-plated onto fresh non-transfected cells for revealing the presence (or not) of infective viruses. At 72 h post-plating, the generation of foci forming units (FFUs) was revealed by Coomassie staining. As expected, single-transfections of SegA, SegB or Seg.R<sub>200</sub>D did not produce FFUs and, as shown in Figure 4C, the transfection of SegA+SegB produced detectable FFUs (the three circles in the upper panel) while no FFUs (the three circles in the lower panel) were detected after the transfection of SegA.R<sub>200</sub>D+SegB (Figure 4C). This system is quantitative, since the FFUs detected 72 h post-plating are quantifiable by simply counting the FFUs. However, since no FFUs were detected after the transfection of SegA.R<sub>200</sub>D+SegB, evidenced by a complete monolayer of cells stained blue, we did not find any sense in quantifying. In turn, this drastic observation indicates that viruses bearing the VP3 R<sub>200</sub>D mutation lose their replication ability (is “dead”), demonstrating its crucial role in the infectious cycle.
We agree with the reviewer that a better explanation was needed in the manuscript, so we have incorporated a paragraph in the results section of our revised version of the manuscript (lines 209-219).
(10) Why pH 8 for simulation?
The Molecular Theory calculations were performed at pH 8 for consistency with the experimental conditions used in our biophysical assays. These biophysical experiments were also performed at pH 8, following the conditions established in the original study where VP3 was first purified for crystallization (DOI: 10.1016/j.str.2007.10.023).
(11) There is minimal evidence for the sequential binding model described in the abstract. The simulations do not resolve this model, nor is truly specific PI3P binding shown.
In response to your concerns, we would like to emphasize that our simulations provide robust evidence supporting the two more important aspects of the sequential binding model: 1) Membrane Approach: In all simulations, VP3 consistently approaches the membrane via its positively charged C-terminal (Ct) region. 2) PI3P Recruitment: Once the protein is positioned flat on the membrane surface, PI3P is unequivocally recruited to the positively charged P2 region. The enrichment of PI3P in the proximity to the protein is clearly observed and has been quantified via radial distribution functions, as detailed in the manuscript and supplementary material.
While we understand that opinions may vary on the sufficiency of the data to fully validate the model, we believe the results offer meaningful insights into the proposed binding mechanism. That said, we acknowledge that the specificity of VP3 binding may not be restricted solely to PI3P but could extend to phosphoinositides in general. To address this, we performed the new set of co-flotation experiments which are discussed in detail in our response to point 5.
Reviewer #2 (Recommendations For The Authors):
(1) Line 1: Consider changing the title to better reflect the mostly biochemical and computational data presented in the paper: "Mechanism of Birnavirus VP3 Interactions with PI3P-Containing Membranes". There are no data to show hijacking by a virus presented.
We appreciate this recommendation, which was also expressed by reviewer #3. Additionally, we thank for the suggested title. We have replaced the title of the manuscript by a more specific one. Thus, our current is
"On the Role of VP3-PI3P Interaction in Birnavirus Endosomal Membrane Targeting".
(2) Lines 53-54 and throughout: Consider rephrasing "demonstrate" to "validate" to give credit to Gimenez et al., 2018, 2022 for discovery.
Thanks for the suggestion. We have followed it accordingly. Please see line 52 from our revised version of the manuscript.
(3) Line 56-59 and throughout: Consider tempering and rephrasing these conclusions that are based mostly on computational data. For example, change "unveil" to "suggest" or another term.
We have now modified the wording throughout the manuscript.
(4) The abstract could also emphasize that this study sought to map the resides within VP3 that are important for P13P interaction.
Thanks for the suggestion. We have followed it accordingly. Please, see lines 53-55 from our revised version of the manuscript.
(5) Lines 63-69: This Significance paragraph seems tangential. The findings in this paper aren't at all related to the evolutionary link between birnaviruses and positive-strand RNA viruses. The significance of the work for me lies in the deep biochemical/biophysical insights into how a viral protein interacts with membranes to nucleate its replication factory.
We have re-written the significance paragraph highlighting the mechanistic aspect of our findings. Please, see lines 62-67 in our revised version of the manuscript.
(6) Line 74: Please define "IDBV" abbreviation.
We apologize for the missing information. We have defined the IBDV abbreviation in our revised version of the manuscript (please, see line 73).
(7) Line 88: Please define "pVP2" abbreviation.
We apologize for the missing information. We have defined the pVP2 abbreviation in our revised version of the manuscript (please, see line 87).
(8) Lines 101-105: Please change references (8, 9, 10) to be consistent with the rest of the manuscript (names, year).
We apologize for this mistake. These citations are identifiable and consistent in the revised version of the manuscript (lines 100-105).
(9) Line 125: For a broad audience, consider explaining that recombinant His-2xFYVE domain is known to exhibit PI3P-binding specificity and was used as a positive control.
Thanks for the recommendation. We have incorporated a brief explanation supporting the use of His-2xFYVE as a positive control in our revised version of the manuscript. Please, see lines 127-129.
(10) Lines 167-171: The quantitative data in Figure S3 shows that there was a non-significant co-localization coefficient of the R<sub>200</sub>D mutant. For transparency, this should be stated in the Results section when referenced.
We agree with this recommendation. We have clearly mentioned it in the revised version of the manuscript. Please, see lines 177-179. Also, we have referred this fact when introducing the assays performed using the purified GST-2xFYVE, shown in Figure 3. Please, see lines 182-184.
(11) Lines 156 and 173: These Results section titles have nearly identical wording. Consider rephrasing to make it distinct.
We agree with the reviewer’s observation. In fact, we sought to do it on purpose as for them to be a “wordplay”, but we understand that could result in a awkwarded redundancy. So, in the revised version of the manuscript, both titles are:
Role of VP3 P2 in the association of VP3 with the EE membrane (line 163).
VP3 P2 mediates VP3-PI3P association to EE membranes (line 182).
(12) Line 194: Is it alternatively possible that the R<sub>200</sub>D mutant lost its capacity to dimerize, and that in turn impacted PI3P interaction?
Thanks for the relevant question. VP3 was crystallized and its structure reported in (Casañas et al., 2008) (DOI: 10.1016/j.str.2007.10.023). In that report, the authors showed that the two VP3 subunits associate in a symmetrical manner by using the crystallographic two-fold axes. Each subunit contributes with its 30% of the total surface to form the dimer, with 81 interprotomeric close contacts, including polar bonds and van der Waals contacts. The authors identified the group of residues involved in these interactions, among which the R<sub>200</sub> is not included. Addittionally, the authors determined that the interface of the VP3 dimer in crystals is biologically meaningful (not due to the crystal packing).
To confirm that the lack of binding was not due to misfolding of the mutant, we compared the circular dichroism spectra of mutant and wild type proteins, without detecting significant differences (shown in Figure 4B). These observations do not exclude the possibility mentioned by the reviewer, but constitute solid evidences, we believe, to validate our observations.
(13) Lines 231-243: Consider changing verbs to past tense (i.e., change "is" to "was") for the purposes of consistency and tempering.
Thanks for the recommendation, we have proceeded as suggested. Please, see lines 249-262 in our revised version of the manuscript.
(14) Lines 306-308: Is there any information about whether it is free VP3 (v. VP3 complexed in RNP) that binds to membrane? I am just trying to wrap my head around how these factories form during infection.
Thanks for pointing this out. We first observed that in infected cell, all the components of the RNPs [VP3, VP1 (the viral polymerase) and the dsRNA] were associated to the endosomes. Since by this moment it had been already elucidated that VP3 "wrapped" de dsRNA within the RNPs (Luque et al., 2009) (DOI: 10.1016/j.jmb.2008.11.029), we sought that VP3 was most probably leading this association. We answered yes after studying its distribution, also endosome-associated, when ectopically expressed. These results were published in (Delgui et al., 2013) (DOI: 10.1128/jvi.03152-12).
Thus, in our subsequent studies, we have worked with both, the infection-derived or the ectopically expressed VP3, to advance in elucidating the mechanism by which VP3 hijacks the endosomal membranes and its relevancy for viral replication, reported in this current manuscript.
(15) Lines 320-334: This last paragraph discussing evolutionary links between birnaviruses and positive-strand RNA viruses seems tangential and distracting. Consider reducing or removing.
Thanks for highlighting this aspect of our work. Maybe difficult to follow, but in the context of other evidences reported for the Birnaviridae family of viruses, we strongly believe that there is an evolutionary aspect in having observed that these dsRNA viruses replicate associated to membranous organelles, a hallmark of +RNA viruses. However, we agree with the reviewer that this might not be the main point of our manuscript, so we reduced this paragraph accordingly. Please, see lines 358-367 in our revised version of the manuscript.
(16) Lines 322-324: Change "RdRd" to "RdRp" if keeping paragraph.
Thanks. We have corrected this mistake in lines 360 and 361.
(17) Figures 1A, 1B, and throughout: Again, please check and explain protein sizes and amounts. This would improve the clarity of the manuscript.
All our flotation assays were performed using 1 mM concentration of purified protein in a final volume of 100 mL (mentioned in M&M section). The complete fusion protein His-2xFYVE (shown in Figs. 1A and 4A left panel) is 954 base pairs-long and contains 317 residues (~35 kDa). The complete fusion protein His-VP3 FL (shown in Figs. 1B and 1G left panel) is 861 base pairs-long and contains 286 residues (~32 kDa). The complete fusion protein His-VP3 DCt (shown in Fig. 1G, right panel) is 753 bp-long and contains 250 residues (~28 kDa). The complete fusion protein His-VP3 FL R<sub>200</sub>D (shown in Fig. 4A right panel) is 861 bp-long and contains 286 residues (~32 kDa). This latter information was incorporated in our revised version of the manuscript. Please, see lines 381-382, 396-397 and 399-400 from the M&M section, and lines in the corresponding figure legends.
(18) Figures 1B and 1G show different results for PI3P(+) membranes. I see protein associated with the top fraction in 1B, but I don't see any such result in 1G.
As already mentioned, liposome-based methods, such as the co-flotation assay, are well-established and widely regarded as the preferred approach for studying protein-phosphoinositide interactions. However, this approach is rather qualitative, as density gradient separation reveals whether the protein is located in the top fractions (bound to liposomes) or the bottom fractions (unbound). Our quantifications aim to demonstrate differences in the bound fraction between liposome populations with and without PI3P. Given the setting of the co-flotation assays, each protein-liposome system [2xFYVE-PI3P(-), 2xFYVE-PI3P(+), VP3-PI3P(-), or VP3-PI3P(+)] is assessed separately, and even if the conditions are homogeneous, it’s not surprising to observe differences in the protein level between each one. Indeed, the revised version of the manuscript include a membrane for Figure 1G, were His-VP3 FL associated with the top fraction is more clear. Please, see the new version of Figure 1G.
(19) Figure 1C: Please include cryo-EM images of the liposome PI3P(-) variables to assess the visual differences of the liposomal membranes under these conditions.
Thanks for the recommendation. it has been verified that there is no binding of gold particles to liposomes PI3P(-) when they are incubated solely with the gold-particle reagent, or when they are pre-incubated with the gold-particle reagent with either His-2xFYVE or His-VP3 FL. We have incorporated a new panel in Figure 1C showing a representative image of these results. Please, see lines 143-144 in the revised version of our manuscript and our revised version of Figure 1C.
(20) Figures 2D, 2E, and 3A: The puncta are not obvious in these images. Consider adding Zoomed panels.
We apologize for this aspect of Figures 2 and 3, also highlighted by reviewer #1. We believe that this was due to the low quality resulting from the PDF conversion of the original files. For Figure 3A, we have homogenized its aspect with those from 3B. Regarding Figure 2, we have incorporated zoomed panels, as suggested. Please, see the revised versions of both Figures.
(21) Figure 4A: There is almost no protein in the control PI3P(+) blot. Why? Also, the quantification shows no significant membrane association for this control. This result is different from Figure 1A and very confusing (and concerning).
We apologize for the confusion. We replaced membranes for Figure 4A (left panel) with more similar band intensities to that shown in Figure 1A. Please, visit our new version of Figure 4. The quantification shows no significant difference in the association to liposomes PI3P(+) compared to liposomes PI3P(+); it’s true and this is due to, once more, the intrinsically lack of homogeneity of co-flotation assays. However, this one shown in Figure 4A is a redundant control (has been shown in Figure 1A) and we believe that the new membrane is qualitative eloquent.
Reviewer #3 (Recommendations For The Authors):
(1) Overall, the title is general and does not summarize the study. I recommend making the title more specific. The current title is better suited for a review as opposed to a research article. This study provides further biophysical details on the interaction. This should be reflected in the title.
We appreciate this recommendation, which was also expressed by reviewer #2. We have chosen a new title for the manuscript: “On the Role of VP3-PI3P Interaction in Birnavirus Endosomal Membrane Targeting”.
(2) References 8,9,10 are important but they were not correctly cited in the work, this should be corrected.
We apologize for this mistake. These citations are identifiable in our revised version of the manuscript. See lines 100-105.
(3) Flotation experiments and cryo-EM convincingly show that VP3 binds to membranes in a PIP3-dependent manner. However, it would be advisable to include a control for cryo-EM using liposomes that do not contain PIP3 but are incubated with HIS-VP3-FL. This would allow us to rule out any unspecific binding that might not be detected on WB.
Thanks for the advice, also given by reviewer #2. We confirmed that no gold particles were bound on liposomes PI3P(-) even when incubated with the Ni-NTA reagent alone or pre-incubated with His-2xFYVE of His-VP3 FL. We have incorporated a new panel to Figure 1C showing a representative image of these results. Please, see lines 143-144 in the revised version of the manuscript and see the revised version of Figure 1C.
(4) It is not clear what is the difference between WB in B and WB in G. Figure 1G seems to show the same experiment as shown in B, is this a repetition? In both cases, plots next to WBs show quantification with bars, do they represent STD or SEM? Legend A mentions significance p>0.01 (**) but the plot shows ***. This should be corrected.
The Western blot membrane in Figure 1B shows the result of co-flotation assay using His-VP3 FL protein, while the Western blot membrane in Figure 1G (left panel) shows a co-flotation assay using His-VP3 FL protein as a positive control. In another words, in 1B the His-VP3 FL protein is the question while in 1G (left panel) it’s the co-flotation positive control for His-VP3 DCt. The bar plots next to Western blots show quantification, the mean and the STD. Thanks for highlighting this inconsistency. We have now corrected it on the revised version of the manuscript.
(5) It would be useful to indicate positively charged residues and P2 on the AF2 predicted structure in Fig 1.
These are indicated in panels A and B of Figure 2.
(6) Figure 1 legend: Change cryo-fixated liposomes to cryo-fixation or better to "liposomes were vitrified". There is a missing "o" in the cry-fixation in the methods section.
Thanks for the recommendation. We have modified Figure 1. legend to "liposomes were vitrified" (line 758), and fixed the word cryo-fixation in the methods section (line 512).
(7) Figure 2B. It is not clear how the punctated phenotype was unbiasedly characterized (Figure 2D). I see no difference in the representative images. Magnified images should be shown. This should be measured as colocalization (Pearson's and Mander's coefficient) with an early endosomal marker Rab5. Perhaps this figure could be consolidated with Figure 3.
Unfortunately, the lack of clarity in Figure 2D was due to the PDF conversion of the original files. Please, observe the high-quality original image above in response to reviewer #1, where we have additionally included zoomed panels, as also suggested by the other reviewers. For quantification of the co-localization of VP3 and either EGFP-Rab5 orEGFP-2xFYVE, the Manders M2 coefficient was calculated out of approximately 30 cells per construct and experiment and were shown in Figure S3 and Figure 3A, respectively, in our previous version of the manuscript.
(8) PIP3 antagonist drugs should be used to further substantiate the results. If PIP3 specifically recruits VP3, this interaction should be abolished in the presence of PIP3 drug and VP3 should show a diffused signal.
We certainly agree with this point. These experiments were performed and the results were reported in (Gimenez et al., 2020). Briefly, in that work, we blocked the synthesis of PI3P in QM7 cells in a stable cell line overexpressing VP3, QM7-VP3, with either the pan-PI3Kinase (PI3K) inhibitor LY294002, or the specific class III PI3K Vps34 inhibitor Vps34-IN1. In Figure 4, we showed that 98% of the cells treated with these inhibitors had the biosensor GFP-2FYVE dissociated from EEs, evidencing the depletion of PI3P in EEs (Figure 4A). In QM7-VP3 cells, we showed that the depletion of PI3P by either inhibitor caused the dissociation of VP3 from EEs and the disaggregation of VP3 puncta toward a cytosolic distribution (Figure 4B). Moreover, since this observation was crucial for our hipothesis, these results were further confirmed with an alternative strategy to deplete PI3P in EEs. We employed a system to inducibly hydrolyze endosomal PI3P through rapamycin-induced recruitment of the PI3P-myotubularin 1 (MTM1) to endosomes in cells expressing MTM1 fused to the FK506 binding protein (FKBP) and the rapamycin-binding domain fused to Rab5, using the fluorescent proteins mCherry-FKBP-MTM1 and iRFP-FRB-Rab5, as described in (Hammond et al., 2014). These results, shown in Figures 5, 6 and 7 in the same manuscript, further reinforced the notion that PI3P mediates and is necessary for the association of VP3 protein with EEs.
(9) The authors should show the localization of VP3 in IBDV-infected cells and treat cells with PI3P antagonists. The fact that R<sub>200</sub> is not rescued does not necessarily mean that this is because of the failed interaction with PI3P. As the authors wrote in the discussion: VP3 bears multiple essential roles during the viral life cycle (line 305).
Indeed, after having confirmed that the VP3 lost its localization associated to the endosomes after the treatment of the cells with PI3P antagonists, we demonstrated that depletion of PI3P significantly reduced the production of IBDV progeny. For this aim, we used two approaches, the inhibitor Vps34-IN1 and an siRNA against VPs34. In both cases, we observed a significantly reduced production of IBDV progeny (Figures 9 and 10). Specifically related to the reviewer’s question, the localization of VP3 in IBDV-infected cells and treated with PI3P antagonists was shown and quantified in Figure 9a.
(10) Could you provide adsorption-free energy profiles and MD simulations also for the R<sub>200</sub> mutant?
Following the reviewer’s suggestion, we have added a new figure to the supplementary information (Figure S15). Instead of presenting a full free-energy profile for each protein, we focused on the adsorption free energy (i.e., the minimum of the adsorption free-energy profile) for VP3 ΔNt and its mutants, VP3 ΔNt R<sub>200</sub>D and VP3 ΔNt P2 Mut, as a function of salt concentration. The aim was to compare the adsorption free energy of the three proteins and evaluate the effect of electrostatic forces on it, which become increasingly screened at higher salt concentrations. As shown in the referenced figure, reducing the number of positively charged residues from VP3 ΔNt to VP3 ΔNt P2 Mut systematically weakens the protein’s binding to the membrane. This effect is particularly pronounced at lower salt concentrations, underscoring the importance of electrostatic interactions in the adsorption of the negatively charged VP3 onto the anionic membrane.
(11) Liposome deformations in the presence of VP3 are interesting (Figure 6G), were these also observed in Figure 1C?
Good question. The liposome deformations in the presence of VP3 shown in Figure 6G were a robust observation since, as mentioned, it was detectable in 36% of the liposomes PI3P(+), while they were completely absent in PI3P(-) liposomes. However, and unfortunately, the same deformations were not detectable in experiments performed using gold particles shown in Figure 1C. In this regard, we think that it might be possible that the procedure of gold particles incubation itself, or even the presence of the gold particles in the images, would somehow “mask” the deformations effect.
Bibliography
Boukhalfa A, Roccio F, Dupont N, Codogno P, Morel E. 2021. The autophagy protein ATG16L1 cooperates with IFT20 and INPP5E to regulate the turnover of phosphoinositides at the primary cilium. Cell Rep 35:109045. doi:10.1016/j.celrep.2021.109045
Casañas A, Navarro A, Ferrer-Orta C, González D, Rodríguez JF, Verdaguer N. 2008. Structural Insights into the Multifunctional Protein VP3 of Birnaviruses. Structure 16:29–37. doi:10.1016/j.str.2007.10.023
Delgui LR, Rodriguez JF, Colombo MI. 2013. The Endosomal Pathway and the Golgi Complex Are Involved in the Infectious Bursal Disease Virus Life Cycle. J Virol 87:8993–9007. doi:10.1128/JVI.03152-12
Gimenez MC, Issa M, Sheth J, Colombo MI, Terebiznik MR, Delgui LR. 2020. Phosphatidylinositol 3-Phosphate Mediates the Establishment of Infectious Bursal Disease Virus Replication Complexes in Association with Early Endosomes. J Virol 95:e02313-20. doi:10.1128/jvi.02313-20
Hammond GRV, Machner MP, Balla T. 2014. A novel probe for phosphatidylinositol 4-phosphate reveals multiple pools beyond the Golgi. J Cell Biol 205:113–126. doi:10.1083/jcb.201312072
Khaldoun SA, Emond-Boisjoly MA, Chateau D, Carrière V, Lacasa M, Rousset M, Demignot S, Morel E. 2014. Autophagosomes contribute to intracellular lipid distribution in enterocytes. Mol Biol Cell 25:118. doi:10.1091/mbc.E13-06-0324
Luque D, Saugar I, Rejas MT, Carrascosa JL, Rodríguez JF, Castón JR. 2009. Infectious Bursal Disease Virus: Ribonucleoprotein Complexes of a Double-Stranded RNA Virus. J Mol Biol 386:891–901. doi:10.1016/j.jmb.2008.11.029
Morel E, Chamoun Z, Lasiecka ZM, Chan RB, Williamson RL, Vetanovetz C, Dall’Armi C, Simoes S, Point Du Jour KS, McCabe BD, Small SA, Di Paolo G. 2013. Phosphatidylinositol-3-phosphate regulates sorting and processing of amyloid precursor protein through the endosomal system. Nature Communications 2013 4:1 4:1–13. doi:10.1038/ncomms3250
Qi X, Gao Y, Gao H, Deng X, Bu Z, Wang Xiaoyan, Fu C, Wang Xiaomei. 2007. An improved method for infectious bursal disease virus rescue using RNA polymerase II system. J Virol Methods 142:81–88. doi:10.1016/j.jviromet.2007.01.021
-
eLife Assessment
Zanetti et al use convincing biophysical and cellular assays to investigate the interaction of the birnavirus VP3 protein with the early endosome lipid PI3P. The study provides valuable insights and will be of interest to virologists. In future studies, it would be interesting to demonstrate that VP3-PIP3P is a specific interaction and not a general interaction with other PIPs.
-
Reviewer #1 (Public review):
Summary:
Zanetti et al use biophysical and cellular assays to investigate the interaction of the birnavirus VP3 protein with the early endosome lipid PI3P. The major novel finding is that association of the VP3 protein with an anionic lipid (PI3P) appears to be important for viral replication, as evidenced through a cellular assay on FFUs.
Strengths:
Support previously published claims that VP3 associates with early endosome membrane, potentially through binding to PI3P. The finding that mutating a single residue (R200) critically affects early endosome binding and that the same mutation also inhibits viral replication suggests a very important role for this binding in the viral life cycle.
Weaknesses:
The manuscript is relatively narrowly focused: the specifics of the bi-molecular interaction between the VP3 of an unusual avian virus and a host cell lipid (PIP3). Further, the affinity of this interaction is low and its specificity relative to other PIPs is not tested, leading to questions about whether VP3-PI3P binding is relevant.
-
Reviewer #3 (Public review):
Summary:
infectious bursal disease virus (IBDV) is a birnavirus and an important avian pathogen. Interestingly, IBDV appears to be a unique dsRNA virus that uses early endosomes for RNA replication that is more common for +ssRNA viruses such as for example SARS-CoV-2.
This work builds on previous studies showing that IBDV VP3 interacts with PIP3 during virus replication. The authors provide further biophysical evidence for the interaction and map the interacting domain on VP3.
Strengths:
Detailed characterization of the interaction between VP3 and PIP3 identified R200D mutation as critical for the interaction. Cryo-EM data show that VP3 leads to membrane deformation.
Comments on revisions:
I have no further comments. The authors have addressed my questions and concerns. I congratulate the authors on their work!
-
-
www.biorxiv.org www.biorxiv.org
-
Reviewer #2 (Public Review):
When people help others is an important psychological and neuroscientific question. It has received much attention from the psychological side, but comparatively less from neuroscience. The paper translates some ideas from a social Psychology domain to neuroscience using a neuroeconomically oriented computational approach. In particular, the paper is concerned with the idea that people help others based on perceptions of merit/deservingness, but also because they require/need help. To this end, the authors conduct two experiments with an overlapping participant pool:
(1) A social perception task in which people see images of people that have previously been rated on merit and need scales by other participants. In a blockwise fashion, people decide to whether the depicted person a) deserves help, b) needs help, and c) whether the person uses both hands (== control condition)
(2) In an altruism task, people make costly helping decisions by deciding between giving a certain amount of money to themselves or another person. It is manipulated how much the other person needs and deserves the money.
The authors use sound and robust computational modelling approach for both tasks using evidence accumulation models. They analyse behavioural data for both tasks, showing that the behaviour is indeed influenced, as expected, by the deservingness and the need of the shown people. Neurally, the authors use a block-wise analysis approach to find differences in activity levels across conditions of the social perception task. The authors do find large activation clusters in areas related to theory of mind. Interestingly, they also find that activity in TPJ that relates to the deservingness condition correlates with people's deservingness ratings while they do the task, but also with computational parameters related to helping others in the second task, the one that was conducted many months later. Also some behavioural parameters correlate across the two tasks, suggesting that how deserving of help others are perceived reflects a relatively stable feature that translates into concrete helping decisions later-on.
The conclusions of the paper are overall well supported by the data.
(1) I found that the modelling was done very thoroughly for both tasks. Overall, I had the impression that the methods are very solid with many supplementary analyses. The computational modelling is done very well.
(2) A slight caveat, however, regarding this aspect, is that, in my view, the tasks are relatively simplistic, so that even the complex computational models do not as much as they can in the case of more complex paradigms. For example, the bias term in the model seems to correspond to the mean response rate in a very direct way (please correct me if I am wrong).
(3) Related to the simple tasks: The fMRI data is analysed in a simple block-fashion. This is in my view not appropriate to discern the more subtle neural substrates of merit/need-based decision making or person perception. Correspondingly, the neural activation patterns (merit > control, need > control) are relatively broad and unspecific. They do not seem to differ in the classic theory of mind regions, that are the focus of the analyses.
(4) However, the relationship between neural signal and behavioural merit sensitivity in TPJ is noteworthy.
(5) The latter is even more the case, as the neural signal and aspects of the behaviour are correlated across subjects with the second task that is conducted much later. Such a correlation is very impressive and suggests that the tasks are sensitive for important individual differences in helping perception/behaviour.
(6) That being said, the number of participants in the latter analyses are at the lower end of the number of participants that are these days used for across-participant correlations.
-
Reviewer #3 (Public Review):
Summary:
The paper aims at providing a neurocomputational account on how social perception translates in prosocial behaviors. Participants first completed a novel social perception task during fMRI scanning, in which were asked to judge the merit or need of people depicted in different situations. Second , a separate altruistic choice task was used to examine how the perception of merit and need influences the weights people place on themselves, others and fairness when deciding to provide help. Finally, a link between perception and action was drawn in those participants who completed both tasks.
Strengths:
The paper is overall very well written and presented, leaving the reader at ease when describing complex methods and results. The approach used by the author is very compelling, as it combines computational modeling of behavior and neuroimaging data analyses. Despite not being able to comment on the computational model, I find the approach used (to disentangle sensitivity and biases, for merit and need) very well described and derived from previous theoretical work. Results are also clearly described and interpreted.
Weaknesses:
In the social perception task, merit and need are evaluated by means of very different cues that rely on different cognitive processes (more abstract thinking for merit than need). Despite this limitation of the task, the authors were able to argue convincingly in the revised version about the solidity of their findings. Sample size is quite small for study 2, nevertheless the results provide convincing evidence.
-
eLife assessment
These important findings stand out from other similar studies via some convincing demonstration of behavioural and neural relationships between two helping tasks – one focusing more on social perception, one more on its influence on social behaviour – that were performed more than 300 days apart. The claims however would be stronger with a larger sample size.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study presents a useful finding that targeting amino acid metabolism can overcome Trastuzumab resistance in HER2+ breast cancer. The evidence supporting the claims of the authors is solid and the authors may want to validate their results in additional cell lines to strengthen their conclusions. Moreover, the authors should clarify the source of patient samples and why the manuscript focused on epigenetic regulations instead of major transcription factors. The work will be of interest to scientists working in the field of breast cancer.
-
Reviewer #1 (Public review):
Summary:
Hua et al show how targeting amino acid metabolism can overcome Trastuzumab resistance in HER2+ breast cancer.
Strengths:
The authors used metabolomics, transcriptomics and epigenomics approaches in vitro and in preclinical models to demonstrate how trastuzumab-resistant cells utilize cysteine metabolism.
Weaknesses:
However, there are some key aspects that needs to be addressed.
Major:
(1) Patient Samples for Transcriptomic Analysis: It is unclear from the text whether tumor tissues or blood samples were used for the transcriptomic analysis. This distinction is crucial, as these two sample types would yield vastly different inferences. The authors should clarify the source of these samples.
(2) The study only tested one trastuzumab-resistant and one trastuzumab-sensitive cell line. It is unclear whether these findings are applicable to other HER2-positive tumor cell lines, such as HCC1954. The authors should validate their results in additional cell lines to strengthen their conclusions.
(3) Relevance to Metastatic Disease: Trastuzumab resistance often arises in patients during disease recurrence, which is frequently associated with metastasis. However, the mouse experiments described in this paper were conducted only in the primary tumors. This article would have more impact if the authors could demonstrate that the combination of Erastin or cysteine starvation with trastuzumab can also improve outcomes in metastasis models.
Minor:
(1) The figures lack information about the specific statistical tests used. Including this information is essential to show the robustness of the results.
(2) Figure 3K Interpretation: The significance asterisks in Figure 3K do not specify the comparison being made. Are they relative to the DMSO control? This should be clarified.
-
Reviewer #2 (Public review):
In this manuscript, Hua et al. proposed SLC7A11, a protein facilitating cellular cystine uptake, as a potential target for the treatment of trastuzumab-resistant HER2-positive breast cancer. If this claim holds true, the finding would be of significance and might be translated to clinical practice. Nevertheless, this reviewer finds that the conclusion was poorly supported by the data.
Notably, most of the data (Figures 2-6) were based on two cell lines - JIMT1 as a representative of trastuzumab-resistant cell line, and SKBR3 as a representative of trastuzumab sensitive cell line. As such, these findings could be cell-line specific while irrelevant to trastuzumab sensitivity at all. Furthermore, the authors claimed ferroptosis simply based on lipid peroxidation (Figure 3). Cell viability was not determined, and the rescuing effects of ferroptosis inhibitors were missing. The xenograft experiments were also suspicious (Figure 4). The description of how cysteine starvation was performed on xenograft tumors was lacking, and the compound (i.e., erastin) used by the authors is not suitable for in vivo experiments due to low solubility and low metabolic stability. Finally, it is confusing why the authors focused on epigenetic regulations (Figures 5 & 6), without measuring major transcription factors (e.g., NRF2, ATF4) which are known to regulate SLC7A11.
To sum up, this reviewer finds that the most valuable data in this manuscript is perhaps Figure 1, which provides unbiased information concerning the metabolic patterns in trastuzumab-sensitive and primary resistant HER2-positive breast cancer patients.
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Hua et al show how targeting amino acid metabolism can overcome Trastuzumab resistance in HER2+ breast cancer.
Strengths:
The authors used metabolomics, transcriptomics and epigenomics approaches in vitro and in preclinical models to demonstrate how trastuzumab-resistant cells utilize cysteine metabolism.
Thank you for your valuable comments. We would like to extend our appreciation for your efforts. Your constructive suggestion would help improve our research.
Weaknesses:
However, there are some key aspects that needs to be addressed.
Major:
(1) Patient Samples for Transcriptomic Analysis: It is unclear from the text whether tumor tissues or blood samples were used for the transcriptomic analysis. This distinction is crucial, as these two sample types would yield vastly different inferences. The authors should clarify the source of these samples.
Thank you for your valuable comments. In the transcriptomic analysis, we included the data of HER2 positive breast cancer patients who received trastuzumab in I-SPY2 trial (GSE181574). Tumor tissues were used in this dataset.
(2) The study only tested one trastuzumab-resistant and one trastuzumab-sensitive cell line. It is unclear whether these findings are applicable to other HER2-positive tumor cell lines, such as HCC1954. The authors should validate their results in additional cell lines to strengthen their conclusions.
Thank you for your valuable comments. We agree with your opinion, and the exploration of multiple cell lines would make our research findings more comprehensive. This is a limitation of our study, and we would continue to improve our design and methods in future experiments.
(3) Relevance to Metastatic Disease: Trastuzumab resistance often arises in patients during disease recurrence, which is frequently associated with metastasis. However, the mouse experiments described in this paper were conducted only in the primary tumors. This article would have more impact if the authors could demonstrate that the combination of Erastin or cysteine starvation with trastuzumab can also improve outcomes in metastasis models.
Thank you for your valuable comments. We agree with your suggestions. The exploration of metastatic disease would make our research more meaningful and help better address clinical key issues. In our future studies, we will continue to investigate the association between the invasive and metastatic capabilities of trastuzumab resistant HER2 positive breast cancer and cysteine metabolism.
Minor:
(1) The figures lack information about the specific statistical tests used. Including this information is essential to show the robustness of the results.
Thank you for your valuable comments. We would include the statistical information in our figure legends.
(2) Figure 3K Interpretation: The significance asterisks in Figure 3K do not specify the comparison being made. Are they relative to the DMSO control? This should be clarified.
Thank you for your valuable comments. We would clarify the comparison information in our figure legends.
Reviewer #2 (Public review):
In this manuscript, Hua et al. proposed SLC7A11, a protein facilitating cellular cystine uptake, as a potential target for the treatment of trastuzumab-resistant HER2-positive breast cancer. If this claim holds true, the finding would be of significance and might be translated to clinical practice. Nevertheless, this reviewer finds that the conclusion was poorly supported by the data.
Notably, most of the data (Figures 2-6) were based on two cell lines - JIMT1 as a representative of trastuzumab-resistant cell line, and SKBR3 as a representative of trastuzumab sensitive cell line. As such, these findings could be cell-line specific while irrelevant to trastuzumab sensitivity at all. Furthermore, the authors claimed ferroptosis simply based on lipid peroxidation (Figure 3). Cell viability was not determined, and the rescuing effects of ferroptosis inhibitors were missing. The xenograft experiments were also suspicious (Figure 4). The description of how cysteine starvation was performed on xenograft tumors was lacking, and the compound (i.e., erastin) used by the authors is not suitable for in vivo experiments due to low solubility and low metabolic stability. Finally, it is confusing why the authors focused on epigenetic regulations (Figures 5 & 6), without measuring major transcription factors (e.g., NRF2, ATF4) which are known to regulate SLC7A11.
To sum up, this reviewer finds that the most valuable data in this manuscript is perhaps Figure 1, which provides unbiased information concerning the metabolic patterns in trastuzumab-sensitive and primary resistant HER2-positive breast cancer patients.
Thank you for your valuable comments. We agree with your suggestions. Your feedback would help enhance the quality of our research.
(1) Our research was mainly conducted in JIMT1 (trastuzumab resistant) and SKBR3 (trastuzumab sensitive), and this is a limitation of our study. The experimental validation using different cell lines will make our research findings more persuasive. In our future research, we will continuously optimize experimental design and methods to make our findings more comprehensive.
(2) The detection of ferroptosis in our research was mainly performed by evaluating the lipid peroxidation. Experiments measuring cell viability and rescuing effects would help provide more evidence.
(3) In xenograft experiments, the cysteine starvation was performed by feeding cysteine-free diet. The drug dissolution and other conditions were optimized by referring to previous relevant literature. We would clarify more details in our article.
(4) Epigenetic modifications have been recognized as crucial factors in drug resistance formation. An increasing number of studies have emphasized the importance of epigenetic changes in regulating the abnormal expression of oncogenes and tumor suppressor genes related to drug resistance. Currently, the role of epigenetic changes in the development of trastuzumab resistance in HER2 positive breast cancer is still in exploration. We tried to investigate the dysregulation of histone modifications and DNA methylation in trastuzumab resistant HER2 positive breast cancer. Our findings indicated that targeting H3K4me3 and DNA methylation could decrease SLC7A11 expression and induce ferroptosis. This would provide more evidence in exploring trastuzumab resistance mechanisms. We will provide a more detailed discussion in the article.
We would like to extend our appreciation for your constructive suggestions and continue to improve our research in future experiments.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This is an important study reporting that activation of the presynaptic GPR55 receptor suppresses synaptic transmission by modulating GABA release through the reduction of the readily releasable pool without affecting the presynaptic AP waveform and calcium influx. The evidence supporting this claim is compelling and based on an impressive array of techniques including patch-clamp recordings from the axon terminals of cerebellar Purkinje cells and fluorescent imaging of vesicular exocytosis. However, a few technical issues leave some questions open, these include uncertainty regarding the specificity of pharmacological agents and the nature of the endogenous process that would activate this pathway in vivo. In the current form, the evidence indicating that synaptic vesicles become insensitive to VGCC activation in the presence of GPR55 is weak and would need to be supported with additional experimental data.
-
Reviewer #1 (Public review):
In this manuscript, the authors report that GPR55 activation in presynaptic terminals of Purkinje cells decrease GABA release at the PC-DCN synapse. The authors use an impressive array of techniques (including highly challenging presynaptic recordings) to show that GPR55 activation reduces the readily releasable pool of vesicle without affecting presynaptic AP waveform and presynaptic Ca2+ influx. This is an interesting study, which is seemingly well-executed and proposes a novel mechanism for the control of neurotransmitter release. However, the authors' main conclusions are heavily, if not solely, based on pharmacological agents that most often than not demonstrate affinity at multiple targets. Below are points that the authors should consider in a revised version.
Major points:
(1) There is no clear evidence that GPR55 is specifically expressed in presynaptic terminals at the PC-DCN synapse. The authors cited Ryberg 2007 and Wu 2013 in the introduction, mentioning that GPR55 is potentially expressed in PCs. Ryberg (2007) offers no such evidence, and the expression in PC suggested by Wu (2013) does not necessarily correlate with presynaptic expression. The authors should perform additional experiments to demonstrate the presynaptic expression of GPR55 at PC-DCN synapse.
(2) The authors' conclusions rest heavily on pharmacological experiments, with compounds that are sometimes not selective for single targets. Genetic deletion of GPR55 would be a more appropriate control. The authors should also expand their experiments with occlusion experiments, showing if the effects of LPI are absent after AM251 or O-1602 treatment. In addition, the authors may want to consider AM281 as a CB1R antagonist without reported effects at GPR55.
(3) It is not clear how long the different drugs were applied, and at what time the recordings were performed during or following drug application. It appears that GPR55 agonists can have transient effects (Sylantyev, 2013; Rosenberg, 2023), possibly due to receptor internalization. The timeline of drug application should be reported, where IPSC amplitude is shown as a function of time and drug application windows are illustrated.
(4) A previous investigation on the role of GPR55 in the control of neurotransmitter release is not cited nor discussed Sylantyev et al., (2013, PNAS, Cannabinoid- and lysophosphatidylinositol-sensitive receptor GPR55 boosts neurotransmitter release at central synapses). Similarities and differences should be discussed.
Minor point:
(1) What is the source of LPI? What isoform was used? The multiple isoforms of LPI have different affinities for GPR55.
-
Reviewer #2 (Public review):
Summary:
This paper investigates the mode of action of GPR55, a relatively understudied type of cannabinoid receptor, in presynaptic terminals of Purkinje cells. The authors use demanding techniques of patch clamp recording of the terminals, sometimes coupled with another recording of the postsynaptic cell. They find a lower release probability of synaptic vesicles after activation of GPR55 receptors, while presynaptic voltage-dependent calcium currents are unaffected. They propose that the size of a specific pool of synaptic vesicles supplying release sites is decreased upon activation of GPR55 receptors.
Strengths:
The paper uses cutting-edge techniques to shed light on a little-studied, potentially important type of cannabinoid receptor. The results are clearly presented, and the conclusions are for the most part sound.
Weaknesses:
The nature of the vesicular pool that is modified following activation of GPR55 is not definitively characterized.
-
Reviewer #3 (Public review):
Summary:
Inoshita and Kawaguchi investigated the effects of GPR55 activation on synaptic transmission in vitro. To address this question, they performed direct patch-clamp recordings from axon terminals of cerebellar Purkinje cells and fluorescent imaging of vesicular exocytosis utilizing synapto-pHluorin. They found that exogenous activation of GPR55 suppresses GABA release at Purkinje cell to deep cerebellar nuclei (PC-DCN) synapses by reducing the readily releasable pool (RRP) of vesicles. This mechanism may also operate at other synapses.
Strengths:
The main strength of this study lies in combining patch-clamp recordings from axon terminals with imaging of presynaptic vesicular exocytosis to reveal a novel mechanism by which activation of GPR55 suppresses inhibitory synaptic strength. The results strongly suggest that GPR55 activation reduces the RRP size without altering presynaptic calcium influx.
Weaknesses:
The study relies on the exogenous application of GPR55 agonists. It remains unclear whether endogenous ligands released due to physiological or pathological activities would have similar effects. There is no information regarding the time course of the agonist-induced suppression. There is also little evidence that GPR55 is expressed in Purkinje cells. This study would benefit from using GPR55 knockout (KO) mice. The downstream mechanism by which GPR55 mediates the suppression of GABA release remains unknown.
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
In this manuscript, the authors report that GPR55 activation in presynaptic terminals of Purkinje cells decrease GABA release at the PC-DCN synapse. The authors use an impressive array of techniques (including highly challenging presynaptic recordings) to show that GPR55 activation reduces the readily releasable pool of vesicle without affecting presynaptic AP waveform and presynaptic Ca2+ influx. This is an interesting study, which is seemingly well-executed and proposes a novel mechanism for the control of neurotransmitter release. However, the authors' main conclusions are heavily, if not solely, based on pharmacological agents that most often than not demonstrate affinity at multiple targets. Below are points that the authors should consider in a revised version.
We thank the reviewer for the encouraging comments, and will fully address the reviewer’s concerns as detailed below.
Major points:
(1) There is no clear evidence that GPR55 is specifically expressed in presynaptic terminals at the PC-DCN synapse. The authors cited Ryberg 2007 and Wu 2013 in the introduction, mentioning that GPR55 is potentially expressed in PCs. Ryberg (2007) offers no such evidence, and the expression in PC suggested by Wu (2013) does not necessarily correlate with presynaptic expression. The authors should perform additional experiments to demonstrate the presynaptic expression of GPR55 at PC-DCN synapse.
We agree with the reviewer’s concern that the present manuscript lacks the evidence for localization of GPR55 at PC axon terminals. Honestly, our previous attempt to immune-label GPR55 did not work well. Now, we realize that different antibodies are commercially available, and are going to test them. Hopefully, in the revised manuscript, we will demonstrate immunocytochemical images showing GPR55 at terminals of PCs.
(2) The authors' conclusions rest heavily on pharmacological experiments, with compounds that are sometimes not selective for single targets. Genetic deletion of GPR55 would be a more appropriate control. The authors should also expand their experiments with occlusion experiments, showing if the effects of LPI are absent after AM251 or O-1602 treatment. In addition, the authors may want to consider AM281 as a CB1R antagonist without reported effects at GPR55.
We appreciate the reviewer for pointing out the essential issue regarding the specificity of activation of GPR55 in our study. Regarding the direct manipulation of GPR55, such as genetic deletion, we will try acute knock-down of its expression, considering the possibility of compensation which sometimes occur when the complete knock-out is performed. In addition, according to the reviewer’s suggestion, we will examine whether the effects of LPI and AM251 occlude each other, and also perform control experiments showing the lack of CB1R involvement.
(3) It is not clear how long the different drugs were applied, and at what time the recordings were performed during or following drug application. It appears that GPR55 agonists can have transient effects (Sylantyev, 2013; Rosenberg, 2023), possibly due to receptor internalization. The timeline of drug application should be reported, where IPSC amplitude is shown as a function of time and drug application windows are illustrated.
As suggested, the timing and duration of drug application will be indicated together with the time course of changes of IPSC amplitudes. This change will make things much clearer. Thank you for the suggestion.
(4) A previous investigation on the role of GPR55 in the control of neurotransmitter release is not cited nor discussed Sylantyev et al., (2013, PNAS, Cannabinoid- and lysophosphatidylinositol-sensitive receptor GPR55 boosts neurotransmitter release at central synapses). Similarities and differences should be discussed.
We are really sorry for missing this important study in discussion and citation. In the revised version, of course, we will discuss their findings and our data.
Minor point:
(1) What is the source of LPI? What isoform was used? The multiple isoforms of LPI have different affinities for GPR55.
We are sorry for insufficient explanation about the LPI used in our study. We used LPI derived from soy (Merck, catalog #L7635) that was estimated to contain 58% C16:0 and 42% C18:0 or C18:2 LPI. This information will be added to the Materials and Methods in the revised manuscript.
Reviewer #2 (Public review):
Summary:
This paper investigates the mode of action of GPR55, a relatively understudied type of cannabinoid receptor, in presynaptic terminals of Purkinje cells. The authors use demanding techniques of patch clamp recording of the terminals, sometimes coupled with another recording of the postsynaptic cell. They find a lower release probability of synaptic vesicles after activation of GPR55 receptors, while presynaptic voltage-dependent calcium currents are unaffected. They propose that the size of a specific pool of synaptic vesicles supplying release sites is decreased upon activation of GPR55 receptors.
Strengths:
The paper uses cutting-edge techniques to shed light on a little-studied, potentially important type of cannabinoid receptor. The results are clearly presented, and the conclusions are for the most part sound.
We are really happy to hear the encouraging comments from the reviewer.
Weaknesses:
The nature of the vesicular pool that is modified following activation of GPR55 is not definitively characterized.
During revision, we will perform further analysis and additional experiments to obtain deeper insights into the vesicle pools affected by GPR55 as much as possible.
Reviewer #3 (Public review):
Summary:
Inoshita and Kawaguchi investigated the effects of GPR55 activation on synaptic transmission in vitro. To address this question, they performed direct patch-clamp recordings from axon terminals of cerebellar Purkinje cells and fluorescent imaging of vesicular exocytosis utilizing synapto-pHluorin. They found that exogenous activation of GPR55 suppresses GABA release at Purkinje cell to deep cerebellar nuclei (PC-DCN) synapses by reducing the readily releasable pool (RRP) of vesicles. This mechanism may also operate at other synapses.
Strengths:
The main strength of this study lies in combining patch-clamp recordings from axon terminals with imaging of presynaptic vesicular exocytosis to reveal a novel mechanism by which activation of GPR55 suppresses inhibitory synaptic strength. The results strongly suggest that GPR55 activation reduces the RRP size without altering presynaptic calcium influx.
We thank the reviewer for the positive evaluation on our conclusions.
Weaknesses:
The study relies on the exogenous application of GPR55 agonists. It remains unclear whether endogenous ligands released due to physiological or pathological activities would have similar effects. There is no information regarding the time course of the agonist-induced suppression. There is also little evidence that GPR55 is expressed in Purkinje cells. This study would benefit from using GPR55 knockout (KO) mice. The downstream mechanism by which GPR55 mediates the suppression of GABA release remains unknown.
We agree with the reviewer in all respects suggested as weaknesses. Most issues will be made much clearer by the additional experiments and analysis described above to respond to respective issues raised by other reviewers. The situation of endogenous ligands for GPR55 causing the synaptic depression and its downstream mechanism are very important issues, and we are going to discuss these points in the revised manuscript, and like to work on these in the future study.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
Overall, this is an important work: the new methodology of hamFISH is a key additional tool for the assessment of the expression of multiple genes simultaneously. The authors provide convincing evidence of the utility of this approach on Medial Amygdala (MeA) tissue leveraging previous a transcriptomic dataset for gene selection. The authors also present a deeper dive into putative relationships between the on-tissue expression of subsets of genes and connectivity and behavioral regulation. The putative biological insights are intriguing, although preliminary, but notably they set up questions for future studies.
-
Reviewer #1 (Public review):
In their paper entitled "Combined transcriptomic, connectivity, and activity profiling of the medial amygdala using highly amplified multiplexed in situ hybridization (hamFISH)" Edwards et al. present a new method designated as hamFISH (highly amplified multiplexed in situ hybridization) that enables sequential detection of {less than or equal to}32 genes using multiplexed branched DNA amplification. As proof-of-principle, the authors apply the new technique - in conjunction with connectivity, and activity profiling - to the medial amygdala (MeA) of the mouse, which is a critical nucleus for innate social and defensive behaviors.
As mentioned by Edwards et al., hamFISH could prove beneficial as an affordable alternative to other in situ transcriptomic methods, including commercial platforms, that are resource-intensive and require complex analysis pipelines. Thus, the authors envision that the method they present could democratize in situ cell-type identification in individual laboratories.
The data presented by Edwards et al. is convincing. The authors use the appropriate and validated methodology in line with the current state-of-the-art. The paper makes a strong case for the benefits of hamFISH when combining transcriptomics studies with connectivity tracing and immediate early gene-based activity profiling. Notably, the authors also discuss the caveats and limitations of their study/approach in an open and transparent manner.
In its current state, the manuscript touches upon a number of most intriguing, yet rather preliminary findings. For example, the roles of inhibitory neuron cluster i3 or of the selective and apparently MeA neuron-specific projections (Figure 3 - Figure Supplement 2D) remain elusive. As it is the authors' prime intent to provide "a proof-of-principle example of overlaying transcriptomic types, projection, and activity in a behaviorally relevant manner and demonstrates the usefulness of hamFISH in multiplexed in situ gene expression profiling", such studies might be beyond the scope of the present manuscript. The absence of such more in-depth hypothesis-based analysis, however, prevents an even more enthusiastic overall assessment.
-
Reviewer #2 (Public review):
Summary:
The authors describe the development and implementation of hamFISH, a sensitive multiplexed ISH method. They leverage a pre-existing scRNA-seq dataset for the MeA to design 32 probes that combinatorically represent MeA neuronal populations - ~80% of MeA neurons express three of these markers. Using these markers to assess the spatial organization of the MeA, the authors identify a novel population of Ndnf+ projection neurons and characterize their connectivity with anterograde and retrograde labeling. They additionally combine hamFISH with CTB labeling of three principal MeA projection sites to show that 75% of MeA neurons have only a single projection target. Finally, they engage adult male mice in encounters with other adult males (aggression), females (mating), and pups (infanticide), followed by hamFISH and c-fos labeling to relate cell identity to behavior. Their overall conclusion is that hamFISH-defined cell types are broadly active to multiple sensory stimuli. However, the data presented are not sufficient to conclude that no selectivity exists within the MeA. A weakness of the study is that the selected hamFISH genes contain only Lhx6 as a lineage-marking transcription factor. Instead, the authors predominately use neuropeptides as markers. Genes such as Tac1, Cartpt, Adcyap1, Calb1, and Gal are expressed throughout the MeA, and many other brain regions; they are not restricted to a single transcriptomic cell type and they do not denote any developmental origins. By design, the panel has low cell type specificity as all MeA neurons express at least three of the genes. Therefore, the authors' conclusions may not hold with a more stringent classification of cell type or cell identity.
-
Reviewer #3 (Public review):
Summary:
In this manuscript, Edwards et al. describe hamFISH, a customizable and cost-efficient method for performing targeted spatial transcriptomics. hamFISH utilizes highly amplified multiplexed branched DNA amplification, and the authors extensively describe hamFISH development and its advantages over prior variants of this approach.
The authors then used hamFISH to investigate an important circuit in the mouse brain for social behavior, the medial amygdala (MeA). To develop a hamFISH probe set capable of distinguishing MeA neurons, the authors mined published single-cell RNA-sequencing datasets of the MeA, ultimately creating a panel of 32 hamFISH probes that mostly cover the identified MeA cell types. They evaluated over 600,000 MeA cells and classified neurons into 16 inhibitory and 10 excitatory types, many of which are spatially clustered. The authors combined hamFISH with viral and other circuit tracer injections to determine whether the identified MeA cell populations sent and/or received unique inputs from connected brain regions, finding evidence that several cell types had unique patterns of input and output. Finally, the authors performed hamFISH on the brains of male mice that were placed in behavioral conditions that elicit aggressive, infanticidal, or mating behaviors, finding that some cell populations are selectively activated (as assessed by c-fos mRNA expression) in specific social contexts.
Strengths:
(1) The authors developed an optimized tissue preparation protocol for hamFISH and implemented oligopools instead of individually synthesized oligonucleotides to reduce costs. The branched DNA amplification scheme improved smFISH signal compared to previous methods, and multiple variants provide additional improvements in signal intensity and specificity. Compared to other spatial transcriptomics methods, the pipeline for imaging and analysis is streamlined and is compatible with other techniques like fluorescence-based circuit tracing. This approach is cost-effective and has several advantages that make it a valuable addition to the list of spatial transcriptomics toolkits.
(2) Using 31 probes, hamFISH was able to detect 16 inhibitory and 10 excitatory neuron types in the MeA subregions, including the vast majority of cell types identified by other transcriptomics approaches. The authors quantified the distributions of these cell types along the anterior-posterior, dorsal-ventral, and medial-lateral axes, finding spatial segregation among some, but not all, MeA excitatory and inhibitory cell types. The authors additionally identified a class of inhibitory neurons expressing Ndnf (and a subset of these that express Chrna7) that project multiple social chemosensory circuits.
(3) The authors combined hamFISH with MeA input and output mapping, finding cell-type biases in the projections to the MPOA, BNST, and VMHvl, and inputs from multiple regions.
(4) The authors identified excitatory and inhibitory cell types, and patterns of activity across cell types, that were selectively activated during various social behaviors, including aggression, mating, and infanticide, providing new insights and avenues for future research into MeA circuit function.
Weaknesses:
(1) Gene selection for hamFISH is likely to still be a limiting factor, even with the expanded (32-probe) capacity. This may have contributed to the lack of ability to identify sexually dimorphic cell types (Figure S2B). This is an expected tradeoff for a method that has major advantages in terms of cost and adaptability.
(2) Adaptation of hamFISH, for example, to adapt it to other brain regions or tissues, may require extensive optimization.
(3) Pairing this method with behavioral experiments is likely to require further optimization, as c-fos mRNA expression is an indirect and incomplete survey of neuronal activity (e.g. not all cell types upregulate c-fos when electrically active). As such, there is a risk of false negative results that limit its utility for understanding circuit function.
(4) The limited compatibility of hamFISH with thicker tissue samples and lack of optical sectioning introduce additional technical limitations. For example, it would be difficult to densely sample larger neural circuits using serial 20 micron sections. Also, because the imaging modality is not clear from the methods, it is difficult to know whether the analysis methods introduce the risk of misattributing gene expression to overlapping cells.
-
Author response:
Reviewer #1:
In their paper entitled "Combined transcriptomic, connectivity, and activity profiling of the medial amygdala using highly amplified multiplexed in situ hybridization (hamFISH)" Edwards et al. present a new method designated as hamFISH (highly amplified multiplexed in situ hybridization) that enables sequential detection of {less than or equal to}32 genes using multiplexed branched DNA amplification. As proof-of-principle, the authors apply the new technique - in conjunction with connectivity, and activity profiling - to the medial amygdala (MeA) of the mouse, which is a critical nucleus for innate social and defensive behaviors.
As mentioned by Edwards et al., hamFISH could prove beneficial as an affordable alternative to other in situ transcriptomic methods, including commercial platforms, that are resource-intensive and require complex analysis pipelines. Thus, the authors envision that the method they present could democratize in situ cell-type identification in individual laboratories.
The data presented by Edwards et al. is convincing. The authors use the appropriate and validated methodology in line with the current state-of-the-art. The paper makes a strong case for the benefits of hamFISH when combining transcriptomics studies with connectivity tracing and immediate early gene-based activity profiling. Notably, the authors also discuss the caveats and limitations of their study/approach in an open and transparent manner.
In its current state, the manuscript touches upon a number of most intriguing, yet rather preliminary findings. For example, the roles of inhibitory neuron cluster i3 or of the selective and apparently MeA neuron-specific projections (Figure 3 - Figure Supplement 2D) remain elusive. As it is the authors' prime intent to provide "a proof-of-principle example of overlaying transcriptomic types, projection, and activity in a behaviorally relevant manner and demonstrates the usefulness of hamFISH in multiplexed in situ gene expression profiling", such studies might be beyond the scope of the present manuscript. The absence of such more in-depth hypothesis-based analysis, however, prevents an even more enthusiastic overall assessment.
We thank the reviewer for their positive assessment and agree that further studies are needed to explore and understand the MeA circuit further.
Reviewer #2:
The authors describe the development and implementation of hamFISH, a sensitive multiplexed ISH method. They leverage a pre-existing scRNA-seq dataset for the MeA to design 32 probes that combinatorically represent MeA neuronal populations - ~80% of MeA neurons express three of these markers. Using these markers to assess the spatial organization of the MeA, the authors identify a novel population of Ndnf+ projection neurons and characterize their connectivity with anterograde and retrograde labeling. They additionally combine hamFISH with CTB labeling of three principal MeA projection sites to show that 75% of MeA neurons have only a single projection target. Finally, they engage adult male mice in encounters with other adult males (aggression), females (mating), and pups (infanticide), followed by hamFISH and c-fos labeling to relate cell identity to behavior. Their overall conclusion is that hamFISH-defined cell types are broadly active to multiple sensory stimuli. However, the data presented are not sufficient to conclude that no selectivity exists within the MeA. A weakness of the study is that the selected hamFISH genes contain only Lhx6 as a lineage-marking transcription factor. Instead, the authors predominately use neuropeptides as markers. Genes such as Tac1, Cartpt, Adcyap1, Calb1, and Gal are expressed throughout the MeA, and many other brain regions; they are not restricted to a single transcriptomic cell type and they do not denote any developmental origins. By design, the panel has low cell type specificity as all MeA neurons express at least three of the genes. Therefore, the authors' conclusions may not hold with a more stringent classification of cell type or cell identity.
We agree with the reviewer that a deeper level of cell type classification may reveal the selectivity of cell types that may have been missed. The design of our hamFISH bridge-readout probes allows modification to be compatible with a barcoded readout system such as MERFISH, which would substantially increase the number of genes that can be included in the gene panel. This would, however, increase the complexity of the analysis pipeline and reduce throughput, but would be a potential avenue to explore to define MeA cell types at a deeper level. An advantage of hamFISH is the ease of including and reading out alternative gene panels. For example, one panel could examine developmental-lineage-specific genes. Overall, our panel captures the highest hierarchical level (similar to the subclass level of the Allen taxonomy) of MeA transcriptomic types, based on published data available at the time of our gene panel design. Genes including Tac1, Cartpt, Adcyap1, Calb1, and Gal are expressed in specific patterns within the MeA and are useful for classification. In the original manuscript, we also included our rationale for dropping Foxp2, a lineage-specific marker gene in the MeA.
Reviewer #3:
In this manuscript, Edwards et al. describe hamFISH, a customizable and cost-efficient method for performing targeted spatial transcriptomics. hamFISH utilizes highly amplified multiplexed branched DNA amplification, and the authors extensively describe hamFISH development and its advantages over prior variants of this approach.
The authors then used hamFISH to investigate an important circuit in the mouse brain for social behavior, the medial amygdala (MeA). To develop a hamFISH probe set capable of distinguishing MeA neurons, the authors mined published single-cell RNA-sequencing datasets of the MeA, ultimately creating a panel of 32 hamFISH probes that mostly cover the identified MeA cell types. They evaluated over 600,000 MeA cells and classified neurons into 16 inhibitory and 10 excitatory types, many of which are spatially clustered. The authors combined hamFISH with viral and other circuit tracer injections to determine whether the identified MeA cell populations sent and/or received unique inputs from connected brain regions, finding evidence that several cell types had unique patterns of input and output. Finally, the authors performed hamFISH on the brains of male mice that were placed in behavioral conditions that elicit aggressive, infanticidal, or mating behaviors, finding that some cell populations are selectively activated (as assessed by c-fos mRNA expression) in specific social contexts.
Strengths:
(1) The authors developed an optimized tissue preparation protocol for hamFISH and implemented oligopools instead of individually synthesized oligonucleotides to reduce costs. The branched DNA amplification scheme improved smFISH signal compared to previous methods, and multiple variants provide additional improvements in signal intensity and specificity. Compared to other spatial transcriptomics methods, the pipeline for imaging and analysis is streamlined and is compatible with other techniques like fluorescence-based circuit tracing. This approach is cost-effective and has several advantages that make it a valuable addition to the list of spatial transcriptomics toolkits.
(2) Using 31 probes, hamFISH was able to detect 16 inhibitory and 10 excitatory neuron types in the MeA subregions, including the vast majority of cell types identified by other transcriptomics approaches. The authors quantified the distributions of these cell types along the anterior-posterior, dorsal-ventral, and medial-lateral axes, finding spatial segregation among some, but not all, MeA excitatory and inhibitory cell types. The authors additionally identified a class of inhibitory neurons expressing Ndnf (and a subset of these that express Chrna7) that project multiple social chemosensory circuits.
(3) The authors combined hamFISH with MeA input and output mapping, finding cell-type biases in the projections to the MPOA, BNST, and VMHvl, and inputs from multiple regions.
(4) The authors identified excitatory and inhibitory cell types, and patterns of activity across cell types, that were selectively activated during various social behaviors, including aggression, mating, and infanticide, providing new insights and avenues for future research into MeA circuit function.
Weaknesses:
(1) Gene selection for hamFISH is likely to still be a limiting factor, even with the expanded (32-probe) capacity. This may have contributed to the lack of ability to identify sexually dimorphic cell types (Figure S2B). This is an expected tradeoff for a method that has major advantages in terms of cost and adaptability.
We recognise that the 32-plex gene detection might not be sufficient to address key questions in the transcriptomic organization of innate social behavior circuits, and that the study fell short of addressing more quantitative gene expression differences between sexes. Detecting sexually dimorphic gene expression likely requires a more targeted approach as the dimorphism is expression differences rather than binary expression of marker genes, and the gene panel needs to be specifically configured for this purpose.
(2) Adaptation of hamFISH, for example, to adapt it to other brain regions or tissues, may require extensive optimization.
We have successfully performed hamFISH on at least two other mouse brain regions without needing to optimize further, suggesting that compatibility with other mouse brain regions is not an issue. We recognise, however, that optimization of hamFISH may be required for its application in other types of tissue or species. Human brain tissue, for example, typically suffers from high autofluorescence and different tissue preparation methods may need to be employed. We note that the amplification by hamFISH signal boost with v2 amplifiers may be useful to this end.
(3) Pairing this method with behavioral experiments is likely to require further optimization, as c-fos mRNA expression is an indirect and incomplete survey of neuronal activity (e.g. not all cell types upregulate c-fos when electrically active). As such, there is a risk of false negative results that limit its utility for understanding circuit function.
We acknowledge that c-fos is not the only readout of neuronal activity and that a panel of immediate early genes would allow a more comprehensive readout of activity-dependent gene expression. We fully agree that immediate early gene induction is an indirect readout of neural activity, and alternative methods such as in vivo physiology would provide a complementary insight into the selectivity of MeA neuron responses.
(4) The limited compatibility of hamFISH with thicker tissue samples and lack of optical sectioning introduce additional technical limitations. For example, it would be difficult to densely sample larger neural circuits using serial 20 micron sections. Also, because the imaging modality is not clear from the methods, it is difficult to know whether the analysis methods introduce the risk of misattributing gene expression to overlapping cells.
We agree that the use of hamFISH as described here is restricted to thin (<20 um) sections. We have shown, however, that our encoding probe and bridge-readout probe design are compatible with HCR-based mRNA detection, which is compatible with thicker sections. Regarding the misattribution of gene expression to overlapping cells in the z-axis, we used epifluorescence microscopy with 14x 500 nm z-steps to collect our raw data and generate maximum intensity projections for further analysis. Because of the thin sections (10 um) used for the imaging, the overlap between cells in z is expected to be minimal. Regarding throughput, we agree that hamFISH is likely not suitable for brain-wide questions that require large volume coverage, but its major advantage is that it allows routine use of low-level multiplexing for targeted brain areas.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
In this valuable contribution, the authors present an approach based on a complex systems theoretical framework to characterize diet-host-microbe interactions and to develop targeted bacteriotherapies using a three-phase workflow. Overall, the solid results provide a reference for microbial community research and insights to guide future studies. However, the theoretical systems approach would benefit from further description, and some claims regarding oxalate bacterial metabolism in complex microbial communities could be strengthened. This study will interest researchers working on gut microbiomes specifically those seeking to modulate host-microbial interactions.
-
Reviewer #1 (Public review):
Summary:
This study experimentally examined diet-microbe-host interactions through a complex systems framework, centered on dietary oxalate. Multiple, independent molecular, animal, and in vitro experimental models were introduced into this research. The authors found that microbiome composition influenced multiple oxalate-microbe-host interfaces. Oxalobacter formigenes were only effective against a poor oxalate-degrading microbiota background and give critical new insights into why clinical intervention trials with this species exhibit variable outcomes. Data suggest that, while heterogeneity in the microbiome impacts multiple diet-host-microbe interfaces, metabolic redundancy among diverse microorganisms in specific diet-microbe axes is a critical variable that may impact the efficacy of bacteriotherapies, which can help guide patient and probiotic selection criteria in probiotic clinical trials.
Strengths:
The paper has made significant progress in both the depth and breadth of scientific research by systematically comparing multiple experimental methods across multiple dimensions. Particularly through in-depth analysis from the enzymatic perspective, it has not only successfully identified several key strains and redundant genes, which is of great significance for understanding the functions of enzymes, the characteristics of strains, and the mechanisms of genes in microbial communities, but also provided a valuable reference for subsequent experimental design and theoretical research.
More importantly, the establishment of a novel research approach to probiotics and gut microbiota in this paper represents a major contribution to the current research field. The proposal of this new approach not only breaks through the limitations of traditional research but also offers new perspectives and strategies for the screening, optimization of probiotics, and the regulation of gut microbiota balance. This holds potential significant value for improving human health and the prevention and treatment of related diseases.
Weaknesses:
While the study has excellently examined the overall changes in microbial community structure and the functions of individual bacteria, it lacks a focused investigation on the metabolic cross-feeding relationships between oxalate-degrading bacteria and related microorganisms, failing to provide a foundational microbial community or model for future research. Although this paper conducts a detailed study on oxalate metabolism, it would be beneficial to visually present the enrichment of different microbial community structures in metabolic pathways using graphical models.
Furthermore, the authors have done a commendable job in studying the roles of key bacteria. If the interactions and effects of upstream and downstream metabolically related bacteria could be integrated, it would provide readers with even more meaningful information. By illustrating how these bacteria interact within the metabolic network, readers can gain a deeper understanding of the complex ecological and functional relationships within microbial communities. Such an integrated approach would not only enhance the scientific value of the study but also facilitate future research in this area.
-
Reviewer #2 (Public review):
Summary:
Using the well-studied oxalate-microbiome-host system, the authors propose a novel conceptual and experimental framework for developing targeted bacteriotherapies using a three-phase pre-clinical workflow. The third phase is based on a 'complex system theoretical approach' in which multi-omics technologies are combined in independent in vivo and in vitro models to successfully identify the most pertinent variables that influence specific phenotypes in diet-host-microbe systems. The innovation relies on the third phase since phase I and phase II are the dominant approaches everyone in the microbiome field uses.
Strengths:
The authors used a multidisciplinary approach which included:
(1) fecal transplant of two distinct microbial communities into Swiss-Webster mice (SWM) to characterize the host response (hepatic response-transcriptomics) and microbial activity (untargeted metabolomics of the stool samples) to different oxalate concentrations;
(2) longitudinal analysis of the N. albigulia gut microbiome composition in response to varying concentrations of oxalate by shotgun metagenomics, with deep bioinformatic analyses of the genomes assembled; and
(3) development of synthetic microbial communities around oxalate metabolisms and evaluation of these communities' activity in oxalate degradation in vivo.
Weaknesses:
However, I have concerns about the frame the authors tried to provide for a 'complex system theoretical approach' and how the data are interpreted within this frame. Several of the conclusions the authors provide do not seem to have sufficient data to support them.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study investigates the function of a critical regulator of human early cardiac development. The convincing examination of GATA6 function is thorough and well-executed. The study will be of interest to scientists working on how the human heart acquires its identity.
-
Reviewer #1 (Public review):
Summary:
This is a comprehensive study that clearly and deeply investigates the function of GATA6 in human early cardiac development.
Strengths:
This study combines hESC engineering, differentiation, detailed gene expression, genome occupancy, and and pathway modulation to elucidate the role of GATA6 in early cardiac differentiation. The work is carefully executed and the results support the conclusions. The use of publicly available data is well integrated throughout the manuscript. The RIME experiments are excellent.
Weaknesses:
Much has been known about GATA6 in mesendoderm development, and this is acknowledged by the authors.
Comments on revised version:
The authors have addressed my comments appropriately.
-
Reviewer #2 (Public review):
Summary:
This manuscript by Bisson et al describes the role GATA6 to regulate cardiac progenitor cell (CPC) specification and cardiomyocyte (CM) generation using human embryonic stem cells (hESCs). The authors found that GATA6 loss-of-function hESC exhibit early defects in mesendoderm and lateral mesoderm patterning stages. Using RNA-seq and CUT&RUN assays the genes of the Wnt and BMP programs were found to be affected by the loss of GATA6 expression. Modulating Wnt and BMP during early cardiac differentiation can partially rescue CPC and CM defects in GATA6 hetero- and homozygous mutant hESCs.
Strengths:
The studies performed were rigorous and the rationale for the experimental designed were logical. The results obtained were clear and supports the conclusions that the authors made regarding the role of GATA6 on Wnt and BMP pathway gene expression.
Weaknesses:
Given the wealth of studies that have been performed in this research area previously, the amount of new information provided in this study is relatively modest. Nevertheless, the results and quite clear and should make a strong contribution to the field.
Comments on revised version:
The authors have addressed the prior request to assess genes expression representing each stage of development/differentiation from mesoderm to cardiac progenitor to cardiomyocytes and confirmed that the differentiation defect lies at the cardiac progenitor and cardiomyocyte stages and not in mesodermal differentiation. This work has significantly improved the robustness of the study.
-
Reviewer #3 (Public review):
In this study, Bison et al. analyzed the role of the GATA6 transcription factor in patterning the early mesoderm and generating cardiomyocytes, using human embryonic stem cell differentiation assays and patient-derived hiPSCs with heart defects associated with mutations in the GATA6 gene. They identified a novel role for GATA6 in regulating genes involved in the WNT and BMP pathways. Modulation of the WNT and BMP pathways partially rescue early cardiac mesoderm defects in GATA6 mutant hESCs. These results provide significant insights into how GATA6 loss-of-function and heterozygous mutations contribute to heart defects.
Comments on revised version:
The authors have addressed all the concerns, using new data and modifications to the text to further strengthen the manuscript.
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This is a comprehensive study that clearly and deeply investigates the function of GATA6 in human early cardiac development.
Strengths:
This study combines hESC engineering, differentiation, detailed gene expression, genome occupancy, and pathway modulation to elucidate the role of GATA6 in early cardiac differentiation. The work is carefully executed and the results support the conclusions. The use of publicly available data is well integrated throughout the manuscript. The RIME experiments are excellent.
Weaknesses:
Much has been known about GATA6 in mesendoderm development, and this is acknowledged by the authors.
We appreciate the comments and have tried to highlight both the early role of GATA6 in cardiac progenitor biology as well as the haploinsufficiency for relevance to human congenital heart disease, which we believe adds value to other recent published work, among others Sharma et al. eLife 2020.
Reviewer #2 (Public review):
Summary:
This manuscript by Bisson et al describes the role of GATA6 to regulate cardiac progenitor cell (CPC) specification and cardiomyocyte (CM) generation using human embryonic stem cells (hESCs). The authors found that GATA6 loss-of-function hESC exhibits early defects in mesendoderm and lateral mesoderm patterning stages. Using RNA-seq and CUT&RUN assays the genes of the Wnt and BMP programs were found to be affected by the loss of GATA6 expression. Modulating Wnt and BMP during early cardiac differentiation can partially rescue CPC and CM defects in GATA6 hetero- and homozygous mutant hESCs.
Strengths:
The studies performed were rigorous and the rationale for the experimental design was logical. The results obtained were clear and supported the conclusions that the authors made regarding the role of GATA6 on Wnt and BMP pathway gene expression.
Weaknesses:
Given the wealth of studies that have been performed in this research area previously, the amount of new information provided in this study is relatively modest. Nevertheless, the results and quite clear and should make a strong contribution to the field.
Likewise for reviewer 2, we appreciate the comments and have tried to highlight both the early role of GATA6 in cardiac progenitor biology as well as the haploinsufficiency for relevance to human congenital heart disease.
Reviewer #3 (Public review):
In this study, Bison et al. analyzed the role of the GATA6 transcription factor in patterning the early mesoderm and generating cardiomyocytes, using human embryonic stem cell differentiation assays and patient-derived hiPSCs with heart defects associated with mutations in the GATA6 gene. They identified a novel role for GATA6 in regulating genes involved in the WNT and BMP pathways -findings not previously noted in earlier analyses of GATA6 mutant hiPSCs during early cardiac mesoderm specification (Sharma et al., 2020). Modulation of the WNT and BMP pathways may partially rescue early cardiac mesoderm defects in GATA6 mutant hESCs. These results provide significant insights into how GATA6 loss-of-function and heterozygous mutations contribute to heart defects.
I have the following comments:
(1) Throughout the manuscript, Bison et al. alternate between different protocols to generate cardiomyocytes, which creates some confusion (e.g., Figure 1 vs. Supplemental Figure 2A). The authors should provide a clear justification for using alternative protocols.
We agree and clarified this issue in the revision (p. 6). The reviewer is correct that there are two widely used protocols for directed differentiation of PSCs to cardiac fate. One is a cytokine-based protocol (Fig. 1A) and the other uses small molecules to manipulate the WNT pathway (CHIR protocol, Supplemental Fig. 2B). In our study, we used the CHIR protocol only for experiments in Supplemental Figure 2B-E. Since our data implicated BMP and WNT as mediators of the GATA6-dependent program, we did this mainly to confirm that the phenotype we observed with the cytokine-based protocol was not biased by the differentiation protocol. However, we found the CHIR protocol to be overall relatively inefficient for cardiac differentiation using the parental H1 hESCs and the various isogenic lines. The in vitro cardiac differentiation protocols for hPSCs are known to be variable depending on lines and sometimes require extensive optimization for various media components and concentrations, cell seeding densities, and batch variations for crucial reagents. The cytokine-based protocol we optimized worked most efficiently with our hPSC lines to generate cardiomyocytes, therefore we committed to using it for the bulk of experiments in this study.
(2) The authors should characterise the mesodermal identity and cardiomyocyte subtypes generated with the activin/BMP-induction protocol thoroughly and clarify whether defects in the expression of BMP and WNTrelated gene affect the formation of specific cardiomyocyte subtypes in a chamber-specific manner. This analysis is important, as Sharma et al. suggested a role for GATA6 in orchestrating outflow tract formation, and Bison et al. similarly identified decreased expression of NRP1, a gene involved in outflow tract septation, in their GATA6 mutant cells.
We agree it is important that the mesodermal identities are quite thoroughly characterized.
For example, Fig. 2 (K+P+, Brachyury, EOMES), Fig. 3G&H (lateral mesoderm, cardiac mesoderm RNAseq & GSEA comparing datasets from Koh et al.). The capacity of the cytokine-based protocol to generate both FHF and SHF derived sub-types has been rigorously evaluated by Keller and colleagues, which we now cite (Yang et al. 2022). Since the null cells do not generate CMs, chamber specific subtypes cannot be evaluated; whether the GATA6 heterozygous mutants are biased is an interesting question. Indeed, the top GO term identified by CUT&RUN analysis for GATA6 at day 2 of
differentiation is outflow tract morphogenesis, which is consistent with the interpretation by Sharma et al., but implicates this program at a much earlier developmental stage, long before cardiomyocyte differentiation. We think this is one of the most important findings of our study and appreciate the chance to highlight this in the revision (p. 9, 17). When we evaluated chamber-specificity for differentiated cardiomyocytes, we did not find significant differences, as indicated for the reviewer in the panel below (day 20 of differentiation). Since our study focuses on early stages of progenitor specification rather than cardiomyocyte differentiation, we agree that a more rigorous analysis would be of value, and indicated this as a limitation of our current study (p. 18).
Author response image 1.
(3) The authors developed an iPSC line derived from a congenital heart disease (CHD) patient with an atrial septal defect and observed that these cells generate cTnnT+ cells less efficiently. However, it remains unclear whether atrial cardiomyocytes (or those localised specifically at the septum) are being generated using the activin/BMP-induction protocol and the patient-derived iPSC line.
As indicated above, our study is focused on cardiac progenitor specification, and we found similar differences with the patient-derived iPSC-CMs compared to using hESC heterozygous targeted mutants. While we did not note any major differences in expression of cardiomyocyte markers, whether the mutants show any biases toward sub-types of cardiomyocytes is an interesting question to be pursued in subsequent work.
(4) The authors should also justify the necessity of using the patient-derived line to further analyse GATA6 function.
This is a good point, and as suggested we provided the justification (p. 5-6). This is the first patient-derived iPSC line published with a heterozygous GATA6 mutation along with an isogenic mutation-corrected control generated for cardiac directed differentiation. Patients with congenital heart disease (CHD) associated with GATA6 mutations are typically heterozygous (also true for many other CHD variants; presumably homozygous null embryos would not survive). It is important to query if phenotypes found using targeted mutations in hESCs (or iPSCs) model the human disease, since the patient cells (or the hESCs) likely have additional genetic variants that might interact with the GATA6 mutation. The fact that both types of heterozygous cells (patient-derived iPSCs and targeted hESCs) generate similar defects in CM differentiation provides evidence supporting the use of these human cellular models to study the genetic and cellular basis for congenital heart disease. This is particularly important, since other models, such as heterozygous mice, do not show such phenotypes.
(5) Figure 3 suggests an enrichment of paraxial mesoderm genes in the context of GATA6 loss-of-function, which is intriguing given the well-established role of GATA6 in specifying cardiac versus pharyngeal mesoderm lineages in model organisms. Could the authors expand their analysis beyond GO term enrichment to explore which alternative fates GATA6 mutant cells may acquire? Additionally, how does the potential enrichment of paraxial mesoderm, rather than pharyngeal mesoderm, relate to the initial mesodermal induction from their differentiation protocol? Could the authors also rule out the possibility of increased neuronal cell fates?
We need to interpret our in vitro differentiation data cautiously in relation to what has been shown in vivo, since we are unlikely to be reproducing all the complex signaling taking place in the embryo. Yet we do see modest increases in gene expression levels including signatures of paraxial mesoderm and ECM/mesenchymal at days 2 or 3 of differentiation in the GATA6 mutant cells. Therefore, we now include a heatmap showing enriched paraxial mesoderm gene expression in the mutant cells, new Fig. 3I (see page 10).
A caveat of this result is that the cells are being differentiated toward cardiac fate, so a bias for alternative fates might be suppressed. We modified the protocol to favor paraxial fate by adding CHIR at day 2 (rather than XAV) and performing qPCR assays at day 3. We found this successfully induced paraxial mesoderm gene expression, but equally comparing wildtype, heterozygous, or null cells, so do not feel it warrants highlighting further.
Recommendations for the authors:
Reviewing Editor (Recommendations for the authors):
Incorporation of marker analysis for various stages of iPSC to CM differentiation (mesoderm, cardiac progenitor, CM subtypes) would increase the significance and support for the findings presented. Further data on the link (direct or indirect) between GATA6 and Wnt/BMP signalling would also add to the significance of this study. A number of textual changes/clarifications are also suggested to improve the manuscript.
We appreciate the feedback and provide responses for issues raised for markers, direct or indirect interactions, and textual changes/clarifications in the following sections. As indicated above, we did not find obvious alterations in cardiac subtypes, but since our study is focused on early progenitor specification, this is an interesting question that we think should be more rigorously evaluated in subsequent work.
Reviewer #1 (Recommendations for the authors):
Minor details:
(1) On p6 "Principal component analysis (PCA) showed that the cells derived from each genotype were well separated from each other (Supplemental Figure 2C)". All genotypes should be in one PCA plot to better evaluate the three genotypes.
We prepared the new plot as suggested, presented as new Supplemental Fig. 2C.
(2) p10: "Chia et al.22 and found a significantly decreased enrichment in GATA6-/- cells relative to WT at day 2" decreased enrichment of what? Direct target genes?
Thank you for catching this. Yes, the text was changed to indicate a “decreased enrichment in GATA6-/- cells relative to WT at day 2 for putative direct GATA6 target genes.”
Reviewer #2 (Recommendations for the authors):
Overall, this is an interesting study that addresses the early developmental roles of GATA6 on cardiac differentiation. While the identification of Wnt and BMP pathway genes to be involved in GATA6 regulation is not entirely unexpected, the authors do bring forth some useful knowledge that helps to further elucidate the mechanism of pre-cardiac mesoderm regulation. Some suggestions for improvement are included below -
Major points:
(1) Since the loss of Gata6 in this study is global (either as heterozygous or homozygous, it is likely that the very early requirement of Gata6 (e.g. mesodermal stage of differentiation) is responsible for the cardiac transcriptional phenotype observed and not due to specific role of Gata6 in the cardiac lineage which would need to be addressed using conditional knock out of Gata6 in hPSC model. The authors should be more explicit when discussing the results as disruption of mesodermal differentiation leading to loss of downstream cardiac lineage cells. For example, I would change the title "GATA6 loss-of-function impairs CM differentiation" to "GATA6 loss-of-function impairs mesodermal (or mesodermal lineage) differentiation" and show the changes in cardiac progenitor cells genes (Isl1, Tbx1, Hand1, and BAF50c/Smarcd3) in addition to cardiomyocyte genes but no change in mesodermal (e.g. Brachyury, T, Eomes, Mesp1/2, etc) genes.
We agree with the reviewer’s interpretation. The title for the section was changed as suggested. In Fig. 1, we show changes in cardiac progenitor cell genes (Isl1, Hand1, and BAF50c/Smarcd3) while not seeing changes in mesodermal genes in Fig. 2 (e.g. Brachyury, Eomes, Mesp1/2). We note that the defect may be specific to cardiac (or anterior lateral) mesoderm, as the ability to express paraxial mesoderm markers was not impaired.
(2) The use of NKX2.5, TBX5, TBX20, and GATA4 as markers for CPC is not ideal. These markers are also expressed in differentiated cardiomycytes. ISL1 or TBX1 for second heart field progenitors and HAND1 or BAF60c/Smarcd3 for first heart field progenitors would be ideal.
As suggested, we included additional day 6 qPCR panel (new Fig. 1E) to evaluate the heart field progenitor markers.
(3) Much of the findings described in this study have been known in the field including the requirement of Wnt and BMP to induce mesodermal and subsequently cardiomyocyte differentiation. The key new information here is that Gata6 knockout disrupts Wnt and BMP signaling. It would help to further validate experimentally some of the Wnt and BMP genes as either direct or indirect targets of Gata6 using reporter assays.
While reporter assays are feasible and do provide relevant outputs, we feel that the use of any one or even several response elements in a reporter assay adds relatively little value compared to comprehensive analysis of bona fide network components. To address the reviewers concern we have included profiling heat maps for WNT and BMP pathway components to more rigorously and specifically evaluate the disruption in the signaling networks caused by loss of GATA6. Proving direct targets of endogenous genes is challenging, but we mapped many binding peaks for GATA6 to putative enhancers of WNT/BMP pathway genes (based on histone marks). We provide a list of these genes (new Fig. 4F) and distinguish these from WNT/BMP pathway genes that were not bound by GATA6 yet are down-regulated in the GATA6 mutant cells and are likely to be indirect targets (p. 12).
Minor points:
(1) Figures 1 and 2 - in the figure legend the labels w2, w4, m2, m5, m11, and m14 should be explained as the name of the clones of targeted hESC.
The legends were edited to provide this information.
(2) Supplemental Figure 3A - the resolution of the FACS plot is suboptimal.
We apologize and have corrected the plot resolution in the revised manuscript.
(3) Supplemental Table 1 - it's intriguing that amongst all the SWI/SNF factors, the one that is known to be cardiac-specific (SMARCD3) did not come up in the GATA6-RIME-enriched proteins. Is this a reflection of the early stage in which GATA6 plays a role in development (e.g. mesendoderm development but not precardiac mesoderm development when SMARCD3 is expressed)?
We agree and have noted this feature in the revised manuscript (p. 17). We note that SMARCD3 is expressed in the RNA-seq data as early as day 2. Although speculative, it may be that GATA6 primarily interacts with SWI/SNF complexes prior to the role for SMARCD3 in cardiac specification.
Reviewer #3 (Recommendations for the authors):
(1) Figures 3G and 3H, as well as others, have resolution issues. The gene names are unreadable, and higherresolution images should be provided.
We apologize for the resolution issues and these have been fixed in the revised version.
(2) In their early manipulation of the WNT and BMP pathways (Figure 6A), it is unclear whether the activin/BMP protocol shown in Figure 1A was used. If this is the case, the authors should compare their results to a wild-type + DOX EV condition for consistency.
We clarified in the revision (Fig. 6A) that all the experiments in Fig. 6 use the cytokine protocol. In the revised figure, we included the wild-type + DOX EV condition as suggested.
(3) In Figures 6C and 6D, the authors should include an analysis of a wild-type isogenic line under their new CHIR/LB condition for comparison.
As suggested, we included the WT isogenic line in the comparison. For Fig. 6C these are shown on a separate graph because the Y-axis values are very different. Note that the CHIR/LB treatments that improve mutant cell differentiation impact the WT cells in the opposite manner.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
The study by Pudlowski et al. shows that a previously-identified protein complex, composed of delta- and epsilon-tubulin together with TEDC1 and TEDC2, functions in generating centriolar triplet microtubules, and that this is crucial for the proper formation of centriolar subdomains and the stability of centrioles throughout the cell cycle. This is an important study that advances our understanding of centriole biogenesis and structure and is supported by convincing evidence based on knockout cell lines, immunoprecipitation, and ultrastructure expansion microscopy. The work is of interest to cell biologists, in particular researchers with interest in centrosome biology.
-
Reviewer #1 (Public review):
Summary:
The study by Pudlowski et al. investigates how the intricate structure of centrioles is formed by studying the role of a complex formed by delta- and epsilon-tubulin and the TEDC1 and TEDC2 proteins. For this they employ knockout cell lines, EM and ultrastructure expansion microscopy as well as pull-downs. Previous work has indicated a role of delta- and epsilon-tubulin in triplet microtubule formation. Without triplet microtubules centriolar cylinders can still form, but are unstable, resulting is futile rounds of de novo centriole assembly during S phase and disassembly during mitosis. Here the authors show that all four proteins function as a complex and knockout of any of the four proteins results in the same phenotype. They further find that mutant centrioles lack inner scaffold proteins and contain an extended proximal end including markers such as SAS6 and CEP135, suggesting that triplet microtubule formation is linked to limiting proximal end extension and formation of the central region that contains the inner scaffold. Finally, they show that mutant centrioles seem to undergo elongation during early mitosis before disassembly, although it is not clear if this may also be due to prolonged mitotic duration in mutants.
Strengths:
Overall this is a well-performed study, well presented, with conclusions supported by convincing data based on knockout cell lines, rescue experiments, and detailed quantifications.
Weaknesses:
Most weaknesses have been addressed in the revised version. The precise mapping of TED complex proteins to centrioles remains challenging with the available tools but has been addressed through the use of several complementary super-resolution techniques.
-
Reviewer #2 (Public review):
Summary:
In this article, the authors study the function of TEDC1 and TEDC2, two proteins previously reported to interact with TUBD1 and TUBE1. Previous work by the same group had shown that TUBD1 and TUBE1 are required for centriole assembly and that human cells lacking these proteins form abnormal centrioles that only have singlet microtubules that disintegrate in mitosis. In this new work, the authors demonstrate that TEDC1 and TEDC2 depletion results in the same phenotype with abnormal centrioles that also disintegrate into mitosis. In addition, they were able to localize these proteins to the proximal end of the centriole, a result not previously achieved with TUBD1 and TUBE1, providing a better understanding of where and when the complex is involved in centriole growth.
Strengths:
The results are very convincing, particularly the phenotype, which is the same as previously observed for TUBD1 and TUBE1. The U-ExM localization is also convincing: despite a signal that's not very homogeneous, it's clear that the complex is in the proximal region of the centriole and procentriole. The phenotype observed in U-ExM on the elongation of the cartwheel is also spectacular and opens the question of the regulation of the size of this structure. The authors also report convincing results on direct interactions between TUBD1, TUBE1, TEDC1, and TEDC2, and an intriguing structural prediction suggesting that TEDC1 and TEDC2 form a heterodimer that interacts with the TUBD1- TUBE1 heterodimer.
Comments on revisions:
I would like to thank the authors for their work and for thoroughly addressing most of my questions. I extend my congratulations to the authors for this excellent and impactful article.
-
Reviewer #3 (Public review):
Summary:
Human cells deficient in delta-tubulin or epsilon-tubulin form unstable centrioles, which lack triplet microtubules and undergo a futile formation and disintegration cycle. In this study, the authors show that human cells lacking the associated proteins TEDC1 or TEDC2 have these identical phenotypes. They use genetics to knockout TEDC1 or TEDC2 in p53-negative RPE-1 cells and expansion microscopy to structurally characterize mutant centrioles. Biochemical methods and AlphaFold-multimer prediction software are used to investigate interactions between tubulins and TEDC1 and TEDC2.
The study shows that mutant centrioles are built only of A tubules, which elongate and extend their proximal region, fail to incorporate structural components, and finally disintegrate in mitosis. In addition, they demonstrate that delta-tubulin or epsilon-tubulin and TEDC1 and TEDC2 form one complex and that TEDC1 TEDC2 can interact independently of tubulins. Finally, they show that localization of four proteins is mutually dependent.
Strengths:
The results presented here are convincing, exciting, and important, and the manuscript is well-written. The study shows that delta-tubulin, epsilon-tubulin, TEDC1, and TEDC2 function together to build a stable and functional centriole, significantly contributing to the field and our understanding of the centriole assembly process.
Weaknesses:
The ultrastructural characterization of TEDC1 and TEDC2 in centrosomes remains challenging. Nevertheless, it is evident that these proteins occupy growing centrioles and the proximal parts of mother centrioles.
Comments on revisions:
The authors have done a great job extending the original experiments and measurements and answering outstanding questions.
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The study by Pudlowski et al. investigates how the intricate structure of centrioles is formed by studying the role of a complex formed by delta- and epsilon-tubulin and the TEDC1 and TEDC2 proteins. For this, they employ knockout cell lines, EM, and ultrastructure expansion microscopy as well as pull-downs. Previous work has indicated a role of delta- and epsilon-tubulin in triplet microtubule formation. Without triplet microtubules centriolar cylinders can still form, but are unstable, resulting in futile rounds of de novo centriole assembly during S phase and disassembly during mitosis. Here the authors show that all four proteins function as a complex and knockout of any of the four proteins results in the same phenotype. They further find that mutant centrioles lack inner scaffold proteins and contain an extended proximal end including markers such as SAS6 and CEP135, suggesting that triplet microtubule formation is linked to limiting proximal end extension and formation of the central region that contains the inner scaffold. Finally, they show that mutant centrioles seem to undergo elongation during early mitosis before disassembly, although it is not clear if this may also be due to prolonged mitotic duration in mutants.
Strengths:
Overall this is a well-performed study, well presented, with conclusions mostly supported by the data. The use of knockout cell lines and rescue experiments is convincing.
Weaknesses:
In some cases, additional controls and quantification would be needed, in particular regarding cell cycle and centriole elongation stages, to make the data and conclusions more robust.
We thank the reviewer for these comments and have improved our analyses of these as detailed below.
Reviewer #2 (Public Review):
Summary:
In this article, the authors study the function of TEDC1 and TEDC2, two proteins previously reported to interact with TUBD1 and TUBE1. Previous work by the same group had shown that TUBD1 and TUBE1 are required for centriole assembly and that human cells lacking these proteins form abnormal centrioles that only have singlet microtubules that disintegrate in mitosis. In this new work, the authors demonstrate that TEDC1 and TEDC2 depletion results in the same phenotype with abnormal centrioles that also disintegrate into mitosis. In addition, they were able to localize these proteins to the proximal end of the centriole, a result not previously achieved with TUBD1 and TUBE1, providing a better understanding of where and when the complex is involved in centriole growth.
Strengths:
The results are very convincing, particularly the phenotype, which is the same as previously observed for TUBD1 and TUBE1. The U-ExM localization is also convincing:
despite a signal that's not very homogeneous, it's clear that the complex is in the proximal region of the centriole and procentriole. The phenotype observed in U-ExM on the elongation of the cartwheel is also spectacular and opens the question of the regulation of the size of this structure. The authors also report convincing results on direct interactions between TUBD1, TUBE1, TEDC1, and TEDC2, and an intriguing structural prediction suggesting that TEDC1 and TEDC2 form a heterodimer that interacts with the TUBD1- TUBE1 heterodimer.
Weaknesses:
The phenotypes observed in U-ExM on cartwheel elongation merit further quantification, enabling the field to appreciate better what is happening at the level of this structure.
We thank the reviewer for these comments and have improved our analyses of cartwheel elongation as detailed below.
Reviewer #3 (Public Review):
Summary:
Human cells deficient in delta-tubulin or epsilon-tubulin form unstable centrioles, which lack triplet microtubules and undergo a futile formation and disintegration cycle. In this study, the authors show that human cells lacking the associated proteins TEDC1 or TEDC2 have these identical phenotypes. They use genetics to knockout TEDC1 or TEDC2 in p53negative RPE-1 cells and expansion microscopy to structurally characterize mutant centrioles. Biochemical methods and AlphaFold-multimer prediction software are used to investigate interactions between tubulins and TEDC1 and TEDC2.
The study shows that mutant centrioles are built only of A tubules, which elongate and extend their proximal region, fail to incorporate structural components, and finally disintegrate in mitosis. In addition, they demonstrate that delta-tubulin or epsilon-tubulin and TEDC1 and TEDC2 form one complex and that TEDC1 TEDC2 can interact independently of tubulins. Finally, they show that the localization of four proteins is mutually dependent.
Strengths:
The results presented here are mostly convincing, the study is exciting and important, and the manuscript is well-written. The study shows that delta-tubulin, epsilon-tubulin, TEDC1, and TEDC2 function together to build a stable and functional centriole, significantly contributing to the field and our understanding of the centriole assembly process.
Weaknesses:
The ultrastructural characterization of TEDC1 and TEDC2 obtained by U-ExM is inconclusive. Improving the quality of the signals is paramount for this manuscript.
We thank the reviewer for these comments and have improved our imaging of TEDC1 and TEDC2 localization, as detailed below.
Recommendations for the authors:
Reviewing Editor (Recommendations For The Authors):
The reviewers agreed that the conclusions are largely supported by solid evidence, but felt that improving the following aspects would make some of the data more convincing:
(1) The UExM localizations of TEDC1/2 are not very convincing and the reviewers suggest to complement these with alternative super-resolution approaches (e.g. SIM) and/or different labeling techniques such as pre-expansion labeling using STAR red/orange secondaries (also robust for SIM and STED), use of the Halo tag, different tag antibodies, etc
We thank the reviewers for these recommendations and have adapted two of these strategies to improve our imaging of TEDC1 and TEDC2 localization. First, we used an alternative super-resolution approach, a Yokogawa CSU-W1 SoRA confocal scanner (resolution = 120 nm) and imaged cells grown on coverslips (not expanded). We found that TEDC1 and TEDC2 localize to procentrioles and the proximal end of parental centrioles (Fig 2 – Supplementary Figure 1a, b). Second, we used a recently described expansion gel chemistry (Kong et al., Methods Mol Biol 2024) combined with Abberior Star red and orange secondary antibodies. This technique resulted in robust signal at centrosomes and in the cytoplasm and indicated that TEDC1 and TEDC2 localize near the centriole walls of procentrioles and the proximal region of parental centrioles, near CEP44 (Fig 2 – Supplementary Figure 1c, d). These results complement and support our initial observations (Fig 2C, D) and we have edited the text to reflect this (lines 157-163). We also note that these Flag tag and V5 tag primary antibodies are specific and have little background signal in all applications (Fig 2 – Supplementary Fig 1E-J), while other commercially available antibodies against these tags did exhibit non-specific signal.
(2) The cell cycle classifications of centrioles would strongly benefit, apart from a better description, from adding quantifications of average centriole length at a given stage based on tubulin staining (not acTub).
We thank the reviewers for these recommendations. We have added an improved description of our cell cycle analyses (lines 234-237). We have also added new analyses for centriole length as measured by staining with alpha-tubulin (Fig 4 – Supp 3 and Fig 4 – Supp 4). We find that in all mutants, acetylated tubulin elongates along with alpha-tubulin in a similar way as control centrioles.
Reviewer #1 (Recommendations For The Authors):
Specific points:
(1) The introduction is a bit oddly structured. About halfway through it summarizes what is going to be presented in the study, giving the impression that it is about to conclude, but then continues with additional, detailed introduction paragraphs. Overall, the authors may also want to consider making it more concise.
We thank the reviewer for these suggestions and have shortened and restructured the introduction for clarity and conciseness.
(2) The text should explain to the non-expert reader why endogenous proteins are not detected and why exogenously expressed, tagged versions are used. Related to this, the authors state overexpression, but what is this assessment based on? Does expression at the endogenous level also rescue? At least by western blot, these questions should be addressed.
In the text, we have added clarification about why endogenous proteins were not detected for immunofluorescence (lines 149-151). To quantify the overexpression, we have added Western blots of TEDC1 and TEDC2 to Fig 1 – Supplementary Figure 1E,F. We note that endogenous levels of both proteins are very low, and the rescue constructs are overexpressed 20 to 70 fold above endogenous levels.
(3) The figures should clearly indicate when tagged proteins are used and detected.
Currently, this info is only found in the legends but should be in the figure panels as well.
We have made these changes to the figure panels in Fig 2, Fig 2 – Supp 1, and Fig 3.
(4) I could not find a description and reference to Figure 2 Supplement 2 and 3.
We have replaced these supplements with new supplementary figures for TEDC1 and TEDC2 localization (Fig 2 – Supp 1).
(5) The multiple bands including unspecific (?) bands should be labeled to guide the reader in the western blots.
We have labeled nonspecific bands in our Western blots with asterisks (Fig 1 – Supp 1, Fig 3)
(6) The alphafold prediction suggests that TUBD1 can bind to the TED complex in the absence of TUBE1 can this be shown? This would be a nice validation of the predicted architecture of the complex. I also missed a bit of a discussion of the predicted architecture. How could it be linked to triplet microtubule formation? Is the latest alphafold version 3 adding anything to this analysis?
In our pulldown experiments, we found that TUBD1 cannot bind to TEDC1 or TEDC2 in the absence of TUBE1 (Fig 3C, D, IB: TUBD1). We performed this experiment with three biological replicates and found the same result. It is possible that TUBD1 and TUBE1 form an intact heterodimer, similar to alpha-tubulin and beta-tubulin, and this will be an exciting area of future research.
We have added new analysis from AlphaFold3 (Fig 3 – Supp 1B). AlphaFold3 predicts a similar structure as AlphaFold Multimer.
We have also added additional discussion about the AlphaFold prediction to the text (lines 220-222, 365-367). Thanks to the reviewer for pointing out this oversight.
(7) I suggest briefly explaining in the text how cells and centrioles at different cell cycle stages were identified. I found some info in the legend of Figure 1, but no info for other figures or in the text. Related to this, how are procentrioles defined in de novo formation? There is no parental centriole to serve as a reference.
We have added a brief explanation of the synchronization and identification in lines 234237. We have also clarified the text regarding de novo centrioles, and now term these “de novo centrioles in the first cell cycle after their formation” (lines 271-272).
(8) Related to point 7: using acetylated tubulin as a universal length and width marker seems unreliable since it is a PTM. The authors should use general tubulin staining to estimate centriole dimensions, or at least establish that acetylated tubulin correlates well with the overall tubulin signal in all mutants.
We have added two supplementary data figures (Fig 4 – supp 3 and Fig 4 – supp 4) in which we co-stain control and mutant centrioles with alpha-tubulin. We found that acetylated tubulin marked mutant centrioles well and as alpha-tubulin length increased, acetylated tubulin length also increased.
(9) Presence and absence of various centriolar proteins. These analyses lack a clear reference for the precise centriole elongation stage. This is particularly problematic for proteins that are recruited at specific later stages (such as inner scaffold proteins). The staining should be correlated with centriole length measurements, ideally using general tubulin staining.
As described for point 8, we have added two supplementary data figures in which we costain control and mutant centrioles with alpha-tubulin and found that acetylated tubulin also increases as overall tubulin length increases in all mutants. We note that inner scaffold proteins are absent in all our mutant centrioles at all stages of the cell and centriole cycle, as also previously reported for POC5 in Wang et al., 2017.
Reviewer #2 (Recommendations For The Authors):
Here's a list of points I think could be improved:
- As the authors previously published, the centriole appears to have a smaller internal diameter than mature centrioles. Could the authors measure to see if the phenotype is identical? Is the centriole blocked in the bloom phase (Laporte et al. 2024)?
We have added an additional supplementary figure (Fig 4 – supp 5) to show that mutant centrioles have smaller diameters than mature centrioles, as we previously reported for the delta-tubulin and epsilon-tubulin mutant centrioles by EM. We thank the reviewers for the additional question of the bloom phase. Given the comparatively smaller number of centrioles we analyzed in this paper compared to Laporte et al (50 to 80 centrioles per condition here, versus 800 centrioles in Laporte et al), it is difficult to definitively conclude whether there is a block in bloom phase. This would be an interesting area for future research.
- The images of the centrioles in EM are beautiful. Would it be possible to apply a symmetrisation on it to better see the centriolar structures? For example, is the A-C linker present?
We thank the reviewer for this excellent suggestion. Using centrioleJ, we find that the A-C linker is absent from mutant centrioles. The symmetrized images have been added to Fig 1 – Supplementary Fig 2, and additional discussion has been added to the text (line 143-144, line 368-374).
- How many EM images were taken? Did the centrioles have 100% A-microtubule only or sometimes with B-MT?
For TEM, we focused on centrioles that were positioned to give perfect cross-section images of the centriolar microtubules, and thus did not take images of off-angle or rotated centrioles. Given the difficulty of this experiment (centrioles are small structures within the cell, centrosomes are single-copy organelles, and off-angle centrioles were not imaged), we were lucky to image 3 centrioles that were in perfect cross-section – 2 for Tedc1<sup>-/-</sup> and 1 for Tedc2<sup>-/-</sup>. Our images indicate that these centrioles only have A-tubules (Fig 1 – Supp Fig
2).
- In Figure 2 - it would be preferable to write TEDC2-flag or TEDC1-flag and not TEDC2/1.
We have made this change
- It seems that Figures 2C and D aren't cited, and some of the data in the supplemental data are not described in the main text.
We have replaced these supplements with new supplementary figures for TEDC1 and TEDC2 localization (Fig 2 – Supp 1).
- The signal in U-ExM with the anti-Flag antibody is heterogeneous. Did the authors test several anti-FLAG antibodies in U-ExM?
We tested several anti-Flag and anti-V5 antibodies for our analyses, and chose these because they have little background signal in all applications (Fig 2 – Supplementary Fig 1E-J). Other commercially available antibodies against these tags did exhibit non-specific signal.
- The AlphaFold prediction is difficult to interpret, the authors should provide more views and the PDB file.
We have added 2 additional views of the AlphaFold prediction in Fig 3 – Supp 1A.
- In general, but particularly for Figure 4: the length doesn't seem to be divided by the expansion factor, it is therefore difficult to compare with known EM dimensions. Can the authors correct the scale bars?
We have corrected the scale bars for all figures to account for the expansion factor.
- Concerning Gamma-tubulin that is "recruited to the lumen of centrioles by the inner scaffold, had localization defects in mutant centrioles. However, we were unable to reliably detect gamma-tubulin within the lumen of control or de novo-formed centrioles in S or G2-phase (Figure 4 - Supplement 1E), and thus were unable to test this hypothesis". In Laporte et al 2024, Gamma-tubulin arrives later than the inner scaffold and only on mature centrioles, so this result appears to be in line with previous observation. However, the authors should be able to detect a proximal signal under the microtubules of the procentriole, is this the case?
We agree that this is an exciting question. However, in our expansion microscopy staining, we frequently observe that gamma-tubulin surrounds centrioles, corresponding to its role in the pericentriolar material (PCM). In our hands, we find it difficult to distinguish between centriolar gamma-tubulin at the base of the A-tubule from gamma-tubulin within the PCM.
- In the signal elongation of SAS-6, STIL, CEP135, CPAP, and CEP44, would it be possible to quantify the length of these signals (with dimensions divided by the expansion factor for comparison with known TEM distances)?
We have quantified the lengths of SAS-6 and CEP135 in new Fig 4 – Supp 3 and Fig 4 – Supp 4.
- The authors observe that centrin is present, but only as a SFI1 dot-like localization (which is another protein that would be interesting to look at), and not an inner scaffold localization. Can the authors elaborate? These results suggest that the distal part is correctly formed with only a microtubule singlet.
We agree with the reviewer’s interpretation that the centriole distal tip is likely correctly formed with only singlet microtubules, as both distal centrin and CP110 are present. We have added this point to the discussion (line 415).
-The authors observe that CPAP is elongated, but CPAP has two locations, proximal and distal. Is it distal or proximal elongation? Is the proximal signal of CPAP longer than that of CEP44 in the mutants? The authors discuss that the elongation could come from overexpression of CPAP, but here it seems that the centriole is not overlong, just the structures around the cartwheel.
We thank the reviewer for this point. It is difficult for us to conclude whether the proximal or distal region is extended in the mutants, as our mutant centrioles lacks a visible separation between these two regions. It would be interesting to probe this question in the future by testing whether subdomains of CPAP may be differentially regulated in our mutants.
Reviewer #3 (Recommendations For The Authors):
It isn't apparent to me what was counted in Figure 1C. Were all centrioles (mother centrioles and procentrioles) counted? Where is the 40% in control cells coming from? Can this set of data be presented differently?
We apologize for the confusion. In this figure, all centrioles were counted. We have updated the figure legend for clarity. We performed this analysis in a similar way as in Wang et al., 2017 to better compare phenotypes.
Figure 2C. and the text lines 182-187: The ultrastructural characterization of TEDC1 and TEDC2 suffers from the low quality of the TEDC1 and TEDC2 signals obtained postexpansion. In comparison with robust low-resolution immunosignal, it appears that most of the signal cannot be recovered after expansion. Another sub-resolution imaging method to re-analyze TEDC1 and TEDC22 localization would be essential. The same concern applies to Figures 2 - Supplement 2 and 3. Also, Figure 2 - Supplement 2 and Supplement 3 do not seem to be cited.
We thank the reviewer for these recommendations. As also mentioned above, we used an alternative super-resolution approach, a Yokogawa CSU-W1 SoRA confocal scanner (resolution = 120 nm), and found that TEDC1 and TEDC2 localize to procentrioles and the proximal end of parental centrioles (Fig 2 – Supplementary Figure 1a, b). Second, we used a recently described expansion gel chemistry (Kong et al., Methods Mol Biol 2024) combined with Abberior Star red and orange secondary antibodies. This technique resulted in robust signal at centrosomes and in the cytoplasm and indicated that TEDC1 and TEDC2 localize near the centriole walls of procentrioles and the proximal region of parental centrioles, near CEP44 (Fig 2 – Supplementary Figure 1c, d). These stainings complement and support our initial observations (Fig 2C, D) and we have edited the text to reflect this (lines 157-163). We have also removed the supplementary figures that were uncited in the text.
TUBD1 and TUBE1 form a dimer and TEDC2 and TEDC1 can interact. Any speculation as to why TEDC2 does not pull down both TUBE1 and TUBD1?
We apologize for the confusion. TEDC2 does pull down both TUBE1 and TUBD1 (Fig 3D, pull-down, second column, Tedc2-V5-APEX2 rescuing the Tedc2<sup>-/-</sup> cells pulls down TUBD1, TUBE1, and TEDC1).
Figure 4A and B. The authors use acetylated tubulin to determine the length of procentrioles in the S and G2 phases. However, procentrioles are not acetylated on their distal ends in these cell phase phases (as the authors also mention further in the text). Why has alpha tubulin not been used since it works well in U-ExM? The average size of the control, G2 procentrioles, seems too small in Figure 4A and not consistent with other imaging data (for instance, in Figure 4 - Supplement 1 C, Cp110, and CPAP staining). There is no statistical analysis in F4A.
We have added two supplementary data figures (Fig 4 – supp 3 and Fig 4 – supp 4) in which we co-stain control and mutant centrioles with alpha-tubulin. We found that acetylated tubulin correlates well with overall tubulin signal in all mutants. We have added statistical analysis to the figure legend of Fig 4A.
Lines 260 - 262: "These results indicate that centrioles with singlet microtubules can elongate to the same length as controls, and therefore that triplet microtubules are not essential for regulating centriole length." It is hard to agree with this statement. Mutant procentrioles show aberrantly elongated proximal signals of several tested proteins. In addition, in lines 326 - 328, the authors state that "Together, these results indicate that centrioles lacking compound microtubules are unable to properly regulate the length of the proximal end."
We thank the reviewer and have clarified the statement to state that these results indicate that centrioles with singlet microtubules can elongate to the same overall length as control centrioles in G2 phase.
Line 353: The authors suggest that elongated procentriole structure in mitosis may represent intermediates in centriole disassembly. Another interpretation, more in line with the EM data from Wang et al., 2017, would be that these mutant procentrioles first additionally elongate before they disassemble in late mitosis. The aberrant intermediate structure concept would need further exploration. For instance, anti-alpha/beta-tubulin antibodies could be used to investigate centriole microtubules.
We apologize for the confusion and have edited this section for clarity (lines 341-343): “We conclude that in our mutant cells, centrioles elongate in early mitosis to form an aberrant intermediate structure, followed by fragmentation in late mitosis.”
References need to be included in lines 122, 277, 279.
We have added these references
Line 281: Add references PMID: 30559430 and PMID: 32526902.
We have added these references (lines 265-266).
Line 289: "Moreover, our results suggest that centriole glutamylation is a multistep process, in which long glutamate side chains are added later during centriole maturation." This does not seem like an original observation. For instance, see PMID: 32526902.
We have added this reference (lines 273-274).
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript describes an important finding of the transcriptional control of a chimeric gene transfer agents (GTA) cluster in Bartonella by a processive anti-termination factor (BrrG). The evidence provided is solid. This manuscript will interest researchers working on transcriptional regulation, horizontal gene transfer, and phages.
-
Reviewer #1 (Public review):
Summary:
Gene transfer agent (GTA) from Bartonella is a fascinating chimeric GTA that evolved from the domestication of two phages. Not much is known about how the expression of the BaGTA is regulated. In this manuscript, Korotaev et al noted the structural similarity between BrrG (a protein encoded by the ror locus of BaGTA) to a well-known transcriptional anti-termination factor, 21Q, from phage P21. This sparked the investigation into the possibility that BaGTA cluster is also regulated by anti-termination. Using a suite of cell biology, genetics, and genome-wide techniques (ChIP-seq), Korotaev et al convincingly showed that this is most likely the case. The findings offer the first insight into the regulation of GTA cluster (and GTA-mediated gene transfer) particularly in this pathogen Bartonella. Note that anti-termination is a well-known/studied mechanism of transcriptional control. Anti-termination is a very common mechanism for gene expression control of prophages, phages, bacterial gene clusters, and other GTAs, so in this sense, the impact of the findings in this study here is limited to Bartonella.
Strengths:
Convincing results that overall support the main claim of the manuscript.
Weaknesses:
A few important controls are missing.
-
Reviewer #2 (Public review):
Summary:
In this study, the authors identified and characterized a regulatory mechanism based on transcriptional anti-termination that connects the two gene clusters, capsid and run-off replication (ROR) locus, of the bipartite Bartonella gene transfer agent (GTA). Among genes essential for GTA functionality identified in a previous transposon sequencing project, they found a potential antiterminatior of phage origin within the ROR locus. They employed fluorescence reporter and gene transfer assays of overexpression and knockout strains in combination with ChiPSeq and promoter-fusions to convincingly show that this protein indeed acts as an antiterminator counteracting attenuation of the capsid gene cluster expression.
Impact on the field:
The results provide valuable insights into the evolution of the chimeric BaGTA, a unique example of phage co-domestication by bacteria. A similar system found in the other broadly studied Rhodobacterales/Caulobacterales GTA family suggests that antitermination could be a general mechanism for GTA control.
Strengths:
Results of the selected and carefully designed experiments support the main conclusions.
Weaknesses:
It remains open why overexpression of the antiterminator does not increase the gene transfer frequency.
-
Author response:
Reviewer 1:
(1) Provide Rsmd and DALI scores to show how similar the AlphaFold-predicted structures of BrrG are to other anti-termination factors. This should be done for Fig1B and also for Suppl. Fig 1 to support the claim that BrrG, GafA, GafZ, Q21 share structural features.
In the revised manuscript we will provide Rsmd and DALI scores.
(2) Throughout the manuscript, flow cytometry data of gfp expression was used and shown as single replicate. Korotaev et al wrote in the legends that error bars are shown (that is not true for e.g. Figs. 3, 4, and 5). It is difficult for reviewers/readers to gauge how reliable are their experiments.
As stated in the manuscript all flow cytometry data were performed in triplicate. In the revised manuscript we will include the two replicates not presented in the main figures as supplementary information.
(3) I am unsure how ChIP-seq in Fig. 2A was performed (with anti-FLAG or anti-HA antibodies? I cannot tell from the Materials & Methods). More importantly, I did not see the control for this ChIP-seq experiment. If a FLAG-tagged BrrG was used for ChIP-seq, then a WT non-tagged version should be used as a negative control (not sequencing INPUT DNA), this is especially important for anti-terminator that can co-travel with RNA polymerase. Please also report the number of replicates for ChIP-seq experiments.
Fig. 2A presents a coverage plot from the ChIP-Seq of ∆brrG +pTet:brrG-3xFLAG (N’). A replicate of this N-terminally tagged construct will be added as supplementary data in the revised version. As anticipated by the referee, we had used ∆brrG +pTet:brrG (untagged) as control.
(4) Korotaev et al mentioned that BrrG binds to DNA (as well as to RNA polymerase). With the availability of existing ChIP-seq data, the authors should be able to locate the DNA-binding element of BrrG, this additional information will be useful to the community.
We will mine the ChIP-Seq data to define the BrrG binding site as closely as possible and include the analysis in the revised version of the manuscript.
(5) Mutational experiments to break the potential hairpin structure are required to strengthen the claim that this putative hairpin is the potential transcriptional terminator.
We did not claim that the identified hairpin is a terminator but rather suggested it as a candidate terminator. We agree with the referee that the proposed experiment would be necessary to definitively prove its terminator function. However, our primary aim was to demonstrate that BrrG acts as a processive terminator, which we have shown by replacing the putative terminator with a well-characterized synthetic terminator that BrrG successfully overcame. Therefore, we prefer not to conduct the proposed experiment and will instead further tone down our conclusions regarding the putative terminator function of the identified hairpin structure.
Reviewer 2:
(1) The authors wrote "GTAs are not self-transmitting because the DNA packaging capacity of a GTA particle is too small to package the entire gene cluster encoding it" (page 3). I thought that at least the Bartonella capsid gene cluster should be self-transmissible within the 14 kb packaged DNA (https://doi.org/10.1371/journal.pgen.1003393, https://doi.org/10.1371/journal.pgen.1000546). This was also concluded by Lang et al (https://doi.org/10.1146/annurev-virology-101416-041624). In this case the presented results would have important implications. As the gene cluster and the anti-terminator required for its expression are separated on the chromosome, it would not be possible to transfer an active GTA gene cluster, although the DNA coding for the genes required for making the packaging agent itself, theoretically fits into a BaGTA particle. Could the authors comment on that? I think it would be helpful to add the sizes of the different gene clusters and the distance between them in Fig. 2A. The ROR amplified region spans 500kb, is the capsid gene cluster within this region?
We thank the reviewer for bringing up this interesting point. The bgt cluster (capsid cluster) is approximately 9.2 kb in size and could feasibly be packaged in its entirety into a GTA particle. In contrast, the ror gene cluster, which encodes the antiterminator BrrG, is approximately 20 kb in size—exceeding the packaging limit of GTA particles—and is separated from the bgt cluster by approximately 35 kb. Consequently, if the bgt cluster is transferred via a GTA particle into a recipient host that does not encode the ror gene cluster, the bgt cluster would not be expressed.
(2) Another side-note regarding the introduction: On page three the authors write: "GTAs encode bacteriophage-like particles and in contrast to phages transfer random pieces of host bacterial DNA". While packaging is not specific, certain biases in the packaging frequency are observed in both studied GTA families. For Bartonella this is ROR. In the two GTA-producing strains D. shibae and C. crescentus origin and terminus of replication are not packaged and certain regions are overrepresented (https://doi.org/10.1093/gbe/evy005, https://doi.org/10.1371/journal.pbio.3001790). Furthermore, D. shibae plasmids are not packaged but chromids are. I think the term "random" does not properly describe these observations. I would suggest using "not specific" instead.
We thank the reviewer for this suggestion and will adjust the working accordingly.
(3) Page 5: Remove "To address this". It is not needed as you already state "To test this hypothesis" in the previous sentence.
We will adjust the working accordingly.
(4) I think the manuscript would greatly benefit from a summary figure to visualize the Q-like antiterminator-dependent regulatory circuit for GTA control and its four components described on pages 15 and 16.
We thank the reviewer for this valuable suggestion and will include a summary figure illustrating the deduced regulatory mechanism in the revised manuscript.
(5) Page 17: It might be worth noting that GafA is highly conserved along GTAs in Rhodobacterales (https://doi.org/10.3389/fmicb.2021.662907) and so is probably regulatory integration into the ctrA network (https://doi.org/10.3389/fmicb.2019.00803). It's an old mechanism. It would be also interesting to know if it is a common feature of the two archetypical GTAs that the regulator is not part of the cluster itself.
We agree with the points raised by the reviewer and will address them in the revised manuscript. Specifically, we will highlight the high conservation of GafA in GTAs across Rhodobacterales and its regulatory integration within the ctrA network. Additionally, we will analyze whether the GafA-like antitermination regulator is typically located outside the regulated gene cluster, as we have demonstrated for BrrG of BaGTA in the Bartonellae.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study presents findings on DNA methylation as an efficient epigenetic transcriptional regulating strategy in bacteria. The authors utilized single-molecule real-time sequencing to profile the DNA methylation landscape across three model pathovars of Pseudomonas syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system, which includes a conserved sequence motif associated with N6-methyladenine. The evidence presented is solid and the study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation.
-
Reviewer #1 (Public review):
Summary:
In this work, Huang et al used SMRT sequencing to identify methylated nucleotides (6mA, 4mC, and 5mC) in Pseudomonas syringae genome. They show that the most abundant modification is 6mA and they identify the enzymes required for this modification as when they mutate HsdMSR they observe a decrease of 6mA. Interestingly, the mutant also displays phenotypes of change in pathogenicity, biofilm formation, and translation activity due to a change in gene expression likely linked to the loss of 6mA.
Overall, the paper represents an interesting set of new data that can bring forward the field of DNA modification in bacteria.
Comments on revisions:
Thank you for the additional work. The authors have now addressed all my concerns.
-
Reviewer #2 (Public review):
In the present manuscript, Huang et.al. revealed the significant roles of the DNA methylome in regulating virulence and metabolism within Pseudomonas syringae, with a particular focus on the HsdMSR system in this model strain. The authors used SMRT-seq to profile the DNA methylation patterns (6mA, 5mC, and 4mC) in three P. syringae strains (Psph, Pss, and Psa) and displayed the conservation among them. They further identified the type I restriction-modification system (HsdMSR) in P. syringae, including its specific motif sequence. The HsdMAR participated in the process of metabolism and virulence (T3SS & Biofilm formation), as demonstrated through RNA-seq analyses. Additionally, the authors revealed the mechanisms of the transcriptional regulation by 6mA. Strictly from the point of view of the interest of the question and the work carried out, this is a worthy and timely study that uses third-generation sequencing technology to characterize the DNA methylation in P. syringae. The experimental approaches were solid, and the results obtained were interesting and provided new information on how epigenetics influences the transcription in P. syringae. The conclusions of this paper are mostly well supported by data.
Comments on revisions:
The authors have successfully addressed all the comments from the reviewers in their revised manuscript.
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
In this work, Huang et al used SMRT sequencing to identify methylated nucleotides (6mA, 4mC, and 5mC) in Pseudomonas syringae genome. They show that the most abundant modification is 6mA and they identify the enzymes required for this modification as when they mutate HsdMSR they observe a decrease of 6mA. Interestingly, the mutant also displays phenotypes of change in pathogenicity, biofilm formation, and translation activity due to a change in gene expression likely linked to the loss of 6mA. Overall, the paper represents an interesting set of new data that can bring forward the field of DNA modification in bacteria.
Thank you for your valuable feedback on our paper exploring the impact of 6mA modification in P. syringae.
Major Concerns:
Most of the authors' data concern Psph pathovar. I am not sure that the authors' conclusions are supported by the two other pathovars they used in the initial 2 figures. If the authors want to broaden their conclusions to Pseudomonas syringe and not restrict it to Psph, the authors should have stronger methylation data using replicates. Additionally, they should discuss why Pss is so different than Pst and Psph. Could they do a blot to confirm it is really the case and not a sequencing artifact? Is the change of methylation during bacterial growth conserved between the pathovar? The authors should obtain mutants in the other pathovar to see if they have the same phenotype. The authors have a nice set of data concerning Psph but the broadening of the results to other pathovar requires further investigation.
We appreciate the reviewer’s insightful comments. While the majority of our data focuses on the Psph, we recognize the importance of validating these findings in Pss and Pst. To this end, we have performed additional experiments using dot blot and mutant construction to enhance our conclusions in other pathovars.
We agree that we should discuss why Pss is different from Psph and Pst. We performed a dot blot assay using genome DNA in Pss and Pst, presented in Figure S5A. Meanwhile, we compared the 6mA modification level of Pss and Pst in different growth phases. As shown in Figure S5A, the change of methylation during bacterial growth is conserved in Pst. The change was not obvious in Pss, which might be due to the lack of a type I R-M system.
“In accordance with previous studies showing that growth conditions affect the bacterial methylation status, we applied dot blot experiments using the same amount of DNA (1 μg) from these three P. syringae strains to detect the 6mA levels during both logarithmic and stationary phases. The results revealed that 6mA levels in the stationary phase were much higher compared to the logarithmic phase in Psph and Pst, but no significant change in Pss. Additionally, we found that during the stationary phase, 6mA methylation levels in Psph and Pst were higher than those in Pss. These findings were consistent with the MTases predication on these three strains, since Pss does not harbor any type I R-M systems, which are important for 6mA medication in bacteria.”
Please see Figure S5A and Lines 220-228 in the revised manuscript.
We also tried to construct an HsdM mutant in Pst to explore whether the influence of 6mA methylation was conserved in P. syringae, but it failed after multiple attempts. We did not construct a Pss mutant because no type I R-M system was predicted, and few methylation sites were identified via SMRT-seq in this strain. Therefore, we overexpressed HsdM in Pst instead. We have performed additional experiments in WT and the HsdM overexpression strains, including dot blot and growth curve assays.
Please see Figures S5B-C and Lines228-232 in the revised manuscript.
The authors should include proper statistical analysis of their data. A lot of terms are descriptive but not supported by a deeper analysis to sustain the conclusions. For example, in Figure 4E, we do not know if the overlap is significant or not. Are DEGs more overlapping to 6mA sites than non-DEGs? Here is a non-exhaustive list of terms that need to be supported by statistics: different level (L145), greater conservation (L162), significant conservation (L165), considerable similarity (L175), credible motifs (L189), Less strong (L277) and several "lower" and "higher" throughout the text.
Thank you for the insightful feedback. We have made the following revisions in the manuscript to ensure that the terms are more precise and do not require statistical significance testing.
(1) Statistical analysis: We have added statistical tests for the overlap between DEGs and 6mA sites in Figure 4E. We performed the Fisher test, and we found the overlap was not significant (p> 0.05). DEGs and non-DEGs were both non-significant overlapped 6mA sites. Please see Figure 4E and Lines 261-262.
“Less strong” was used to indicate the influence of HsdM on biofilm in Figure 5D. All Figures with “*” labels were analyzed using students' two-tailed t-tests with a significant change (p < 0.05).
(2) Revised wording: For terms used to describe comparisons, we have revised the wording to be clearer and ensure that the terminology used did not imply the need for statistical significance testing when not required. For example:
“Different level” has been removed. Please see Lines 143-144.
“Greater conservation” has been revised to “more conserved functional terms”. Please see Lines 161-162.
“Significant conservation” has been revised to “notable conservation”. Please see Line 165.
“Credible motifs” has been revised to “identified motifs”. Please see Line 186.
The authors performed SMRT sequencing of the delta hsdMSR showing a reduction of 6mA. Could they include a description of their results similar to Figures 1-2. How reduced is the 6mA level? Is it everywhere in the genome? Does it affect other methylation marks? This analysis would strengthen their conclusions.
Yes, we agree. We have provided additional analysis and descriptions to strengthen the conclusions regarding these valuable comments. We determined three methylation sites in the HsdMSR mutant strain and compared the overlapped genes within these modification patterns. Specifically, we focused on the 6mA sites in Psph WT, HsdMSR mutant, and HsdM motif CAGCN<sub>(6)</sub>CTC. As expected, we found almost all of the reduction 6mA sites in the ΔhsdMSR were from motif CAGCN<sub>(6)</sub>CTC. We also noticed that 5mC and 4mC sites in the mutant were relatively similar to that in WT, and the slight difference might be caused by sequencing errors. Overall, we propose that HsdMSR only catalyze the 6mA located on the motif CAGCN<sub>(6)</sub>CTC, but does not affect other 6mA sites and other modification types.
Please see Figures S4D-E and Lines 212-218 in the revised manuscript.
In Figure 6E to conclude that methylation is required on both strands, the authors are missing the control CAGCN6CGC construct otherwise the effect could be linked to the A on the complementary strand.
Thank you for your valuable suggestions. We have provided the control result on the complementary strand. Please see Figure 6C. The new result evidences the conclusion that 6mA methylation regulates gene transcription based on methylation on both strands.
Please see Figure 6C and Lines 329-330 in the revised manuscript.
Reviewer #2 (Public Review):
In the present manuscript, Huang et.al. revealed the significant roles of the DNA methylome in regulating virulence and metabolism within Pseudomonas syringae, with a particular focus on the HsdMSR system in this model strain. The authors used SMRT-seq to profile the DNA methylation patterns (6mA, 5mC, and 4mC) in three P. syringae strains (Psph, Pss, and Psa) and displayed the conservation among them. They further identified the type I restriction-modification system (HsdMSR) in P. syringae, including its specific motif sequence. The HsdMAR participated in the process of metabolism and virulence (T3SS & Biofilm formation), as demonstrated through RNA-seq analyses. Additionally, the authors revealed the mechanisms of the transcriptional regulation by 6mA. Strictly from the point of view of the interest of the question and the work carried out, this is a worthy and timely study that uses third-generation sequencing technology to characterize the DNA methylation in P. syringae. The experimental approaches were solid, and the results obtained were interesting and provided new information on how epigenetics influences the transcription in P. syringae. The conclusions of this paper are mostly well supported by data, but some aspects of data analysis and discussion need to be clarified and extended.
Thank you for your positive feedback and recognition of the importance of our study. We appreciate the suggestions for further clarification and extension of some aspects of data analysis and discussion. We added further analysis of the SMRT-seq result of the ΔhsdMSR and overexpressed HsdM in Pst to provide more information on conservation. We added these contents to the discussion in the revised manuscript. Please see Figure 6C and Figure S5.
Reviewer #3 (Public Review):
Summary:
The article by Huang et.al. presents an in-depth study on the role of DNA methylation in regulating virulence and metabolism in Pseudomonas syringae, a model phytopathogenic bacterium. This comprehensive research utilized single-molecule real-time (SMRT) sequencing to profile the DNA methylation landscape across three model pathovars of P. syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system (HsdMSR), which includes a conserved sequence motif associated with N6-methyladenine (6mA). The study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation. The use of SMRT sequencing for methylome profiling, coupled with transcriptomic analysis and in vivo validation, establishes a robust evidence base for the findings
Strengths:
The results are presented clearly, with well-organized figures and tables that effectively illustrate the study's findings.
Weaknesses:
It would be helpful to add more details, especially in the methods, which make it easy to evaluate and enhance the manuscript's reproducibility.
Thank you for the positive evaluation of our study, as well as the constructive feedback provided. We have added more details in methods for RNA-seq analysis and Ribo-seq analysis. Please see Lines 484-515.
“Briefly, bacteria were cultured to an OD<sub>600</sub> of 0.4, at which point chloramphenicol was added to a final concentration of 100 µg/mL for 2 minutes. Cells were then pelleted and washed with pre-chilled lysis buffer [25 mM Tris-HCl, pH 8.0; 25 mM NH4Cl; 10 mM MgOAc; 0.8% Triton X-100; 100 U/mL RNase-free DNase I; 0.3 U/mL Superase-In; 1.55 mM chloramphenicol; and 17 mM GMPPNP]. The pellet was resuspended in lysis buffer, followed by three freeze-thaw cycles using liquid nitrogen. Sodium deoxycholate was then added to a final concentration of 0.3% before centrifugation. The resulting supernatant was adjusted to 25 A260 units and mixed with 2 mL of 500 mM CaCl<sub>2</sub> and 12 µL MNase, making up a total volume of 200 µL. After the digestion, the reaction was quenched with 2.5 mL of 500 mM EGTA. Monosomes were isolated using Sephacryl S400 MicroSpin columns, and RNA was purified using the miRNeasy Mini Kit (Qiagen). rRNA was removed using the NEBNext rRNA Depletion Kit, and the final library was constructed with the NEBNext Small RNA Library Prep Kit. For each sample, ribosome footprint reads were mapped to the Psph 1448A reference genome, and the translational efficiency was calculated by dividing the normalized Ribo-seq counts by the normalized RNA counts. Two biological replicates were performed for all experiments.”
Recommendations For The Authors:
Reviewer #1 (Recommendations For The Authors):
I would recommend the authors limit their manuscript to Psph pathovar and include statistical analysis supporting their conclusions.
Thank you for your suggestion.
Minor
• L104: "significantly" please add a p-value and explain the analysis.
Sorry for the confusion. We have added the p-value and explained the analysis in the method section. The p-value used for SMRT-seq was the modification quality value (QV) score, which were used to call the modified bases A (QV=50) and C (QV=100). Please see Lines 452-454.
• Figures 1B, D, F, and Figure 2A: make the Venn diagram to scale
Yes, we have revised.
• L110-111: missing p-value to say that the authors observe a bigger overlap in Pst than Psph as they observed more modified sites in Pst
Sorry for the confusion. We said it had a bigger overlap in Pst because the number 17.7 was bigger than the number of 15 in Psph. To avoid misunderstanding, we revised the wording to “more genes equipped with all three modification types were detected in Pst than Psph”. Please see Lines 110-111.
• L112: missing description of their Pss analysis (IDP, sites...)
We have added the information for Pss in the revised manuscript.
“Additionally, the methylome atlas of Pss revealed a lower incidence of methylation than those of Psph and Pst, particularly in terms of 6mA modifications, which were only seen in 457 significant 6mA occurrences under the same threshold (IPD > 1.5) and a total of 2,853 and 1,438 methylation sites were detected as 5mC and 4mC, respectively”. Please see Lines 114-116.
• L118: "modification" to "modified "
We have revised. Please see Line 119.
• L120: "modification sites" to "modified nucleotides"
We have revised. Please see Line 121.
• L142: correct the title "Methylated genes revealed highly functional conservation among three P. syringae strains" maybe to "Methylated genes are functionally conserved among ..."
We have revised. Please see Line 142.
• Figure 2C: not easy to read and interpret
Sorry for the confusion. Figure 2C revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The larger the size, the bigger the number.
We have revised the legend of Figure 2C. Please see Lines 575-579.
“The dot plot revealed the significantly enriched functional pathways in GO and KEGG databases among three P. syringae strains. The specific names of each pathway were listed on the left, and each column with dots indicated the number of genes within one kind of methylation in one of three P. syringae strains. The size of the dots indicates the number of related genes.”
• Figure 6B-C: what is the difference between B 24h and C?
Figure 6B revealed the expression difference between WT and mutant during 24 hours. Figure 6C only showed a time point in 24 hours. To avoid repetition, we have removed Figure 6C.
• Figure 6C-D: if the same maybe remove Figure 2C
We have removed Figure 6D.
Reviewer #2 (Recommendations For The Authors):
The manuscript could be improved by addressing the following concerns:
(1) In line 146: How to understand the percentage conserved in "more than two of the strains"?
Sorry for the confusion, we planned to indicate the pattern that conserved in two strains and three strains. We have revised it to: “Notable, about 25% to 45% of methylated genes were conserved in two and three strains”. Please see Line 145.
(2) In line 178: Five conserved sequence motifs should be replaced by "Six conserved sequence motifs".
We have revised. Please see Line 176.
(3) In Figure 2B, specify the C1, C2 and C3. "m6A" should be replaced by "6mA".
Yes, we have revised.
(4) In Figure S2, "m6A" should be replaced by "6mA".
Yes, we have revised.
(5) In line 212, please add references for the previous studies showing that growth conditions affect bacterial methylation status.
Thank you for your suggestion. We have added the relevant references (Gonzalez and Collier, 2013), (Krebes et al., 2014), (Sanchez-Romero and Casadesus, 2020).
(6) In line 217, "illustrate" should be "illustrated".
Yes, we have revised. Please see Line 210.
(7) There are some genes colored in grey, revealing bigger differences between the two strains than those related to ribosomal protein, T3SS, and alginate synthesis in Fig. 4A. Do they have important functional roles as well?
Thank you for your suggestion. A total of 116 genes with bigger differences (|Log<sub>2</sub>FC| > 2) except for genes related to ribosomal protein, T3SS, and alginate synthesis. Among these genes, 31 were annotated as hypothetical proteins and 4 as transcription factors with unknown functions, and the remaining genes mostly encoded metabolism-related enzymes. These enzymes might have effects on growth defects in ΔhsdMSR. We added this information in the revised manuscript. Please see Line 249-254.
(8) The authors should discuss what will be the potential signals or factors that can regulate the activity of HsdMSR. In other words, what can decide the extent of methylation through activating or suppressing the expression of HsdMSR?
Thank you for your valuable suggestion. We have added this part in the discussion part. Please see Lines 404-415.
“Apart from the established roles of 6mA and HsdMSR in P. syringae, certain signals or factors may influence HsdMSR expression. For instance, we confirmed that the growth phase affects methylation levels in P. syringae. Previous studies have shown that increased temperatures can reduce methylation levels, as observed in PAO1(Doberenz et al., 2017). These findings suggest that HsdMSR expression may be responsive to both intrinsic cellular states and extrinsic environmental conditions. To further explore potential upstream TFs regulating the expression of HsdMSR, we searched for TF-binding sites in the HsdMSR promoter using our own databases (Fan et al., 2020; Shao et al., 2021; Sun et al., 2024). As a result, we found three candidate TFs (PSPPH_0061, PSPPH_3268, and PSPPH_3504) that might directly bind and regulate HsdMSR expression. Future studies on these TFs and their interactions with the HsdMSR promoter would help clarify the regulatory network governing HsdMSR activity.”
Reviewer #3 (Recommendations For The Authors):
(1) Some figures contain dense information, which may be overwhelming for readers. Streamlining the legend for Figure 1 and resizing the Venn diagrams within it could enhance clarity and visual appeal.
Thank you for your suggestion. We have scaled all the Venn plots in the revised version.
(2) Incorporating a discussion about the role of the restriction-modification (RM) system in bacterial defense against phage infection into the discussion section could enrich the manuscript's context and relevance.
Thank you for your valuable suggestion. We have added this part in the Discussion part. Please see Lines 416-427.
“RM systems are known for their intrinsic role as innate immune systems in anti-phage infection, and present in around 90% of bacterial genomes(Oliveira et al., 2014). RM systems protect bacteria self by recognizing and degrading foreign phage DNA via methylation-specific site and restriction endonucleases (REases) (Loenen et al., 2014). In addition, other phage-resistance systems are similar to RM systems but carry extra genes. One is called the phage growth limitation (Pgl) system, which modifies and cleaves phage DNA. However, the Pgl only modifies the phage DNA in the first infection cycle, and cleaves phage DNA in the subsequent infections, which gives a warn to the neighboring cells(Hampton et al., 2020; Hoskisson et al., 2015). To counteract RM and RM-like systems, phages have evolved strategies, including unusual modifications such as hydroxymethylation, glycosylation, and glucosylation. They can also encode their own MTases to protect their DNA or employ strategies to evade restriction systems and other anti-RM defenses.(Iida et al., 1987; Murphy et al., 2013; Vasu and Nagaraja, 2013).
(3) In line 152: What is the importance of the mentioned example of Cro/CI family TF?
Thank you for your comments. The Cro/CI are important TFs present in phages. The interaction between Cro and CI affects bacteria immunity status in Enterohemorrhagic Escherichia coli (EHEC) strains(Jin et al., 2022). RM systems are known as a kind of phage-defense system, and hence we mentioned it here. We have added this description in the revised manuscript. Please see Lines 152-154.
Reference:
(1) Doberenz, S., Eckweiler, D., Reichert, O., Jensen, V., Bunk, B., Sproer, C., Kordes, A., Frangipani, E., Luong, K., Korlach, J., et al. (2017). Identification of a Pseudomonas aeruginosa PAO1 DNA Methyltransferase, Its Targets, and Physiological Roles. mBio 8. 10.1128/mBio.02312-16.
(2) Fan, L., Wang, T., Hua, C., Sun, W., Li, X., Grunwald, L., Liu, J., Wu, N., Shao, X., Yin, Y., et al. (2020). A compendium of DNA-binding specificities of transcription factors in Pseudomonas syringae. Nat Commun 11, 4947. 10.1038/s41467-020-18744-7.
(3) Gonzalez, D., and Collier, J. (2013). DNA methylation by CcrM activates the transcription of two genes required for the division of Caulobacter crescentus. Mol Microbiol 88, 203-218. 10.1111/mmi.12180.
(4) Hampton, H.G., Watson, B.N., and Fineran, P.C. (2020). The arms race between bacteria and their phage foes. Nature 577, 327-336.
(5) Hoskisson, P.A., Sumby, P., and Smith, M.C. (2015). The phage growth limitation system in Streptomyces coelicolor A (3) 2 is a toxin/antitoxin system, comprising enzymes with DNA methyltransferase, protein kinase and ATPase activity. Virology 477, 100-109.
(6) Iida, S., Streiff, M.B., Bickle, T.A., and Arber, W. (1987). Two DNA antirestriction systems of bacteriophage P1, darA, and darB: characterization of darA− phages. Virology 157, 156-166.
(7) Jin, M., Chen, J., Zhao, X., Hu, G., Wang, H., Liu, Z., and Chen, W.-H. (2022). An engineered λ phage enables enhanced and strain-specific killing of enterohemorrhagic Escherichia coli. Microbiology Spectrum 10, e01271-01222.
(8) Krebes, J., Morgan, R.D., Bunk, B., Sproer, C., Luong, K., Parusel, R., Anton, B.P., Konig, C., Josenhans, C., Overmann, J., et al. (2014). The complex methylome of the human gastric pathogen Helicobacter pylori. Nucleic Acids Res 42, 2415-2432. 10.1093/nar/gkt1201.
(9) Loenen, W.A., Dryden, D.T., Raleigh, E.A., Wilson, G.G., and Murray, N.E. (2014). Highlights of the DNA cutters: a short history of the restriction enzymes. Nucleic Acids Res 42, 3-19.
(10) Murphy, J., Mahony, J., Ainsworth, S., Nauta, A., and van Sinderen, D. (2013). Bacteriophage orphan DNA methyltransferases: insights from their bacterial origin, function, and occurrence. Appl Environ Microb 79, 7547-7555.
(11) Oliveira, P.H., Touchon, M., and Rocha, E.P. (2014). The interplay of restriction-modification systems with mobile genetic elements and their prokaryotic hosts. Nucleic Acids Res 42, 10618-10631.
(12) Sanchez-Romero, M.A., and Casadesus, J. (2020). The bacterial epigenome. Nature reviews. Microbiology 18, 7-20. 10.1038/s41579-019-0286-2.
(13) Shao, X., Tan, M., Xie, Y., Yao, C., Wang, T., Huang, H., Zhang, Y., Ding, Y., Liu, J., Han, L., et al. (2021). Integrated regulatory network in Pseudomonas syringae reveals dynamics of virulence. Cell Rep 34, 108920. 10.1016/j.celrep.2021.108920.
(14) Sun, Y., Li, J., Huang, J., Li, S., Li, Y., Lu, B., and Deng, X. (2024). Architecture of genome-wide transcriptional regulatory network reveals dynamic functions and evolutionary trajectories in Pseudomonas syringae. bioRxiv, 2024.2001. 2018.576191.
(15) Vasu, K., and Nagaraja, V. (2013). Diverse functions of restriction-modification systems in addition to cellular defense. Microbiol Mol Biol Rev 77, 53-72. 10.1128/MMBR.00044-12.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript offers valuable theoretical predictions on how horizontal gene transfer (HGT) can lead to alternative stable states in microbial communities. Using a modeling framework, solid theoretical evidence is provided to support the claimed role of HGT. However, given that the model has many degrees of freedom, a more comprehensive analysis of the role of different parameters could strengthen the study. Additionally, potential interactions between plasmids that carry out HGT are not discussed in the model. This paper would be of interest to researchers in microbiology, ecology, and evolutionary biology.
-
Reviewer #2 (Public review):
Summary:
In this work, the authors use a theoretical model to study the potential impact of Horizontal Gene Transfer on the number of alternative stable states of microbial communities. For this, they use a modified version of the competitive Lotka Volterra model-which accounts for the effects of pairwise, competitive interactions on species growth-that incorporates terms for the effects of both an added death (dilution) rate acting on all species and the rates of horizontal transfer of mobile genetic elements-which can, in turn, affect species growth rates. The authors analyze the impact of horizontal gene transfer in different scenarios--such as bistability between pairs of species and multistability in communities--over an extended range of parameter values. In almost all these cases, the authors report an increase in either the number of alternative stable states or the parameter region (e.g. growth rate values) in which they occur.
Understanding the origin of alternative stable states in microbial communities and how often they may occur is an important challenge in microbial ecology and evolution. Shifts between these alternative stable states can drive transitions between e.g. a healthy microbiome and dysbiosis. A better understanding of how horizontal gene transfer can drive multistability could help predict alternative stable states in microbial communities, as well as inspire novel treatments to steer communities towards the most desired (e.g. healthy) stable states. In my opinion, this manuscript is a solid theoretical approach to the subject.
Strengths:<br /> - Generality of the model: the work is based on a phenomenological model that has been extensively used to predict the dynamics of ecological communities in many different scenarios.<br /> - The question of how horizontal gene transfer can drive alternative stable states in microbial communities is important and there are very few studies addressing it.
Weaknesses:<br /> - In the revised version of the manuscript, the authors significantly extended the analyzed region of parameter values. Still, the model has many parameters and the analysis is typically done by changing one or two parameters at a time. Thus, the work shows how HGT can indeed promote multistability, but it remains hard to grasp whether it consistently does so across a large region of the parameter values space.
-
Reviewer #3 (Public review):
Hong et al. used a model they previously developed to study the impact of plasmid transfer on microbial multispecies communities. They investigated the effect of plasmid transfer on the existence of alternative stable states in a community. The model most closely resembles plasmid conjugation, where the transferred genes confer independent growth-related fitness effects and different plasmids do not affect each other's transfer or growth effects. For this process, the authors find that increasing the rate of plasmid transfer leads to an increasing number of stable states, as long as the model includes a constant death/dilution term.
This is an interesting and important topic, and I welcome the authors' efforts to explore these topics with mathematical modeling. The addition of sensitivity analyses also strengthens the usefulness for quantitative microbial ecologists. However, the additional sections have made the main text harder to read. Between the effect of the dilution rate, the increase in subpopulations with HGT, and the modulation of interspecies competition, the reviewers have suggested a number of factors that may explain the way plasmid transfer modulates multistability. I think it would be helpful if the authors could summarize some of these effects/interactions between different parameters in their model more. I personally continue to find the model very unintuitive, especially in the way it averages over subpopulations carrying more than one foreign plasmid. Additional sentences that give the reader intuition for the sensitivity analyses and how these interplay with the results would be good.
Specific points
(1) The model makes strong assumptions about the biology of HGT, that could be spelled out even more. Since the model is primarily applicable to HGT driven by the exchange of plasmids, I believe the abstract (and perhaps even the title of the paper) should be updated to reflect that.
(2) I am not surprised that a mechanism that creates diversity will lead to more alternative stable states. Specifically, the null model for the absence of HGT is to set gamma to zero, resulting in pij=0 for all subpopulations (line 454). This means that a model with N^2 classes is effectively reduced to N classes. It seems intuitive that an LV-model with many more species would also allow for more alternative stable states. For a fair comparison one would really want to initialize these subpopulations in the model (with the same growth rates - e.g. mu1(1+lambda2)) but without gene mobility.<br /> [Update:] It is good that it seems that initializing pij with non-zero abundance did not seem to affect the conclusion that higher amounts of HGT increases multi stability. However, rather than listing it as one control for a specific condition, I would argue that this is the appropriate null model across the board (where HGT rate is varied from 0 to a high value), including figures S9 and S10.
(3) The possibility that the same cell may be counted in different pij runs counter to all intuition that researchers coming from a background of compartmental /epidemiological modeling may have. The associated assumption that plasmids do not affect each other's dynamics or (growth/interaction) effects at all is also a very strong assumption. This should be signaled much earlier in the manuscript, possibly already in line 106 when the model is introduced.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The authors present a modelling study to test the hypothesis that horizontal gene transfer (HGT) can modulate the outcome of interspecies competition in microbiomes, and in particular promote bistability in systems across scales. The premise is a model developed by the same authors in a previous paper where bistability happens because of a balance between growth rates and competition for a mutual resource pool (common carrying capacity). They show that introducing a transferrable element that gives a "growth rate bonus" expands the region of parameter space where bistability happens. The authors then investigate how often (in terms of parameter space) this bistability occurs across different scales of complexity, and finally under selection for the mobile element (framed as ABR selection).
Strengths:
The authors tackle an important, yet complex, question: how do different evolutionary processes impact the ecology of microbial ecosystems? They do a nice job at increasing the scales of heterogeneity and asking how these impact their main observable: bistability.
We appreciate the reviewer for agreeing with the potential value of our analysis. We are also grateful for the constructive comments and suggestions on further analyzing the influence of the model structure and the associated assumptions. We have fully addressed the raised issues in the updated manuscript and below.
Weaknesses:
The author's starting point is their interaction LV model and the manuscript then explores how this model behaves under different scenarios. Because the structure of the model and the underlying assumptions essentially dictate these outcomes, I would expect to see much more focus on how these two aspects relate to the specific scenarios that are discussed. For example:
A key assumption is that the mobile element conveys a multiplicative growth rate benefit (1+lambda). However, the competition between the species is modelled as a factor gamma that modulates the competition for overall resource and thus appears in the saturation term (1+ S1/Nm + gamma2*S2/Nm). This means that gamma changes the perceived abundance of the other species (if gamma > 1, then from the point of view of S1 it looks like there are more S2 than there really are). Most importantly, the relationship between these parameters dictates whether or not there will be bistability (as the authors state).
This decoupling between the transferred benefit and the competition can have different consequences. One of them is that - from the point of view of the mobile element - the mobile element competes at different strengths within the same population compared to between. To what degree introducing such a mobile element modifies the baseline bistability expectation thus strongly depends on how it modifies gamma and lambda.
Thus, this structural aspect needs to be much more carefully presented to help the reader follow how much of the results are just trivial given the model assumptions and which have more of an emergent flavour. From my point of view, this has an important impact on helping the reader understand how the model that the authors present can contribute to the understanding of the question "how microbes competing for a limited number of resources stably coexist". I do appreciate that this changes the focus of the manuscript from a presentation of simulation results to more of a discussion of mathematical modelling.
We thank the reviewer for the insightful suggestions. We agree with the reviewer that the model structure and the underlying assumptions need to be carefully discussed, in order to understand the generality of the theoretical predictions. In particular, the reviewer emphasized that how HGT affects bistability might depend on how mobile genetic elements modified growth rates and competition. In the main text, we have shown that when mobile genes only influence species growth rates, HGT is expected to promote multistability (Fig. 1 and 2). However, when mobile genes modify species interactions, the effect of HGT on multistability is dependent on how mobile genes change competition strength (Fig. 3a to f). When mobile genes increase competition, HGT promotes multistability (Fig. 3c and e). In contrast, when mobile genes relax competition, HGT is expected to reduce multistability (Fig. 3d and f).
In light of the reviewer’s comments, we have further generalized the model structure, by accounting for the scenario where mobile genes simultaneously modify growth rates and competition. The effect of mobile genes on growth rates is represented by the magnitude of 𝜆’s, and the influence on competition is described by another parameter 𝛿. By varying these two parameters, we can evaluate how the model structure and the underlying assumptions affect the baseline expectation. We performed additional simulations with broad ranges of 𝜆 and 𝛿 values. In particular, we analyzed whether HGT would promote the likelihood of bistability in two-species communities compared with the scenario without gene transfer (Fig. 3g-i). Our results suggested that: (1) With or without HGT, reducing 𝜆 (increasing neutrality) promotes bistability; (2) With HGT, increasing 𝛿 promotes bistability; (2) Compared with the population without HGT, gene transfer promotes bistability when 𝛿 is zero or positive, while reduces bistability when 𝛿 is largely negative. These results agree with the reviewer’s comment that the baseline bistability expectation depends on how HGT modifies gamma and lambda. In the updated manuscript, we have thoroughly discussed how the model structure and the underlying assumptions can influence the predictions (line 238-253).
We further expanded our analysis, by calculating how other parameters, including competition strength, growth rate ranges, and death/dilution rate, would affect the multistability of communities undergoing horizontal gene transfer (Fig. S2, S3, S9, S10, S11, S12, S13, S15). Together with the results presented in the first draft, these analysis enables a more comprehensive understanding of how different mechanisms, including but not limited to HGT, collectively shaped community multistability. In the updated manuscript, the reviewer can see the change of focus from exploring the effects of HGT to a more thorough discussion of the mathematical model. The revised texts highlighted in blue and the supplemented figures reflect such a change.
Reviewer #2 (Public review):
Summary:
In this work, the authors use a theoretical model to study the potential impact of Horizontal Gene Transfer on the number of alternative stable states of microbial communities. For this, they use a modified version of the competitive Lotka Volterra model-which accounts for the effects of pairwise, competitive interactions on species growth-that incorporates terms for the effects of both an added death (dilution) rate acting on all species and the rates of horizontal transfer of mobile genetic elements-which can in turn affect species growth rates. The authors analyze the impact of horizontal gene transfer in different scenarios: bistability between pairs of species, multistability in communities, and a modular structure in the interaction matrix to simulate multiple niches. They also incorporate additional elements to the model, such as spatial structure to simulate metacommunities and modification of pairwise interactions by mobile genetic elements. In almost all these cases, the authors report an increase in either the number of alternative stable states or the parameter region (e.g. growth rate values) in which they occur.
In my opinion, understanding the role of horizontal gene transfer in community multistability is a
very important subject. This manuscript is a useful approach to the subject, but I'm afraid that a thorough analysis of the role of different parameters under different scenarios is missing in order to support the general claims of the authors. The authors have extended their analysis to increase their biological relevance, but I believe that the analysis still lacks comprehensiveness.
Understanding the origin of alternative stable states in microbial communities and how often they may occur is an important challenge in microbial ecology and evolution. Shifts between these alternative stable states can drive transitions between e.g. a healthy microbiome and dysbiosis. A better understanding of how horizontal gene transfer can drive multistability could help predict alternative stable states in microbial communities, as well as inspire novel treatments to steer communities towards the most desired (e.g. healthy) stable states.
Strengths:
(1) Generality of the model: the work is based on a phenomenological model that has been extensively used to predict the dynamics of ecological communities in many different scenarios.
(2) The question of how horizontal gene transfer can drive alternative stable states in microbial communities is important and there are very few studies addressing it.
We thank the reviewer for the positive comments on the potential novelty and conceptual importance of our work. We are also grateful for the constructive suggestions on the generality and comprehensiveness of our analysis. In particular, we agree with the reviewer that a thorough analysis of the role of different parameter could further improve the rigor of this work. We have fully addressed the raised issues in the updated manuscript and below.
Weaknesses:
(1) There is a need for a more comprehensive analysis of the relative importance of the different model parameters in driving multistability. For example, there is no analysis of the effects of the added death rate in multistability. This parameter has been shown to determine whether a given pair of interacting species exhibits bistability or not (see e.g. Abreu et al 2019 Nature Communications 10:2120). Similarly, each scenario is analyzed for a unique value of species interspecies interaction strength-with the exception of the case for mobile genetic elements affecting interaction strength, which considers three specific values. Considering heterogeneous interaction strengths (e.g. sampling from a random distribution) could also lead to more realistic scenarios - the authors generally considered that all species pairs interact with the same strength. Analyzing a larger range of growth rates effects of mobile genetic elements would also help generalize the results. In order to achieve a more generic assessment of the impact of horizontal gene transfer in driving multistability, its role should be systematically compared to the effects of the rest of the parameters of the model.
We appreciate the suggestions. For each of the parameters that the reviewer mentioned, we have performed additional simulations to evaluate its importance in driving multistability.
For the added death rate, we have calculated the bistability feasibility of two-species populations under different values of 𝐷. Our results suggested that (1) varying death rate indeed changed the bistability probability of the system; (2) when the death rate was zero, mobile genetic elements that only modify growth rates would have no effects on system’s bistability. These results highlighted the importance of added death rate in driving multistability (Fig. S2, line 136-142).
For the interspecies interaction strength, we first extended our analysis on two-species populations. By calculating the bistability probability under different values of 𝛾, we showed that when interspecies interaction strength was smaller than 1, the influence of HGT on population bistability became weak (Fig. S3, line 143-147). We also considered heterogenous interaction strengths in multispecies communities, by randomly sampling 𝛾<sub>ij</sub> values from uniform distributions. While our results suggested the heterogeneous distribution of 𝛾<sub>ij</sub> didn’t fundamentally change the main conclusion, the mean value and variance of 𝛾<sub>ij</sub> affected the influence of HGT on multistability. The effects of HGT on community multistability becomes stronger when the mean value of 𝛾<sub>ij</sub> gets larger than 1 and the variance of 𝛾<sub>ij</sub> is small (Fig. S12, line 190-196).
We also analyzed different ranges of growth rates effects of mobile genetic elements. In particular, we sampled 𝜆<sub>ij</sub> values from uniform distributions with given widths. Greater width led to larger range of growth rate effects. We used five-species populations as an example and tested different ranges. Our results suggested that multistability was more feasible when the growth rate effects of MGEs were small. The qualitative relationship between HGT and community was not dependent on the range of growth rate effects (Fig. S13, line 197-205).
(2) The authors previously developed this theoretical model to study the impact of horizontal gene transfer on species coexistence. In this sense, it seems that the authors are exploring a different (stronger interspecies competition) range of parameter values of the same model, which could potentially limit novelty and generality.
We appreciate the comment. In a previous work (PMID: 38280843), we developed a theoretical model that incorporated horizontal gene transfer process into the classic LV framework. This model provides opportunities to investigate the role of HGT in different open questions of microbial ecology. In the previous work, we considered one fundamental question: how competing microbes coexist stably. In this work, however, we focused on a different problem: how alternative stable states emerge in complex communities. While the basic theoretical tool that we applied in the two works were similar, the scientific questions, application contexts and the implications of our analysis were largely different. The novelty of this work arose from the fact that it revealed the conceptual linkage between alternative stable states and a ubiquitous biological process, horizontal gene transfer. This linkage is largely unknown in previous studies. Exploring such a linkage naturally required us to consider stronger interspecies competitions, which in general would diminish coexistence but give rise to multistability. We believe that the analysis performed in this work provide novel and valuable insights for the field of microbial ecology.
With all the supplemented simulations that we carried out in light of the all the reviewer’s comments, we believe the updated manuscript also provide a unified framework to understand how different biological processes collectively shaped the multistability landscape of complex microbiota undergoing horizontal gene transfer. The comprehensive analyses performed and the diverse scenarios considered in this study also contribute to the novelty and generality of this work.
(3) The authors analyze several scenarios that, in my opinion, naturally follow from the results and parameter value choices in the first sections, making their analysis not very informative. For example, after showing that horizontal gene transfer can increase multistability both between pairs of species and in a community context, the way they model different niches does not bring significantly new results. Given that the authors showed previously in the manuscript that horizontal gene transfer can impact multistability in a community in which all species interact with each other, one might expect that it will also impact multistability in a larger community made of (sub)communities that are independent of (not interacting with) each-which is the proposed way for modelling niches. A similar argument can be made regarding the analysis of (spatially structured) metacommunities. It is known that, for smaller enough dispersal rates, space can promote regional diversity by enabling each local community to remain in a different stable state. Therefore, in conditions in which the impact of horizontal gene transfer drives multistability, it will also drive regional diversity in a metacommunity.
Thanks. Based on the reviewer’s comments, we have move Fig. 3 and 4 to Supplementary Information. In the updated manuscript, we have focused more on analyzing the roles of different parameters in shaping community multistability.
(4) In some cases, the authors consider that mobile genetic elements can lead to ~50% growth rate differences. In the presence of an added death rate, this can be a relatively strong advantage that makes the fastest grower easily take over their competitors. It would be important to discuss biologically relevant examples in which such growth advantages driven by mobile genetic elements could be expected, and how common such scenarios might be.
We appreciate the suggestion. Mobile genetic elements can drive large growth rate differences when they encode adaptative traits like antibiotic resistance (line 197-198).
We also analyzed different ranges of growth rates effects of mobile genetic elements, by sampling 𝜆<sub>ij</sub> values from uniform distributions with given widths. Our results suggested that multistability was more feasible when the fitness effects of MGEs were small (Fig. S13b). The qualitative relationship between HGT and community was not dependent on the range of growth rate effects (Fig. S13a and b). We discussed these results in line 197-205 of the updated main text.
Reviewer #3 (Public review):
Hong et al. used a model they previously developed to study the impact of horizontal gene transfer (HGT) on microbial multispecies communities. They investigated the effect of HGT on the existence of alternative stable states in a community. The model most closely resembles HGT through the conjugation of incompatible plasmids, where the transferred genes confer independent growth-related fitness effects. For this type of HGT, the authors find that increasing the rate of HGT leads to an increasing number of stable states. This effect of HGT persists when the model is extended to include multiple competitive niches (under a shared carrying capacity) or spatially distinct patches (that interact in a grid-like fashion). Instead, if the mobile gene is assumed to reduce between-species competition, increasing HGT leads to a smaller region of multistability and fewer stable states. Similarly, if the mobile gene is deleterious an increase in HGT reduces the parameter region that supports multistability.
This is an interesting and important topic, and I welcome the authors' efforts to explore these topics with mathematical modeling. The manuscript is well written and the analyses seem appropriate and well-carried out. However, I believe the model is not as general as the authors imply and more discussion of the assumptions would be helpful (both to readers + to promote future theoretical work on this topic). Also, given the model, it is not clear that the conclusions hold quite so generally as the authors claim and for biologically relevant parameters. To address this, I would recommend adding sensitivity analyses to the manuscript.
We thank the reviewer for the agreeing that our work addressed an important topic and was wellconducted. We are also grateful for the suggestion on sensitivity analysis, which is very helpful to improve the rigor and generality of our conclusion. All the raised issues have been fully addressed in the updated manuscript and below.
Specific points
(1) The model makes strong assumptions about the biology of HGT, that are not adequately spelled out in the main text or methods, and will not generally prove true in all biological systems. These include:
a) The process of HGT can be described by mass action kinetics. This is a common assumption for plasmid conjugation, but for phage transduction and natural transformation, people use other models (e.g. with free phage that adsorp to all populations and transfer in bursts).
b) A subpopulation will not acquire more than one mobile gene, subpopulations can not transfer multiple genes at a time, and populations do not lose their own mobilizable genes. [this may introduce bias, see below].
c) The species internal inhibition is independent of the acquired MGE (i.e. for p1 the self-inhibition is by s1).
These points are in addition to the assumptions explored in the supplementary materials, regarding epistasis, the independence of interspecies competition from the mobile genes, etc. I would appreciate it if the authors could be more explicit in the main text about the range of applicability of their model, and in the methods about the assumptions that are made.
We are grateful for the reviewer’s suggestions. In main text and methods of the updated manuscript, we have made clear the assumptions underlying our analysis. For point (a), we have clarified that our model primarily focused on plasmid transfer dynamics (line 74, 101, 517). Therefore, the process of HGT can be described by mass action kinetics, which is commonly assumed for plasmid transfer (line 537-538). For point (b), our model allows a cell to acquire more than one mobile genes. Please see our response to point (3) for details. We have also made it clear that we assumed the populations would not lose their own mobile gene completely (line 526-527). For (c), we have also clarified it in the updated manuscript (line 111-112, 527-528).
We have also performed a series of additional simulations to show the range of applicability of our model. In particular, we discuss the role of other mechanisms, including interspecies interaction strength, the growth rate effects of MGEs, MGE epistasis and microbial death rates in shaping the multistability of microbial communities undergoing HGT. These results were provided in Fig. S2, S3, S9, S10, S11, S12, S13 and S15.
(2) I am not surprised that a mechanism that creates diversity will lead to more alternative stable states. Specifically, the null model for the absence of HGT is to set gamma to zero, resulting in pij=0 for all subpopulations (line 454). This means that a model with N^2 classes is effectively reduced to N classes. It seems intuitive that an LV-model with many more species would also allow for more alternative stable states. For a fair comparison, one would really want to initialize these subpopulations in the model (with the same growth rates - e.g. mu1(1+lambda2)) but without gene mobility.
We appreciate the insightful comments. The reviewer was right that in our model HGT created additional subpopulations in the community. However, with or without HGT, we calculated the species diversity and multistability based on the abundances of the 𝑁 species (s<sub>i</sub> in our model), instead of all the p<sub>ij</sub> subpopulations. Therefore, although there exist more ‘classes’ in the model with HGT, the number of ‘classes’ considered when we calculated community diversity and multistability was equal. In light of the reviewer’s suggestion, we have also performed additional simulations, where we initialized the subpopulations in the model with nonzero abundances. Our results suggested that initializing the p<sub>ij</sub> subpopulations with non-zero abundances didn’t change the main conclusion (Fig. S11, line 188-189).
(3) I am worried that the absence of double gene acquisitions from the model may unintentionally promote bistability. This assumption is equivalent to an implicit assumption of incompatibility between the genes transferred from different species. A highly abundant species with high HGT rates could fill up the "MGE niche" in a species before any other species have reached appreciable size. This would lead to greater importance of initial conditions and could thus lead to increased multistability.
This concern also feels reminiscent of the "coexistence for free" literature (first described here http://dx.doi.org/10.1016/j.epidem.2008.07.001 ) which was recently discussed in the context of plasmid conjugation models in the supplementary material (section 3) of https://doi.org/10.1098/rstb.2020.0478 .
We appreciate the comments. Our model didn’t assume the incompatibility between MGEs transferred from different species. Instead, it allows a cell to acquire more than one MGEs. In our model, p<sub>ij</sub> described the subpopulation in the 𝑖-th species that acquired the MGE from the 𝑗th species. Here, p<sub>ij</sub> can have overlaps with p<sub>ik</sub> (𝑗 ≠ 𝑘). In other words, a cell can belong to p<sub>ij</sub> and p<sub>ik</sub> at the same time. The p<sub>ij</sub> subpopulation is allowed to carry the MGEs from the other species. In the model, we used
to describe the influence of the other MGEs on the growth of p<sub>ij</sub>.
We also thank the reviewer for bringing two papers into our attention. We have cited and discussed these papers in the updated manuscript (line 355-362).
(4) The parameter values tested seem to focus on very large effects, which are unlikely to occur commonly in nature. If I understand the parameters in Figure 1b correctly for instance, lambda2 leads to a 60% increase in growth rate. Such huge effects of mobile genes (here also assumed independent from genetic background) seem unlikely except for rare cases. To make this figure easier to interpret and relate to real-world systems, it could be worthwhile to plot the axes in terms of the assumed cost/benefit of the mobile genes of each species.
Thanks for the comments. In the main text, we presented one simulation results that assumed relatively large effects of MGE on species fitness, as the reviewer pointed out. In the updated manuscript, we have supplemented numerical simulations that considered different ranges of fitness effects, including the fitness effect as small as 10% (Fig. S13a). We have also plotted the relationship between community multistability and the assumed fitness effects of MGEs, as the reviewer suggested (Fig. S13b). Our results suggested that multistability was more feasible when the fitness effects of MGEs were small, and changing the range of MGE fitness effects didn’t fundamentally change our main conclusion. These results were discussed in line 197-205 of the updated main text.
Something similar holds for the HGT rate (eta): given that the population of E. coli or Klebsiella in the gut is probably closer to 10^9 than 10^12 (they make up only a fraction of all cells in the gut), the assumed rates for eta are definitely at the high end of measured plasmid transfer rates (e.g. F plasmid transfers at a rate of 10^-9 mL/CFU h-1, but it is derepressed and considered among the fastest - https://doi.org/10.1016/j.plasmid.2020.102489 ). To adequately assess the impact of the HGT rate on microbial community stability it would need to be scanned on a log (rather than a linear) scale. Considering the meta-analysis by Sheppard et al. it would make sense to scan it from 10^-7 to 1 for a community with a carrying capacity around 10^9.
We thank the reviewer for the constructive suggestion. We have carried out additional simulations by scanning the 𝜂 value from 10<sup>-7</sup> to 1. The results suggested that increasing HGT rates started to promote multistability when 𝜂 value exceeded 10<sup>-2</sup> per hour (Fig. S9, line 337-346). This corresponds to a conjugation efficiency of 10<sup>-11</sup> cell<sup>-1</sup> ∙ mL<sup>-1</sup>∙ mL when the maximum carrying capacity equals 10<sup>9</sup> cells ∙ mL<sup>-1</sup>, or a conjugation efficiency of 10<sup>-14</sup> cell<sup>-1</sup> ∙ hr<sup>-1</sup>∙ mL when the maximum carrying capacity equals 10<sup>12</sup> cells ∙ mL<sup>-1</sup>.
(5) It is not clear how sensitive the results (e.g. Figure 2a on the effect of HGT) are to the assumption of the fitness effect distribution of the mobile genes. This is related to the previous point that these fitness effects seem quite large. I think some sensitivity analysis of the results to the other parameters of the simulation (also the assumed interspecies competition varies from figure to figure) would be helpful to put the results into perspective and relate them to real biological systems.
We appreciate the comments. In light of the reviewer’s suggestion, we have changed the range of the fitness effects and analyzed the sensitivity of our predictions to this range. As shown in Fig. S13, changing the range of MGE fitness effects didn’t alter the qualitative interplay between HGT and community multistability. We have also examined the sensitivity of the results to the strength of interspecies competition strength (Fig. S3, S10, S12). These results suggested that while the strength of interspecies interactions played an important role in shaping community multistability, the relationship between HGT rate and multistability was not fundamentally changed by varying interaction strength. In addition, we examined the role of death rates (Fig. S2). In the updated manuscript, we discussed the sensitivity of our prediction to these parameters in line 136-147, 190205, 335-354.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
Please find below a few suggestions that, in my opinion, could help improve the manuscript.
TITLE
It might not be clear what I 'gene exchange communities' are. Perhaps it could be rewritten for more specificity (e.g. '...communities undergoing horizontal gene transfer').
We have updated the title as the reviewer suggested.
ABSTRACT
The abstract could also be edited to improve clarity and specificity. Terms like 'complicating factors' are vague, and enumerating specific factors would be better. The results are largely based on simulations, no analytical results are plotted, so I find that the sentence starting with 'Combining theoretical derivation and numerical simulations' can be a bit misleading.
We appreciate the suggestions. We have enumerated the specific factors and scenarios in the updated abstract (line 18-26). We have also replaced 'Combining theoretical derivation and numerical simulations' with ‘Combining mathematical modeling and numerical simulations’.
INTRODUCTION
- Line 42, please revise this paragraph. The logical flow is not so clear, it seems a bit like a list of facts, but the main message might not be clear enough. Also, it would be good to define 'hidden' states or just rewrite this sentence.
We appreciate the suggestion. In the updated manuscript, we have rewritten this paragraph to improve the logical flow and clarity (line 46-52).
- Line 54, there is little detail about both theoretical models and HGT in this paragraph, and mixing the two makes the paragraph less focused. I suggest to divide into two paragraphs and expand its content. For example, you could explain a bit some relevant implications of MGE.
We appreciate the suggestion. In the updated manuscript, we have divided this paragraph into two paragraphs, focusing on theoretical models and HGT, respectively (line 55-71). In particular, we have added explanations on the implications of MGEs (line 66-69), as the reviewer suggested.
- Line 72, as mentioned in the abstract, it would be better to explicitly mention which confounding factors are going to be discussed.
Thanks for the suggestion. We have rewritten this part as “We further extended our analysis to scenarios where HGT changed interspecies interactions, where microbial communities were subjected to strong environmental selections and where microbes lived in metacommunities consisting of multiple local habitats. We also analyzed the role of different mechanisms, including interspecies interaction strength, the growth rate effects of MGEs, MGE epistasis and microbial death rates in shaping the multistability of microbial communities. These results created a comprehensive framework to understand how different dynamic processes, including but not limited to HGT rates, collectively shaped community multistability and diversity” (line 75-82).
RESULTS
- The basic concepts (line 77) should be explained with more detail, keeping the non-familiar reader in mind. The reader might not be familiar with the concept of bistability in terms of species abundance. Also, note that mutual inhibition does not necessarily lead to positive feedback, as an interaction strength between 0 and 1 might still be considered inhibition. In any case, in Figure 1 it is not obvious how the positive feedback is represented, the caption should explain it. Note that neither the main text nor the caption explains the metaphor of the landscape and the marble that you are using in Figure 1a.
We have rewritten this paragraph to provide more details on the basic concepts (line 86-99). We have removed the statement about ‘mutual inhibition’ to avoid being misleading. We have also updated the caption of Fig. 1a to explain the metaphor of the landscape and the marble (line 389396).
- In the classical LV model, bistability does not depend on growth rates, but only on interaction strength. Therefore, I think that much of the results are significantly influenced by the added death rate. I believe that if the death rate is set to zero, mobile genetic elements that only modify growth rates will have no effect on the system's bistability. Because of this, I think that a thorough analysis of the role of the added death (dilution) rate and the distribution of growth rates is especially needed.
We are grateful for the reviewer’s insightful comments. In the updated manuscript, we have thoroughly analyzed the role of the added death (dilution) rate on the bistability of communities composed of two species (Fig. S2). Indeed, as the reviewer pointed out, if the death rate equals zero, mobile genetic elements that only modify growth rates will have no effect on the system's bistability. We have discussed the role of death rate in line 136-142 of the updated manuscript.
We have also expanded our analysis on the distribution of growth rates. In particular, we considered different ranges of growth rates effects of mobile genetic elements, by sampling 𝜆<sub>ij</sub> values from uniform distributions with given widths (Fig. S13). Greater width led to larger range of growth rate effects. We used five-species populations as an example and tested different ranges.
Our results suggested that multistability was more feasible when the growth rate effects of MGEs were small (Fig. S13b). The qualitative relationship between HGT and community was not dependent on the range of growth rate effects (Fig. S13a). These results are discussed in line 197205 of the updated manuscript.
- The analysis uses gamma values that, in the absence of an added death rate, render a species pair bistable. Therefore, multistability would be quite expected for a 5 species community. Note that, multistability is possible in communities of more than 2 species even if all gamma values are smaller than 1. Analyzing a wide range of interaction strength distributions would really inform on the relative role of HGT in multistability across different community scenarios.
We are grateful for the reviewer’s suggestion. In light of the reviewer’s comments, in the updated manuscript, we have performed additional analysis by focusing on a broader range of interaction strengths (Fig. S3, S10, S12), especially the gamma values below 1 (Fig. S10). Our results agreed with the reviewer’s notion that multistability was possible in communities of more than 2 species even if all gamma values were smaller than 1 (Fig. S10).
- I would recommend the authors extend the analysis of the model used for Figures 1 and 2. Figures 3 and 4 could be moved to the supplement (see my point in the public review), unless the authors extend the analysis to explain some non-intuitive outcomes for niches and metacommunities.
Thanks. In the updated manuscript we have performed additional simulations to extend the analysis in Figure 1 and 2. These results were presented in Fig. S2, S3, S9, S10, S11, S12, and S13. We have also moved Figure 3 and 4 to SI as the reviewer suggested.
- The authors seem to refer to fitness and growth rates as the same thing. This could lead to confusion - the strongest competitor in a species pair could also be interpreted as the fittest species despite being the slowest grower. I think there's no need to use fitness if they refer to growth rates. In any case, they should define fitness if they want to use this concept in the text.
We are grateful for the insightful suggestion. To avoid confusion, we have used ‘growth rate’ throughout the updated manuscript.
- Across the text, the language needs some revision for clarity, specificity, and scientific style. In lines 105 - 109 there are some examples, like the use of 'in a lot of systems', and ' interspecies competitions' (I believe they mean interspecies interaction strengths).
We appreciate the reviewer for pointing them out. We have thoroughly checked the text and made the revisions whenever applicable to improve the clarity and specificity.
- Many plots present the HGT rate on the horizontal axis. Could the authors explain why is it that the rate of HGT is relatively important for the number of alternative stable states? I understand how from zero to a small positive number there is a qualitative change. Beyond that, it shouldn't affect bistability too much, I think. If I am right, then other parameters could be more informative to plot in the horizontal axis. If I am wrong, I think that providing an explanation for this would be valuable.
Thanks. To address the reviewer’s comment, we have systematically analyzed the effects of HGT on community multistability, by scanning the HGT rate from 10<sup>-7</sup> to 10<sup>0</sup>hr<sup>-1</sup> . In communities of two or multiple species, our simulation results showed that multistability gradually increased with HGT rate when HGT rate exceeded 10<sup>2</sup>hr<sup>-1</sup>. These results, presented in Fig. S9 and discussed in line 337-346, provided a more quantitative relationship between multistability and HGT rate.
While in this work we showed the potential role of HGT in modulating community multistability, our results didn’t exclude the role of the other parameters. Motivated by the comments raised by the reviewers, in the updated manuscript, we have performed additional simulations to analyze the influence of other parameters in shaping community multistability. These parameters include death or dilution rate (Fig. S2), interaction strength (Fig. S3, S9, S10, S11, S12, S14, S15), 𝜆 range (Fig. S13, S15) and 𝛿 value (Fig. 3g, h, i). In many of the supplemented results (Fig. S2b, S3b, S13b, Fig. 3g, 3h and 3i), we have also plotted the data by using these parameters as the x axis. We believe the updated work now provided a more comprehensive framework to understand how different mechanisms, including but not limited to HGT, might shape the multistability of complex microbiota. These points were discussed in line 136-147, 190-205, 238-253, 334-354 of the updated main text.
- My overall thoughts on the case of antibiotic exposure are similar to those of previous sections. Very few of the different parameters of the model are analyzed and discussed. In this case, the authors increased the interaction strength to ~0.4 times higher compared to previous sections. Was this necessary, and why?
Thanks for the comments. In the previous draft, the interaction strength 𝛾=1.5 was tested as an example. Motivated by the reviewer’s comments, in the updated manuscript, we have examined different interaction strengths, including the strength ( 𝛾 = 1.1 ) commonly tested in other scenarios. The prediction equally held for different 𝛾 values (Fig. S15). We have also analyzed different 𝜆 ranges (Fig. S15). These results, together with the analyses presented in the earlier version of the manuscript, suggested the potential role of HGT in promoting multistability for communities under strong selection. The supplemented results were presented in Fig. S15 and discussed in line 293-295 of the updated manuscript.
- Line 195, if a gene encodes for the production of a public good, why would its HGT reduce interaction strength? I can think of the opposite scenario: the gene is a public good, and without HGT there is only one species that can produce it. Let's imagine that the public good is an enzyme that deactivates an antibiotic that is present in the environment, and then the species that produces has a positive interaction with another species in a pairwise coculture. If HGT happens, the second species becomes a producer and does not need the other one to survive in the presence of antibiotics anymore. The interaction can then become more competitive, as e.g. competition for resources could become the dominant interaction.
We are grateful for pointing it out. In the updated manuscript, we have removed this statement.
DISCUSSION
- L 267 "by comparison with empirical estimates of plasmid conjugation rates from a previous study [42], the HGT rates in our analysis are biologically relevant in a variety of natural environments". The authors are using a normalized model and the relevance of other parameter values is not discussed. If the authors want to claim that they are using biologically relevant HGT, they should also discuss whether the rest of the parameter values are biologically relevant. I recommend relaxing this statement about HGT rates.
We appreciate the suggestion. We agree with the reviewer that other parameters including the death/dilution rate, interactions strength and 𝜆 ranges are also important in shaping community multistability. We have performed additional analysis to show the effects of these parameters. In light of the reviewer’s suggestion, we have relaxed this statement and thoroughly discussed the context-dependent effect of HGT as well as the roles of different parameters (line 334-354).
- Last sentence: "Therefore, inhibiting the MGE spread using small molecules might offer new opportunities to reshape the stability landscape and narrow down the attraction domains of the disease states". It is not clear what procedure/technique the authors are suggesting. If they want to keep this statement, the authors should give more details on how small molecules can be/are used to inhibit MGE.
We appreciated the comments. Previous studies have shown some small molecules like unsaturated fatty acids can inhibit the conjugative transfer of plasmids. By binding the type IV secretion traffic ATPase TrwD, these compounds limit the pilus biogenesis and DNA translocation. We have provided more details regarding this statement in the updated manuscripts (line 376-379).
METHODS
- Line 439, mu_i should be presented as the maximum 'per capita' growth rate.
We have updated the definition of 𝜇i following the suggestion (line 529).
- Line 444, this explanation is hard to follow, please expand it to provide more details. You could provide an example, like explaining that all individuals from S1 have the MGE1 and therefore they have mu_1 = mu_01 ... After HGT, their fitness changes if they get the plasmid from S2, so a term lambda2 appears.
Thanks. In the updated manuscript, we have expanded the explanation by providing an example as the reviewer suggested (line 534-537).
- The normalization assumes a common carrying capacity Nm (Eqs 1-4) and then it's normalized (Eqs. 5-8). It would be better to start from a more general scenario in which each species has a different carrying capacity and then proceed with the normalization.
We appreciate the suggestion. In the updated manuscript, we have started our derivation from the scenario where each species has a different carrying capacity before proceeding with the normalization (section 1 of Methods, line 516-554). The same equations can be obtained after normalization.
- I think that the meaning of kappa (the plasmid loss rate) is not explained in the text.
Thanks for pointing it out. We have explained the meaning of kappa in the updated text (line 108, 154, 539-541, 586-587, 607).
SUPPLEMENT
- Figure S4, what are the different colors in panel b?
In panel b of Fig. S4, the different colors represent the simulation results repeated with randomized growth rates. We have made it clear in the updated SI.
Reviewer #3 (Recommendations for the authors):
(1) Please extend your description of the model, so it is easier to understand for readers who have not read the first paper. Especially the choice to describe the model as species and subpopulations, as opposed to writing it as MGE-carrying and MGE-free populations of each species makes it quite complicated to understand which parameters influence each other.
Thanks for the suggestion. We have extended the model description in the updated manuscript, which provides a more detailed introduction on model configurations and parameter definitions (line 86-99, 101-113, 151-159). We have also updated the Methods to extend the model description.
(2) Please define gamma_ji in equation 13 and eta_jki in equation 14 (how to map the indices onto the assumed directionality of the interaction).
We have defined these two parameters in the updated manuscript (line 584-586, 630-632).
(3) Line 511: please add at the beginning of this paragraph that you are assuming a grid-like arrangement of patches which will be captured by dispersal term H.
We have updated this paragraph to make this assumption clear (line 636-637).
(4) Line 540: "used in our model" (missing a word).
We have corrected it in the updated manuscript.
(5) Currently the analyses looking at the types of growth effects HGT brings (Figures 5-7) feel very "tacked on". These are not just "confounding factors", but rather scenarios that are much more biologically realistic than the assumption of independent effects. I would introduce them earlier in the text, as I think many readers may not trust your results until they know this was considered (+ how it changes the conclusions).
We are grateful for the suggestion. We agree with the reviewer that these biologically realistic scenarios should be introduced earlier in the text. In the updated manuscript, we have moved these analyses forward, as sections 3, 4 and 5. We have also avoided the term “confounding factors”. Instead, in the updated manuscript, we have separated these analyses into different sections, and clearly described each scenario in the section title (line 217-218, 254, 275).
(6) In some places the manuscript refers to HGT, in others to MGE presence (e.g. caption of Figure 6). These are not generally the same thing, as HGT could also occur due to extracellular vesicles or natural transformation etc. Please standardize the nomenclature and make it clearer which type of processes the model describes.
We appreciate the comment. The model in this work primarily focused on the process of plasmid transfer. We have made it clear throughout the main text.
(7) In many figures the y-axis starts at a value other than 0. This is a bit misleading. In addition, I would recommend changing the title "Area of bistability region" to "Area of bistability" or perhaps even "Area of multistability" (since more than two species are considered).
Thanks for the suggestion. We have updated all the relevant figures to make sure that their y-axes start at 0. We have also changed the title “Area of bistability region” to “Area of multistability”, whenever it is applicable.
(8) Figure 7: what are the assumed fitness effects of the mobile genes in the simulation? Which distribution were they drawn from? Please add this info to the figure caption here and elsewhere.
In Figure 7, we explored an extreme scenario of the fitness effects of the mobile genes, where the population was subjected to strong environmental selection and only cells carrying the mobile gene could grow. Therefore, the carriage of the mobile gene changed the species growth rate from 0 to a positive value µ<sub>i</sub>. When calculating the number of stable states in the communities, we randomly drew the µ<sub>i</sub> values from a uniform distribution between 0.3 and 0.7 hr<sup>-1</sup>. We had added this information in the figure caption (line 505-508) and method (line 615-617) of the updated manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study combines virology experiments and mathematical modeling to determine the nuclear export rate of each of the eight RNA segments of the influenza A virus, leading to the proposal that a specific retention of mRNA within the nucleus delays the expression of antigenic viral proteins. The proposed model for explaining the differential rate of export is compelling, going beyond the state of the art, but the experimental setup is only in partial support and further studies will be needed to confirm the proposed mechanism.
-
Reviewer #1 (Public review):
The authors studied why the two more antigenic proteins of the influenza A virus, hemagglutinin (HA) and neuraminidase (NA), are expressed later during the infection. They set an experimental approach consisting of a 2-hour-long infection at a multiplicity of infection of 2 with the viral strain WSN. They used cells from the lung carcinoma cell line A549. They used the FISH technique to detect the mRNAs in situ and developed an imaging-based assay for mathematically modeling and estimating the nuclear export rate of each of the eight viral segments. They propose that the delay in the expression of HA and NA is based on the retention of their mRNA within the nucleus.
Strength
The study of an unaddressed mechanism in influenza A virus infectious cycle, as is the late expression of HA and NA, by creating a work flow including mRNA detection (FISH) plus mathematical calculations to arrive at a model, which additionally could be useful for general biological processes where transcription occurs in a burst-like manner.
Weakness
The authors built on several assumptions regarding the viral infection to "quantify" the transcript' export rate lacking experimental support. It would greatly improve if more precise experiments could be performed and/or include demonstration of the assumptions made (i.e., empirically demonstrating that cRNA production does not occur within the first 2 hours of infection, and the late expression of HA and NA proteins).
-
Reviewer #2 (Public review):
In this study the authors developed a framework to investigate the export rates of Influenza viral RNAs translocating from the nucleus to the cytoplasm. This model suggests that the influenza virus may control gene expression at the RNA export level, namely, the retention of certain transcripts in the nucleus for longer times, allows the generation of other viral encoded proteins that are exported regularly, and only later on do certain mRNAs get exported. These encode proteins that alert the cell to the presence of viral molecules, hence keeping their emergence to very end, might help the virus to avoid detection as late as possible in the infection cycle.
The study is of limited scope. The notion that some mRNAs are retained in the nucleus after transcription is concluded early on from the FISH data. The model does not contribute much to the understanding and is mostly confirming the FISH data. The export rate is an ambiguous number and this part is not elaborated upon. One is left with more questions since no mechanistic knowledge emerges, and no additional experimentation is attempted to try drive to a deeper understanding.
Comments on revisions:
The authors have implemented the comments that required textual rewriting, which does make the paper clearer. On the experimental side, very little was done. It is fine to answer that the suggested experiments are not relevant or feasible for one reason or another, but one would expect to see some effort in providing other experimental sets to address key comments, and not only to modify a sentence in the text. So in my mind this round of revision feels more like some kind of intellectual discussion, which is fine, but I would have expected more, particularly after so much time has passed. I am still not satisfied with the way the analysis is presented in Fig. 2B, and writing a line about what is not analyzed in the legend, does not seem clear enough.
-
Author response:
The following is the authors’ response to the original reviews.
We thank the editors and reviewers for the comments and suggestions on our manuscript. The main point that we wished to convey in this paper was the concept and the kinetic model that enabled the estimation of nuclear export rate from an image of single mRNAs localised in single cells. By studying the influenza viral transcripts with this model, we report the variation in the mRNA nuclear export rate of the eight viral segments. Of note, the hemagglutinin and neuraminidase mRNAs were the slowest among the eight segments in exiting the nucleus. We agree that the potential mechanism and the biological impact of this observation require further validation, as the reviewers pointed out. We revised our manuscript to describe these points separately (Lines 21-25, Abstract; Lines 86-91, Introduction; Lines 316-320, Results; Lines 372-381, Discussion). We also highlight below, the revisions that we made to address the specific points raised by the reviewers.
Influenza viral transcription
The authors used specific settings for their virology experiments and several assumptions regarding their mathematical modelling, so it's extremely important that the reader has the viral life cycle clearly understood before immersing themselves in the results. Thus, a detailed explanation of the viral life cycle, including the kinetics of each step, would be extremely helpful if included in the introduction section. Reviewer #1
We have included the molecular composition of influenza vRNP and the mechanism of viral transcription in the revised manuscript (Lines 46-53).
Line 45: "Eight viral RNA segments are transcribed by the same set of molecular machinery" (Ref. 7). What's known about the arrival of the viral RNA segments in the nucleus? Is it synchronized? The authors will understand that my concern is related to the fact that a differential arrival would indeed impact the transcription and export processes. Reviewer #1
The arrival of eight vRNPs in the nucleus is not synchronised, with each of the eight vRNPs arriving independently (Chou et al. PLOS Pathogens 2013) (Lakadamyali et al, PNAS 2003). This does not compromise our model, as our model estimates the export rate of each mRNA species individually (also please see our response in Model assumption below). This is included in the second paragraph of the Discussion section (Lines 390-400).
Model assumption
Even though I do not have the expertise to assess the authors' mathematical model, I do not doubt its robustness. Even so, I find some virological concerns related to the set-up of their experiments. According to what I understand, the authors performed non-synchronized 2 h-long infections with the WSN strain of influenza A virus. They did this to avoid cRNA production (and cross-reaction of the probes), which they claim to occur "much later than mRNA synthesis". Then they omit the degradation of the mRNAs for their model without giving an explanation for having done so. So, taking all these into account, it seems to me that too many assumptions are made without a strong argument. I understand that they are made in order to simplify their model, but I strongly consider that the model would gain strength if some of these events were experimentally considered. Thus, would it be possible to perform synchronized infections? Would it be possible to empirically demonstrate that cRNA production does not occur within the first 2 hours of infection and/or separate transcription and replication? Would it be possible to incorporate a degradation inhibitor of the mRNAs into their infections? If all these could be achieved, then the results coming out of the mathematical model would be enormously reinforced. Reviewer #1
* The study lacks experimental data that would help support the conclusions. For instance, perturbations are many times used to prove a point related to gene expression. An example for Fig. 2 for such an experiment could be to treat the cells with transcription inhibitors (e.g. DRB, 5,6-dichloro1-beta-D-ribofuranosylbenzimidazole). Preventing transcription leaves only mature RNAs in the nucleus, and then using this system one can compare the export rate of different RNAs. Reviewer #2
We agreed that the primary concern in our model was the assumption that the mRNA degradation could be omitted. Synchronised infection is not necessary; in fact, non-synchronised infection is preferred, as we explain later in our response. Additionally, the dominance of mRNA production over the cRNA production has been documented elsewhere. To address mRNA degradation and validate our model estimation, we performed a time-course measurement using baloxavir. Baloxavir efficiently blocks the viral transcription by inhibiting the nuclease activity in PA. DRB, suggested by the reviewer, allows influenza viral transcription and causes viral transcripts to accumulate in the nucleus for unknown mechanisms (Amorim et al. Traffic 2007 and our observation using smFISH, not shown). The additional experiment, now presented in Fig. 5 in the revised manuscript, indicated that the mRNA degradation is minimal, and the export rate estimated in our model and the time-course experiment agreed well for the HA segment. The experiment raised the possibility that the time-course measurement underestimates the export rate of transcripts that exit the nucleus rapidly, such as NP. A real-time imaging of single transcripts would be necessary to directly measure the true nuclear export rate; however, this is beyond the scope of our paper. The new result is now presented in Fig. 5, Supplementary figures 3 and 4, and in the main text (Lines 322-360). An alteration was also made in Line 286 to guide to Fig. 5. The Materials and Methods section was updated (Lines 478-482).
We note that our model does not require synchronised infection. Even under synchronised infection, such as incubating cells with the virus at 4°C to facilitate attachment and subsequently shifting to 37°C to allow viral entry, the inherent heterogeneity in vRNP migration to the nucleus still remains. This randomness does not compromise our model; rather, our model exploits this random arrival of each vRNP in each cell in the system. This variation, in turn, generates cells carrying varying amounts of transcripts, enabling the estimation of nuclear export rate. Importantly, more variation ensures the broader distribution of transcript levels, enabling more precise parameter fitting in our model. It is also important to note that our model does not require the correlation between segments. Our model estimates the export rate of each mRNA species individually. These important points were explained in the Discussion section (Lines 390-400).
* There is no concrete value given for the export rates and what they might mean biologically (e.g. time present/stuck in the nucleus) - Fig. 4D. This leaves the reader in the dark. Reviewer #2
The export rate lambda (previously denoted as k) in our model (Fig. 4) and the decay constant k in the time-course measurement (Fig. 5) represent the proportion of mRNAs exported from the nucleus in an infinitesimal time, defining the nuclear export rate. This has been clarified in the revised manuscript (Lines 314-316), with some alterations to make the parameter use more comprehensive.
- The Greek letter k previously used in Fig. 4 and the associated equations was consistently replaced with lambda to avoid the confusion with the parameter k that is subsequently used for the exponent decay in Fig. 5 in the revised manuscript.
- The Greek letter epsilon (previously used to represent export) was replaced with mu, slightly more common for representing the rate of transport.
- The term “velocity” was consistently replaced with “rate” in the context of the nuclear export (Lines 163, 215, 320, 441).
- The phrase “molar concentrations of mRNAs” was corrected for “molecules of mRNAs” (Line 282).
Also, we have now described our model in two sections: “Conceiving the model” and “Implementing a kinetic model to estimate the nuclear export rate” in the Result. The first section outlines the conceptual framework of the model, and the second focuses on its implementation and the parameter extraction (Lines 227 and 277).
Applicability of the model
Lines 27-29. "Our framework presented in this study can be widely used for investigating the nuclear retention of nascent transcripts produced in a transcription burst." In my opinion, this is the strongest point of the manuscript: developing a mathematical model to analyze nuclear export retention as a mechanism of protein expression control, which could lay the foundation for further biological processes. The authors revisit this idea in the Discussion section. However, which would be those processes for which the model could be helpful? I consider that a more conspicuous discussion on this topic would broaden the readers scope, a crucial point under the eLife scope. Reviewer #1
* Could this framework be used to quantify the nuclear export rate of cellular RNAs? According to the explanation in the Discussion, it would seem that this approach is limited to quantifying the export rate of influenza RNAs. Reviewer #2
Our model is not limited to the influenza virus infection. Our model is applicable for systems where transcription is initiated concurrently, such as when stimuli trigger the activation of a certain set of genes for transcription. Therefore, this makes it particularly valuable for quantifying the nuclear retention of mRNAs in a transcription burst. This point is reiterated in Line 383-390.
Potential mechanisms for differential nuclear export rate of viral segments
* There is no mechanistic insight in the study. The idea driven by this study is that gene expression is regulated by the RNA export rate. But how is that explained? Is there any molecular pathway or explanation for this model? If the transcripts are ready for export, why do the mRNAs stay inside the nucleus? One option to consider are the export factors. Viral RNAs are exported by different pathways as mentioned (line 362), or by TREX2 (Bhat P et al Nat Comm 2023). The data shows that there is no difference observed in the export rate of different pathways. How about knocking down an important export factor to show how this affects the export rates. Or the opposite, overexpress a certain factor, would this change the nucleus/cytoplasm distribution of the retained RNAs. Reviewer #2
As we discussed in the paper, we are beginning to consider that each viral segment has an intrinsic sequence that determines its nuclear export rate, because previous studies on the export factors does not fully explain the variation in the nuclear export rate observed in our study. As the reviewer suggested, a recent study (Bhat et al. Nature Communications 2023) exactly pointed out the internal sequence in the HA segment, aligning with our working hypothesis. This point is discussed and their work (Bhat et al. 2023) has been cited in the Discussion section in the revised manuscript (Lines 446-449).
Biological impact of the nuclear retention
The authors mention several times throughout the manuscript that the virus might use the nuclear retention of mRNA for HA and NA to postpone the expression of these antigenic molecules. At this point, I need to admit that a great question mark appeared in my mind, maybe related to the fact that some knowledge is lacking in my analysis. Lines 328-330: "On the other hand, pushing back the expression of viral antigens HA and NA would be beneficial for the virus to delay the host immune response against the infected cells in which the virus is being replicated." As I tend to understand, the host immune response recognizes HA and NA within the viral particle, if so and independently of the time that HA and Na arrive at the virus assembly step, the progeny' viral particles that are complete and extruded from the cells would be those awakening the host immunity response. If this is right, how would the delayed export of those proteins from the nucleus (and their late expression) be beneficial for delaying the immune response? I would appreciate an explanation for this point, and if I am wrong, then there could exist a relationship between nuclear export rate and the pathogenicity of different strains of influenza A virus. If so, could the authors challenge their model with additional viral strains showing a differential immune response pattern? A deeper analysis in this direction would greatly strengthen the message in their manuscript. Reviewer #1
* Is the timing of viral protein appearance in accordance with the time the mRNA is exported to the cytoplasm. It is logical that the first mRNA to go to the cytoplasm would be the first to become a protein. Can the authors show that nuclear retention of mRNA would push back the expression of the viral antigens HA and NA. Reviewer #2
Three types of immune reactions are being studied extensively. The first is the humoral immune response, where antibodies target the viral antigens HA and NA on the viral envelope, coating and inactivating the viral particles. The second is the cytotoxic T cell response. There is growing evidence that cytotoxic T cells react against NP, eliciting cross-reaction to broader range of influenza viral strains. This reaction is not specific to HA and NA, and antigens are processed in the cytoplasm and presented on the MHC. The third is antibody-dependent cellular cytotoxicity (ADCC), where antibodies recognise the viral proteins on the cellular surface (HA and NA) of infected cells, facilitating their elimination by the NK cells. Although protein translation may begin as soon as the first mRNA exits the nucleus, the virus may delay the peak of the antigen production and therefore, postpone the NK-mediated ADCC. This specific point, along with references to ADCC in influenza virus infection, has been clarified in the Discussion section (Lines 377-381).
Data analysis and presentation
Lines 99-101. "Viral mRNAs were detected as single diffraction-limited spots in the three-dimensional image stacks, allowing for absolute mRNA quantification (Fig. 1B)". What do the authors mean to say by "absolute mRNA quantification"? Do they refer to the total spots or the total mRNAs? Is it assumed that one spot corresponds to a single mRNA transcript? This is not clear at all for this reviewer, which could be the situation for a potential reader. Since it's the beginning of the story, this should be clearly stated in the manuscript. Reviewer #1
Each spot of fluorescent signal corresponds to a single molecule of viral mRNA. We quantified the absolute number of transcripts in each cell. This is clarified in the revised manuscript (Lines 104-106).
* Line 151: does the baseline change according to the RNA in question? The authors say that the "baseline is defined by the median of the Z distribution of peripheral mRNAs" - it seems that the number 0.731 refers only to one type of RNA (which is not mentioned at all not in the text and not in the legend). Reviewer #2
The baseline was set using the NP mRNAs in the cytoplasm because the NP mRNA showed the widest distribution across the cytoplasm (Line 157).
* Also, what is all the signal that is seen outside the marked cells in Fig. 2B? There seems to be significant background in the field, does this mean much false-positive in the multiplex FISH? If so, then how do the authors know that the staining inside the cells isn't to some degree non-specific? It would be necessary to back this up with some other type of quantitative assay like qRT-PCR. Reviewer #2
The cells were removed from the analysis if the cytoplasmic boundary touched any edge of the field-of-view, while the signals were recovered across the entire field-of-view. This is clarified in the figure legend (Lines 194-195).
Others
* The meaning and explanation for Figure 1H -are unclear. Rephrase and make the legend more reader friendly. Reviewer #2
We made alterations to the legend (Lines 132-134) and the relevant lines in the main text (Lines 148-151).
* Fig. 2E: Is this the total transcript count or only in the nucleus? Would it be possible to find some correlation between the segments if a pair-wise analysis is performed according to nuclear-cytoplasm distribution? Reviewer #2
The total counts are presented. This is clarified in the legend (Lines 199-200).
* Abstract -"A mathematical modelling indicated that the relationship between the nuclear ratio and the total count of mRNAs in single cells is dictated by a proxy for the nuclear export rate." - this sentence is very unclear. Reviewer #2
The sentence was removed in the revised manuscript (Line 21). This removal did not affect the overall meaning in the abstract. We also made an alteration to Line 279 that contained a similar phrase.
* The use of the word "acutely" (lines 16 and 35) is strange. Reviewer #2
They have been removed (now Lines 15, 33).
* Line 157 - "This result indicates that the velocity of viral mRNA export from the nucleus varies according to the viral segments." - not velocity, maybe timing. Reviewer #2
We consistently replaced “velocity” with “rate” (Lines 163, 215, 320, 441).
* Reference for line 41. Reviewer #2
A reference (Waker et al. Trends Microbiol. 2019) has been cited (Line 39).
* Reference for lines 105-106. Reviewer #2
The gene length of each segment was indicated in the sentence (Line 137).
* Line 264- why here is 0.02 M.O.I used compared to line 97 where 2 is used? Reviewer #2
We used M.O.I. of 0.02 to allow for spot quantification over longer periods of observation (Lines 269-270).
* NS1 is expressed at late infection times and might alter the nuclear export of viral mRNAs (line 352). Need to show that indeed it is not expressed in the experiments done here. Reviewer #2
It is not possible to definitely prove that NS1 is not expressed due to the sensitivity limitations. However, we minimised the its impact by investigating at the early time point (Lines 415416).
* Line 459- 30% formamide? Is this correct or should it be 10%? Reviewer #2
This is correct. The probes used were longer than the others for smFISH. Therefore, we washed away the probes with the stringent condition.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study reports a model of 8 somatosensory areas of the rat cortex consisting of 4.2 million morphologically and electrically detailed neurons. The authors carry out simulation experiments aimed at understanding how multiscale organization of the cortical network shapes neural activity. While the reviewers found the results to be solid, they note that they could have likely been obtained using a much smaller portion of the model. Nonetheless, the release of the modeling platform represents a significant contribution to the field by providing a valuable resource for the scientific community.
-
Reviewer #1 (Public review):
This paper presents a model of the whole somatosensory non-barrel cortex of the rat, with 4.2 million morphologically and electrically detailed neurons, with many aspects of the model constrained by a variety of data. The paper focuses on simulation experiments, testing a range of observations. These experiments are aimed at understanding how multiscale organization of the cortical network shapes neural activity.
Strengths
• The model is very large and detailed. With 4.2 million neurons and 13.2 billion synapses, as well as the level of biophysical realism employed, it is a highly comprehensive computational representation of the cortical network.
• Large scope of work - the authors cover a variety of properties of the network structure and activity in this paper, from dendritic and synaptic physiology to multi-area neural activity.
• Direct comparisons with experiments, shown throughout the paper, are laudable.
• The authors make a number of observations, like describing how high-dimensional connectivity motifs shape patterns of neural activity, which can be useful for thinking about the relations between the structure and the function of the cortical network.
• Sharing the simulation tools and a "large subvolume of the model" is appreciated.
Weaknesses
• A substantial part of this paper - the first few figures - focuses on single-cell and single-synapse properties, with high similarity to what was shown in Markram et al., 2015. Details may differ, but overall it is quite similar.
• Although the paper is about the model of the whole non-barrel somatosensory cortex, out of all figures, only one deals with simulations of the whole non-barrel somatosensory cortex. Most figures focus on simulations that involve one or a few "microcolumns". Again, it is rather similar to what was done in Markram et al., 2015 and constitutes relatively incremental progress.
• With a model like this, one has an opportunity to investigate computations and interactions across an extensive cortical network in an in vivo-like context. However, the simulations presented are not addressing realistic specific situations corresponding to animals performing a task or perceiving a relevant somatosensory stimulus. This makes the insights into roles of cell types or connectivity architecture less interesting, as they are presented for relatively abstract situations. It is hard to see their relationship to important questions that the community would be excited about - theoretical concepts like predictive coding, biophysical mechanisms like dendritic nonlinearities, or circuit properties like feedforward, lateral, and feedback processing across interacting cortical areas. In other words, what do we learn from this work conceptually, especially, about the whole non-barrel somatosensory cortex?
• Most of comparisons with in vivo-like activity are done using experimental data for whisker deflection (plus some from the visual stimulation in V1). But this model is for the non-barrel somatosensory cortex, so exactly the part of the cortex that has less to do with whiskers (or vision). Is it not possible to find any in vivo neural activity data from non-barrel cortex?
• The authors almost do not show raw spike rasters or firing rates. I am sure most readers would want to decide for themselves whether the model makes sense, and for that the first thing to do is to look at raster plots and distributions of firing rates. Instead, the authors show comparisons with in vivo data using highly processed, normalized metrics.
• While the authors claim that their model with one set of parameters reproduces many experimentally established metrics, that is not entirely what one finds. Instead, they provide different levels of overall stimulation to their model (adjusting the target "P_FR" parameter, with values from 0 to 1, and other parameters), and that influences results. If I get this right (the figures could really be improved with better organization and labeling), simulations with P_FR closer to 1 provide more realistic firing rate levels for a few different cases, however, P_FR of 0.3 and possibly above tends to cause highly synchronized activity - what the authors call bursting, but which also could be called epileptic-like activity in the network.
• The authors mention that the model is available online, but the "Resource availability" section does not describe that in substantial detail. As they mention in the Abstract, it is only a subvolume that is available. That might be fine, but more detail in appropriate parts of the paper would be useful.
Comments on revisions:
The authors addressed all my comments by revising and adding text as well as revising and adding some figures and videos. The limitations described in my previous review (above) mostly remain, but they are much better acknowledged and described now. These limitations can be addressed in the future work, whereas the current paper represents a step forward relative to the state of the art and provides a useful resource for the community.
Two minor points about the new additions to the paper:
(1) Something does not seem right in the sentence, "Unlike the Markram et al. (2015) model, the new model can also be exploited by the community and has already been used in a number of follow up papers studying (Ecker et al., 2024a,b; ...)". Should the authors remove "studying"?
(2) It is great that the authors added more plots and videos of the firing rates, but most of them show maximum-normalized rates, which sort of defeats the purpose. No scale on the y-axis is shown (it can be useful even for normalized data). And it is impossible to see anything for inhibitory populations.
These are minor points that may not need to be addressed. Overall, it is a nice study that is certainly useful for the field.
A great improvement is that the model is made fully available to the public.
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
This paper presents a model of the whole somatosensory non-barrel cortex of the rat, with 4.2 million morphologically and electrically detailed neurons, with many aspects of the model constrained by a variety of data. The paper focuses on simulation experiments, testing a range of observations. These experiments are aimed at understanding how the multiscale organization of the cortical network shapes neural activity.
Strengths:
(1) The model is very large and detailed. With 4.2 million neurons and 13.2 billion synapses, as well as the level of biophysical realism employed, it is a highly comprehensive computational representation of the cortical network.
(2) Large scope of work - the authors cover a variety of properties of the network structure and activity in this paper, from dendritic and synaptic physiology to multi-area neural activity.
(3) Direct comparisons with experiments, shown throughout the paper, are laudable.
(4) The authors make a number of observations, like describing how high-dimensional connectivity motifs shape patterns of neural activity, which can be useful for thinking about the relations between the structure and the function of the cortical network.
(5) Sharing the simulation tools and a "large subvolume of the model" is appreciated.
We thank the reviewer for these comments and are pleased they appreciated these aspects of the work.
Weaknesses:
(1) A substantial part of this paper - the first few figures - focuses on single-cell and single-synapse properties, with high similarity to what was shown in Markram et al., 2015. Details may differ, but overall it is quite similar.
We thank the reviewer for this useful comment and agree that it is important to better highlight the incremental improvements to the model’s low-level physiology. The validity of any model can continuously be improved at all spatial scales and the validity of emergent network activity increases with improved validity at lower levels. For this reason, we felt it was valuable to improve the low-level physiology of the model.
Regarding neuron physiology, we have added the following in Section 2.1 on page 5:
“2.1 Improved modeling and validation of neuron physiology
Similarly to Markram et al. (2015), electrical properties of single neurons were modelled by optimizing ion channel densities in specific compartment-types (soma, axon initial segment (AIS), basal dendrite, and apical dendrite) (Figure 2B) using an evolutionary algorithm (IBEA; Van Geit et al., 2016) so that each neuron recreates electrical features of its corresponding electrical type (e-type) under multiple standardized protocols. Compared to Markram et al. (2015), electrical models were optimized and validated using 1) additional in vitro data, features and protocols, 2) ion channel and electrophysiological data corrected for the liquid junction potential, and 3) stochastic channels (StochKv3) now including inactivation profiles. The methodology and resulting electrical models are described in Reva et al. (2023) (see Methods), and generated quantitatively more accurate electrical activity, including improved attenuation of excitatory postsynaptic potentials (EPSPs) and back-propagating action potentials.”
And page 8:
“The new neuron models saw a 5-fold improvement in generalizability compared to Markram et al. (2015) (Reva et al., 2023).”
We have also made the descriptions of the improvements to synaptic physiology more explicit in Section 2.2 on page 9:
“2.2 Improved modeling and validation of synaptic physiology
The biological realism of synaptic physiology was improved relative to Markram et al. (2015) using additional data sources and by extending the stochastic version of the Tsodyks-Markram model (Tsodyks and Markram, 1997; Markram et al., 1998; Fuhrmann et al., 2002; Loebel et al., 2009) to feature multi-vesicular release, which in turn improved the accuracy of the coefficient of variations (CV; std/mean) of postsynaptic potentials (PSPs) as described in Barros-Zulaica et al. (2019) and Ecker et al. (2020). The model assumes a pool of available vesicles that is utilized by a presynaptic action potential, with a release probability dependent on the extracellular calcium concentration ([Ca2+]o; Ohana and Sakmann, 1998; Rozov et al., 2001; Borst, 2010). Additionally, single vesicles spontaneously release as an additional source of variability with a low frequency (with improved calibration relative to Markram et al. (2015)). The utilization of vesicles leads to a postsynaptic conductance with bi-exponential kinetics. Short-term plasticity (STP) dynamics in response to sustained presynaptic activation are either facilitating (E1/I1), depressing (E2/I2), or pseudo-linear (I3). E synaptic currents consist of both AMPA and NMDA components, whilst I currents consist of a single GABAA component, except for neurogliaform cells, whose synapses also feature a slow GABAB component. The NMDA component of E synaptic currents depends on the state of the Mg2+ block (Jahr and Stevens, 1990), with the improved fitting of parameters to cortical recordings from Vargas-Caballero and Robinson (2003) by Chindemi et al. (2022).”
(2) Although the paper is about the model of the whole non-barrel somatosensory cortex, out of all figures, only one deals with simulations of the whole non-barrel somatosensory cortex. Most figures focus on simulations that involve one or a few "microcolumns". Again, it is rather similar to what was done by Markram et al., 2015 and constitutes relatively incremental progress.
We thank the reviewer for this comment and have added the following text to the Discussion on page 33 to explain our rationale:
“In keeping with the philosophy of compartmentalization of parameters and continuous model refinement (see Introduction), it was essential to improve validity at the columnar scale (relative to Markram et al. (2015)) as part of demonstrating validity of the full nbS1. Indeed, improved parametrization and validation at smaller scales was essential to parameterizing background input which generated robust nbS1 activity within realistic [Ca<sup>2+</sup>]<sub>o</sub> and firing rate ranges. We view this as a major achievement, as it was unknown whether the model would achieve a stable and meaningful regime at the start of our investigation. Whilst we would have liked to go further, our primary goal was to publish a well characterized model as an open resource that others could use to undertake further in-depth studies. In this regard, we are pleased that the parametrization of the nbS1 model has already been used to study EEG signals (Tharayil et al., 2024), as well as propagation of activity between two subregions (Bolaños-Puchet and Reimann, 2024).”
We also make it clearer in the Introduction on page 4 that the improved validation of the emergent columnar regime was essential to stable activity at the larger scale:
“These initial validations demonstrated that the model was in a more accurate regime compared to Markram et al. (2015) – an essential step before testing more complex or larger-scale validations. For example, under the same parameterization we then observed selective propagation of stimulus-evoked activity to downstream areas, and…”
(3) With a model like this, one has an opportunity to investigate computations and interactions across an extensive cortical network in an in vivo-like context. However, the simulations presented are not addressing realistic specific situations corresponding to animals performing a task or perceiving a relevant somatosensory stimulus. This makes the insights into the roles of cell types or connectivity architecture less interesting, as they are presented for relatively abstract situations. It is hard to see their relationship to important questions that the community would be excited about - theoretical concepts like predictive coding, biophysical mechanisms like dendritic nonlinearities, or circuit properties like feedforward, lateral, and feedback processing across interacting cortical areas. In other words, what do we learn from this work conceptually, especially, about the whole non-barrel somatosensory cortex?
We thank the reviewer for this comment and agree that it would be very interesting to explore such topics. In the Introduction on page 4, we have updated the list of papers which have so far used the model for more in depth studies:
“…propagation of activity between cortical areas (Bolaños-Puchet and Reimann, 2024) the role of non-random connectivity motifs on network activity (Pokorny et al., 2024) and reliability (Egas Santander et al., 2024), the composition of high-level electrical signals such as the EEG (Tharayil et al., 2024), and how spike sorting biases population codes (Laquitaine et al., 2024).”
In the Discussion on page 33 we also add our additional thoughts on this topic:
“Whilst we would have liked to go further, our primary goal was to publish a well characterized model as an open resource that others could use to undertake further in-depth studies. In this regard, we are pleased that the parametrization of the nbS1 model has already been used to study EEG signals (Tharayil et al., 2024), as well as propagation of activity between two subregions (Bolaños-Puchet and Reimann, 2024). Investigation, improvement and validation must be continued at all spatial scales in follow up papers with detailed description, figures and analysis, which cannot be covered in this manuscript. Each new study increases the scope and validity of future investigations. In this way, this model and paper act as a stepping stone towards more complex questions of interest to the community such as perception, task performance, predictive coding and dendritic processing. This was similar for Markram et al. (2015) where the initial paper was followed by more detailed studies. Unlike the Markram et al. (2015) model, the new model can also be exploited by the community and has already been used in a number of follow up papers studying (Ecker et al., 2024a,b; Bolaños-Puchet and Reimann, 2024; Pokorny et al., 2024; Egas Santander et al., 2024; Tharayil et al., 2024; Laquitaine et al., 2024). We believe that the number of use cases for such a general model is vast, and is made larger by the increased size of the model.”
(4) Most comparisons with in vivo-like activity are done using experimental data for whisker deflection (plus some from the visual stimulation in V1). But this model is for the non-barrel somatosensory cortex, so exactly the part of the cortex that has less to do with whiskers (or vision). Is it not possible to find any in vivo neural activity data from the non-barrel cortex?
We agree with the reviewer that this is a weakness. We have expanded our discussion of the need to mix data sources to also consider our view for network level activity:
“This paper and its companion paper serve to present a methodology for modeling micro- and mesoscale anatomy and physiology, which can be applied for other cortical regions and species. With the rapid increase in openly available data, efforts are already in progress to build models of mouse brain regions with reduced reliance on data mixing thanks to much larger quantities of available atlas-based data. This also includes data for the validation of emergent network level activity. Here we chose to compare network-level activity to data mostly from the barrel cortex, as well as a single study from primary visual cortex. Whilst a lot of the data used to build the model was from the barrel cortex, the barrel cortex also represents a very well characterized model of cortical processing for simple and controlled sensory stimuli. The initial comparison of population-wise responses in response to accurate thalamic input for single whisker deflections was essential to demonstrating that the model was closer to in vivo, and we were unaware of similar data for nonbarrel somatosensory regions. Moreover, our optogenetic & lesion study demonstrated the capacity to compare and extend studies of canonical cortical processing in the whisker system.”
(5) The authors almost do not show raw spike rasters or firing rates. I am sure most readers would want to decide for themselves whether the model makes sense, and for that, the first thing to do is to look at raster plots and distributions of firing rates. Instead, the authors show comparisons with in vivo data using highly processed, normalized metrics.
We thank the reviewer for this comment and agree that better visualizations of the network activity under different conditions is essential for helping the reader assess the work. In addition to raster plots in Video 1, Video 3, Fig 6, Fig 5C, Fig S9a, S16a, we have additionally:
a) Changed the histograms of spontaneous activity in Fig 4G on page 13 to raster plots for the seven column subvolume for two contrasting meta-parameter regimes.
b) Added 4 new videos (Video 6a,b and 8a,b) showing all spontaneous and evoked meta-parameter combinations in hex0 and hex39 of the nbS1:
We have added improved plots showing the distributions of firing rates in the seven column subvolume on page 74:
With more detailed consideration in the Results on page 15:
“Long-tailed population firing rate distributions with means ∼ 1Hz
To study the firing rate distributions of different subpopulations and m-types, we ran 50s simulations for the meta-parameter combinations: [Ca<sup>2+</sup>]<sub>o</sub>: 1.05mM, R<sub>OU</sub>: 0.4,P<sub>FR</sub>: 0.3, 0.7 (Figure S4). Different subpopulations showed different sparsity levels (proportion of neurons spiking at least once) ranging from 6.6 to 42.5%. Wohrer et al. (2013) considered in detail the biases and challenges in obtaining ground truth firing rate distributions in vivo, and discuss the wide heterogeneity of reports in different modalities using different recording techniques. They conclude that most evidence points towards longtailed distributions with peaks just below 1Hz. We confirmed that spontaneous firing rate distributions were long-tailed (approximately lognormally distributed) with means on the order of 1Hz for most subpopulations. Importantly the layer-wise means were just below 1Hz in all layers for the P<sub>FR</sub> = 0.3 meta-parameter combination. Moreover, our recent work applying spike sorting to extracellular activity using this meta-parameter combination found spike sorted firing rate distributions to be lognormally distributed and very similar to in vivo distributions obtained using the same probe geometry and spike sorter (Laquitaine et al., 2024).
(6) While the authors claim that their model with one set of parameters reproduces many experimentally established metrics, that is not entirely what one finds. Instead, they provide different levels of overall stimulation to their model (adjusting the target "P_FR" parameter, with values from 0 to 1, and other parameters), and that influences results. If I get this right (the figures could really be improved with better organization and labeling), simulations withP<sub>FR</sub> closer to 1 provide more realistic firing rate levels for a few different cases, however, P<sub>FR</sub> of 0.3 and possibly above tends to cause highly synchronized activity - what the authors call bursting, but which also could be called epileptic-like activity in the network.
We thank the reviewer for this comment. We can now see that the motivation for P<sub>FR</sub> parameter was introduced very briefly in the results and that the results of the calibration and analysis of the spontaneous activity regime are not interpreted in relation to this parameter.
To address this, we have given more detail where it is first introduced in the Results on page 12:
“to account for uncertainty in the firing rate bias during spontaneous activity from extracellular spike sorted recordings…”
We then reconsider that it represents an unknown bias when interpreting the calibration and spontaneous activity results on page 15:
“We reemphasize that the [Ca<sup>2+</sup>]<sub>o</sub>, R<sub>OU</sub> and P<sub>FR</sub> meta-parameters account for uncertainty of in vivo extracellular calcium concentration, the nature of inputs from other brain regions and the bias of extracellularly recorded firing rates. Whilst estimates for [Ca<sup>2+</sup>]<sub>o</sub> are between 1.0 - 1.1mM (Jones and Keep, 1988; Massimini and Amzica, 2001; Amzica et al., 2002; Gonzalez et al., 2022) and estimates for PFR are in the range of 0.1 - 0.3 (Olshausen and Field, 2006), combinations of these parameters supporting in vivo-like stimulus responses in later sections will offer a prediction for the true values of these parameters. Both these later results and our recent analysis of spike sorting bias using this model (Laquitaine et al., 2024) predict a spike sorting bias corresponding to P<sub>FR</sub> ∼ 0.3, confirming the prediction of Olshausen and Field (2006).”
And in relation to the stimulus evoked responses on page 17:
“Specifically, simulations with PFR from 0.1 to 0.5 robustly support realistic stimulus responses, with the middle of this range (0.3) corresponding with estimates of in vivo recording bias; both the previous estimates of Olshausen and Field (2006) and from a spike sorting study using this model (Laquitaine et al., 2024).”
Following these considerations, the remainder of the experiments using the seven column subvolume only use a single meta-parameter on page 19.
For the full nbS1 we further discuss the importance of a P_FR value between 0.1 and 0.3 in the Results on page 26:
“Stable spontaneous activity only emerges in nbS1 at predicted in vivo firing rates
After calibrating the model of extrinsic synaptic input for the seven column subvolume, we tested to what degree the calibration generalizes to the entire nbS1. Notably, this included the addition of mid-range connectivity (Reimann et al., 2024). The total number of local and mid-range synapses in the model was 9138 billion and 4075 billion, i.e., on average full model simulations increased the number of intrinsic synapses onto a neuron by 45%. Particularly, we ran simulations for P<sub>FR</sub></i ∈ [0.1, 0.15, ..., 0.3] using the OU parameters calibrated for the seven column subvolume for [Ca<sup>2+</sup>]<sub>o</sub> = 1.05mM and R<sub>OU</sub> = 0.4. Each of these full nbS1 simulations produced stable non-bursting activity (Figure 8A), except for the simulation for P<sub>FR</sub></i = 0.3, which produced network-wide bursting activity (Video 6). Activity levels in the simulations of spontaneous activity were heterogeneous (Figure 8B, Video 7). In some areas, firing rates were equal to the target P<sub>FR</sub>, whilst in others they increased above the target (Figure 8C). In the more active regions, mean firing rates (averaged over layers) were on the order of 30-35% of the in vivo references for the maximum non-bursting P<sub>FR</sub> simulation (target P<sub>FR</sub> : 0.25). This range of firing rates again fits with the estimate of firing rate bias from our paper studying spike sorting bias (Laquitaine et al., 2024) and the meta-parameter range supporting realistic stimulus responses in the seven column subvolume. This also predicts that the nbS1 cannot sustain higher firing rates without entering a bursting regime.
Finally, we also added to our discussion of biases in extracellular firing rates in the Discussion on page 32:
“This is also inline with our recent work using the model, which estimated a spike sorting bias corresponding to PFR = 0.3 using virtual extracellular electrodes (Laquitaine et al., 2024).”
We also thank the reviewer for pointing out that we did not define the term “bursting” in the main text. We have added the following definition and discussion in the Results on page 15:
“Note that the most correlated meta-parameter combination [Ca<sup>2+</sup>]<sub>o</sub>: 1.1mM, R<sub>OU</sub>: 0.2, P<sub>FR</sub>: 1.0 produced network-wide “bursting” activity, which we define as highly synchronous all or nothing events (Video 1). Such activity, which may be characteristic of epileptic activity, can be studied with the model but is not the focus of this study.”
(7) The authors mention that the model is available online, but the "Resource availability" section does not describe that in substantial detail. As they mention in the Abstract, it is only a subvolume that is available. That might be fine, but more detail in appropriate parts of the paper would be useful.
Firstly, we are pleased to say that the full nbS1 model is now available to download, in addition to the seven hexagon subvolume. In the manuscript, we have:
a) Added to the Introduction at the bottom of page 4:
“To provide a framework for further studies and integration of experimental data, the full model is made available with simulation tools, as well as a smaller subvolume with the optional new connectome capturing inhibitory targeting rules from electron microscopy”.
b) Updated the open source panel of Figure 1:
Secondly, we thank the reviewer for noticing that the description of the available model is not well described in the “Resource availability” statement and have addressed this by:
a) Adding the following to the “Resource availability” statement on page 36:
“Both the full nbS1 model and smaller seven hexagon subvolume are available on Harvard Dataverse and Zenodo respectively in SONATA format (Dai et al., 2020) with simulation code. DOIs are listed under the heading ``Final simulatable models'' in the Key resources table. An additional link is provided to the SM-Connectome with instructions on how to use it with the seven hexagon subvolume model.”
b) Creating a new subheading in the “Key resources table” titled: “Final simulatable models” to make it clearer which links refer to the final models.
Reviewer #2 (Public review):
Summary:
This paper is a companion to Reimann et al. (2022), presenting a large-scale, data-driven, biophysically detailed model of the non-barrel primary somatosensory cortex (nbS1). To achieve this unprecedented scale of a bottom-up model, approximately 140 times larger than the previous model (Markram et al., 2015), they developed new methods to account for inputs from missing brain areas, among other improvements. Isbister et al. focus on detailing these methodological advancements and describing the model's ability to reproduce in vivo-like spontaneous, stimulus-evoked, and optogenetically modified activity.
Strengths:
The model generated a series of predictions that are currently impossible in vivo, as summarized in Table S1. Additionally, the tools used in this study are made available online, fostering community-based exploration. Together with the companion paper, this study makes significant contributions by detailing the model's constraints, validations, and potential caveats, which are likely to serve as a basis for advancing further research in this area.
We thank the reviewer for these comments, and are pleased they appreciate these aspects of the work.
Weaknesses:
That said, I have several suggestions to improve clarity and strengthen the validation of the model's in vivo relevance.
Major:
(1) For the stimulus-response simulations, the authors should also reference, analyze, and compare data from O'Connor et al. (2010; https://pubmed.ncbi.nlm.nih.gov/20869600/) and Yu et al .(2016; https://pubmed.ncbi.nlm.nih.gov/27749825/) in addition to Yu et al. 2019, which is the only data source the authors consider for an awake response. The authors mentioned bias in spike rate measurements, but O'Connor et al. used cell-attached recordings, which do not suffer from activity-based selection bias (in addition, they also performed Ca2+ imaging of L2/3). This was done in the exact same task as Yu et al., 2019, and they recorded from over 100 neurons across layers. Combining this data with Yu et al., 2019 would provide a comprehensive view of activity across layers and inhibitory cell types. Additionally, Yu et al. (2016) recorded VPM neurons in the same task, alongside whole-cell recordings in L4, showing that L4 PV neurons filter movement-related signals encoded in thalamocortical inputs during active touch. This dataset is more suitable for extracting VPM activity, as it was collected under the same behavior and from the same species (Unlike Diamond et al., 1992, which used anesthetized rats). Furthermore, this filtering is an interesting computation performed by the network the authors modeled. The validation would be significantly strengthened and more biologically interesting if the authors could also reproduce the filtering properties, membrane potential dynamics, and variability in the encoding of touch across neurons, not just the latency (which is likely largely determined by the distance and number of synapses).
We thank the reviewer for pointing out these very useful studies. We have taken on board this suggestion for a future model of the mouse barrel cortex.
(2) The authors mention that in the model, the response of the main activated downstream area was confined to L6. Is this consistent with in vivo observations? Additionally, is there any in vivo characterization of the distance dependence of spiking correlation to validate Figure 8I?
We are not aware of data confirming the propagation of activity to downstream areas being confined to layer 6 but have considered the connectivity further between these two regions on page 27, as well as studying this further in follow up work:
“Stable propagation of evoked activity through mid-range connectivity only emerges in nbS1 at predicted in vivo firing rates
We repeated the previous single whisker deflection evoked activity experiment in the full model, providing a synchronous thalamic input into the forelimb sub-region (S1FL; Figure 8E; Video 8 & 9). Responses in S1FL were remarkably similar to the ones in the seven column subvolume, including the delays and decays of activity (Figure 8F). However, in addition to a localized primary response in S1FL within 350μm of the stimulus, we found several secondary responses at distal locations (Figure 8E; Video 9), which was suggestive of selective propagation of the stimulus-evoked signal to downstream areas efferently connected by mid-range connectivity. The response of the main activated downstream area (visible in Figure 8E) was confined to L6 (Figure 8G). In a follow up study using the model to explore the propagation of activity between cortical regions (Bolaños-Puchet and Reimann, 2024), it is described how the model contains both a feedforward projection pattern, which projects to principally to synapses in L1 & L23, and a feedback type pattern, which principally projects to synapses in L1 & L6. On visualizing the innervation profile from the stimulated hexagon to the downstream hexagon we can see that we have stimulated a feedback pathway (Figure S16)”
With referenced Figure S16 on page 85:
We did find in vivo evidence of similar layer-wise and distance dependence of correlations in the somatosensory cortex discussed on page 27 of the Results:
“The distance dependence of correlations followed a similar profile to that observed in a dataset characterizing spontaneous activity in the somatosensory cortex (Reyes-Puerta et al., 2015a) (compare red line in Figure 8I with Figure S16). In the in vivo dataset spiking correlation was also low but highest in lower layers, with short “up-states” in spiking activity constrained to L5 & 6 (see Figure 1E,F in (Reyes-Puerta et al., 2015a)). In the model, they are constrained to L6.”
With Figure S16a on page 85 showing the distance dependence of correlations in the anaesthetized barrel cortex during spontaneous activity (digitization from the reference paper):
(3) Across the figures, activity is averaged across neurons within layers and E or I cell types, with a limited description of single-cell type and single-cell responses. Were there any predictions regarding the responses of particular cell types that significantly differ from others in the same layer? Such predictions could be valuable for future investigations and could showcase the advantages of a data-driven, biophysically detailed model.
We thank the review for this comment. In addition to new analyses at higher granularity addressed in other comments, we have added the following comparison of stimulus-evoked membrane potential dynamics in different subpopulations for the original connectome and SM-connectome in Figure 7 on page 24.
This gave interesting results discussed in a new subsection on page 26:
“EM targeting trends hyperpolarize Sst+ and HT3aR+ late response, and disinhibit L5/6 E
Studying somatic membrane potentials for different subpopulations in response to whisker deflections shows that PV+, L23E and L4E subpopulations are largely unaffected in the SM-connectome (Figure 7E). Interestingly, Sst+ and 5HT3aR+ subpopulations show a strong hyperpolarization in the late response that isn’t present in the original connectome. Interestingly, this corresponds with a stronger late response in L5/6 E populations, which could be caused by disinhibition due to the Sst+ and 5HT3aR+ hyperpolarization. This could be explored further in follow up studies using our connectome manipulator tool (Pokorny et al., 2024).”
(4) 2.4: Are there caveats to assuming the OU process as a model for missing inputs? Inputs to the cortex are usually correlated and low-dimensional (i.e., communication subspace between cortical regions), but the OU process assumes independent conductance injection. Can (weakly) correlated inputs give rise to different activity regimes in the model? Can you add a discussion on this?
We agree with the reviewer that there are caveats to assuming an OU process for the model of missing inputs and have added the following to the Discussion on page 31:
“The calibration framework could optimize per population parameters for other compensation methods, whilst still offering an interpretable spectrum of firing rate regimes at different levels of P<sub>FR</sub>. For example, more realistic compensation schemes could be explored which introduce a) correlations between the inputs received by different neurons and b) compensation distributed across dendrites, as well as at the soma. We predict that such changes would make spontaneous activity more correlated at the lower spontaneous firing rates which supported in vivo like responses (P<sub>FR</sub> : 0.1 − 0.5), which would in turn make stimulus-responses more noise correlated.”
(5) 2.6: The network structure is well characterized in the companion paper, where the authors report that correlations in higher dimensions were driven by a small number of neurons with high participation ratios. It would be interesting to identify which cell types exhibit high node participation in high-dimensional simplices and examine the spiking activity of cells within these motifs. This could generate testable predictions and inform theoretical cell-type-specific point neuron models for excitatory/inhibitory balanced networks and cortical processing.
We thank the reviewer for this suggestion. We have added two supplementary figures to address this suggestion, which are discussed in the Results on Page 16:
“Additionally, we studied the structural effect on the firing rate (here measured as the inverse of the inter-spike interval, ISI, which can be thought of as a proxy of non-zero firing rate). We found that for the connected circuit, the firing rate increases with simplex dimension; in contrast with the disconnected circuit, where this relationship remains flat (see Figure S6 red vs. blue curves and Methods).
This also demonstrates high variability between neurons, in line with biology, both structurally (Towlson et al., 2013; Nigam et al., 2016) and functionally (Wohrer et al., 2013; Buzs´aki and Mizuseki, 2014). We next identified the cell types that are overexpressed in the group of neurons that have the 5% highest values of node participation across dimensions (Figure S7). This could inform theoretical point neuron models with cell-type specificity, for example. We found that while in dimension one (i.e., node degree) this consists mostly of inhibitory cells, in higher dimensions the cell types concentrate in layers 4, 5 and 6, especially for TPC neurons. This is in line with our structural layer-wise findings in Figure 8B in Reimann et al. (2024).”
Which reference new Figures S6 and S7:
With the methodology for S6 described on page 49 of the Methods:
“For any numeric property of neurons, e.g., firing rate, we evaluate the effect of dimension on it by taking weighted averages across dimensions. That is for each dimension k, we take the weighted average of the property across neurons where the weights are given by node participation on dimension k. More precisely, let N be the number of neurons and −→V ∈ RN, be a vector of a property on all the neurons e.g., the vector of firing rates. Then in each dimension k we compute
Where
is the vector of node participation on dimension k for all neurons and ・ is the dot product.
To measure the over and underexpression of the different m-types among those with the highest 5% of values of node participation, we used the hypergeometric distribution to determine the expected distribution of m-types in a random sample of the same size. More precisely, for each dimension k and m-type m, let N<sub>total</sub> be the total number of neurons in the circuit, Nm be the number of neurons of m-type m in the circuit, Ctop be the number of neurons with the highest 5% values of node participation in dimension k, Cm the number of neurons of mtype m among these, and let P = hypergeom(N<sub>total</sub<,N<sub>m</sub>,C<sub>top</sub>) be the hypergeometric distribution.
By definition, P(x) describes the probability of sampling x neurons of m-type m in a random sample of size C<sub>top</sub>. Therefore, using the cumulative distribution F(x) = P(Counts ≤ x), we can compute the p-values as follows:
Small values indicate under and over representation respectively….”
Minor:
(1) Since the previous model was published in 2015, the neuroscience field has seen significant advancements in single-cell and single-nucleus sequencing, leading to the clustering of transcriptomic cell types in the entire mouse brain. For instance, the Allen Institute has identified ~10 distinct glutamatergic cell types in layer 5, which exceeds the number incorporated into the current model. Could you discuss 1) the relationship between the modeled me-types and these transcriptomic cell types, and 2) how future models will evolve to integrate this new information? If there are gaps in knowledge in order to incorporate some transcriptome cell types into your model, it would be helpful to highlight them so that efforts can be directed toward addressing these areas.
We thank the reviewer for this suggestion, particularly the idea to describe what types of data would be valuable towards improving the model in future. We have added the following to the Discussion on page 33:
“In our previous work (Roussel et al., 2023) we linked mouse inhibitory me-models to transcriptomic types (t-types) in a whole mouse cortex transcriptomic dataset (Gouwens et al., 2019). This can provide a direct correspondence in future large-scale mouse models. As we model only a single electrical type for pyramidal cells there is no one-to-one correspondence between our me-models and the 10 different pyramidal cell types identified there. We are not currently aware of any method which can recreate the electrical features of different types of pyramidal cells using only generic ion channel models. To achieve the firing pattern behavior of more specific electrical types, usually ion channel kinetics are tweaked, and this would violate the compartmentalization of parameters. In future we hope to build morpho-electric-transcriptomic type (met-type) models by selecting gene-specific ion channel models (Ranjan et al., 2019, 2024) based on the met-type’s gene expression. Data specific to different neuron sections (i.e. soma, AIS, apical/basel dendrites) of different met-types, such as gene expression, distribution of ion channels, and voltage recordings under standard single cell protocols would be particularly useful.”
(2) For the optogenetic manipulation, it would be interesting if the model could reproduce the paradoxical effects (for example, Mahrach et al. reported paradoxical effects caused by PV manipulation in S1; https://pubmed.ncbi.nlm.nih.gov/31951197/). This seems a more relevant and non-trivial network phenomenon than the V1 manipulation the authors attempted to replicate.
We thank the reviewer for this valuable idea. Indeed, our model is able to reproduce paradoxical effects under certain conditions. We added the following new supplementary Figure S12 demonstrating this finding (black arrows).
Which we discuss in the Results on page 22:
“However, at high contrasts, we observed a paradoxical effect of the optogenetic stimulation on L6 PV+ neurons, reducing their activity with increasing stimulation strength (Figure S12B; cf. Mahrach et al. (2020)). This effect did not occur under grey screen conditions (i.e., at contrast 0.0) with a constant background firing rate of 0.2 Hz or 5 Hz respectively (not shown). The individual…”
and added to the Discussion on page 32:
“Also, we predicted a paradoxical effect of optogenetic stimulation on L6 PV+ interneurons, namely a decrease in firing with increased stimulus strength. This is reminiscent of the paradoxical responses found by Mahrach et al. (2020) in the mouse anterior lateral motor cortex (in L5, but not in L2/3) and barrel cortex (no layer distinction) respectively. While Mahrach et al. (2020) conducted their recordings in awake mice not engaged in any behavior, we observed this effect only when drifting grating patterns with high contrast were presented. Nevertheless, consistent with their findings, we found the effect only in deep but not in superficial layers, and only for PV+ interneurons but not for PCs. Our model could therefore be used to improve the understanding of this paradoxical effect in follow up studies. These examples demonstrate that the approach of modeling entire brain regions can be used to further probe the topics of the original articles and cortical processing.”
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
My specific comments are in the Public Review. The summarizing point is that this is a sprawling paper, and it is easy for readers to get confused. Focusing on specific connections between known functional properties and findings in this model, especially for the full-scale model, will be helpful.
We thank the reviewer for this comment and for their related recommendation (4) below, and have added subheadings through-out the results.
Reviewer #2 (Recommendations for the authors):
(1) P4. What are the 10 free parameters?
We thank the reviewer for pointing out that it would be useful to summarize the 10 parameters at this stage of the text, and have adjusted the sentence to:
“As a result, the emerging in-vivo like activity is the consequence of only 10 free parameters representing the strength of extrinsic input from other brain regions into 9 layer-specific excitatory and inhibitory populations, and a parameter controlling the noise structure of this extrinsic input.”
(2) Table 1 and S1 are extremely useful. Could you provide a table summarizing the major assumptions or gaps in the model, their potential influence on the results, and possible ways to collect data that could support or challenge these assumptions? Currently, this information is scattered throughout the manuscript.
We thank the reviewer for this very useful suggestion and have added a Table S8 on page 68:
(3) Figure 4F is important, but the legend is unclear. What is the unit on the x-axis? The values seem too large to represent per-neuron measurements.
Thank you to the reviewer for raising this. Indeed the values are estimated mean numbers of missing number synapses per neuron by population. Such numbers are difficult to estimate but we have further discussed our rationale, justification and consideration of whether these numbers are accurate in the Results, as follows:
“Heterogeneity in synaptic density within and across neuron classes and sections makes estimating the number of missing synapses challenging (DeFelipe and Fariñas, 1992). Changing the assumed synaptic density value of 1.1 synapses/μm would only change the slope of the relationship, however. Estimates of mean number of existing and missing synapses per population were within reasonable ranges; even the larger estimate for L5 E (due to higher dendritic length; Figure S3) was within biological estimates of 13,000 ± 3,500 total afferent synapses (DeFelipe and Fariñas, 1992).”
This text references the new supplementary Figure S3:
Moreover, these numbers represent the number of synapses, rather than the number of connections. The number of connections is usually used for quantifications such as indegree, and are usually much lower.
We have also updated the caption and axis labels of the original figure:
(4) Including additional subsections or improving the indexing in the Results section could be beneficial. In its current format, it's difficult to distinguish where the model description ends and where the validation begins. Some readers may want to focus more on the validation than other parts, so clearer segmentation would improve readability.
We have addressed this comment with the opening comment in the authors “Recommendations for authors”.
(5) P4. 2nd paragraph. Original vs rewired connectome. The term "rewired connectome" may give the impression that it refers to an artificial manipulation rather than a modification based on the latest data. It might be helpful to use a different term (e.g., SM-connectome as described later in the paper?).
We have adjusted the text in the introduction:
“Additionally, we generated a new connectome which captured recently characterized spatially-specific targeting rules for different inhibitory neuron types (Schneider-Mizell et al., 2023) in the MICrONS electron microscopy dataset (MICrONS-Consortium et al., 2021), such as increased perisomatic targeting by PV+ neurons, and increased targeting of inhibitory populations by VIP+ neurons. Comparing activity to the original connectome gave predictions about the role of these additional targeting rules.”
(6) Figures 7 B, C, D: what is v1/v2? Original vs SM-Connectome?
We thank the reviewer for noticing this and have corrected the figure to use “Orig” and “SM” consistent with the rest of the figure.
(7) Page 23, 2.10: what is phi?
We thank the reviewer for noticing this inconsistency with the earlier text, and have updated the text to read: “Particularly, we ran simulations for PF R ∈ [0.1, 0.15, ..., 0.3] using the OU para-maters calibrated for the seven column subvolume for [Ca<sup>2+</sup>] = 1.05 mM and R<sub>OU</sub> = 0.4.”
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study investigates the implications of human endogenous retrovirus (HERV) activity in myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and fibromyalgia (FM). These findings indicate significant associations that coincide with previous literature, which has suggested roles for differential HERV activity in degenerative, inflammatory, and aging-related pathologies of the central nervous system (CNS), as well as neurotropic infections. These seminal studies can be strengthened with minor improvements to the methodologies of characterizing differential HERV activity, further characterizing downstream mechanisms by which HERV activity impacts disease and by an expansion of the datasets utilized to include additional cohorts. These compelling findings are of immediate importance to clinicians, policymakers, and researchers interested in the underlying etiology of human health and disease.
-
Reviewer #1 (Public review):
Summary:
Giménez-Orenga et al. investigate the origin and pathophysiology of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and fibromyalgia (FM). Using RNA microarrays, the authors compare the expression profiles and evaluate the biomarker potential of human endogenous retroviruses (HERV) in these two conditions. Altogether, the authors show that HERV expression is distinct between ME/CFS and FM patients, and HERV dysregulation is associated with higher symptom intensity in ME/CFS. HERV expression in ME/CFS patients is associated with impaired immune function and higher estimated levels of plasma cells and resting CD4 memory T cells. This work provides interesting insights into the pathophysiology of ME/CFS and FM, creating opportunities for several follow-up studies.
Strengths:
(1) Overall, the data is convincing and supports the authors' claims. The manuscript is clear and easy to understand, and the methods are generally well-detailed. It was quite enjoyable to read.
(2) The authors combined several unbiased approaches to analyse HERV expression in ME/CFS and FM. The tools, thresholds, and statistical models used all seem appropriate to answer their biological questions.
(3) The authors propose an interesting alternative to diagnosing these two conditions. Transcriptomic analysis of blood samples using an RNA microarray could allow a minimally invasive and reproducible way of diagnosing ME/CFS and FM.
Weaknesses:
(1) The cohort analysed in this study was phenotyped by a single clinician. As ME/CFS and FM are diagnosed based on unspecific symptoms and are frequently misdiagnosed, this raises the question of whether the results can be generalised to external cohorts.
(2) The analyses performed to unravel the causes and effects of HERV expression in ME/CFS and FM are solely based on sequencing data. Experimental approaches could be used to validate some of the transcriptomic observations.
-
Reviewer #2 (Public review):
Summary:
Giménez-Orenga carried out this study to assess whether human endogenous retroviruses (HERVs) could be used to improve the diagnosis of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Fibromyalgia (FM). To this end, they used the HERV-V3 array developed previously, to characterize the genome-wide changes in the expression of HERVs in patients suffering from ME/CFS, FM, or both, compared to controls. In turn, they present a useful repertoire of HERVs that might characterize ME/CFS and FM. For the most part, the paper is written in a manner that allows a natural understanding of the workflow and analyses carried out, making it compelling. The figures and additional tables present solid support for the findings. However, some statements made by the authors seem incomplete and would benefit from a more thorough literature review. Overall, this work will be of interest to the medical community seeking in better understanding of the co-occurrence of these pathologies, hinting at a novel angle by integrating HERVs, which are often overlooked, into their assessment.
Strengths:
(1) The work is well-presented, allowing the reader to understand the overall workflow and how the specific aims contribute to filling the knowledge gap in the field.
(2) The analyses carried out to understand the potential impact on gene expression mediated by HERVs are in line with previous works, making it solid and robust in the context of this study.
Weaknesses:
(1) The authors claim to obtain genome-wide HERV expression profiles. However, the array used was developed using hg19, while the genomic analysis of this work are carried out using a liftover to hg38. It would improve the statement and findings to include a comparison of the differences in HERVs available in hg38, and how this could impact the "genome-wide" findings.
(2) The authors in some points are not thorough with the cited literature. Two examples are:<br /> a) Lines 396-397 the authors say "the MLT1, usually found enriched near DE genes (Bogdan et al., 2020)". I checked the work by Bogdan, and they studied bacterial infection. A single work in a specific topic is not sufficient to support the statement that MLT1 is "usually" in close vicinity to differentially expressed genes. More works are needed to support this.<br /> b) After the previous statement, the authors go on to mention "contributing to the coding of conserved lncRNAs (Ramsay et al., 2017)". First, lnc = long non-coding, so this doesn't make sense. Second, in the work by Ramsay they mention "that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved", which is different from what the authors in this study are trying to convey. Again, additional work and a rephrasing might help to support this idea.
(3) When presenting the clusters, the authors overlook the fact that cluster 4 is clearly control-specific, and fail to discuss what this means. Could this subset of HERV be used as bona fide markers of healthy individuals in the context of these diseases? Are they associated with DE genes? What could be the impact of such associations?
Appraisals on aims:
The authors set specific questions and presented the results to successfully answer them. The evidence is solid, with some weaknesses discussed above that will methodologically strengthen the work.
Likely impact of work on the field:
This work will be of interest to the medical community looking for novel ways to improve clinical diagnosis. Although future works with a greater population size, and more robust techniques such as RNA-Seq, are needed, this is the first step in presenting a novel way to distinguish these pathologies.
It would be of great benefit to the community to provide a table/spreadsheet indicating the specific genomic locations of the HERVs specific to each condition. This will allow proper provenance for future researchers interested in expanding on this knowledge, as these genomic coordinates will be independent of the technique used (as was the array used here).
-
Reviewer #3 (Public review):
The authors find that HERV expression patterns can be used as new criteria for differential diagnosis of FM and ME/CFS and patient subtyping. The data are based on transcriptome analysis by microarray for HERVs using patient blood samples, followed by differential expression of ERVs and bioinformatic analyses. This is a standard and solid data processing pipeline, and the results are well presented and support the authors' claim.
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
Giménez-Orenga et al. investigate the origin and pathophysiology of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and fibromyalgia (FM). Using RNA microarrays, the authors compare the expression profiles and evaluate the biomarker potential of human endogenous retroviruses (HERV) in these two conditions. Altogether, the authors show that HERV expression is distinct between ME/CFS and FM patients, and HERV dysregulation is associated with higher symptom intensity in ME/CFS. HERV expression in ME/CFS patients is associated with impaired immune function and higher estimated levels of plasma cells and resting CD4 memory T cells. This work provides interesting insights into the pathophysiology of ME/CFS and FM, creating opportunities for several follow-up studies.
Strengths:
(1) Overall, the data is convincing and supports the authors' claims. The manuscript is clear and easy to understand, and the methods are generally well-detailed. It was quite enjoyable to read.
(2) The authors combined several unbiased approaches to analyse HERV expression in ME/CFS and FM. The tools, thresholds, and statistical models used all seem appropriate to answer their biological questions.
(3) The authors propose an interesting alternative to diagnosing these two conditions. Transcriptomic analysis of blood samples using an RNA microarray could allow a minimally invasive and reproducible way of diagnosing ME/CFS and FM.
Weaknesses:
(1) The cohort analysed in this study was phenotyped by a single clinician. As ME/CFS and FM are diagnosed based on unspecific symptoms and are frequently misdiagnosed, this raises the question of whether the results can be generalised to external cohorts.
Thank you for your comment. Surely the study of larger cohorts will determine the external validity of these results in a clinical scenario. However, this pilot study, first of its kind, was designed to maximize homogeneity across participants which seemed primarily ensured by inclusion of females only diagnosed by a single experienced observer.
(2) The analyses performed to unravel the causes and effects of HERV expression in ME/CFS and FM are solely based on sequencing data. Experimental approaches could be used to validate some of the transcriptomic observations.
Certainly, experimental approaches may add robustness to our findings. We in fact consider taking this avenue to deepen in the observations presented here. However, the limited knowledge of HERV-mediated physiological functions may hinder the task of revealing causes and effects of HERV expression in ME/CFS and FM in the short term.
Reviewer #2 (Public review):
Summary:
Giménez-Orenga carried out this study to assess whether human endogenous retroviruses (HERVs) could be used to improve the diagnosis of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) and Fibromyalgia (FM). To this end, they used the HERV-V3 array developed previously, to characterize the genome-wide changes in the expression of HERVs in patients suffering from ME/CFS, FM, or both, compared to controls. In turn, they present a useful repertoire of HERVs that might characterize ME/CFS and FM. For the most part, the paper is written in a manner that allows a natural understanding of the workflow and analyses carried out, making it compelling. The figures and additional tables present solid support for the findings. However, some statements made by the authors seem incomplete and would benefit from a more thorough literature review. Overall, this work will be of interest to the medical community seeking in better understanding of the co-occurrence of these pathologies, hinting at a novel angle by integrating HERVs, which are often overlooked, into their assessment.
Strengths:
(1) The work is well-presented, allowing the reader to understand the overall workflow and how the specific aims contribute to filling the knowledge gap in the field.
(2) The analyses carried out to understand the potential impact on gene expression mediated by HERVs are in line with previous works, making it solid and robust in the context of this study.
Weaknesses:
(1) The authors claim to obtain genome-wide HERV expression profiles. However, the array used was developed using hg19, while the genomic analysis of this work are carried out using a liftover to hg38. It would improve the statement and findings to include a comparison of the differences in HERVs available in hg38, and how this could impact the "genome-wide" findings.
This is an important point. However, the low number of probes that were excluded from our analysis by lack of correspondence with hg38, less than 100 among the 1,290,800 probesets, was interpreted as insignificant for "genome-wide" claims. An aspect that will be detailed in the revised version of this manuscript.
(2) The authors in some points are not thorough with the cited literature. Two examples are:
a) Lines 396-397 the authors say "the MLT1, usually found enriched near DE genes (Bogdan et al., 2020)". I checked the work by Bogdan, and they studied bacterial infection. A single work in a specific topic is not sufficient to support the statement that MLT1 is "usually" in close vicinity to differentially expressed genes. More works are needed to support this.
b) After the previous statement, the authors go on to mention "contributing to the coding of conserved lncRNAs (Ramsay et al., 2017)". First, lnc = long non-coding, so this doesn't make sense. Second, in the work by Ramsay they mention "that contributed a significant amount of sequence to primate lncRNAs whose expression was conserved", which is different from what the authors in this study are trying to convey. Again, additional work and a rephrasing might help to support this idea.
Certainly, these two sentences need rephrasing to better adjust statements to current evidence and will be replaced in the revised version of this manuscript.
(3) When presenting the clusters, the authors overlook the fact that cluster 4 is clearly control-specific, and fail to discuss what this means. Could this subset of HERV be used as bona fide markers of healthy individuals in the context of these diseases? Are they associated with DE genes? What could be the impact of such associations?
Using control DE HERV as bona fide markers of healthy individuals seems like an interesting possibility worth exploring. Control DE HERVs (cluster 4) are indeed associated with DE genes involved in apoptosis, T cell activation and cell-cell adhesion (modules 1 and 6) (Figure 3A). The impact of which deserves further study.
Appraisals on aims:
The authors set specific questions and presented the results to successfully answer them. The evidence is solid, with some weaknesses discussed above that will methodologically strengthen the work.
Likely impact of work on the field:
This work will be of interest to the medical community looking for novel ways to improve clinical diagnosis. Although future works with a greater population size, and more robust techniques such as RNA-Seq, are needed, this is the first step in presenting a novel way to distinguish these pathologies.
It would be of great benefit to the community to provide a table/spreadsheet indicating the specific genomic locations of the HERVs specific to each condition. This will allow proper provenance for future researchers interested in expanding on this knowledge, as these genomic coordinates will be independent of the technique used (as was the array used here).
We agree with the reviewer that sharing genomic locations of DE HERVs in these pathologies would contribute to further development of our findings. Unfortunately, we do not hold the rights to share probe coordinates from this custom HERV-V3 microarray which we used under MTA agreement with its developer.
Reviewer #3 (Public review):
The authors find that HERV expression patterns can be used as new criteria for differential diagnosis of FM and ME/CFS and patient subtyping. The data are based on transcriptome analysis by microarray for HERVs using patient blood samples, followed by differential expression of ERVs and bioinformatic analyses. This is a standard and solid data processing pipeline, and the results are well presented and support the authors' claim.
-
-
www.medrxiv.org www.medrxiv.org
-
eLife Assessment
This study investigated the influence of genomic information and timing of vaccine strain selection on the accuracy of influenza A/H3N2 forecasting. The authors utilised appropriate statistical methods and have provided solid evidence that is an important contribution to the evidence base. While the study addresses a key aspect of public health, the impact is rather limited by its exclusive reliance on predictive methods using genomic information, without incorporating phenotypic data.
-
Reviewer #1 (Public review):
Summary:
In the paper, the authors investigate how the availability of genomic information and the timing of vaccine strain selection influence the accuracy of influenza A/H3N2 forecasting. The manuscript presents three key findings:
(1) Using real and simulated data, the authors demonstrate that shortening the forecasting horizon and reducing submission delays for sharing genomic data improve the accuracy of virus forecasting.
(2) Reducing submission delays also enhances estimates of current clade frequencies.
(3) Shorter forecasting horizons, for example, allowed by the proposed use of "faster" vaccine platforms such as mRNA, resulting in the most significant improvements in forecasting accuracy.
Strengths:
The authors present a robust analysis, using statistical methods based on previously published genetic-based techniques to forecast influenza evolution. Optimizing prediction methods is crucial from both scientific and public health perspectives. The use of simulated as well as real genetic data (collected between April 1, 2005, and October 1, 2019) to assess the effects of shorter forecasting horizons and reduced submission delays is valuable and provides a comprehensive dataset. Moreover, the accompanying code is openly available on GitHub and is well-documented.
Weaknesses:
While the study addresses a critical public health issue related to vaccine strain selection and explores potential improvements, its impact is somewhat constrained by its exclusive reliance on predictive methods using genomic information, without incorporating phenotypic data. The analysis remains at a high level, lacking a detailed exploration of factors such as the genetic distance of antigenic sites.
Another limitation is the subsampling of the available dataset, which reduces several tens of thousands of sequences to just 90 sequences per month with even sampling across regions. This approach, possibly due to computational constraints, might overlook potential effects of regional biases in clade distribution that could be significant. The effect of dataset sampling on presented findings remains unexplored. Although the authors acknowledge limitations in their discussion section, the depth of the analysis could be improved to provide a more comprehensive understanding of the underlying dynamics and their effects.
-
Reviewer #2 (Public review):
Summary:
The authors have examined the effects of two parameters that could improve their clade forecasting predictions for A(H3N2) seasonal influenza viruses based solely on analysis of haemagglutinin gene sequences deposited on the GISAID Epiflu database. Sequences were analysed from viruses collected between April 1, 2005 and October 1, 2019. The parameters they investigated were various lag periods (0, 1, 3 months) for sequences to be deposited in GISAID from the time the viruses were sequenced. The second parameter was the time the forecast was accurate over projecting forward (for 3,6,9,12 months). Their conclusion (not surprisingly) was that "the single most valuable intervention we could make to improve forecast accuracy would be to reduce the forecast horizon to 6 months or less through more rapid vaccine development". This is not practical using conventional influenza vaccine production and regulatory procedures. Nevertheless, this study does identify some practical steps that could improve the accuracy and utility of forecasting such as a few suggested modifications by the authors such as "..... changing the start and end times of our long-term forecasts. We could change our forecasting target from the middle of the next season to the beginning of the season, reducing the forecast horizon from 12 to 9 months.'
Strengths:
The authors are very familiar with the type of forecasting tools used in this analysis (LBI and mutational load models) and the processes used currently for influenza vaccine virus selection by the WHO committees having participated in a number of WHO Influenza Vaccine Consultation meetings for both the Southern and Northern Hemispheres.
Weaknesses:
The conclusion of limiting the forecasting to 6 months would only be achievable from the current influenza vaccine production platforms with mRNA. However, there are no currently approved mRNA influenza vaccines, and mRNA influenza vaccines have also yet to demonstrate their real-world efficacy, longevity, and cost-effectiveness and therefore are only a potential platform for a future influenza vaccine. Hence other avenues to improve the forecasting should be investigated.
While it is inevitable that more influenza HA sequences will become available over time a better understanding of where new influenza variants emerge would enable a higher weighting to be used for those countries rather than giving an equal weighting to all HA sequences.
Also, other groups are considering neuraminidase sequences and how these contribute to the emergence of new or potentially predominant clades.
-
Author response:
Thank you to the reviewers and editors for their positive and constructive comments. Based on this feedback, we can see that we need to clarify that the primary goal of this paper is a test of potential changes in public health policy rather than a test of technical improvements to forecasting models. We briefly summarize the primary goal below to address these public reviews and list our proposed revisions to the manuscript based on reviewer feedback.
All real-time forecasting models contend with 2 major constraints:
(1) How far into the future they have to predict
(2) How rapidly the data used for predictions become available in real time
In the case of evolutionary influenza forecasts, the current values of these constraints are 1) 12 months into the future and 2) an average lag of ~3 months for hemagglutinin (HA) sequences to become available after sample collection. Regardless of the predictors we use in these models (genetic or phenotypic), our units of prediction always depend on HA: the HA protein is the primary target of our immunity, HA is the only gene whose composition is determined by the vaccine selection process, and influenza diversity is historically defined by clades in HA phylogenies.
Our primary goal of this study was to understand the relative effect sizes of these two common constraints on forecasting while holding all other variables as constant as possible. With this understanding, we hoped to better inform public health priorities and set realistic expectations for current and future forecasting efforts regardless of the technical specifications of each forecasting model. In other words, the goal of this study was not to optimize prediction methods but to estimate the effects of potential policy changes on forecast accuracy.
We found that reducing how far into the future we need to predict consistently reduced our forecasting error in simulated populations (where we knew the true fitness of each virus) and in natural populations (where we either estimated fitness from genetic predictors or we knew the true fitness of each virus based on its future success). Figure 6 and its first supplemental figure show these effect sizes for natural and simulated populations, respectively, when the future fitness of each virus is known at the time of prediction. By definition, we cannot hope to improve our estimates of viral fitness for these forecasts by using other genetic or phenotypic information.
Figure 6 shows that reducing how far into the future we need to predict from 12 to 6 months improves our forecasting accuracy 3 times as much as reducing the lag between sample collection and HA sequence submission to public databases. The impact of this finding is the confirmation that a faster vaccine development process would improve our forecast accuracy substantially more than faster turnaround between sample collection and sequence submission. If our public health goal is to make better predictions of future influenza populations, then this result indicates that our main priority is to speed up the vaccine development process.
If our public health goal is to better understand the composition of currently circulating influenza populations (the units of our forecasts), then Figure 3 shows that reducing the lag between sample collection and HA sequence submission from ~3 months on average to 1 month on average reduces our uncertainty in current clade frequency estimates by half. This impact is also independent of the predictors we use in our forecasting models and is not lessened by the lack of other genetic or phenotypic information in our analyses.
We realize that neither a 6-month vaccine development process nor a 1-month average sequence submission lag exist yet, but we believe that these are realistic and achievable goals for scientific and public health communities. We also realize that these public health goals are not mutually exclusive. By measuring the effects of these realistic changes to current policy through our forecasting experiments, we hope to inspire and motivate researchers and decision-makers who are empowered to make both of these goals a reality.
Finally, we want to emphasize that the use of phenotypic data in forecasts introduces additional delays caused by the lag between when genetic sequences become available and when serological experiments can be performed. Most WHO influenza collaborating centers use a "sequence-first" approach where they characterize the genetic sequence and use available sequences to prioritize phenotypic experiments with serology. This additional lag in availability of phenotypic data means that a forecasting model based on genetic and phenotypic data will necessarily have a greater lag in data availability than a model based on genetic data only. This lag is important for practical forecasts, too, but because the lag reflects specific characteristics of each collaborating center and not a global policy change, we believe this topic falls outside of the scope of this study.
Based on these public reviews and the private recommendations from reviewers, we plan to make the following revisions to this manuscript.
● Clarify the introduction, discussion, and abstract to emphasize the primary goal for this study to test effects of realistic changes to public health policy and note that this study does not cover improvements to forecasting models. As part of these changes, we will include a rationale for our choice of a genetic-information-only approach rather than a model that integrates phenotypic data. We will also refine Figure 1 to more clearly communicate the two factors we tested in this study.
● Provide a clearer explanation for the subsampling approach we use, include supplemental materials to communicate the geographic and temporal biases that exist in available HA sequence data, and discuss potential effects of different subsampling strategies.
● Evaluate the robustness of our results to different randomly subsampled data. We will perform additional technical replicates of our analysis workflow for natural populations, and summarize the effects of realistic interventions across replicates in a supplemental figure and the main text of the results.
● Investigate time-dependent effects of forecast horizons and submission lags on model accuracy to identify any potential biases in accuracy during specific historical epochs or any seasonal trends in accuracy associated with predicting future populations for the Northern or Southern Hemispheres.
● In the discussion, clarify how reducing submission lags would practically improve the WHO's ability to select vaccine candidate viruses and minimize jargon that currently makes the discussion less accessible to the average reader.
● Investigate how changes in forecast horizons and submission lags change the distance between predicted and observed future populations at antigenic positions (i.e., "epitope" positions) to understand whether we see the same effects with that subset of positions as we see across all HA positions.
-
-
www.biorxiv.org www.biorxiv.org
-
Author Response:
We greatly appreciate the feedback provided by reviewers on this manuscript. One of our key objectives was to provide a comprehensive, detailed resource for researchers using single-cell transcriptomics to study arthritis, especially immune cells like macrophages. We strived to perform thorough, wide-ranging analyses that are both accessible and useful to other scientists in the field, and that we hope will serve as the basis for many future avenues of study. As such, we acknowledge that this work is a “first step”, providing a strong descriptive foundation with some mechanistic insight that we and others will continue pursuing. Preliminary studies in our laboratory seeking to dissect signaling mechanisms associated with the M-CSF pathway have illuminated how complex and context-dependent this signaling is, which is an important consideration for future in vivo investigations. Further, it is indeed true that attempting to harmonize transcriptomic data across studies, models, laboratories, and dissection/processing methods is fraught with difficulty and prone to misinterpretation – and we made an effort to highlight this in our manuscript, particularly with respect to where synovial immune cells were recovered from, and how. We encourage healthy discussion within the field for developing shared, unified protocols for harvests and processing upstream of transcriptomic experiments.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
The authors report how a previously published method, ReplicaDock, can be used to improve predictions from AlphaFold-multimer (AFm) for protein docking studies. The level of improvement is modest for cases where AFm is successful; for cases where AFm is not as successful, the improvement is more significant, although the accuracy of prediction is also notably lower. The evidence for the ReplicaDock approach being more predictive than AFm is particularly convincing for the antibody-antigen test case. Overall, the study makes a valuable contribution by combining data- and physics-driven approaches.
-
Reviewer #1 (Public review):
Summary:
The authors wanted to use AlphaFold-multimer (AFm) predictions to reduce the challenge of physics-based protein-protein docking.
Strengths:
They found two features of AFm predictions that are very useful. 1) pLLDT is predictive of flexible residues, which they could target for conformational sampling during docking; 2) the interface-pLLDT score is predictive of the quality of AFm predictions, which allows the authors to decide whether to do local or global docking.
Weaknesses:
(1) As admitted by the authors, the AFm predictions for the main dataset are undoubtedly biased because these structures were used for AFm training. Could the authors find a way to assess the extent of this bias?<br /> (2) For the CASP15 targets where this bias is absent, the presentation was very brief. In particular, I'm interested in seeing how AFm helped with the docking? They may even want to do a direct comparison with docking results w/o the help of AFm.
Comments on revisions:
This revision has addressed my previous comments.
-
Reviewer #2 (Public review):
Summary:
In short, this paper uses a previously published method, ReplicaDock to improve predictions from AlphaFold-multimer. The method generated about 25% more acceptable predictions than AFm, but more important is improving an Antibody-antigen set, where more than 50% of the models become improved.
When looking at the results in more detail, it is clear that for the models where the AFm models are good, the improvement is modest (or not at all). See, for instance, the blue dots in Fig 6. However, in the cases where AFm fails, the improvement is substantial (red dots in Fig 6), but no models reach a very high accuracy (Fnat ~0.5 compared to 0.8 for the good AFm models). So the paper could be summarized by claiming, "We apply ReplicaDock when AFm fails", instead of trying to sell the paper as an utterly novel pipeline. I must also say that I am surprised by the excellent performance of ReplicaDock - it seems to be a significant step ahead of other (not AlphaFold) docking methods, and from reading the original paper, that was unclear. Having a better benchmark of it alone (without AFm) would be very interesting.
These results also highlight several questions I try to describe in the weakness section below. In short, they boil down to the fact that the authors must show how good/bad ReplicaDock is at all targets (not only the ones where AFm fails. In addition, I have several more technical comments.
Strengths:
Impressive increase in performance on AB-AG set (although a small set and no proteins ).
Weaknesses:
The presentation is a bit hard to follow. The authors mix several measures (Fnat, iRMS, RMSDbound, etc). In addition, it is not always clear what is shown. For instance, in Fig 1, is the RMSD calculated for a single chain or the entire protein? I would suggest that the author replace all these measures with two: TM-score when evaluating the quality of a single chain and DockQ when evaluating the results for docking. This would provide a clearer picture of the performance. This applies to most figures and tables. For instance, Fig 9 could be shown as a distribution of DockQ scores.
The improvements on the models where AFm is good are minimal (if at all), and it is unclear how global docking would perform on these targets, nor exactly why the plDDT<0.85 cutoff was chosen. To better understand the performance of ReplicaDock, the authors should therefore (i) run global and local docking on all targets and report the results, (ii) report the results if AlphaFold (not multimer) models of the chains were used as input to ReplicaDock (I would assume it is similar). These models can be downloaded from AlphaFoldDB.
Further, it would be interesting to see if ReplicaDock could be combined with AFsample (or any other model to generate structural diversity) to improve performance further.
The estimates of computing costs for the AFsample are incorrect (check what is presented in their paper). What are the computational costs for RepliaDock global docking?
It is unclear strictly what sequences were used as input to the modelling. The authors should use full-length UniProt sequences if that were not done.
The antibody-antigen dataset is small. It could easily be expanded to thousands of proteins. It would be interesting to know the performance of ReplicaDock on a more extensive set of Antibodies and nanobodies.
Using pLDDT on the interface region to identify good/bas models is likely suboptimal. It was acceptable (as a part of the score) for AlphaFold-2.0 (monomer), but AFm behaves differently. Here, AFm provides a direct score to evaluate the quality of the interaction (ipTM or Ranking Confidence). The authors should use these to separate good/bad models (for global/local docking), or at least show that these scores are less good than the one they used.
Comments on revisions:
The inclusion of the DockQ improved the paper. No further comments.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review)
Summary:
The authors wanted to use AlphaFold-multimer (AFm) predictions to reduce the challenge of physics-based protein-protein docking.
Strengths:
They found that two features of AFm predictions are very useful. 1) pLLDT is predictive of flexible residues, which they could target for conformational sampling during docking; 2) the interface-pLLDT score is predictive of the quality of AFm predictions, which allows the authors to decide whether to do local or global docking.
Weaknesses:
(1) As admitted by the authors, the AFm predictions for the main dataset are undoubtedly biased because these structures were used for AFm training. Could the authors find a way to assess the extent of this bias?
Indeed, the AFm training included most of the structures in the DB5 benchmark for its training as many structures (either unbound or bound) were deposited before the training cut-off period. One of the challenges of estimating this bias is the availability of new structures - both bound and unbound deposited after the training cut-off. Estimating the extent of training bias is therefore conditional on these factors and difficult. A few studies have attempted to address this bias (Yin et al, 2022, https://doi.org/10.1002/pro.4379).
In our study, we assess this bias by comparing the AFm structures to the bound and unbound forms and calculating their Ca RMSDs and TM-scores (new addition). We now elaborate in the Results:Dataset curation section and we have added a figure comparing the TM-scores in the supplement.
We added a clarifying text and a note about the TM-score calculation in the manuscript as follows:
“Since most of the benchmark targets in DB5.5 were included in AlphaFold training, there would be training bias associated with their predictions (i.e. our measured success rates are an upper bound).”
“We also calculated the TM-scores of the AFm predicted complex structures with respect to the bound and the unbound crystal structures (Supplementary Figure S2). As TM-scores reflect a global comparison between structures and are less sensitive to local structural deviations, no strong conclusions could be derived. This is in agreement with our intuition that since both unbound and bound states of proteins will share a similar fold, and AlphaFold can predict structures with high TM-scores in most cases, gauging the conformational deviations with TM-scores would be inconclusive.”
(2) For the CASP15 targets where this bias is absent, the presentation was very brief. In particular, it would be interesting to see how AFm helped with the docking. The authors may even want to do a direct comparison with docking results without the help of AFm.
Unfortunately since this was a CASP-CAPRI round, the structure of the unbound Antigen or the nanobodies was unavailable. Thus we cannot perform a comparison without using AF2 at all since we need a structure prediction tool to produce the unbound nanobody and the nanobody-antigen complex template structure to dock. This has been clarified in the main text for better understanding for the readers.
“Since the nanobody-antigen complexes were CASP targets, we did not have unbound structures, rather only the sequences of individual chains. Therefore, for each target, we employed the AlphaRED strategy as described in Fig 7.”
Reviewer #1 (Recommendations For The Authors):
For suggestions for major improvements, see comments under weaknesses. One additional suggestion: the authors found that pLLDT is predictive of flexible residues. Can they try to find AFm features that are predictive of the interface site? Such information may guide their docking to a local site.
This is a great idea that we and others have been thinking about considerably. Prior work by Burke et al. (Towards a structurally resolved human protein interaction network) examines AlphaFold’s ability to predict PPIs. For high-confidence predicted models of interacting protein complexes, the authors showed that pDockQ correlated reasonably well with correct protein interactions.
That being said, binding site identification, particularly in a partner-agnostic fashion, i.e. determining binding patches on a given protein, is an area of on-going research . We hope a future study examines AlphaFold3 or ESM3 specifically for this task.
“Further, we tested multiple thresholds to estimate the optimum cut-off for distinguishing near-native structures (defined as an interface-RMSD < 4 Å) from the predictions. Figure 3.B summarizes the performance with a confusion matrix for the chosen interface-pLDDT cutoff of 85. 79 % of the targets are classified accurately with a precision of 75%, thereby validating the utility of interface-pLDDT as a discriminating metric to rank the docking quality of the AFm complex structure predictions. With AlphaFold3 and ESM3 being released, investigating features that could predict flexible residues or interface site would be valuable, as this information may guide local docking.”
Minor:
Page 3, lines 73-77, state how many targets were curated from DB5.5.
We have now clarified this in the manuscript. All 254 targets curated from DB5.5 at the time of this benchmark study.
“For each protein target, we extracted the amino acid sequences from the bound structure and predicted a corresponding three-dimensional complex structure with the ColabFold implementation of the AlphaFold multimer v2.3.0 (released in March 2023) for the 254 benchmark targets from DB5.5.”
In Figure 1, the color used for medium is too difficult to distinguish from the grey color used for rigid.
We thank you for this suggestion. We have updated the color to olive. Further, based on Reviewer 2’s suggestions, we have moved this plot to the Supplementary.
Reviewer #2 (Public Review):
Summary:
In short, this paper uses a previously published method, ReplicaDock, to improve predictions from AlphaFold-multimer. The method generated about 25% more acceptable predictions than AFm, but more important is improving an Antibody-antigen set, where more than 50% of the models become improved.
When looking at the results in more detail, it is clear that for the models where the AFm models are good, the improvement is modest (or not at all). See, for instance, the blue dots in Figure 6. However, in the cases where AFm fails, the improvement is substantial (red dots in Figure 6), but no models reach a very high accuracy (Fnat ~0.5 compared to 0.8 for the good AFm models). So the paper could be summarized by claiming, "We apply ReplicaDock when AFm fails", instead of trying to sell the paper as an utterly novel pipeline. I must also say that I am surprised by the excellent performance of ReplicaDock - it seems to be a significant step ahead of other (not AlphaFold) docking methods, and from reading the original paper, that was unclear. Having a better benchmark of it alone (without AFm) would be very interesting.
We thank the reviewer for highlighting the performance of ReplicaDock. ReplicaDock alone is benchmarked in the original paper (10.1371/journal.pcbi.1010124), with full details on the 2022 version of DB5.5 in the supplement. Indeed ReplicaDock2 achieves the highest reported success rates on flexible docking targets reported in the literature (until this AlphaRED paper!).
Regarding this statement about “the paper could be summarized…” it might be helpful to give more context. ReplicaDock is a replica exchange Monte Carlo sampling approach for protein docking that incorporates flexibility in an induced-fit fashion. However, the choice of which backbone residues to move is solely dependent on contacts made during each docking trajectory. In the last section of the ReplicaDock paper, we introduced “Directed Induced-fit” where we biased the backbone sampling only towards those residues where we knew the backbone is flexible (this information is obtained because for the benchmark set, we had both unbound and bound structures and hence could cherry-pick the specific residues which are mobile). We agree with the reviewers that AlphaRED is essentially a derivative of ReplicaDock, however, the two major claims that we make in this paper are:
(1) AlphaFold pLDDT is an effective predictor of backbone flexibility for practical use in docking.
(2) We can automate the Directed InducedFit approach within ReplicaDock by utilizing this pLDDT information per residue for conformational sampling in protein docking; and in doing so, create a pipeline that would allow us to go from sequence-to-structure-to-complex, specifically capturing conformational changes.
To conclude these claims, we pose the following questions in the Introduction:
“(1) Do the residue-specific estimates from AF/AFm relate to potential metrics demonstrating conformational flexibility?
(2) Can AF/AFm metrics deduce information about docking accuracy?
(3) Can we create a docking pipeline for in-silico complex structure prediction incorporating AFm to convert sequence-to-structure-to-docked complexes?”
This work requires a pipeline, the center of which lies in ReplicaDock as a docking method, but has functionalities that were absent in prior work. The goal is also to develop a one-stop shop without manual intervention (a prerequisite for biasing backbone sampling in ReplicaDock) that could be utilized by structural biologists efficiently.
We clarify this points in the abstract and main text as follows:
Abstract: “In this work, we combine AlphaFold as a structural template generator with a physics-based replica exchange docking algorithm \add{to better sample conformational changes.”
Introduction:
“The overarching goal is to create a one-stop, fully-automated pipeline for simple, reproducible, and accurate modeling of protein complexes. We investigate the aforementioned questions and create a protocol to resolve AFm failures and capture binding-induced conformational changes. We first assess the utility of AFm confidence metrics to detect conformational flexibility and binding site confidence.”
These results also highlight several questions I try to describe in the weakness section below. In short, they boil down to the fact that the authors must show how good/bad ReplicaDock is at all targets (not only the ones where AFm fails. In addition, I have several more technical comments.
Strengths:
Impressive increase in performance on AB-AG set (although a small set and no proteins).
We thank the reviewer for their comments.
Weaknesses:
The presentation is a bit hard to follow. The authors mix several measures (Fnat, iRMS, RMSDbound, etc). In addition, it is not always clear what is shown. For instance, in Figure 1, is the RMSD calculated for a single chain or the entire protein? I would suggest that the author replace all these measures with two: TM-score when evaluating the quality of a single chain and DockQ when evaluating the results for docking. This would provide a clearer picture of the performance. This applies to most figures and tables.
We apologize for the lack of clarity owing to different metrics. Irms and fnat are standard performance metrics in the docking field, but we agree that DockQ would be simpler when the detail of the other metrics are not required. We have updated the figures Figure 5 and Figure 8 to also show DockQ comparisons.
Regarding Figure 1, as highlighted in Line 90 of the main-text, “Figure 1 shows the Ca-RMSD of all protein partners of the AFm predicted complex structures with respect to the bound and the unbound.” As suggested by the reviewer in their further comments, we have moved this FIgure to the Supplementary. We have also included TM-score comparison in the Supplementary ( SupFig S2) and included clarifying statements in the main text:
“We also tested TM-scores to measure the structural deviations of the AFm predicted complex structures with respect to the bound and unbound structures (Supplementary Figure S2). However, this metric is not sensitive enough to detect the subtle, local conformational changes upon binding.”
For instance, Figure 9 could be shown as a distribution of DockQ scores.
We have now updated Figure 5 to include DockQ scores in Panel D. Since DockQ is a function of iRMSD, fnat and L-RMSD, it shows cumulative improvement in performance. Some of the nuanced details, such as, the protocol improves i-RMSD considerably but fnat improvement is lacking, and can highlight whether backbone sampling is the challenge or is it sidechain refinement.Therefore, we need to retain the iRMSD and fnat metrics in panel A-C . But We have incorporated this in the main text as follows:
“Finally, to evaluate docking success rates, we calculate DockQ for top predictions from AFm and AlphaRED respectively (Figure 5D). AlphaRED demonstrates a success rate (DockQ>0.23) for 63% of the benchmark targets. Particularly for Ab-Ag complexes, AFm predicted acceptable or better quality docked structures in only 20% of the 67 targets. In contrast, the AlphaRED pipeline succeeds in 43% of the targets, a significant improvement.”
Further, we have reevaluated success rates in Figure 8 (previously Figure 9) and have updated the manuscript to report these updated success rates.
“By utilizing the AlphaRED strategy, we show that failure cases in AFm predicted models are improved for all targets (lower Irms for 97 of 254 failed targets) with CAPRI acceptable-quality or better models generated for 62% of targets overall (Fig 8)”.
The improvements on the models where AFm is good are minimal (if at all), and it is unclear how global docking would perform on these targets, nor exactly why the plDDT<0.85 cutoff was chosen.
We agree with the reviewers that the improvement on the models with good AFm predictions is minimal. We acknowledge this in the text now as follows:
“Most of the improvements in the success rates are for cases where AFm predictions are worse. For targets with good AFm predictions, AlphaRED refinement results in minimal improvements in docking accuracy.”
The choice of pLDDT cutoff = 85 is elaborated in the “Interface-pLDDT correlates with DockQ and discriminates poorly docked structures” section, paragraph 3. Briefly, we tested multiple metrics and the interface pLDDT had the highest AUC, indicating that it is the best metric for this task. For interface-pLDDT we tested multiple thresholds, and the cutoff of 85 resulted in the highest percentage of true-positive and true-negative rates. This is illustrated with the confusion matrix in Figure 3.B with the precision scores. We now clarify this in the text as follows:
“With interface-pLDDT as a discriminating metric, we tested multiple thresholds to estimate the optimum cut-off for distinguishing near-native structures (defined as an interface-RMSD < 4 Å) from the predictions. Figure 3B summarizes the performance with a confusion matrix for the chosen interface-pLDDT cutoff of 85. 79% of the targets are classified accurately with a precision of 75%, thereby validating the utility of interface-pLDDT as a discriminating metric to rank the docking quality of the AFm complex structure predictions.”
To better understand the performance of ReplicaDock, the authors should therefore (i) run global and local docking on all targets and report the results, (ii) report the results if AlphaFold (not multimer) models of the chains were used as input to ReplicaDock (I would assume it is similar). These models can be downloaded from AlphaFoldDB.
The performance of ReplicaDock on DB5.5 is tabulated in our prior work (https://doi.org/10.1371/journal.pcbi.1010124) and we direct the reviewers there for the detailed performance and results. In our opinion, the benchmark suggested by the reviewer would be redundant and not worth the computational expense.
The scope of this paper is to highlight a structure prediction + physics-based modeling pipeline for docking to adapt to the accuracy of up-and-coming structure prediction tools.
Using AlphaFold monomer chains as input and benchmarking on that, albeit interesting scientifically, will not be useful for either the pipeline or biologists who would want a complex structure prediction. We thank the authors for their comments but want to reemphasize that the end goal of this work is to increase the accuracy of complex structure predictions and PPIs obtained from computational tools.
Further, it would be interesting to see if ReplicaDock could be combined with AFsample (or any other model to generate structural diversity) to improve performance further.
We would like to highlight that ReplicaDock is a stand-alone tool for protein docking and here we demonstrate the ability of adapting it with metrics derived from AlphaFold or other structure prediction tools (say ESMFold) such as pLDDT for conformational sampling and improving docking accuracy. We definitely agree that adapting it to use with tools such as AFSample will be interesting but it is out of scope of this work.
The estimates of computing costs for the AFsample are incorrect (check what is presented in their paper). What are the computational costs for RepliaDock global docking?
The authors of the AFSample paper report that “AFsample requires more computational time than AF2, as it generates 240 models, and including the extra recycles, the overall timing is 1000 more costly than the baseline.” We have reported these exact numbers in our manuscript.
The computational costs of ReplicaDock are 8-72 CPU hours on a single node with 24 processors as reported in our prior work.
For AlphaRED, the costs are slightly higher owing to the structure prediction module in the beginning and are up to 100 CPU hrs for our largest (max Nres) target.
It is unclear strictly what sequences were used as input to the modelling. The authors should use full-length UniProt sequences if they were not done.
We report this in the methods section of the manuscript as well as in Figure 5. Full length complex sequences were used for the models that we extracted from DB5.5.
“As illustrated in Fig. 5, given a sequence of a protein complex, we use the ColabFold implementation of AF2-multimer to obtain a predictive template.”
We clarify this in the methods section as:
“For each target in the DB5.5 dataset, we first extracted the corresponding FASTA sequence for the bound complex and then obtained AlphaFold predicted models with the ColabFold v1.5.2 implementation of AlphaFold and AlphaFold-multimer (v.2.3.0).”
The antibody-antigen dataset is small. It could easily be expanded to thousands of proteins. It would be interesting to know the performance of ReplicaDock on a more extensive set of Antibodies and nanobodies.
This work demonstrates the performance on the docking benchmark, i.e. given unbound structure can you predict the bound complexes. With this regard, our analysis has been focussed on targets where both the unbound and bound structures are available so that we could evaluate the ability of AlphaRED on modeling protein flexibility and docking accuracy. For antibody-antigen complexes, there are only 67 structures with both unbound and bound complexes available and they constituted our dataset. Benchmarking AlphaRED on all antibody-antigen targets can give biased results as most Ab-Ag complexes are in AlphaFold training set. Further, our work is more aimed towards predicting conformational flexibility in docking and not rigid-body docked complexes, so benchmarking on existing bound Ab-Ag structures is out of scope for this work.
Using pLDDT on the interface region to identify good/bas models is likely suboptimal. It was acceptable (as a part of the score) for AlphaFold-2.0 (monomer), but AFm behaves differently. Here, AFm provides a direct score to evaluate the quality of the interaction (ipTM or Ranking Confidence). The authors should use these to separate good/bad models (for global/local docking), or at least show that these scores are less good than the one they used.
We thank the reviewers for this suggestion.
Reviewer #2 (Recommendations For The Authors):
Some Figures could be skipped/improved
Fig 1: Use TM-score instead a much better measure (and the figure is not necessary).
Figure 1 compares the bias of AlphaFold towards unbound or bound forms of the proteins. We believe that this figure highlights the slight inherent bias of AlphaFold towards bound structures over unbound.
As the reviewers have suggested we have included a plot comparing the TM-scores for the structures. Further, we have moved this figure to the Supplementary.
Fig 2. Skip B (why compare RMSD with pLDDT?). Add a figure to see how this correlates over all targets not just two.
RMSD and LDDT both represent metrics to evaluate conformational variability between two structures, such as the bound and unbound forms of the same protein structure. On one hand where RMSD measures overall deviation of residues, LDDT allows the estimation of relative domain orientations and concerted proteins. We have elaborated this in Methods as well as in the Results section titled “AlphaFold pLDDT provides a predictive confidence measure for backbone flexibility”.
The data for the benchmark targets is now included in the Supplementary (Supplementary Figures S3-S4).
Fig 3. Color the different chains of a protein differently. Thereby the Receptor/Ligand/Bound labels can be omitted.
We thank the reviewers for this suggestion. However, the color scheme is chosen to highlight (1) the relative orientation of protein partners relative to each other. We have ensured that the alignment is over one partner (Receptor) so that you could see the relative orientation of the other partner (Ligand) in the modeled protein over the bound structure (in one color). (2) The coloring of the receptor and ligand chain is by pLDDT (from red to blue) to highlight that for decoys with incorrectly predicted interfaces, the pLDDT scores of the interface residues are indeed lower and can be a discriminating metric. We elaborate this in the caption of Figure 3 as well as in the section “Interface-pLDDT correlates with DockQ and discriminates poorly docked structures”. Coloring the chains of a protein differently will obfuscate the point that we are aiming to make and will be inconclusive for the readers as they would need to rely only on quantitative metrics (Irms and DockQ) reported but won’t be able to visualize the interface pLDDT of the incorrectly bound structures. We hope that this justifies the choice of our color scheme.
Fig 4. Include RankConf, ipTM, pDockQ, and other measures in the plos (they are likely better). Include DockQ for the top targets. It is difficult to estimate for multi chain complexes.
We thank the reviewer for this suggestion. We have now included the DockQ performances for all targets in Figure 5 (previously Figure 6) as well as re-evaluated our final success rates based on the DockQ calculations in Figure 8 (previously Figure 9).
Fig 5. use a better measure to split (see above).
We have elaborated on the choice of the split for the comments above and the interface pLDDT threshold of 85 is a decision made post observation on the docking benchmark. We do want to highlight that the cut-off is arbitrary and in our online server (ROSIE) as well as in custom scripts, this cut-off can be tuned by the user as required. We would suggest a cut-off of 85 based on our observations but the users are welcome to tune this as per their needs.
Fig 6. Replace lrms/fnat with DockQ.
We have now included DockQ scores in our manuscript.
Fig 7. Color the different chains of a protein differently.
We have colored the protein chains differently. AlphaFold models are in Orange, Bound complexes are in Gray, and predicted proteins from AlphaRED are in Blue-Green indicating the two partners. All models are aligned over the receptor so relative orientations of the ligand protein can be observed.
Fig 8 Color the different chains of a protein differently.
The chains are colored differently. We would like the reviewer to elaborate more on what they would like to observe as we believe our color scheme makes intuitive sense for readers.
Fig 9. Use DockQ instead of CAPRI criteria.
The figure has been updated based on DockQ. To elaborate, the CAPRI criteria is set based on DockQ scores as elaborated in the figure caption.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript reports important findings that the methyltransferase METTL3 is involved in the repair of abasic sites and uracil in DNA, mediating resistance to floxuridine-driven cytotoxicity. Convincing evidence shows the involvement of m6A in DNA based on single cell imaging and mass spec data. The authors present evidence that the m6A signal does not result from bacterial contamination or RNA, but the text does not make this overly clear.
-
Reviewer #1 (Public review):
Summary:
The authors sought to identify unknown factors involved in the repair of uracil in DNA through a CRISPR knockout screen.
Strengths:
The screen identified both known and unknown proteins involved in DNA repair resulting from uracil or modified uracil base incorporation into DNA. The conclusion is that the protein activity of METTL3, which converts A nucleotides to 6mA nucleotides, plays a role in the DNA damage/repair response. The importance of METTL3 in DNA repair, and its colocalization with a known DNA repair enzyme, UNG2, is well characterized.
Weaknesses:
This reviewer identified no major weaknesses in this study. The manuscript could be improved by tightening the text throughout, and more accurate and consistent word choice around the origin of U and 6mA in DNA. The dUTP nucleotide is misincorporated into DNA, and 6mA is formed by methylation of the A base present in DNA. Using words like 6mA "deposition in DNA" seems to imply it results from incorporation of a methylated dATP nucleotide during DNA synthesis.
-
Reviewer #2 (Public review):
Summary:
In this work, the authors performed a CRISPR knockout screen in the presence of floxuridine, a chemotherapeutic agent that incorporates uracil and fluoro-uracil into DNA, and identified unexpected factors, such as the RNA m6A methyltransferase METTL3, as required to overcome floxuridine-driven cytotoxicity in mammalian cells. Interestingly, the observed N6-methyladenosine was embedded in DNA, which has been reported as DNA 6mA in mammalian genomes and is currently confirmed with mass spectrometry in this model. Therefore, this work consolidated the functional role of mammalian genomic DNA 6mA, and supported with solid evidence to uncover the METTL3-6mA-UNG2 axis in response to DNA base damage.
Strengths:
In this work, the authors took an unbiased, genome-wide CRISPR approach to identify novel factors involved in uracil repair with potential clinical interest.
The authors designed elegant experiments to confirm the METTL3 works through genomic DNA, adding the methylation into DNA (6mA) but not the RNA (m6A), in this base damage repair context. The authors employ different enzymes, such as RNase A, RNase H, DNase, and liquid chromatography coupled to tandem mass spectrometry to validate that METTL3 deposits 6mA in DNA in response to agents that increase genomic uracil.
They also have the Mettl3-KO and the METTL3 inhibition results to support their conclusion.
Weaknesses:
Although this study demonstrates that METTL3-dependent 6mA deposition in DNA is functionally relevant to DNA damage repair in mammalian cells, there are still several concerns and issues that need to be improved to strengthen this research.
First, in the whole paper, the authors never claim or mention the mammalian cell lines contamination testing result, which is the fundamental assay that has to be done for the mammalian cell lines DNA 6mA study.
Second, in the whole work, the authors have not supplied any genomic sequencing data to support their conclusions. Although the sequencing of DNA 6mA in mammalian models is challenging, recent breakthroughs in sequencing techniques, such as DR-Seq or NT/NAME-seq, have lowered the bar and improved a lot in the 6mA sequencing assay. Therefore, the authors should consider employing the sequencing methods to further confirm the functional role of 6mA in base repair.
Third, the authors used the METTL3 inhibitor and Mettl3-KO to validate the METTL3-6mA-UNG2 functional roles. However, the catalytic mutant and rescue of Mettl3 may be the further experiments to confirm the conclusion.
-
Reviewer #3 (Public review):
Summary:
The authors are showing evidence that they claim establishes the controversial epigenetic mark, DNA 6mA, as promoting genome stability.
Strengths:
The identification of a poorly understood protein, METTL3, and its subsequent characterization in DDR is of high quality and interesting.
Weaknesses:
(1) The very presence of 6mA (DNA) in mammalian DNA is still highly controversial and numerous studies have been conclusively shown to have reported the presence of 6mA due to technical artifacts and bacterial contamination. Thus, to my knowledge there is no clear evidence for 6mA as an epigenetic mark in mammals, and consequently, no evidence of writers and readers of 6mA. None of this is mentioned in the introduction. Much of the introduction can be reduced, but a paragraph clearly stating the controversy and lack of evidence for 6mA in mammals needs to be added, otherwise, the reader is given an entirely distorted view of the field.
These concerns must also be clearly in the limitations section and even in the results section which fails to nuance the authors' findings.
(2) What is the motivation for using HT-29 cells? Moreover, the materials and methods do not state how the authors controlled for bacterial contamination, which has been the most common cause of erroneous 6mA signals to date. Did the authors routinely check for mycoplasma?
(3) The single cell imaging of 6mA in various cells is nice. The results are confirmed by mass spec as an orthogonal approach. Another orthogonal and quantitative approach to assessing 6mA levels would be PacBio. Similarly, it is unclear why the authors have not performed dot-blots of 6mA for genomic DNA from the given cell lines.
(4) The results of Figure 3 need further investigation and validation. If the results are correct the authors are suggesting that the majority of 6mA in their cell lines is present in the DNA, and not the RNA, which is completely contrary to every other study of 6mA in mammalian cells that I am aware of. This could suggest that the antibody is not, in fact, binding to 6mA, but to unmodified adenine, which would explain why the signal disappears after DNAse treatment. Indeed, binding of 6mA to unmethylated DNA is a commonly known problem with most 6mA antibodies and is well described elsewhere.
(5) Given the lack of orthologous validation of the observed DNA 6mA and the lack of evidence supporting the presence of 6mA in mammalian DNA and consequently any functional role for 6mA in mammalian biology, the manuscript's conclusions need to be toned down significantly, and the inherent difficulty in assessing 6mA accurately in mammals acknowledged throughout.
-
Author response:
eLife Assessment <br /> This manuscript reports important findings that the methyltransferase METTL3 is involved in the repair of abasic sites and uracil in DNA, mediating resistance to floxuridine-driven cytotoxicity. The presented evidence for the involvement of m6A in DNA is incomplete and requires further validation with orthogonal approaches to conclusively show the presence of 6mA in the DNA and exclude that the source is RNA or bacterial contamination.
We thank the editors for recognizing the importance of our work and the relevance of METTL3 in DNA repair. However, we wholly disagree with the second sentence in the eLife assessment, and we want to clarify why our evidence for the involvement of 6mA in DNA is complete.
The identification of 6mA in DNA, upon DNA damage, is based first on immunofluorescence observations using an anti-m6A antibody. In this setting, removal of RNA with RNase treatment fails to reduce the 6mA signal, excluding the possibility that the source of signal is RNA. In contrast, removal of DNA with DNase treatment removes all 6mA signal, strongly suggesting that the species carrying the N6-methyladenosine modification is DNA (Figure 3D, E). Importantly, in Figure 3F, we provide orthogonal, quantitative mass spectrometry data that independently confirm this finding. Mass spectrometry-liquid chromatography of DNA analytes, conclusively shows the presence of 6mA in DNA upon treatment with DNA damaging agents and excludes that the source is RNA, based on exact mass. Reviewer #2 recognized the strengths of this approach to generate solid evidence for 6mA in DNA.
Cells only show the 6mA signal when treated with DNA damaging agents, and the 6mA is absent from untreated cells (Figure 3D, E, F). This provides strong evidence that the 6mA signal is not a result of bacterial contamination in our cell lines. Moreover, our cell lines are routinely tested for mycoplasma contamination. It could be possible that stock solutions of DNA damaging agents may be contaminated, but this would need to be true for all individual drugs and stocks tested. The data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3G, H) provides strong evidence against bacterial contamination in our stocks.
In summary, we provide conclusive evidence, based on orthogonal methods, that the METTL3-dependent N6-methyladenosine modification is deposited in DNA, not RNA, in response to DNA damage.
Public Reviews: <br /> Reviewer #1 (Public review): <br /> Summary:
The authors sought to identify unknown factors involved in the repair of uracil in DNA through a CRISPER knockout screen.
Typo above: “CRISPER” should be “CRISPR”.
Strengths:
The screen identified both known and unknown proteins involved in DNA repair resulting from uracil or modified uracil base incorporation into DNA. The conclusion is that the protein activity of METTL3, which converts A nucleotides to 5mA nucleotides, plays a role in the DNA damage/repair response. The importance of METTL3 in DNA repair, and its colocalization with a known DNA repair enzyme, UNG2, is well characterized.
Typo above: “5mA” should be “6mA”.
Weaknesses: <br /> This reviewer identified no major weaknesses in this study. The manuscript could be improved by tightening the text throughout, and more accurate and consistent word choice around the origin of U and 6mA in DNA. The dUTP nucleotide is misincorporated into DNA, and 6mA is formed by methylation of the A base present in DNA. Using words like 6mA "deposition in DNA" seems to imply it results from incorporation of a methylated dATP nucleotide during DNA synthesis.
The increased presence of 6mA during DNA damage could result from methylation at the A base itself (within DNA) or from incorporation of pre-modified 6mA during DNA synthesis. Our data do not directly discriminate between these two mechanisms, and we will clarify this point in the discussion.
Reviewer #2 (Public review): <br /> Summary: <br /> In this work, the authors performed a CRISPR knockout screen in the presence of floxuridine, a chemotherapeutic agent that incorporates uracil and fluoro-uracil into DNA, and identified unexpected factors, such as the RNA m6A methyltransferase METTL3, as required to overcome floxuridine-driven cytotoxicity in mammalian cells. Interestingly, the observed N6-methyladenosine was embedded in DNA, which has been reported as DNA 6mA in mammalian genomes and is currently confirmed with mass spectrometry in this model. Therefore, this work consolidated the functional role of mammalian genomic DNA 6mA, and supported with solid evidence to uncover the METTL3-6mA-UNG2 axis in response to DNA base damage. <br /> Strengths: <br /> In this work, the authors took an unbiased, genome-wide CRISPR approach to identify novel factors involved in uracil repair with potential clinical interest.
The authors designed elegant experiments to confirm the METTL3 works through genomic DNA, adding the methylation into DNA (6mA) but not the RNA (m6A), in this base damage repair context. The authors employ different enzymes, such as RNase A, RNase H, DNase, and liquid chromatography coupled to tandem mass spectrometry to validate that METTL3 deposits 6mA in DNA in response to agents that increase genomic uracil. <br /> They also have the Mettl3-KO and the METTL3 inhibition results to support their conclusion. <br /> Weaknesses:<br /> Although this study demonstrates that METTL3-dependent 6mA deposition in DNA is functionally relevant to DNA damage repair in mammalian cells, there are still several concerns and issues that need to be improved to strengthen this research.
First, in the whole paper, the authors never claim or mention the mammalian cell lines contamination testing result, which is the fundamental assay that has to be done for the mammalian cell lines DNA 6mA study.
Our cell lines are routinely tested for bacterial contamination, specifically mycoplasma, and we plan to state this information in a revised version of the manuscript.
Importantly, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on the presence of DNA damage and not caused by contamination in the cell lines (Figure 3D, E, F). While it could be possible that stock solutions of DNA damaging agents may be contaminated, this would need to be the case for all individual drugs and stocks tested that induce 6mA, which seems very unlikely. Finally, the data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3 G, H) provides strong evidence against bacterial contamination in our drug stocks.
Second, in the whole work, the authors have not supplied any genomic sequencing data to support their conclusions. Although the sequencing of DNA 6mA in mammalian models is challenging, recent breakthroughs in sequencing techniques, such as DR-Seq or NT/NAME-seq, have lowered the bar and improved a lot in the 6mA sequencing assay. Therefore, the authors should consider employing the sequencing methods to further confirm the functional role of 6mA in base repair.
While we agree that it could be important to understand the precise genomic location of 6mA in relation to DNA damage, this is outside the scope of the current study. Moreover, this exercise may prove unproductive. If 6mA is enriched in DNA at damage sites or as DNA is replicated, the genomic mapping of 6mA is likely to be stochastic. If stochastic, it would be impossible to obtain the read depth necessary to map 6mA accurately.
Third, the authors used the METTL3 inhibitor and Mettl3-KO to validate the METTL3-6mA-UNG2 functional roles. However, the catalytic mutant and rescue of Mettl3 may be the further experiments to confirm the conclusion.
We believe this to be an excellent suggestion from Reviewer #2 but we are unable to perform the proposed experiment at this time. We encourage future studies to explore the rescue experiment.
Reviewer #3 (Public review):
Summary:
The authors are showing evidence that they claim establishes the controversial epigenetic mark, DNA 6mA, as promoting genome stability.
Strengths:
The identification of a poorly understood protein, METTL3, and its subsequent characterization in DDR is of high quality and interesting.
Weaknesses:
(1) The very presence of 6mA (DNA) in mammalian DNA is still highly controversial and numerous studies have been conclusively shown to have reported the presence of 6mA due to technical artifacts and bacterial contamination. Thus, to my knowledge there is no clear evidence for 6mA as an epigenetic mark in mammals, and consequently, no evidence of writers and readers of 6mA. None of this is mentioned in the introduction. Much of the introduction can be reduced, but a paragraph clearly stating the controversy and lack of evidence for 6mA in mammals needs to be added, otherwise, the reader is given an entirely distorted view of the field.
These concerns must also be clearly in the limitations section and even in the results section which fails to nuance the authors' findings.
We agree with the reviewer that the presence and potential function of 6mA in mammalian DNA has been debated. Importantly, the debate regarding the presence and quantity of 6mA in DNA has been previously restricted to undamaged, baseline conditions. In complete agreement with this notion, we do not detect appreciable levels of 6mA in untreated cells. We will revise the introduction to introduce the debate about 6mA in DNA. We, however, want to highlight that our study provides for the first time, convincing evidence (based on orthogonal methods) that 6mA is present in DNA in response to a stimulus, DNA damage.
(2) What is the motivation for using HT-29 cells? Moreover, the materials and methods do not state how the authors controlled for bacterial contamination, which has been the most common cause of erroneous 6mA signals to date. Did the authors routinely check for mycoplasma?
HT-29 is a cell line of colorectal origin and chemotherapeutic agents that introduce uracil and uracil derivatives in DNA, as those used in this study, are relevant for the treatment of colorectal cancer. As indicated above, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on DNA damage and not caused by a potential bacterial contamination (Figure 3D, E, F). Additionally, our cell lines are routinely tested for bacterial contamination, specifically mycoplasma.
(3) The single-cell imaging of 6mA in various cells is nice but must be confirmed by orthogonal approaches. PacBio would provide an alternative and quantitative approach to assessing 6mA levels. Similarly, it is unclear why the authors have not performed dot-blots of 6mA for genomic DNA from the given cell lines.
We are confused by this point since an orthogonal approach to detect 6mA, mass spectrometry-liquid chromatography, was employed. This method does not use an antibody and confirms the increase of 6mA in DNA when cells were treated with DNA damaging agents. This data is presented in Figure 3F.
It is sensible to hypothesize that the localization of 6mA is consistent with DNA replication (like uracil deposition). In this event, the genomic mapping of 6mA is likely to be stochastic. This would make quantification with PacBio sequencing difficult because it would be very challenging to achieve the appropriate read depth to call a modified base.
Dot blots rely on an antibody and thus are not truly orthogonal to our immunofluorescence-based measurements. We preferred the mass spectrometry-liquid chromatography approach we took as a true orthogonal approach.
(4) The results of Figure 3 need further investigation and validation. If the results are correct the authors are suggesting that the majority of 6mA in their cell lines is present in the DNA, and not the RNA, which is completely contrary to every other study of 6mA in mammalian cells that I am aware of. This could suggest that the antibody is not, in fact, binding to 6mA, but to unmodified adenine, which would explain why the signal disappears after DNAse treatment. Indeed, binding of 6mA to unmethylated DNA is a commonly known problem with most 6mA antibodies and is well described elsewhere.
Based on this and the following comment, we are convinced that Reviewer #3 has overlooked two critical elements of our study:
First, the immunofluorescence work presented in Figure 3, showing 6mA signal in response to DNA damage, uses cells that were pre-extracted to remove excess cytoplasmic RNA. This method is often used in immunofluorescence experiments of this kind. The pre-extraction method removes most of the cytoplasmic content, and the majority of the cytoplasmic m6A RNA signal. Supplementary Figure 3D shows cells that have not been pre-extracted prior to staining. These images show the cytoplasmic m6A signal is abundant if we do not perform the pre-extraction step.
If the antibody used to label 6mA significantly reacted with unmodified adenine, we would expect a large signal in untreated or untreated and denatured conditions. In contrast, an increase in 6mA is not observed in either case.
Second, the orthogonal approach we employed, mass spectrometry coupled with liquid chromatography, measures 6mA DNA analytes specifically by exact mass. This approach does not depend on an antibody and yields results consistent with those from the immunofluorescence experiments.
(5) Given the lack of orthologous validation of the observed DNA 6mA and the lack of evidence supporting the presence of 6mA in mammalian DNA and consequently any functional role for 6mA in mammalian biology, the manuscript's conclusions need to be toned down significantly, and the inherent difficultly in assessing 6mA accurately in mammals acknowledged throughout.
Typo above: “difficultly” should be “difficulty”.
As discussed in response to prior comments, Figure 3 does provide two independent and orthologous methods that demonstrate 6mA presence in DNA specifically, and not RNA, in response to DNA damage. Complementary and orthogonal datasets are presented using either immunofluorescence microscopy or mass spectrometry-liquid chromatography of extracted DNA. The latter method does not rely on an antibody and can discriminate 6mA DNA versus RNA based on exact mass. We will revise the text to clarify that Figure 3F is a completely orthogonal approach.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study marks a significant advancement in brain aging research by centering on Asian populations (Chinese, Malay, and Indian Singaporeans), a group frequently underrepresented in such studies. It unveils solid evidence for anatomical differences in brain aging predictors between the young and old age groups. Overall, this study broadens our understanding of brain aging across diverse ethnicities.
-
Joint Public Review:
Summary:
The authors of the study investigated the generalization capabilities of a deep learning brain age model across different age groups within the Singaporean population, encompassing both elderly individuals aged 55 to 88 years and children aged 4 to 11 years. The model, originally trained on a dataset primarily consisting of Caucasian adults, demonstrated a varying degree of adaptability across these age groups. For the elderly, the authors observed that the model could be applied with minimal modifications, whereas for children, significant fine-tuning was necessary to achieve accurate predictions. Through their analysis, the authors established a correlation between changes in the brain age gap and future executive function performance across both demographics. Additionally, they identified distinct neuroanatomical predictors for brain age in each group: lateral ventricles and frontal areas were key in elderly participants, while white matter and posterior brain regions played a crucial role in children. These findings underscore the authors' conclusion that brain age models hold the potential for generalization across diverse populations, further emphasizing the significance of brain age progression as an indicator of cognitive development and aging processes.
Strengths:
(1) The study tackles a crucial research gap by exploring the adaptability of a brain age model across Asian demographics (Chinese, Malay, and Indian Singaporeans), enriching our knowledge of brain aging beyond Western populations.<br /> (2) It uncovers distinct anatomical predictors of brain aging between elderly and younger individuals, highlighting a significant finding in the understanding of age-related changes and ethnic differences.
In summary, this paper underscores the critical need to include diverse ethnicities in model testing and estimation.
Comments on revisions:
The previously mentioned weaknesses were addressed in the revision process. As stated earlier the paper tackles a crucial research gap by exploring the adaptability of a brain-age model across Asian demographics (Chinese, Malay, and Indian Singaporeans), enriching our knowledge of brain aging beyond Western populations.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
The authors of the study investigated the generalization capabilities of a deep learning brain age model across different age groups within the Singaporean population, encompassing both elderly individuals aged 55 to 88 years and children aged 4 to 11 years. The model, originally trained on a dataset primarily consisting of Caucasian adults, demonstrated a varying degree of adaptability across these age groups. For the elderly, the authors observed that the model could be applied with minimal modifications, whereas for children, significant fine-tuning was necessary to achieve accurate predictions. Through their analysis, the authors established a correlation between changes in the brain age gap and future executive function performance across both demographics. Additionally, they identified distinct neuroanatomical predictors for brain age in each group: lateral ventricles and frontal areas were key in elderly participants, while white matter and posterior brain regions played a crucial role in children. These findings underscore the authors' conclusion that brain age models hold the potential for generalization across diverse populations, further emphasizing the significance of brain age progression as an indicator of cognitive development and aging processes.
Strengths:
(1) The study tackles a crucial research gap by exploring the adaptability of a brain age model across Asian demographics (Chinese, Malay, and Indian Singaporeans), enriching our knowledge of brain aging beyond Western populations.
(2) It uncovers distinct anatomical predictors of brain aging between elderly and younger individuals, highlighting a significant finding in the understanding of age-related changes and ethnic differences.
Weaknesses:
(1) Clarity in describing the fine-tuning process is essential for improved comprehension.
(2) The analysis often limits its findings to p-values, omitting the effect sizes crucial for understanding the relationship with cognition.
(3) Employing a predictive framework for cognition using brain age could offer more insight than mere statistical correlations.
(4) Expanding the study's scope to evaluate the model's generalisability to unseen Caucasian samples is vital for establishing a comparative baseline.
In summary, this paper underscores the critical need to include diverse ethnicities in model testing and estimation.
Reviewer #1 (Recommendations for the authors):
Comment #1 - Fine-Tuning Process Clarity: Enhanced clarity in the fine-tuning process documentation is crucial for understanding how models are adapted to new datasets. This involves explaining parameter adjustments and choices, which facilitates replication and application in further research.
We thank Reviewer #1 for this pertinent point. As advised, we have added a Supplementary Methods section with more details on the finetuning process. This includes the addition of Supplementary Figure S6, which shows examples of learning curves that helped inform our parameter adjustments and choices. We have added a reference to this section in Section 5.2 of the Methods.
Comment #2 - Effect Sizes Reporting: The emphasis on reporting effect sizes alongside p-values addresses the need to quantify the strength of observed effects, particularly the relationship between brain age and cognition. Effect sizes provide insights into the practical significance of findings, crucial for clinical and practical applications.
We thank Reviewer #1 for raising this important comment. As suggested, we have added standardized regression coefficients (as measures of effect size) alongside p-values in Figures 3 – 4, Supplementary Figures S2 – S4, Supplementary Tables S4 – S15, and the text of Sections 2.2 – 2.3 of the Results. We have additionally added 95% confidence intervals to Supplementary Tables S4 – S15.
Comment #3 - Predictive Framework for Cognition: Adopting a predictive framework for cognition using brain age moves the research from mere correlation to actionable prediction, offering potentials based on predictive analytics.
We thank Reviewer #1 for this insightful suggestion. Adopting a predictive framework would certainly be a useful and exciting avenue for the application of brain age. However, we note that the current study was primarily interested in the generalizability and interpretability of brain age in Asian children and older adults, as well as the added value of longitudinal measures of brain age. Thus, we believe our correlation-based analysis effectively demonstrated that deviations of brain age from chronological age were not merely random errors, but were informative of cognition. Furthermore, ongoing changes to these deviations were informative of future cognition. This helps to establish the brain age gap as a biomarker for aging, independent of chronological age. Additionally, we expect that the accurate prediction of future cognition would require a multitude of factors, in addition to T1-based brain age, as well as a large sample size to train and test. We believe such a dataset would be a promising avenue for future work, but it is outside the scope of the current study.
Nonetheless, we were able to conduct a preliminary analysis using the current longitudinal data from SLABS and GUSTO. We extracted the same variables used in the original analyses of future cognition, corresponding to Figures 3D and 4B in the main text. To implement a predictive framework, we split the data into 10 stratified cross-validation folds. We also used kernel ridge regression (KRR) as the predictive model, as it has previously shown promising performance in behavioral and cognitive prediction [1]. We used a cosine kernel and nested 5-fold cross-validation to pick the optimal regularization strength (alpha).
To investigate the added value of BAG and longitudinal changes in BAG, we compared 3 predictive models for each cognitive domain. The baseline model consisted of the demographic covariates used in the original analyses (i.e. chronological age, sex, and years of education for older adults). A second model combined demographics with baseline BAG, and the third model incorporated demographics, baseline BAG, and the (early) annual rate of change in BAG. Predictions were extracted from each test fold, and performance was measured by the correlation between test predictions and actual values of future cognition (or change in cognition). Models were statistically compared using the corrected resampled t-test for machine learning models [1], [2], [3]. The Benjamini-Hochberg procedure was used to correct for multiple comparisons.
Author response image 1 shows the prediction results for SLABS and GUSTO. Notably, adding the early change in BAG significantly improves the prediction of future change in executive function in SLABS. There is also an improvement in predicting the future inhibition score in GUSTO, but this is not significant after multiple comparison correction. Encouragingly, these are the same domains that showed significant associations with the change in BAG in the original analyses. This suggests that longitudinal brain age continues to contribute information, independent of baseline factors, in a predictive framework. We hope that future work can expand on this analysis with, for instance, larger sample sizes, more varied and informative predictors, and state-of-the-art prediction methods, in order to establish actionable predictions of future cognition.
Author response Image 1.
Predictive framework for cognition similarly suggests value of longitudinal change in BAG. Prediction performance (Pearson's correlation) of KRR across future cognitive outcomes. Each boxplot shows the distribution of performance over cross-validation folds. Model performances are statistically compared for each outcome. Significant outcomes from the original analyses are bolded. (A) Results for SLABS using the early change in BAG and future change in cognitive scores (non-overlapping). Early change in BAG again shows benefit for predicting future change in executive function. (B) Results for GUSTO using the early change in BAG (from 4.5-7.5 years old) and future cognitive score (at 8.5 years old). Early change in BAG again shows benefit for predicting future inhibition, but it is not significant after multiple comparison correction. Key - **: p < 0.01; * (ns): p < 0.05 but p<sub>corr</sub> > 0.05 after multiple comparison correction; ns: p > 0.05
Comment #4 - Generalizability to Unseen Caucasian Samples: Evaluating the model's performance on unseen (longitudinal) Caucasian samples is important for benchmarking.
We thank Reviewer #1 for this important comment. We agree that generalizability should be benchmarked against performance on unseen Caucasian samples. In the SFCN model paper [4], they conducted an out-of-sample test on unseen Caucasian samples from ages 13 to 95. In this age range, they reported a high correlation (r = 0.975) and low MAE (MAE = 3.90). This favorable generalization performance was verified in adults by independent evaluations [5], [6]. This is also in line with what we observed in Asian older adults, taking into account the different age ranges and sample sizes involved [7].
However, this also highlights the difficulty in evaluating on younger ages in the range of GUSTO (4.5 – 10.5 years old). Most accessible developmental datasets (e.g. HBN, PING) were already included in model training, preventing an unbiased evaluation on these samples. Datasets such as PNC and ABCD were not included in training, but they primarily consist of an older age range than GUSTO. Holm et al. [8] previously tested the SFCN model in ABCD and reported satisfactory performance (low MAE) from 9 – 13 years old. However, to the best of our knowledge, there are no reported generalization results (for any ethnicity) from 4.5 – 7.5 years old, which is where we found the most performance degradation in GUSTO. We are also not aware of any datasets in this age range we could access to test this, unfortunately, but it would be an important area for future work.
While benchmarking in Caucasian children is difficult, we were able to conduct a preliminary analysis with older adults using the ADNI dataset (which was not included in the model training [4]). We selected a longitudinal subset with cognitive data available and no dementia at baseline (N = 137). We used composite cognitive scores covering memory, executive function, language, and visuospatial function [9], [10], [11]. We followed the same methodology (e.g. preprocessing, finetuning, statistical analysis) as the main analyses on EDIS, SLABS, and GUSTO. To maximize the data available, we tested associations with future cognition (taken at the last available time point), similar to GUSTO. We again included chronological age, sex, and years of education as demographic covariates.
Author response image 2 shows the brain age predictions for the pretrained and finetuned models on ADNI. Similar to Singaporean older adults, the pretrained model performs well, producing a high correlation (r = 0.8053; compared to r = 0.7389 for EDIS and r = 0.8136 for SLABS) and somewhat low MAE (MAE = 4.9735; compared to MAE = 3.9895 for EDIS and MAE = 3.4668 for SLABS). After finetuning, the MAE improves (MAE = 3.6837; compared to MAE = 3.3232 for EDIS and MAE = 3.2653 for SLABS) with a similar correlation (r = 0.7854; compared to r = 0.7445 for EDIS and r = 0.8138 for SLABS). This suggests that generalization to unseen Singaporean older adults is in line with the generalization to unseen Caucasian older adults.
Author response image 2.
Brain age predictions on unseen Caucasian sample of older adults. Predictions from the A) pretrained and B) finetuned brain age models on ADNI participants. Compare to Figure 2 of the main text.
For the associations with future cognition, we again find that baseline BAG does not associate with future cognition (Author response tables 1 and 2). However, encouragingly, we find that the early annual rate of change in BAG does associate with future memory, which is significant after multiple comparison correction for the finetuned model (Author response tables 2 and 3). This suggests a degree of replicability to the original results, but interestingly, in a different domain (memory vs. executive function). In contrast to SLABS, which consists of healthy older adults recruited from the community, ADNI consists of participants at risk of AD recruited from memory clinics. Thus, this difference in domain could be due to factors such as a stronger signal for memory in the testing battery or greater variations in memory function and decline. However, it could also reflect other population differences between ADNI and SLABS. This is an intriguing area for future study, ideally with larger sample sizes and more diverse populations included.
Author response table 1.
Linear relationship between pretrained baseline BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
Author response table 2.
Linear relationship between finetuned baseline BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
Author response table 3.
Linear relationship between pretrained change in BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
Author response table 4.
Linear relationship between finetuned change in BAG and future cognitive score in ADNI. Compare to Supplementary Tables S4 – S15 of the original text.
References
(1) L. Q. R. Ooi et al., “Comparison of individualized behavioral predictions across anatomical, diffusion and functional connectivity MRI,” NeuroImage, vol. 263, p. 119636, Nov. 2022, doi: 10.1016/j.neuroimage.2022.119636.
(2) C. Nadeau and Y. Bengio, “Inference for the Generalization Error,” Mach. Learn., vol. 52, no. 3, pp. 239–281, Sep. 2003, doi: 10.1023/A:1024068626366.
(3) R. R. Bouckaert and E. Frank, “Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms,” in Advances in Knowledge Discovery and Data Mining, H. Dai, R. Srikant, and C. Zhang, Eds., Berlin, Heidelberg: Springer, 2004, pp. 3–12. doi: 10.1007/978-3-540-24775-3_3.
(4) E. H. Leonardsen et al., “Deep neural networks learn general and clinically relevant representations of the ageing brain,” NeuroImage, vol. 256, p. 119210, Aug. 2022, doi: 10.1016/j.neuroimage.2022.119210.
(5) R. P. Dörfel et al., “Prediction of brain age using structural magnetic resonance imaging: A comparison of accuracy and test-retest reliability of publicly available software packages,” Neuroscience, preprint, Jan. 2023. doi: 10.1101/2023.01.26.525514.
(6) J. L. Hanson, D. J. Adkins, E. Bacas, and P. Zhou, “Examining the reliability of brain age algorithms under varying degrees of participant motion,” Brain Inform., vol. 11, no. 1, p. 9, Apr. 2024, doi: 10.1186/s40708-024-00223-0.
(7) A.-M. G. de Lange et al., “Mind the gap: Performance metric evaluation in brain-age prediction,” Hum. Brain Mapp., vol. 43, no. 10, pp. 3113–3129, Jul. 2022, doi: 10.1002/hbm.25837.
(8) M. C. Holm et al., “Linking brain maturation and puberty during early adolescence using longitudinal brain age prediction in the ABCD cohort,” Dev. Cogn. Neurosci., vol. 60, p. 101220, Feb. 2023, doi: 10.1016/j.dcn.2023.101220.
(9) P. K. Crane et al., “Development and assessment of a composite score for memory in the Alzheimer’s Disease Neuroimaging Initiative (ADNI),” Brain Imaging Behav., vol. 6, no. 4, pp. 502–516, Dec. 2012, doi: 10.1007/s11682-012-9186-z.
(10) L. E. Gibbons et al., “A composite score for executive functioning, validated in Alzheimer’s Disease Neuroimaging Initiative (ADNI) participants with baseline mild cognitive impairment,” Brain Imaging Behav., vol. 6, no. 4, pp. 517–527, Dec. 2012, doi: 10.1007/s11682-012-9176-1.
(11) S.-E. Choi et al., “Development and validation of language and visuospatial composite scores in ADNI,” Alzheimers Dement. Transl. Res. Clin. Interv., vol. 6, no. 1, p. e12072, 2020, doi: 10.1002/trc2.12072.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study examines the variability in spacing and direction of entorhinal grid cells, providing convincing evidence that such variability helps disambiguate locations within an environment. This study will be of interest to neuroscientists working on spatial navigation and, more broadly, on neural coding.
-
Reviewer #1 (Public review):
Summary:
The present paper by Redman et al. investigated the variability of grid cell properties in the MEC by analyzing publicly available large-scale neural recording data. Although previous studies have proposed that grid spacing and orientation are homogeneous within the same grid module, the authors found a small but robust variability in grid spacing and orientation across grid cells in the same module. The authors also showed, through model simulations, that such variability is useful for decoding spatial position.
Strengths:
The results of this study provide novel and intriguing insights into how grid cells compose the cognitive map in the axis of the entorhinal cortex and hippocampus. This study analyzes large data sets in an appropriate manner and the results are convincing.
Comments on revisions:
In the revised version of the manuscript, the authors have addressed all the concerns I raised.
-
Reviewer #2 (Public review):
Summary:
This paper presents an interesting and useful analysis of grid cell heterogeneity, showing that the experimentally observed heterogeneity of spacing and orientation within a grid cell module can allow more accurate decoding of location from a single module.
Strengths:
(1) I found the statistical analysis of the grid cell variability to be very systematic and convincing. I also found the evidence for enhanced decoding of location based on between cell variability within a module to be convincing and important, supporting their conclusions.
(2) Theoreticians have developed models that focus on the use of grid cells that are highly regular in their parameters, and usually vary only in the spatial phase of cells within modules and the spacing and orientation between modules. This focus on consistency is partly to obtain the generalization of the grid cell code to a broad range of previously unvisited locations. In contrast, most experimentalists working with grid cells know that many if not most grid cells show high variability of firing fields, as demonstrated in the figures in experimental papers. The authors of this current paper have highlighted this discrepancy, and shown that the variability shown in the data could actually enhance decoding of location.
-
Reviewer #3 (Public review):
Summary:
Redman and colleagues analyze grid cell data obtained from public databases. They show that there is significant variability in spacing and orientation within a module. They show that the difference in spacing and orientation for a pair of cells is larger than the one obtained for two independent maps of the same cell. They speculate that this variability could be useful to disambiguate the rat position if only information from a single module is used by a decoder.
Strengths:
The strengths of this work lie in its conciseness, clarity, and the potential significance of its findings for the grid cell community, which has largely overlooked this issue for the past two decades. Their hypothesis is well stated and the analyses are solid.
Weaknesses:
Major weaknesses identified in the original version have been addressed.
The authors have addressed all of our concerns, providing control analyses that strengthen their claim.
-
Author response:
The following is the authors’ response to the original reviews.
We thank the reviewers for their time and thoughtful comments. We believe that the further analyses suggested have made the results clearer and more robust. Below, we briefly highlight the key points addressed in the revision and the new evidence supporting them. Then, we address each reviewer’s critiques point-by-point.
- Changes in variability with respect to time/experience
Both reviewers #1 and #3 asked whether the variability in grid properties observed was dependent on time or experience. This is an important point, given that such a dependence on time could lead to interesting hypotheses about the underlying dynamics of the grid code. However, in the new analyses we performed, we do not observe changes in grid variability within a session (Fig S5 of the revised manuscript), suggesting that the grid variability seen is constant within the timescale of the data set.
- The assumption of constant grid parameters in the literature
Reviewer #2 pointed out that it had been appreciated by experimentalists that grid properties are variable within a module. We agree that we may have overstated the universality of this assumption in the original manuscript, and we have toned down the language in the revision. However, we note that many previous theoretical studies assumed these properties to be constant, within a given module. We provide some examples below, and have added evidence of this assertion, with citations to the theoretical literature, to the revised manuscript .
- Additional sources of variability
Reviewer #3 pointed out additional sources that might explain the variability observed in the paper (beyond time and experience). These sources include: field width, border location, and the impact of conjunctive cells. We have run additional analyses and have found no significant impact on the observed variability from any of these factors. We believe that these are important controls, and have added them to the manuscript (Fig S4-S7 of the revised manuscript)
- Analysis of computational models
Reviewer #3 noted that our results could be strengthened by performing similar analyses on the output of computational models of grid cells. This is a good idea. We have now measured the variability of grid properties in a recent normative recurrent neural network (RNN) model that develops grid cells when trained to perform path integration (Sorscher et al., 2019). This model has been shown to develop signatures of a 2D toroidal attractor (Sorscher et al., 2023) and achieves a high accuracy on a simple path integration task. Interestingly, the units with the greatest grid scores also exhibit a range of grid spacings and grid orientations (Fig S8 of the revised manuscript). Furthermore, by decreasing the amount of sparsity (through decreasing the weight decay regularization), we found an increase in the variability of the grid properties. This analysis demonstrates a heretofore unknown similarity between the RNN models trained to perform path integration and recorded grid cells from MEC. It additionally provides a framework for computational analysis of the emergence of grid property variability.
Reviewer #1:
(1) Is the variability in grid spacing and orientation that the authors found intrinsically organized or is it shaped by experience? Previous research has shown that grid representations can be modified through experience (e.g., Boccara et al., Science 2019). To understand the dynamics of the network, it would be important to investigate whether robust variability exists from the beginning of the task period (recording period) or whether variability emerges in an experience-dependent manner within a session.
This is an interesting question that was not addressed in the paper. To test this, we performed additional analysis to resolve whether the variability changes across a session.
Using a sliding window, we have measured changes in variability with respect to recording time (Fig S5A). To this end, we compute grid orientation and spacing over a time-window whose length is half the total length of the recording. From the population distribution of orientation and spacing values, we compute the standard deviation as a measure of variability. We repeat the same procedure, sliding the window forward until the variability for the second half of the recording is computed.
We applied this approach to recording ID R12 (the same as in Figs 2-4) given that this recording session was significantly longer than the rest (nearly two hours). Results are shown in Fig S5B-C. For both orientation and spacing, no changes of variability with respect to time can be observed. Similar results were found for other modules (see caption of Fig S5 for statistics).
We also note that the rats were already familiarized with the environment for 10-20 sessions prior to the recordings, so there may not be further learning during the period of the grid cell recordings. No changes in variability can be seen in Rat R across days (e.g., in Fig 5B R12 and R22 have similar distributions of variability). However, we note that it may be possible that there are changes in grid properties at time-scales greater than the recordings.
(2) It is important to consider the optimal variability size. The larger the variability, the better it is for decoding. On the other hand, as the authors state in the
Discussion, it is assumed that variability does not exist in the continuous attractor model. Although this study describes that it does not address how such variability fits the attractor theory, it would be better if more detailed ideas and suggestions were provided as to what direction the study could take to clarify the optimal size of variability.
We appreciate this suggestion and agree that more discussion is warranted on how our results can be reconciled with previously observed attractor dynamics. To explore this, we studied the recurrent neural network (RNN) model from Sorscher et al. (2019), which develops grid responses when trained on path integration. This network has previously been found to develop signatures of toroidal topology (Sorscher et al., 2023), yet we find its grid responses also contain heterogeneity in grid properties (Fig S8). By decreasing the strength of the weight decay regularization (which leads to denser connectivity in the recurrent layer), we find an increase in the grid property variability. Interestingly, decreasing the weight decay regularization has been previously found to lead to weaker grid responses and worse ability of the RNN to perform path integration on environments larger than it was trained on. This approach not only provides preliminary evidence to our claim that too much variability can lead to weaker continuous attractor structure, but also provides a modeling framework with which future work can explore this question in more detail. We have added discussion of this issue to the manuscript text (Discussion).
Reviewer #2:
(1) Even though theoreticians might have gotten the mistaken impression that grid cells are highly regular, this might be due to an overemphasis on regularity in a subset of papers. Most experimentalists working with grid cells know that many if not most grid cells show high variability of firing fields within a single neuron, though this analysis focuses on between neurons. In response to this comment, the reviewers should tone down and modify their statements about what are the current assumptions of the field (and if possible provide a short supplemental section with direct quotes from various papers that have made these assumptions).
We agree that some experimentalists are aware of variability in the recorded grid response patterns and that this work may not come as a complete surprise to them. We have toned down our language in the Introduction, changing “our results challenge a long-held assumption” to “our results challenge a frequently made assumption in the theoretical literature”. Additionally, we have added a caveat that “experimentalists have been aware” of the observed variability in grid properties.
We would like to emphasize that the lack of work carefully examining the robustness of this variability has prevented a firm understanding of whether this is an inherent property of grid cells or due to measurement noise. The impact of this can be seen in theoretical neuroscience work where a considerable number of articles (including recent publications) start with the assumption that all grid cells within a module have identical properties, with the exception of phase shift and noise. We have now cited a number of these papers in the Introduction, to provide specific references. To further illustrate the pervasiveness of this assumption being explicitly made in theoretical neuroscience, below we provide quotes from a few important papers:
“Cells with a common spatial period also share a common grid orientation; their responses differ only by spatial translations, or different preferred firing phases, with respect to their common response period” (Sreenivasan and Fiete, 2011)”
“Grid cells are organized into discrete modules; within each module, the spatial scale and orientation of the grid lattice are the same, but the lattice for different cells is shifted in space.” (Stemmler et al., 2015)”
“Recently, it was shown that grid cells are organized in discrete modules within which cells share the same orientation and periodicity but vary randomly in phase” (Wei et al., 2015)”
“...cells within one module have receptive fields that are translated versions of one another, and different modules have firing lattices of different scales and orientations” (Dorrell et al., 2023)”
In these works, this assumption is used to derive properties relating to the computational properties of grid cells (e.g., error correction, optimal scaling between grid spacings in different modules).
In addition, since grid cells are assumed to be identical in the computational neuroscience community, there has been little work on quantifying how much variability a given model produces. This makes it challenging to understand how consistent different models are with our observations. This is illustrated in our analysis of a recent recurrent neural network (RNN) model of grid cells (Fig S8), which does exhibit variability.
(2) The authors state that "no characterization of the degree and robustness of variability in grid properties within individual modules has been performed." It is always dangerous to speak in absolute terms about what has been done in scientific studies. It is true that few studies have had the number of grid cells necessary to make comparisons within and between modules, but many studies have clearly shown the distribution of spacing in neuronal data (e.g. Hafting et al., 2005; Barry et al., 2007; Stensola et al., 2012; Hardcastle et al., 2015) so the variability has been visible in the data presentations. Also, most researchers in the field are well aware that highly consistent grid cells are much rarer than messy grid cells that have unevenly spaced firing fields. This doesn't hurt the importance of the paper, but they need to tone down their statements about the lack of previous awareness of variability (specific locations are noted in the specific comments).
We have toned down our language in the Introduction. However, we note that our point that no detailed analysis had been done on measuring the robustness of this variability stands. Thus, for the general community, it has not been clear whether this previously observed variability is noise or a real feature of the grid code.
(3) The methods section needs to have a separate subheading entitled: How grid cells were assigned to modules" that clearly describes how the grid cells were assigned to a module (i.e. was this done by Gardner et al., or done as part of this paper's post-processing?
We thank the reviewer for pointing out this missing information. We have added a new subsection in the Materials and Methods section, entitled “Grid module classification” to clarify how the grid cells are assigned to modules. In short, this was done by Gardner et al. (2022) using an unsupervised clustering approach that was viewed as enabling a less biased identification of modules. We did not perform any additional processing steps on module identity.
Reviewer #3:
(1) One possible explanation of the dispersion in lambda (not in theta) could be variability in the typical width of the field. For a fixed spacing, wider fields might push the six fields around the center of the autocorrelogram toward the outside, depending on the details of how exactly the position of these fields is calculated. We recommend authors show that lambda does not correlate with field width, or at least that the variability explained by field width is smaller than the overall lambda variability.
We agree that this option had not been carefully ruled out by our previous analyses. To tackle this question, we compute the field width of a given cell using the value at the minima of its spatial autocorrelogram (Fig S4A-B). For all cells in recording ID R12, there is a non-significant negative linear correlation between grid field width and between-cell variability (Fig S4C) . The variability explained by the width of the field is 4% of the variability, as indicated by the R<sup>2</sup> value of the linear fit. Similar results were found for all other modules (see caption of Fig S4C for statistics). Therefore, we do not think that grid field width explains spacing variability.
(2) An alternative explanation could be related to what happens at the borders. The authors tackle this issue in Figure S2 but introduce a different way of measuring lambda based on three fields, which in our view is not optimal. We recommend showing that the dispersions in lambda and theta remain invariant as one removes the border-most part of the maps but estimating lambda through the autocorrelogram of the remaining part of the map. Of course, there is a limit to how much can be removed before measures of lambda and theta become very noisy.
We have performed additional analysis to explore the role of borders in grid property variability. To do so, we have followed the suggestion by the reviewer and have re-analyzed grid properties from the autocorrelogram when the border-most part of the maps are removed (Fig S6A-B). For all modules, we do not see any changes in variability (computed as the standard deviation of the population distribution) for either orientation or spacing. As predicted by the reviewer, after removing about 25% of the border-most part of the environment we start seeing changes in variability, as measures of theta and lambda become noisy and computed over a smaller spatial range. This result holds for all other modules (Fig S6C-D).
(3) A third possibility is slightly more tricky. Some works (for example Kropff et al, 2015) have shown that fields anticipate the rat position, so every time the rat traverses them they appear slightly displaced opposite to the direction of movement. The amount of displacement depends on the velocity. Maps that we construct out of a whole session should be deformed in a perfectly symmetric way if rats traverse fields in all directions and speeds. However, if the cell is conjunctive, we would expect a deformation mainly along the cell's preferred head direction. Since conjunctive cells have all possible preferred directions, and many grid cells are not conjunctive at all, this phenomenon could create variability in theta and lambda that is not a legitimate one but rather associated with the way we pool data to construct maps. To rule away this possibility, we recommend the authors study the variability in theta and lambda of conjunctive vs non-conjunctive grid cells. If the authors suspect that this phenomenon could explain part of their results, they should also take into account the findings of Gerlei and colleagues (2020) from the Nolan lab, that add complexity to this issue.
We appreciate the reviewer pointing out the possible role conjunctive cells may play. To investigate how conjunctive cells may affect the observed grid property variability, we have performed additional analyses taking into account if the grid cells included in the study are conjunctive. Comparing within- and between-cell variability of conjunctive vs. non-conjunctive cells in recording R12, we do not see any qualitative differences for either orientation or spacing (Fig S7A-B). When excluding conjunctive cells from the between-variability comparison, we do not see any significant difference compared to when these cells are included (Fig S7C-D). As such, it does not appear that conjunctive cells are the source of variability in the population.
We further note that the number of putative conjunctive cells varied across modules and recordings. For instance, in recording Q1 and Q2, Gardner et al. (2022) reported 3 (out of 97) and 1 (out of 66) conjunctive cells, respectively. Given that we see variability robustly across recordings (Fig 5), we do not believe that conjunctive cells can explain the presence of variability we observe.
(4) The results in Figure 6 are correct, but we are not convinced by the argument. The fact that grid cells fire in the same way in different parts of the environment and in different environments is what gives them their appeal as a platform for path integration since displacement can be calculated independently of the location of the animal. Losing this universal platform is, in our view, too much of a price to pay when the only gain is the possibility of decoding position from a single module (or non-adjacent modules) which, as the authors discuss, is probably never the case. Besides, similar disambiguation of positions within the environment would come for free by adding to the decoding algorithm spatial cells (non-hexagonal but spatially stable), which are ubiquitous across the entorhinal cortex. Thus, it seems to us that - at least along this line of argumentation - with variability the network is losing a lot but not gaining much.
We agree that losing the continuous attractor network (CAN) structure and the ability to path integrate would be a very large loss. However, we do not believe that the variability we observe necessarily destroys either the CAN or path integration. We argue this for two reasons. First, the data we analyzed [from Gardner et al. (2022)] is exactly the data set that was found to have toroidal topology and therefore viewed to be consistent with a major prediction of CANs. Thus, the amount of variability in grid properties does not rule out the underlying presence of a continuous attractor. Second, path integration may still be possible with grid cells that have variable properties. To illustrate this, we analyzed data from Sorscher et al. (2019) recurrent neural network model (RNN) that was trained explicitly on path integration, and found that the grid representations that emerged had variability in spacing and orientation (see point #6 below).
(5) In Figure 4 one axis has markedly lower variability. Is this always the same axis? Can the authors comment more on this finding?
We agree that in Fig 4 the first axis has lower variability. We believe that this is specific to the module R12 and does not reflect any differences in axis or bias in the methods used to compute the axis metrics. To test this, we have performed the same analyses for other modules, finding that other recordings do not exhibit the same bias. Results for the modules with the most cells are shown below (Author response image 1).
Author response image 1.
Grid propertied along Axis 1 are not less variable for many recorded grid modules. Same as Fig.4C-D, but for four other recorded modules. Note that the variability along each axis is similar.
(6) The paper would gain in depth if maps coming out of different computational models could be analyzed in the same way.
We agree with the reviewer that examining computational models using the same approach would strengthen our results and we appreciate the suggestion. To address this, we have analyzed the results from a previous normative model for grid cells [Sorscher et al., (2019)] that trained a recurrent neural network (RNN) model to perform path integration and found that units developed grid cell like responses. These models have been found to exhibit signatures of toroidal attractor dynamics [Sorscher et al. (2023)] and exhibit a diversity of responses beyond pure grid cells, making them a good starting point for understanding whether models of MEC may contain uncharacterized variability in grid properties.
We find that RNN units in these normative models exhibit similar amounts of variability in grid spacing and orientation as observed in the real grid cell recordings (Fig S8A-D). This provides additional evidence that this variability may be expected from a normative framework, and that the variability does not destroy the ability to path integrate (which the RNN is explicitly trained to perform).
The RNN model offers possibilities to assess what might cause this variability. While we leave a detailed investigation of this to future work, we varied the weight decay regularization hyper-parameter. This value controls how sparse the weights in the hidden recurrent layer are. Large weight decay regularization strength encourages sparser connectivity, while small weight decay regularization strength allows for denser connectivity. We find that increasing this penalty (and enforcing sparser connectivity) decreases the variability of grid properties (Fig S8E-F). This suggests that the observed variability in the Gardner et al. (2022) data set could be due to the fact that grid cells are synaptically connected to other, non-grid cells in MEC.
(7) Similarly, it would be very interesting to expand the study with some other data to understand if between-cell delta_theta and delta_lambda are invariant across environments. In a related matter, is there a correlation between delta_theta (delta_lambda) for the first vs for the second half of the session? We expect there should be a significant correlation, it would be nice to show it.
We agree this would be interesting to examine. For this analysis, it is essential to have a large number of grid cells, and we are not aware of other published data sets with comparable cell numbers using different environments.
Using a sliding window analysis, we have characterized changes in variability with respect to the recording time (Figure S5A). To do so, we compute grid orientation and spacing over a time-window whose length is half of the total length of the recording. From the population distribution of orientation and spacing values, we compute the standard deviation as a measure of between-cell variability. We repeat the same procedure, sliding the window forward until the variability for the second half of the recording is computed.
We applied this approach to recording ID R12 (the same as in Figs 2-4) given that this recording session was significantly longer than the rest (almost two hours). Results are shown in Fig S5 B-C. For both orientation and spacing, no systematic changes of variability with respect to time were observed. Similar results were found for other modules (see caption of Fig S5 for statistics).
We also note that the rats were already familiarized with the environment for 10-20 sessions prior to the recordings, so there may not be further learning during the period of the grid cell recordings. No changes in variability can be seen in Rat R across days (e.g., in Fig 5B R12 and R22 have similar distributions of variability). However, we note that it may be possible that there are changes in grid properties at time-scales greater than the recordings.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study reports a detailed quantification of the population dynamics of Salmonella enterica serovar Typhimurium in mice. Bacterial burden and founding population sizes across various organs were quantified, revealing pathways of dissemination and reseeding of the gastrointestinal tract from systemic organs. Using various techniques, including genetic distance measurements, the authors present compelling evidence to support their conclusions, thus presenting new knowledge that will be of broad interest to scientists focusing on infectious diseases.
-
Reviewer #1 (Public review):
Hotinger et al. explore the population dynamics of Salmonella enterica serovar Typhimurium in mice using genetically tagged bacteria. In addition to physiological observations, pathology assessments, and CFU measurements, the study emphasizes quantifying host bottleneck sizes that limit Salmonella colonization and dissemination. The authors also investigate the genetic distances between bacterial populations at various infection sites within the host.
Initially, the study confirms that pretreatment with the antibiotic streptomycin before inoculation via orogastric gavage increases the bacterial burden in the gastrointestinal (GI) tract, leading to more severe symptoms and heightened fecal shedding of bacteria. This pretreatment also significantly reduces between-animal variation in bacterial burden and fecal shedding. The authors then calculate founding population sizes across different organs, discovering a severe bottleneck in the intestine, with founding populations reduced by approximately 10^6-fold compared to the inoculum size. Streptomycin pretreatment increases the founding population size and bacterial replication in the GI tract. Moreover, by calculating genetic distances between populations, the authors demonstrate that, in untreated mice, Salmonella populations within the GI tract are genetically dissimilar, suggesting limited exchange between colonization sites. In contrast, streptomycin pretreatment reduces genetic distances, indicating increased exchange.
In extraintestinal organs, the bacterial burden is generally not substantially increased by streptomycin pretreatment, with significant differences observed only in the mesenteric lymph nodes and bile. However, the founding population sizes in these organs are increased. By comparing genetic distances between organs, the authors provide evidence that subpopulations colonizing extraintestinal organs diverge early after infection from those in the GI tract. This hypothesis is further tested by measuring bacterial burden and founding population sizes in the liver and GI tract at 5 and 120 hours post-infection. Additionally, they compare orogastric gavage infection with the less injurious method of infection via drinking, finding similar results for CFUs, founding populations, and genetic distances. These results argue against injuries during gavage as a route of direct infection.
To bypass bottlenecks associated with the GI tract, the authors compare intravenous (IV) and intraperitoneal (IP) routes of infection. They find approximately a 10-fold increase in bacterial burden and founding population size in immune-rich organs with IV/IP routes compared to orogastric gavage in streptomycin-pretreated animals. This difference is interpreted as a result of "extra steps required to reach systemic organs."
While IP and IV routes yield similar results in immune-rich organs, IP infections lead to higher bacterial burdens in nearby sites, such as the pancreas, adipose tissue, and intraperitoneal wash, as well as somewhat increased founding population sizes. The authors correlate these findings with the presence of white lesions in adipose tissue. Genetic distance comparisons reveal that, apart from the spleen and liver, IP infections lead to genetically distinct populations in infected organs, whereas IV infections generally result in higher genetic similarity.
Finally, the authors investigate GI tract reseeding, identifying two distinct routes. They observe that the GI tracts of IP/IV-infected mice are colonized either by a clonal or a diversely tagged bacterial population. In clonally reseeded animals, the genetic distance within the GI tract is very low (often zero) compared to the bile population, which is predominantly clonal or pauciclonal. These animals also display pathological signs, such as cloudy/hardened bile and increased bacterial burden, leading the authors to conclude that the GI tract was reseeded by bacteria from the gallbladder bile. In contrast, animals reseeded by more complex bacterial populations show that bile contributes only a minor fraction of the tags. Given the large founding population size in these animals' GI tracts, which is larger than in orogastrically infected animals, the authors suggest a highly permissive second reseeding route, largely independent of bile. They speculate that this route may involve a reversal of known mechanisms that the pathogen uses to escape from the intestine.
The manuscript presents a substantial body of work that offers a meticulously detailed understanding of the population dynamics of S. Typhimurium in mice. It quantifies the processes shaping the within-host dynamics of this pathogen and provides new insights into its spread, including previously unrecognized dissemination routes. The methodology is appropriate and carefully executed, and the manuscript is well-written, clearly presented, and concise. The authors' conclusions are well-supported by experimental results and thoroughly discussed. This work underscores the power of using highly diverse barcoded pathogens to uncover the within-host population dynamics of infections and will likely inspire further investigations into the molecular mechanisms underlying the bottlenecks and dissemination routes described here.
-
Reviewer #2 (Public review):
In this paper, Hotinger et. al. propose an improved barcoded library system, called STAMPR, to study Salmonella population dynamics during infection. Using this system, the authors demonstrate significant diversity in the colonization of different Salmonella clones (defined by the presence of different barcodes) not only across different organs (liver, spleen, adipose tissues, pancreas and gall bladder) but also within different compartments of the same gastrointestinal tissue. Additionally, this system revealed that microbiota competition is the major bottleneck in Salmonella intestinal colonization, which can be mitigated by streptomycin treatment. However, this has been demonstrated previously in numerous publications. They also show that there was minimal sharing between populations found in the intestine and those in the other organs. Upon IV and IP infection to bypass the intestinal bottleneck, they were able to demonstrate, using this library, that Salmonella can renter the intestine through two possible routes. One route is essentially the reverse path used to escape the gut, leading to a diverse intestinal population; while the other, through the bile, typically results in a clonal population.
Comments on latest version:
The authors have addressed my concerns.
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Hotinger et al. explore the population dynamics of Salmonella enterica serovar Typhimurium in mice using genetically tagged bacteria. In addition to physiological observations, pathology assessments, and CFU measurements, the study emphasizes quantifying host bottleneck sizes that limit Salmonella colonization and dissemination. The authors also investigate the genetic distances between bacterial populations at various infection sites within the host.
Initially, the study confirms that pretreatment with the antibiotic streptomycin before inoculation via orogastric gavage increases the bacterial burden in the gastrointestinal (GI) tract, leading to more severe symptoms and heightened fecal shedding of bacteria. This pretreatment also significantly reduces between-animal variation in bacterial burden and fecal shedding. The authors then calculate founding population sizes across different organs, discovering a severe bottleneck in the intestine, with founding populations reduced by approximately 10^6-fold compared to the inoculum size. Streptomycin pretreatment increases the founding population size and bacterial replication in the GI tract. Moreover, by calculating genetic distances between populations, the authors demonstrate that, in untreated mice, Salmonella populations within the GI tract are genetically dissimilar, suggesting limited exchange between colonization sites. In contrast, streptomycin pretreatment reduces genetic distances, indicating increased exchange.
In extraintestinal organs, the bacterial burden is generally not substantially increased by streptomycin pretreatment, with significant differences observed only in the mesenteric lymph nodes and bile. However, the founding population sizes in these organs are increased. By comparing genetic distances between organs, the authors provide evidence that subpopulations colonizing extraintestinal organs diverge early after infection from those in the GI tract. This hypothesis is further tested by measuring bacterial burden and founding population sizes in the liver and GI tract at 5 and 120 hours post-infection. Additionally, they compare orogastric gavage infection with the less injurious method of infection via drinking, finding similar results for CFUs, founding populations, and genetic distances. These results argue against injuries during gavage as a route of direct infection.
To bypass bottlenecks associated with the GI tract, the authors compare intravenous (IV) and intraperitoneal (IP) routes of infection. They find approximately a 10-fold increase in bacterial burden and founding population size in immune-rich organs with IV/IP routes compared to orogastric gavage in streptomycin-pretreated animals. This difference is interpreted as a result of "extra steps required to reach systemic organs."
While IP and IV routes yield similar results in immune-rich organs, IP infections lead to higher bacterial burdens in nearby sites, such as the pancreas, adipose tissue, and intraperitoneal wash, as well as somewhat increased founding population sizes. The authors correlate these findings with the presence of white lesions in adipose tissue. Genetic distance comparisons reveal that, apart from the spleen and liver, IP infections lead to genetically distinct populations in infected organs, whereas IV infections generally result in higher genetic similarity.
Finally, the authors investigate GI tract reseeding, identifying two distinct routes. They observe that the GI tracts of IP/IV-infected mice are colonized either by a clonal or a diversely tagged bacterial population. In clonally reseeded animals, the genetic distance within the GI tract is very low (often zero) compared to the bile population, which is predominantly clonal or pauciclonal. These animals also display pathological signs, such as cloudy/hardened bile and increased bacterial burden, leading the authors to conclude that the GI tract was reseeded by bacteria from the gallbladder bile. In contrast, animals reseeded by more complex bacterial populations show that bile contributes only a minor fraction of the tags. Given the large founding population size in these animals' GI tracts, which is larger than in orogastrically infected animals, the authors suggest a highly permissive second reseeding route, largely independent of bile. They speculate that this route may involve a reversal of known mechanisms that the pathogen uses to escape from the intestine.
The manuscript presents a substantial body of work that offers a meticulously detailed understanding of the population dynamics of S. Typhimurium in mice. It quantifies the processes shaping the within-host dynamics of this pathogen and provides new insights into its spread, including previously unrecognized dissemination routes. The methodology is appropriate and carefully executed, and the manuscript is well-written, clearly presented, and concise. The authors' conclusions are well-supported by experimental results and thoroughly discussed. This work underscores the power of using highly diverse barcoded pathogens to uncover the within-host population dynamics of infections and will likely inspire further investigations into the molecular mechanisms underlying the bottlenecks and dissemination routes described here.
Major point:
Substantial conclusions in the manuscript rely on genetic distance measurements using the Cavalli-Sforza chord distance. However, it is unclear whether these genetic distance measurements are independent of the founding population size. I would anticipate that in populations with larger founding population sizes, where the relative tag frequencies are closer to those in the inoculum, the genetic distances would appear smaller compared to populations with smaller founding sizes independent of their actual relatedness. This potential dependency could have implications for the interpretation of findings, such as those in Figures 2B and 2D, where antibiotic-pretreated animals consistently exhibit higher founding population sizes and smaller genetic distances compared to untreated animals.
Thank you for raising this important point regarding reliance on cord distances for gauging genetic distance in barcoded populations. The reviewer is correct that samples with more founders will be more similar to the inoculum and thus inherently more similar to other samples that also have more founders. However, creation of libraries containing very large numbers of unique barcodes can often circumvent this issue. In this case, the effect size of chance-based similarity is not large enough to change the interpretation of the data in Figures 2B and 2D. In our case, the library has ~6x10<sup>4</sup> barcodes, and the founding populations in Figure 2B are ~10<sup>3</sup>. Randomly resampling to create two populations of 10<sup>3</sup> cells from an initial population with 6x10<sup>4</sup> barcodes is expected to yield largely distinct populations with very little similarity. Thus, the similarity between streptomycin-treated populations in Figure 2D is likely the result of biology rather than chance.
Reviewer #2 (Public review):
In this paper, Hotinger et. al. propose an improved barcoded library system, called STAMPR, to study Salmonella population dynamics during infection. Using this system, the authors demonstrate significant diversity in the colonization of different Salmonella clones (defined by the presence of different barcodes) not only across different organs (liver, spleen, adipose tissues, pancreas, and gall bladder) but also within different compartments of the same gastrointestinal tissue. Additionally, this system revealed that microbiota competition is the major bottleneck in Salmonella intestinal colonization, which can be mitigated by streptomycin treatment. However, this has been demonstrated previously in numerous publications. They also show that there was minimal sharing between populations found in the intestine and those in the other organs. Upon IV and IP infection to bypass the intestinal bottleneck, they were able to demonstrate, using this library, that Salmonella can renter the intestine through two possible routes. One route is essentially the reverse path used to escape the gut, leading to a diverse intestinal population; while the other, through the bile, typically results in a clonal population. Although the authors showed that the STAMPR pipeline improved the ability to identify founder populations and their diversity within the same animal during infections, some of the conclusions appear speculative and not fully supported.
(1) It's particularly interesting how the authors, using this system, demonstrate the dominant role of the microbiota bottleneck in Salmonella colonization and how it is widened by antibiotic treatment (Figure 1). Additionally, the ability to track Salmonella reseeding of the gut from other organs starting with IV and IP injections of the pathogen provides a new tool to study population dynamics (Figure 5). However, I don't think it is possible to argue that the proximal and distal small intestine, Peyer's patches (PPs), cecum, colon, and feces have different founder populations for reasons other than stochastic variations. All the barcoded Salmonella clones have the same fitness and the fact that some are found or expanded in one region of the gastrointestinal tract rather than another likely results from random chance - such as being forced in a specific region of the gut for physical or spatial reasons-and subsequent expansion, rather than any inherent biological cause. For example, some bacteria may randomly adhere to the mucus, some may swim toward the epithelial layer, while others remain in the lumen; all will proliferate in those respective sites. In this way, different founder populations arise based on random localization during movement through the gastrointestinal tract, which is an observation, but it doesn't significantly contribute to understanding pathogen colonization dynamics or pathogenesis. Therefore, I would suggest placing less emphasis on describing these differences or better discussing this aspect, especially in the context of the gastrointestinal tract.
Thank you for helping us identify this area for further clarification. We agree with the reviewer’s interpretation that seeding of proximal and distal small intestine, Peyer's patches (PPs), cecum, colon, and feces with different founder populations is likely caused by stochastic variations, consistent with separate stochastic bottlenecks to establishing these separate niches. To clarify this point we have modified the text in the results section, “Streptomycin treatment decreases compartmentalization of S. Typhimurium populations within the intestine”.
Change to text:
“Except for the cecum and colon, in untreated animals the S. Typhimurium populations in different regions of the intestine were dissimilar (Avg. GD ranged from 0.369 to 0.729, 2D left); i.e., there is little sharing between populations in the intestine. These data suggest that there are separate bottlenecks in different regions of the intestine that cause stochastic differences in the identity of the founders. Interestingly, when these founders replicate, they do not mix, remaining compartmentalized with little sharing between populations throughout the intestinal tract (i.e., barcodes found in one region are not in other regions, Figure S3). This was surprising as the luminal contents, an environment presumably conducive to bacterial movement, were not removed from these samples.”
In this section we are interested in the underlying biology that occurs after the initial bottleneck to preserve this compartmentalization during outgrowth of the intestinal population. In other words, what prevents these separate populations from merging (e.g., what prevents the bacteria replicating in the proximal small intestine from traveling through the intestine and establishing a niche in the distal small intestine)? While we do not explore the mechanisms of compartmentalization, we observe that it is disrupted by streptomycin pretreatment, suggesting a microbiota-dependent biological cause.
(2) I do think that STAMPR is useful for studying the dynamics of pathogen spread to organs where Salmonella likely resides intracellularly (Figure 3). The observation that the liver is colonized by an early intestinal population, which continues to proliferate at a steady rate throughout the infection, is very interesting and may be due to the unique nature of the organ compared to the mucosal environment. What is the biological relevance during infection? Do the authors observe the same pattern (Figures 3C and G) when normalizing the population data for the spleen and mesenteric lymph nodes (mLN)? If not, what do the authors think is driving this different distribution?
Thank you for raising this interesting point. These data indicate that the liver is seeded from the intestine early during infection. The timing and source of dissemination have relevance for understanding how host and pathogen variables control the spread of bacteria to systemic sites. For example, our conclusion (early dissemination) indicates that the immune state of a host at the time of exposure to a pathogen, and for a short period thereafter, are what primarily influence the process of dissemination, not the later response to an active infection.
We observe that the liver and mucosal environments within the intestine have similar colonization behaviors. Both niches are seeded early during infection, followed by steady pathogen proliferation and compartmentalization that apparently inhibits further seeding. This results in the identity of barcodes in the liver population remaining distinct from the intestinal populations, and the intestinal populations remaining distinct from each other.
We observe a similar pattern to the liver in the spleen and MLN (the barcodes in the spleen and MLN are dissimilar to the population in the intestine). To clarify this point, we have modified the text (below) and added this analysis as a supplemental figure (S4).
Change to text:
Genetic distance comparison of liver samples to other sites revealed that, regardless of streptomycin treatment, there was very little sharing of barcodes between the intestine and extraintestinal sites (Avg. GD >0.75, Figure 3C). Furthermore, the MLN and spleen populations also lacked similarity with the intestine (Figure S4). These analyses strongly support the idea that S. Typhimurium disseminates to extraintestinal organs relatively early following inoculation, before it establishes a replicative niche in the intestine.
(3) Figure 6: Could the bile pathology be due to increased general bacterial translocation rather than Salmonella colonization specifically? Did the authors check for the presence of other bacteria (potentially also proliferating) in the bile? Do the authors know whether Salmonella's metabolic activity in the bile could be responsible for gallbladder pathology?
The reviewer raises interesting points for future work. We did not check whether other bacterial species are translocating during S. Typhimurium infection. The relevance of Salmonella’s metabolic activity is also very interesting, and we hope these questions will be answered by future studies.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Minor points:
(1) P. 9/10 "... the marked delay in shedding after IP and IV relative to orogastric inoculation suggest that the S. Typhimurium population encounters substantial bottleneck(s) on the route(s) from extraintestinal sites back to the intestine.": Can you conclude that from the data? It could also be possible that there is a biological mechanism (other than chance events) that delays the re-entry to the intestine.
We propose that the delay in shedding indicates additional obstacles that bacteria face when re-entering the intestine, and that there are likely biological mechanisms that cause this delay. However, these unknown mechanisms effectively act as additional bottlenecks by causing a stochastic loss of population diversity.
(2) P. 11 "...both organs would likely contain all 10 barcodes. In contrast, a library with 10,000 barcodes can be used to distinguish between a bottleneck resulting in Ns = 1,000 and Ns = 10,000, since these bottlenecks result in a different number of barcodes in output samples. Furthermore, high diversity libraries reduce the likelihood that two tissue samples share the same barcode(s) due to random chance, enabling more accurate quantification of bacterial dissemination.": I agree with the general analysis, but I find it misleading to talk about the presence of barcodes when the analyses in this manuscript are based on the much more powerful comparison of relative abundance of individual tags instead of their presence or absence.
The reviewer raises an excellent point, and the distinction between relative abundance versus presence/absence is discussed extensively in the original STAMPR manuscript. Although relative abundance is powerful, the primary metric used in this study (Ns) is calculated principally from the number of barcodes, corrected (via simulations) for the probability of observing the same barcode across distinct founders. Although this correction procedure does rely on barcode abundance, the primary driver of founding population quantification is the number of barcodes.
(3) P.14 "the library in LB supplemented with SM was not significantly different than the parent strain" and Figure 2C: How was significance tested? How many times were the growth curves recorded? On my print-out, the red color has different shades for different growth curves.
Significance was tested with a Mann-Whitney and growth curves were performed 5 times. Growth curves are displayed with 50% opacity, and as a result multiple curves directly on top of each other appear darker. The legend to S2 has been modified accordingly.
(4) P.16: close bracket in the equation for FRD calculation.
Done
(5) Figure 2C "Average CFU per founder": I found the wording confusing at first as I thought you divided the average bacterial burden per organ by Ns, instead of averaging the CFU/Ns calculated for each mouse.
The wording has been clarified.
(6) Figure 3B: It would be helpful to include expected genetic distances in the schematic as it is difficult to infer the genetic distance when only two of three, respectively, different "barcode colors" are used. While I find the explanation in the main text intuitive, a graphical representation would have helped me.
Thank you for the suggestion. Unfortunately, using colors to represent barcodes is imperfect and limits the diversity that can be depicted. We have modified Figure 3B to further clarify.
(7) Figure 3C: Why do you compare the genetic distance to the liver, when you discuss the genetic distance of the intestinal population? Is it not possible that the intestinal populations are similar to the extraintestinal organs except the liver?
For clarity, we chose to highlight exclusively the liver. However, we observed a similar pattern to the liver in other extraintestinal organs. To clarify the generalizability of this point we have added a supplemental figure with comparisons to MLN and Spleen (Supplemental figure S4) as well as further text.
(8) Figure 3C & S5A: I found "+SM" and "+SM, Drinking" confusing and would have preferred "+SM, Gavage" and "+SM, Drinking" for clarity.
Done, thank you for the suggestion.
(9) Figure 3G&H: I find it worthy of discussion that the bacterial burden increases over time, while the founding population decreases. Does that not indicate that replication only occurs at specific sites leading to the amplification of only a few barcodes and thereby a larger change of the relative barcode abundance compared to the inoculum?
From 5h to 120h the size of the founding population decreases in multiple intestinal sites. This likely indicates that the impact of the initial bottleneck is still ongoing at 5h, although further temporal analysis would be required to define the exact timing of the bottleneck. Notably, the passage time through the mouse intestine is ~5h. Many of the founders observed at 5h could be a population that will never establish a replicative niche, and failing to colonize be shed in the feces, bottlenecking the population between 5h and 120h. To clarify this point we have added the following text:
Section “S. Typhimurium disseminates out of the intestine before establishing an intestinal replicative niche”.
“In contrast to the liver, there were more founders present in samples from the intestine (particularly in the colon) at 5 hours versus 120 hours (Figure 3H). These data likely indicate that many of the founders observed in the intestine at 5 hours are shed in the feces prior to establishing a replicative niche, and demonstrates that the forces restricting the S. Typhimurium population in the intestine act over a period of > 5 hours.”
(10) Figure S2A: I do not understand this figure. Why are there more than 70.000 tags listed? I was under the impression the barcode library in S. Typhimurium had 55.000 tags while only the plasmid pSM1 had more than 70.000 (but the plasmid should not be relevant here). Why are there distinct lines at approximately 10^-5 and a bit lower? I would have expected continuously distributed barcode frequencies.
During barcode analysis, each library is mapped to the total barcode list in the barcode donor pSM1, which contains ~70,000 barcodes. This enables consistent analysis across different bacterial libraries. The designation “barcode number” refers to the barcode number in pSM1, meaning many of the barcodes in the Salmonella library are at zero reads. This graph type was chosen to show there was no bias toward a particular barcode, however there is significant overlap of the points, making individual barcode frequencies difficult to see. We have changed the x-axis to state “pSM1 Barcode Number” and clarified in the figure legend.
Since the y-axes on these graphs is on a log10 scale, the lines represent barcodes with 1 read, 2 reads, 3 reads, etc. As the number of reads per barcode increases linearly, the space between them decreases on logarithmic axes.
(11) There are a few typos in the figure legends of the supplementary material. For example Figure S2: S. Typhimurium not italicized, ~7x105 no superscript. Fig. S4&5 ", Open circles" is "O" is capitalized.
Typos have been corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
The current human tissue-based study provides compelling evidence correlating hippocampal expressions of RNA guanine-rich G-quadruplexes with aging and with Alzheimer's Disease presence and severity. The results are fundamental and will rejuvenate our understanding of aging and AD's pathogenesis.
[Editors' note: this paper was reviewed by Review Commons.]
-
Reviewer #1 (Public review):
This is an interesting manuscript where the authors systematically measure rG4 levels in brain samples at different ages of patients affected by AD. To the best of my knowledge this is the first time that BG4 staining is used in this context and the authors provide compelling evidence to show an association with BG4 staining and age or AD progression, which interestingly indicates that such RNA structure might play a role in regulating protein homeostasis as previously speculated. The methods used and the results reported seems robust and reproducible.
-
Reviewer #2 (Public review):
RNA guanine-rich G-quadruplexes (rG4s) are non-canonical higher order nucleic acid structures that can form under physiological conditions. Interestingly, cellular stress is positively correlated with rG4 induction.
In this study, the authors examined human hippocampal postmortem tissue for the formation ofrG4s in aging and Alzheimer Disease (AD). rG4 immunostaining strongly increased in the hippocampus with both age and with AD severity. 21 cases were used in this study (age range 30-92).
This immunostaining co-localized with hyper-phosphorylated tau immunostaining in neurons. The BG4 staining levels were also impacted by APOE status. rG4 structure was previously found to drive tau aggregation. Based on these observations, the authors propose a model of neurodegeneration in which chronic rG4 formation drives proteostasis collapse.
This model is interesting, and would explain different observations (e.g., RNA is present in AD aggregates and rG4s can enhance protein oligomerization and tau aggregation).
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
This is an interesting manuscript where the authors systematically measure rG4 levels in brain samples at different ages of patients affected by AD. To the best of my knowledge this is the first time that BG4 staining is used in this context and the authors provide compelling evidence to show an association with BG4 staining and age or AD progression, which interestingly indicates that such RNA structure might play a role in regulating protein homeostasis as previously speculated. The methods used and the results reported seems robust and reproducible. There were two main things that needed addressing:
(1) Usually in BG4 staining experiments to ensure that the signal detected is genuinely due to rG4 an RNase treatment experiment is performed. This does not have to be extended to all the samples presented but having a couple of controls where the authors observe loss of staining upon RNase treatment will be key to ensure with confidence that rG4s are detected under the experimental conditions. This is particularly relevant for this brain tissue samples where BG4 staining has never been performed before.
(2) The authors have an association between rG4-formation and age/disease progression. They also observe distribution dependency of this, which is great. However, this is still an association which does not allow the model to be supported. This is not something that can be fixed with an easy experiment and it is what it is, but my point is that the narrative of the manuscript should be more fair and reflect the fact that, although interesting, what the authors are observing is a simple correlation. They should still go ahead and propose a model for it, but they should be more balanced in the conclusion and do not imply that this evidence is sufficient to demonstrate the proposed model. It is absolutely fine to refer to the literature and comment on the fact that similar observations have been reported and this is in line with those, but still this is not an ultimate demonstration.
Comments on current version:
The authors have now addressed my concerns.
We thank the reviewer for their support!
Reviewer #2 (Public review):
RNA guanine-rich G-quadruplexes (rG4s) are non-canonical higher order nucleic acid structures that can form under physiological conditions. Interestingly, cellular stress is positively correlated with rG4 induction.
In this study, the authors examined human hippocampal postmortem tissue for the formation ofrG4s in aging and Alzheimer Disease (AD). rG4 immunostaining strongly increased in the hippocampus with both age and with AD severity. 21 cases were used in this study (age range 30-92).
This immunostaining co-localized with hyper-phosphorylated tau immunostaining in neurons. The BG4 staining levels were also impacted by APOE status. rG4 structure was previously found to drive tau aggregation. Based on these observations, the authors propose a model of neurodegeneration in which chronic rG4 formation drives proteostasis collapse.
This model is interesting, and would explain different observations (e.g., RNA is present in AD aggregates and rG4s can enhance protein oligomerization and tau aggregation).
Main issue from the previous round of review:
There is indeed a positive correlation between Braak stage severity and BG4 staining, but this correlation is relatively weak and borderline significant ((R = 0.52, p value = 0.028). This is probably the main limitation of this study, which should be clearly acknowledged (together with a reminder that "correlation is not causality"). Related to this, here is no clear justification to exclude the four individuals in Fig 1d (without them R increases to 0.78). Please remove this statement. On the other hand, the difference based on APOE status is more striking.
Comments on current version:
The authors have made laudable efforts to address the criticisms I made in my evaluation of the original manuscript.
We thank the reviewer for their support!
Recommendations for the authors:
Reviewing Editor:
I would suggest two minor edits:
- The findings are correlative and descriptive, but the title implies functionality (A New Role for RNA G-quadruplexes in Aging and Alzheimer′s Disease). I would suggest toning down this title).
- While I understand the limitations in performing additional biochemical experiments to validate the immunofluorescence study, I think this is worth mentioning as a limitation in the text.
We have made these two changes as requested, altering the title to remove the word Role that may imply more meaning than intended, and adding a line to the discussion on the need for future additional biochemical experiments.
Reviewer #1 (Recommendations for the authors):
Thanks for addressing the concerns raised.
We thank the reviewer for their support!
Reviewer #2 (Recommendations for the authors):
Minor point:
Related to the "correlation is not causality" remark I made in my evaluation of the original manuscript: the authors' answer is reasonable. Still, I would suggest to modify the abstract: "we propose a model of neurodegeneration in which chronic rG4 formation drives proteostasis collapse" => "we propose a model of neurodegeneration in which chronic rG4 formation is linked to proteostasis collapse"
All other remarks I made have been answered properly.
We thank the reviewer for their support! We have made the change exactly as requested by the reviewer.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study provides information on the TMEM16 family of membrane proteins, which play roles in lipid scrambling and ion transport. By simulating 27 structures representing five distinct family members, the authors captured hundreds of lipid scrambling events, offering insights into the mechanisms of lipid translocation and the specific protein regions involved in these processes. However, while the data on groove dilation is compelling, the evidence for outside-the-groove scramblase activity without experimental validation is inadequate and is based on a limited set of observed events.
-
Reviewer #1 (Public review):
Summary:
The manuscript investigates lipid scrambling mechanisms across TMEM16 family members using coarse-grained molecular dynamics (MD) simulations. While the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations, several critical issues undermine its novelty, impact, and alignment with experimental observations.
Critical issues:
(1) Lack of Novelty:<br /> The phenomenon of lipid scrambling via an open hydrophilic groove is already well-established in the literature, including through atomistic MD simulations. The authors themselves acknowledge this fact in their introduction and discussion. By employing coarse-grained simulations, the study essentially reiterates previously known findings with limited additional mechanistic insight. The repeated observation of scrambling occurring predominantly via the groove does not offer significant advancement beyond prior work.
(2) Redundancy Across Systems:<br /> The manuscript explores multiple TMEM16 family members in activating and non-activating conformations, but the conclusions remain largely confirmatory. The extensive dataset generated through coarse-grained MD simulations primarily reinforces established mechanistic models rather than uncovering fundamentally new insights. The effort, while statistically robust, feels excessive given the incremental nature of the findings.
(3) Discrepancy with Experimental Observations:<br /> The use of coarse-grained simulations introduces inherent limitations in accurately representing lipid scrambling dynamics at the atomistic level. Experimental studies have highlighted nuances in lipid permeation that are not fully captured by coarse-grained models. This discrepancy raises questions about the biological relevance of the reported scrambling events, especially those occurring outside the canonical groove.
(4) Alternative Scrambling Sites:<br /> The manuscript reports scrambling events at the dimer-dimer interface as a novel mechanism. While this observation is intriguing, it is not explored in sufficient detail to establish its functional significance. Furthermore, the low frequency of these events (relative to groove-mediated scrambling) suggests they may be artifacts of the simulation model rather than biologically meaningful pathways.
Conclusion:
Overall, while the study is technically sound and presents a large dataset of lipid scrambling events across multiple TMEM16 structures, it falls short in terms of novelty and mechanistic advancement. The findings are largely confirmatory and do not bridge the gap between coarse-grained simulations and experimental observations. Future efforts should focus on resolving these limitations, possibly through atomistic simulations or experimental validation of the alternative scrambling pathways.
-
Reviewer #2 (Public review):
Summary:
Stephens et al. present a comprehensive study of TMEM16-members via coarse-grained MD simulations (CGMD). They particularly focus on the scramblase ability of these proteins and aim to characterize the "energetics of scrambling". Through their simulations, the authors interestingly relate protein conformational states to the membrane's thickness and link those to the scrambling ability of TMEM members, measured as the trespassing tendency of lipids across leaflets. They validate their simulation with a direct qualitative comparison with Cryo-EM maps.
Strengths:
The study demonstrates an efficient use of CGMD simulations to explore lipid scrambling across various TMEM16 family members. By leveraging this approach, the authors are able to bypass some of the sampling limitations inherent in all-atom simulations, providing a more comprehensive and high-throughput analysis of lipid scrambling. Their comparison of different protein conformations, including open and closed groove states, presents a detailed exploration of how structural features influence scrambling activity, adding significant value to the field. A key contribution of this study is the finding that groove dilation plays a central role in lipid scrambling. The authors observe that for scrambling-competent TMEM16 structures, there is substantial membrane thinning and groove widening. The open Ca2+-bound nhTMEM16 structure (PDB ID 4WIS) was identified as the fastest scrambler in their simulations, with scrambling rates as high as 24.4 {plus minus} 5.2 events per μs. This structure also shows significant membrane thinning (up to 18 Å), which supports the hypothesis that groove dilation lowers the energetic barrier for lipid translocation, facilitating scrambling.
The study also establishes a correlation between structural features and scrambling competence, though analyses often lack statistical robustness and quantitative comparisons. The simulations differentiate between open and closed conformations of TMEM16 structures, with open-groove structures exhibiting increased scrambling activity, while closed-groove structures do not. This finding aligns with previous research suggesting that the structural dynamics of the groove are critical for scrambling. Furthermore, the authors explore how the physical dimensions of the groove qualitatively correlate with observed scrambling rates. For example, TMEM16K induces increased membrane thinning in its open form, suggesting that membrane properties, along with structural features, play a role in modulating scrambling activity.
Another significant finding is the concept of "out-of-the-groove" scrambling, where lipid translocation occurs outside the protein's groove. This observation introduces the possibility of alternate scrambling mechanisms that do not follow the traditional "credit-card model" of groove-mediated lipid scrambling. In their simulations, the authors note that these out-of-the-groove events predominantly occur at the dimer interface between TM3 and TM10, especially in mammalian TMEM16 structures. While these events were not observed in fungal TMEM16s, they may provide insight into Ca2+-independent scrambling mechanisms, as they do not require groove opening.
Weaknesses:
A significant challenge of the study is the discrepancy between the scrambling rates observed in CGMD simulations and those reported experimentally. Despite the authors' claim that the rates are in line experimentally, the observed differences can mean large energetic discrepancies in describing scrambling (larger than 1kT barrier in reality). For instance, the authors report scrambling rates of 10.7 events per μs for TMEM16F and 24.4 events per μs for nhTMEM16, which are several orders of magnitude faster than experimental rates. While the authors suggest that this discrepancy could be due to the Martini 3 force field's faster diffusion dynamics, this explanation does not fully account for the large difference in rates. A more thorough discussion on how the choice of force field and simulation parameters influence the results, and how these discrepancies can be reconciled with experimental data, would strengthen the conclusions. Likewise, rate calculations in the study are based on 10 μs simulations, while experimental scrambling rates occur over seconds. This timescale discrepancy limits the study's accuracy, as the simulations may not capture rare or slow scrambling events that are observed experimentally and therefore might underestimate the kinetics of scrambling. It's however important to recognize that it's hard (borderline unachievable) to pinpoint reasonable kinetics for systems like this using the currently available computational power and force field accuracy. The faster diffusion in simulations may lead to overestimated scrambling rates, making the simulation results less comparable to real-world observations. Thus, I would therefore read the findings qualitatively rather than quantitatively. An interesting observation is the asymmetry observed in the scrambling rates of the two monomers. Since MARTINI is known to be limited in correctly sampling protein dynamics, the authors - in order to preserve the fold - have applied a strong (500 kJ mol-1 nm-2) elastic network. However, I am wondering how the ENM applies across the dimer and if any asymmetry can be noticed in the application of restraints for each monomer and at the dimer interface. How can this have potentially biased the asymmetry in the scrambling rates observed between the monomers? Is this artificially obtained from restraining the initial structure, or is the asymmetry somehow gatekeeping the scrambling mechanism to occur majorly across a single monomer? Answering this question would have far-reaching implications to better describe the mechanism of scrambling.
Notably, the manuscript does not explore the impact of membrane composition on scrambling rates. While the authors use a specific lipid composition (DOPC) in their simulations, they acknowledge that membrane composition can influence scrambling activity. However, the study does not explore how different lipids or membrane environments or varying membrane curvature and tension, could alter scrambling behaviour. I appreciate that this might have been beyond the scope of this particular paper and the authors plan to further chase these questions, as this work sets a strong protocol for this study. Contextualizing scrambling in the context of membrane composition is particularly relevant since the authors note that TMEM16K's scrambling rate increases tenfold in thinner membranes, suggesting that lipid-specific or membrane-thickness-dependent effects could play a role.
-
Reviewer #3 (Public review):
Summary:
The paper investigates the TMEM16 family of membrane proteins, which play roles in lipid scrambling and ion transport. A total of 27 experimental structures from five TMEM16 family members were analyzed, including mammalian and fungal homologs (e.g., TMEM16A, TMEM16F, TMEM16K, nhTMEM16, afTMEM16). The identified structures were in both Ca²⁺-bound (open) and Ca²⁺-free (closed) states to compare conformations and were preprocessed (e.g., modeling missing loops) and equilibrated. Coarse-grain simulations were performed in DOPC membranes for 10 microseconds to capture the scrambling events. These events were identified by tracking lipids transitioning between the two membrane leaflets and they analysed the correlation between scrambling rates, in addition, structural properties such as groove dilation and membrane thinning were calculated. They report 700 scrambling events across structures and Figure 2 elaborates on how open structures show higher activity, also as expected. The authors also address how structures may require open grooves, this and other mechanisms around scrambling are a bit controversial in the field.
Strengths:
The strength of this study emerges from a comparative analysis of multiple structural starting points and understanding global/local motions of the protein with respect to lipid movement. Although the protein is well-studied, both experimentally and computationally, the understanding of conformational events in different family members, especially membrane thickness less compared to fungal scramblases offers good insights.
Weaknesses:
The weakness of the work is to fully reconcile with experimental evidence of Ca²⁺-independent scrambling rates observed in prior studies, but this part is also challenging using coarse-grain molecular simulations. Previous reports have identified lipid crossing, packing defects, and other associated events, so it is difficult to place this paper in that context. However, the absence of validation leaves certain claims, like alternative scrambling pathways, speculative.
-
Author response:
Reviewer #1 (Public review):
Summary:
The manuscript investigates lipid scrambling mechanisms across TMEM16 family members using coarse-grained molecular dynamics (MD) simulations. While the study presents a statistically rigorous analysis of lipid scrambling events across multiple structures and conformations, several critical issues undermine its novelty, impact, and alignment with experimental observations.
Critical issues:
(1) Lack of Novelty:
The phenomenon of lipid scrambling via an open hydrophilic groove is already well-established in the literature, including through atomistic MD simulations. The authors themselves acknowledge this fact in their introduction and discussion. By employing coarse-grained simulations, the study essentially reiterates previously known findings with limited additional mechanistic insight. The repeated observation of scrambling occurring predominantly via the groove does not offer significant advancement beyond prior work.
We agree with the reviewer’s statement regarding the lack of novelty when it comes to our observations of scrambling in the groove of open Ca<sup>2+</sup>-bound TMEM16 structures. However, we feel that the inclusion of closed structures in this study, which attempts to address the yet unanswered question of how scrambling by TMEM16s occurs in the absence of Ca<sup>2+</sup>, offers new observations for the field. In our study we specifically address to what extent the induced membrane deformation, which has been theorized to aid lipids cross the bilayer especially in the absence of Ca<sup>2+</sup>, contributes to the rate of scrambling (see references 36, 59, and 66). There are also several TMEM16F structures solved under activating conditions (bound to Ca<sup>2+</sup> and in the presence of PIP2) which feature structural rearrangements to TM6 that may be indicative of an open state (PDB 6P48) and had not been tested in simulations. We show that these structures do not scramble and thereby present evidence against an out-of-the-groove scrambling mechanism for these states. Although we find a handful of examples of lipids being scrambled by Ca<sup>2+</sup>-free structures of TMEM16 scramblases, none of our simulations suggest that these events are related to the degree of deformation.
(2) Redundancy Across Systems:
The manuscript explores multiple TMEM16 family members in activating and non-activating conformations, but the conclusions remain largely confirmatory. The extensive dataset generated through coarse-grained MD simulations primarily reinforces established mechanistic models rather than uncovering fundamentally new insights. The effort, while statistically robust, feels excessive given the incremental nature of the findings.
Again, we agree with the reviewer’s statement that our results largely confirm those published by other groups and our own. We think there is however value in comparing the scrambling competence of these TMEM16 structures in a consistent manner in a single study to reduce inconsistencies that may be introduced by different simulation methods, parameters, environmental variables such as lipid composition as used in other published works of single family members. The consistency across our simulations and high number of observed scrambling events have allowed us to confirm that the mechanism of scrambling is shared by multiple family members and relies most obviously on groove dilation.
(3) Discrepancy with Experimental Observations:
The use of coarse-grained simulations introduces inherent limitations in accurately representing lipid scrambling dynamics at the atomistic level. Experimental studies have highlighted nuances in lipid permeation that are not fully captured by coarse-grained models. This discrepancy raises questions about the biological relevance of the reported scrambling events, especially those occurring outside the canonical groove.
We thank the reviewer for bringing up the possible inaccuracies introduced by coarse graining our simulations. This is also a concern for us, and we address this issue extensively in our discussion. As the reviewer pointed out above, our CG simulations have largely confirmed existing evidence in the field which we think speaks well to the transferability of observations from atomistic simulations to the coarse-grained level of detail. We have made both qualitative and quantitative comparisons between atomistic and coarse-grained simulations of nhTMEM16 and TMEM16F (Figure 1, Figure 4-figure supplement 1, Figure 4-figure supplement 5) showing the two methods give similar answers for where lipids interact with the protein, including outside of the canonical groove. We do not dispute the possible discrepancy between our simulations and experiment, but our goal is to share new nuanced ideas for the predicted TMEM16 scrambling mechanism that we hope will be tested by future experimental studies.
(4) Alternative Scrambling Sites:
The manuscript reports scrambling events at the dimer-dimer interface as a novel mechanism. While this observation is intriguing, it is not explored in sufficient detail to establish its functional significance. Furthermore, the low frequency of these events (relative to groove-mediated scrambling) suggests they may be artifacts of the simulation model rather than biologically meaningful pathways.
We agree with the reviewer that our observed number of scrambling events in the dimer interface is too low to present it as strong evidence for it being the alternative mechanism for Ca<sup>2+</sup>-independent scrambling. This will require additional experiments and computational studies which we plan to do in future research. However, we are less certain that these are artifacts of the coarse-grained simulation system as we observed a similar event in an atomistic simulation of TMEM16F.
Conclusion:
Overall, while the study is technically sound and presents a large dataset of lipid scrambling events across multiple TMEM16 structures, it falls short in terms of novelty and mechanistic advancement. The findings are largely confirmatory and do not bridge the gap between coarse-grained simulations and experimental observations. Future efforts should focus on resolving these limitations, possibly through atomistic simulations or experimental validation of the alternative scrambling pathways.
Reviewer #2 (Public review):
Summary:
Stephens et al. present a comprehensive study of TMEM16-members via coarse-grained MD simulations (CGMD). They particularly focus on the scramblase ability of these proteins and aim to characterize the "energetics of scrambling". Through their simulations, the authors interestingly relate protein conformational states to the membrane's thickness and link those to the scrambling ability of TMEM members, measured as the trespassing tendency of lipids across leaflets. They validate their simulation with a direct qualitative comparison with Cryo-EM maps.
Strengths:
The study demonstrates an efficient use of CGMD simulations to explore lipid scrambling across various TMEM16 family members. By leveraging this approach, the authors are able to bypass some of the sampling limitations inherent in all-atom simulations, providing a more comprehensive and high-throughput analysis of lipid scrambling. Their comparison of different protein conformations, including open and closed groove states, presents a detailed exploration of how structural features influence scrambling activity, adding significant value to the field. A key contribution of this study is the finding that groove dilation plays a central role in lipid scrambling. The authors observe that for scrambling-competent TMEM16 structures, there is substantial membrane thinning and groove widening. The open Ca<sup>2+</sup>-bound nhTMEM16 structure (PDB ID 4WIS) was identified as the fastest scrambler in their simulations, with scrambling rates as high as 24.4 {plus minus} 5.2 events per μs. This structure also shows significant membrane thinning (up to 18 Å), which supports the hypothesis that groove dilation lowers the energetic barrier for lipid translocation, facilitating scrambling.
The study also establishes a correlation between structural features and scrambling competence, though analyses often lack statistical robustness and quantitative comparisons. The simulations differentiate between open and closed conformations of TMEM16 structures, with open-groove structures exhibiting increased scrambling activity, while closed-groove structures do not. This finding aligns with previous research suggesting that the structural dynamics of the groove are critical for scrambling. Furthermore, the authors explore how the physical dimensions of the groove qualitatively correlate with observed scrambling rates. For example, TMEM16K induces increased membrane thinning in its open form, suggesting that membrane properties, along with structural features, play a role in modulating scrambling activity.
Another significant finding is the concept of "out-of-the-groove" scrambling, where lipid translocation occurs outside the protein's groove. This observation introduces the possibility of alternate scrambling mechanisms that do not follow the traditional "credit-card model" of groove-mediated lipid scrambling. In their simulations, the authors note that these out-of-the-groove events predominantly occur at the dimer interface between TM3 and TM10, especially in mammalian TMEM16 structures. While these events were not observed in fungal TMEM16s, they may provide insight into Ca<sup>2+</sup>-independent scrambling mechanisms, as they do not require groove opening.
Weaknesses:
A significant challenge of the study is the discrepancy between the scrambling rates observed in CGMD simulations and those reported experimentally. Despite the authors' claim that the rates are in line experimentally, the observed differences can mean large energetic discrepancies in describing scrambling (larger than 1kT barrier in reality). For instance, the authors report scrambling rates of 10.7 events per μs for TMEM16F and 24.4 events per μs for nhTMEM16, which are several orders of magnitude faster than experimental rates. While the authors suggest that this discrepancy could be due to the Martini 3 force field's faster diffusion dynamics, this explanation does not fully account for the large difference in rates. A more thorough discussion on how the choice of force field and simulation parameters influence the results, and how these discrepancies can be reconciled with experimental data, would strengthen the conclusions. Likewise, rate calculations in the study are based on 10 μs simulations, while experimental scrambling rates occur over seconds. This timescale discrepancy limits the study's accuracy, as the simulations may not capture rare or slow scrambling events that are observed experimentally and therefore might underestimate the kinetics of scrambling. It's however important to recognize that it's hard (borderline unachievable) to pinpoint reasonable kinetics for systems like this using the currently available computational power and force field accuracy. The faster diffusion in simulations may lead to overestimated scrambling rates, making the simulation results less comparable to real-world observations. Thus, I would therefore read the findings qualitatively rather than quantitatively. An interesting observation is the asymmetry observed in the scrambling rates of the two monomers. Since MARTINI is known to be limited in correctly sampling protein dynamics, the authors - in order to preserve the fold - have applied a strong (500 kJ mol-1 nm-2) elastic network. However, I am wondering how the ENM applies across the dimer and if any asymmetry can be noticed in the application of restraints for each monomer and at the dimer interface. How can this have potentially biased the asymmetry in the scrambling rates observed between the monomers? Is this artificially obtained from restraining the initial structure, or is the asymmetry somehow gatekeeping the scrambling mechanism to occur majorly across a single monomer? Answering this question would have far-reaching implications to better describe the mechanism of scrambling.
The main aim of our computational survey was to directly compare all relevant published TMEM16 structures in both open and closed states using the Martini 3 CGMD force field. Our standardized simulation and analysis protocol allowed us to quantitatively compare scrambling rates across the TMEM16 family, something that has never been done before. We do acknowledge that direct comparison between simulated versus experimental scrambling rates is complicated and is best to be interpreted qualitatively. In line with other reports (e.g., Li et al, PNAS 2024), lipid scrambling in CGMD is 2-3 orders of magnitude faster than typical experimental findings. In the CG simulation field, these increased dynamics due to the smoother energy landscape are a well known phenomenon. In our view, this is a valuable trade-off for being able to capture statistically robust scrambling dynamics and gain mechanistic understanding in the first place, since these are currently challenging to obtain otherwise. For example, with all-atom MD it would have been near-impossible to conclude that groove openness and high scrambling rates are closely related, simply because one would only measure a handful of scrambling events in (at most) a handful of structures.
Considering the elastic network: the reviewer is correct in that the elastic network restrains the overall structure to the experimental conformation. This is necessary because the Martini 3 force field does not accurately model changes in secondary (and tertiary) structure. In fact, by retaining the structural information from the experimental structures, we argue that the elastic network helped us arrive at the conclusion that groove openness is the major contributing factor in determining a protein’s scrambling rate. This is best exemplified by the asymmetric X-ray structure of TMEM16K (5OC9), in which the groove of one subunit is more dilated than the other. In our simulation, this information was stored in the elastic network, yielding a 4x higher rate in the open groove than in the closed groove, within the same trajectory.
Notably, the manuscript does not explore the impact of membrane composition on scrambling rates. While the authors use a specific lipid composition (DOPC) in their simulations, they acknowledge that membrane composition can influence scrambling activity. However, the study does not explore how different lipids or membrane environments or varying membrane curvature and tension, could alter scrambling behaviour. I appreciate that this might have been beyond the scope of this particular paper and the authors plan to further chase these questions, as this work sets a strong protocol for this study. Contextualizing scrambling in the context of membrane composition is particularly relevant since the authors note that TMEM16K's scrambling rate increases tenfold in thinner membranes, suggesting that lipid-specific or membrane-thickness-dependent effects could play a role.
Considering different membrane compositions: for this study, we chose to keep the membranes as simple as possible. We opted for pure DOPC membranes, because it has (1) negligible intrinsic curvature, (2) forms fluid membranes, and (3) was used previously by others (Li et al, PNAS 2024). As mentioned by the reviewer, we believe our current study defines a good standardized protocol and solid baseline for future efforts looking into the additional effects of membrane composition, tension, and curvature that could all affect TMEM16-mediated lipid scrambling.
Reviewer #3 (Public review):
Strengths:
The strength of this study emerges from a comparative analysis of multiple structural starting points and understanding global/local motions of the protein with respect to lipid movement. Although the protein is well-studied, both experimentally and computationally, the understanding of conformational events in different family members, especially membrane thickness less compared to fungal scramblases offers good insights.
We appreciate the reviewer recognizing the value of the comparative study. In addition to valuable insights from previous experimental and computational work, we hope to put forward a unifying framework that highlights various TMEM16 structural features and membrane properties that underlie scrambling function.
Weaknesses:
The weakness of the work is to fully reconcile with experimental evidence of Ca²⁺-independent scrambling rates observed in prior studies, but this part is also challenging using coarse-grain molecular simulations. Previous reports have identified lipid crossing, packing defects, and other associated events, so it is difficult to place this paper in that context. However, the absence of validation leaves certain claims, like alternative scrambling pathways, speculative.
It is generally difficult to quantitatively compare bulk measurements of scrambling phenomena with simulation results. The advantage of simulations is to directly observe the transient scrambling events at a spatial and temporal resolution that is currently unattainable for experiments. The current experimental evidence for the precise mechanism of Ca<sup>2+</sup>-independent scrambling is still under debate. We therefore hope to leverage the strength of MD and statistical rigor of coarse-grained simulations to generate testable hypotheses for further structural, biochemical, and computational studies.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study presents valuable data on the increase in individual differences in functional connectivity with the auditory cortex in individuals with congenital/early-onset hearing loss compared to individuals with normal hearing. The evidence supporting the study's claims is convincing, although additional work using resting-state functional connectivity in the future could further strengthen the results. The work will be of interest to neuroscientists working on brain plasticity and may have implications for the design of interventions and compensatory strategies.
-
Reviewer #1 (Public review):
This experiment sought to determine what effect congenital/early-onset hearing loss (and associated delay in language onset) has on the degree of inter-individual variability in functional connectivity to the auditory cortex. Looking at differences in variability rather than group differences in mean connectivity itself represents an interesting addition to the existing literature. The sample of deaf individuals was large, and quite homogeneous in terms of age of hearing loss onset, which are considerable strengths of the work. The experiment appears well conducted and the results are certainly of interest.
Comment from Reviewing Editor: In the revised manuscript, the authors have addressed all concerns previously identified by reviewer 1.
-
Reviewer #3 (Public review):
Summary:
This study focuses on changes in brain organization associated with congenital deafness. The authors investigate differences in functional connectivity (FC) and differences in the variability of FC. By comparing congenitally deaf individuals to individuals with normal hearing, and by further separating congenitally deaf individuals into groups of early and late signers, the authors can distinguish between changes in FC due to auditory deprivation and changes in FC due to late language acquisition. They find larger FC variability in deaf than normal-hearing individuals in temporal, frontal, parietal, and midline brain structures, and that FC variability is largely driven by auditory deprivation. They suggest that the regions that show a greater FC difference between groups also show greater FC variability.
Strengths:
The manuscript is well-written, and the methods are clearly described and appropriate. Including the three different groups enables the critical contrasts distinguishing between different causes of FC variability changes. The results are interesting and novel.
Weaknesses:
Analyses were conducted for task-based data rather than resting-state data. The authors report behavioral differences between groups and include behavioral performance as a nuisance regressor in their analysis. This is a good approach to account for behavioral task differences, given the data. Nevertheless, additional work using resting-state functional connectivity could remove the potential confound fully.
Comment from Reviewing Editor: In the revised manuscript, the authors have addressed all concerns previously identified by reviewer 3, and the eLife assessment statement reflects the point by reviewer 3 that using resting-state functional connectivity in the future could further strengthen the results.
-
Author response:
The following is the authors’ response to the previous reviews.
Public Reviews:
Reviewer #1 (Public review):
This experiment sought to determine what effect congenital/early-onset hearing loss (and associated delay in language onset) has on the degree of inter-individual variability in functional connectivity to the auditory cortex. Looking at differences in variability rather than group differences in mean connectivity itself represents an interesting addition to the existing literature. The sample of deaf individuals was large, and quite homogeneous in terms of age of hearing loss onset, which are considerable strengths of the work. The experiment appears well conducted and the results are certainly of interest. R: Thank you for your positive and thoughtful feedback.
Reviewer #3 (Public review):
Summary:
This study focuses on changes in brain organization associated with congenital deafness. The authors investigate differences in functional connectivity (FC) and differences in the variability of FC. By comparing congenitally deaf individuals to individuals with normal hearing, and by further separating congenitally deaf individuals into groups of early and late signers, the authors can distinguish between changes in FC due to auditory deprivation and changes in FC due to late language acquisition. They find larger FC variability in deaf than normal-hearing individuals in temporal, frontal, parietal, and midline brain structures, and that FC variability is largely driven by auditory deprivation. They suggest that the regions that show a greater FC difference between groups also show greater FC variability.
Strengths:
The manuscript is well-written, and the methods are clearly described and appropriate. Including the three different groups enables the critical contrasts distinguishing between different causes of FC variability changes. The results are interesting and novel.
Weaknesses:
Analyses were conducted for task-based data rather than resting-state data. The authors report behavioral differences between groups and include behavioral performance as a nuisance regressor in their analysis. This is a good approach to account for behavioral task differences, given the data. Nevertheless, additional work using resting-state functional connectivity could remove the potential confound fully.
The authors have addressed my concerns well.
Thank you for your thoughtful feedback. We appreciate your acknowledgment of the strengths of our study and the approaches taken to address potential confounds. As noted, we discuss the limitation of not including resting-state data in the manuscript, and we agree that this represents an important avenue for future research. We hope to address this question in future studies.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This fundamental study provides a critical challenge to a great many studies of the neural correlates of consciousness that were based on post hoc sorting of reported awareness experience. The evidence supporting this criticism is compelling, based on simulations and decoding analysis of EEG data. The results will be of interest not only to psychologists and neuroscientists but also to philosophers who work on addressing mind-body relationships.
-
Reviewer #1 (Public review):
The study aimed to investigate the significant impact of criterion placement on the validity of neural measures of consciousness, examining how different standards for classifying a stimulus as 'seen' or 'unseen' can influence the interpretation of neural data. They conducted simulations and EEG experiments to demonstrate that the Perceptual Awareness Scale, a widely used tool in consciousness research, may not effectively mitigate criterion-related confounds, suggesting that even with the PAS, neural measures can be compromised by how criteria are set. Their study challenged existing paradigms by showing that the construct validity of neural measures of conscious and unconscious processing is threatened by criterion placement, and they provided practical recommendations for improving experimental designs in the field. The authors' work contributes to a deeper understanding of the nature of conscious and unconscious processing and addresses methodological concerns by exploring the pervasive influence of criterion placement on neural measures of consciousness and discussing alternative paradigms that might offer solutions to the criterion problem.
The study effectively demonstrates that the placement of criteria for determining whether a stimulus is 'seen' or 'unseen' significantly impacts the validity of neural measures of consciousness. The authors found that conservative criteria tend to inflate effect sizes, while liberal criteria reduce them, leading to potentially misleading conclusions about conscious and unconscious processing. The authors employed robust simulations and EEG experiments to demonstrate the effects of criterion placement, ensuring that the findings are well-supported by empirical evidence. The results from both experiments confirm the predicted confounding effects of criterion placement on neural measures of unconscious and conscious processing.
The results are consistent with their hypotheses and contribute meaningfully to the field of consciousness research.
-
Reviewer #2 (Public review):
Summary:
The study investigates the potential influence of the response criterion on neural decoding accuracy in consciousness and unconsciousness, utilizing either simulated data or reanalyzing experimental data with post-hoc sorting data.
Strengths:
When comparing the neural decoding performance of Target versus NonTarget with or without post-hoc sorting based on subject reports, it is evident that response criterion can influence the results. This was observed in simulated data as well as in two experiments that manipulated subject response criterion to be either more liberal or more conservative. One experiment involved a two-level response (seen vs unseen), while the other included a more detailed four-level response (ranging from 0 for no experience to 3 for a clear experience). The findings consistently indicated that adopting a more conservative response criterion could enhance neural decoding performance, whether in conscious or unconscious states, depending on the sensitivity or overall response threshold.
Weaknesses:
(1) In the realm of research methodology, conducting post-hoc sorting based on subject reports raises an issue. This operation leads to an imbalance in the number of trials between the two conditions (Target and NonTarget) during the decoding process. Such trial number disparity introduces bias during decoding, likely contributing to fluctuations in neural decoding performance. This potential confounding factor significantly impacts the interpretation of research findings. The trial number imbalance may cause models to exhibit a bias towards the category with more trials during the learning process, leading to misjudgments of neural signal differences between the two conditions and failing to accurately reflect the distinctions in brain neural activity between target and non-target states. Therefore, it is recommended that the authors extensively discuss this confounding factor in their paper. They should analyze in detail how this factor could influence the interpretation of results, such as potentially exaggerating or diminishing certain effects, and whether measures are necessary to correct the bias induced by this imbalance to ensure the reliability and validity of the research conclusions.
-
Reviewer #3 (Public review):
Summary:
Fahrenfort et al. investigate how liberal or conservative criterion placement in a detection task affects the construct validity of neural measures of unconscious cognition and conscious processing. Participants identified instances of "seen" or "unseen" in a detection task, a method known as post hoc sorting. Simulation data convincingly demonstrate that, counterintuitively, a conservative criterion inflates effect sizes of neural measures compared to a liberal criterion. While the impact of criterion shifts on effect size is suggested by signal detection theory, this study is the first to address this explicitly within the consciousness literature. Decoding analysis of data from two EEG experiments further shows that different criteria lead to differential effects on classifier performance in post hoc sorting. The findings underscore the pervasive influence of experimental design and participant reports on neural measures of consciousness, revealing that criterion placement poses a critical challenge for researchers.
Strengths and Weaknesses
One of the strengths of this study is the inclusion of the Perceptual Awareness Scale (PAS), which allows participants to provide more nuanced responses regarding their perceptual experiences. This approach ensures that responses at the lowest awareness level (selection 0) are made only when trials are genuinely unseen. This methodological choice is important as it helps prevent the overestimation of unconscious processing, enhancing the validity of the findings.<br /> The authors also do a commendable job in the discussion by addressing alternative paradigms, such as wagering paradigms, as a possible remedy to the criterion problem (Peters & Lau, 2015; Dienes & Seth, 2010). Their consideration of these alternatives provides a balanced view and strengthens the overall discussion.
Our initial review identified a lack of measures of variance as one potential weakness of this work. However we agree with the authors' response that plotting individual datapoints for each condition is indeed a good visualization of variance within a dataset.
Impact of the Work:
This study effectively demonstrates a phenomenon that, while understood within the context of signal detection theory, has been largely unexplored within the consciousness literature. Subjective measures may not reliably capture the construct they aim to measure due to criterion confounds. Future research on neural measures of consciousness should account for this issue, and no-report measures may be necessary until the criterion problem is resolved.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public review):
Summary:
The paper proposes that the placement of criteria for determining whether a stimulus is 'seen' or 'unseen' can significantly impact the validity of neural measures of consciousness. The authors found that conservative criteria, which require stronger evidence to classify a stimulus as 'seen,' tend to inflate effect sizes in neural measures, making conscious processing appear more pronounced than it is. Conversely, liberal criteria, which require less evidence, reduce these effect sizes, potentially underestimating conscious processing. This variability in effect sizes due to criterion placement can lead to misleading conclusions about the nature of conscious and unconscious processing.
Furthermore, the study highlights that the Perceptual Awareness Scale (PAS), a commonly used tool in consciousness research, does not effectively mitigate these criterion-related confounds. This means that even with PAS, the validity of neural measures can still be compromised by how criteria are set. The authors emphasize the need for careful consideration and standardization of criterion placement in experimental designs to ensure that neural measures accurately reflect the underlying cognitive processes. By addressing this issue, the paper aims to improve the reliability and validity of findings in the field of consciousness research.
Strengths:
(1) This research provides a fresh perspective on how criterion placement can significantly impact the validity of neural measures in consciousness research.
(2) The study employs robust simulations and EEG experiments to demonstrate the effects of criterion placement, ensuring that the findings are well-supported by empirical evidence.
(3) By highlighting the limitations of the PAS and the impact of criterion placement, the study offers practical recommendations for improving experimental designs in consciousness research.
Weaknesses:
The primary focused criterion of PAS is a commonly used tool, but there are other measures of consciousness that were not evaluated, which might also be subject to similar or different criterion limitations. A simulation could applied to these metrics to show how generalizable the conclusion of the study is.
We would like to thank reviewer 1 for their positive words and for taking the time to evaluate our manuscript. We agree that it would be important to gauge generalization to other metrics of consciousness. Note however, that the most commonly used alternative methods are postdecision wagering and confidence, both of which are known to behave quite similarly to the PAS (Sandberg, Timmermans , Overgaard & Cleeremans, 2010). Indeed, we have confirmed in other work that confidence is also sensitive to criterion shifts (see https://osf.io/preprints/psyarxiv/xa4fj). Although it has been claimed that confidence-derived aggregate metrics like meta-d’ or metacognitive efficiency may overcome criterion shifts, it would require empirical data rather than simulation to settle whether this is true or not (also see the discussion in https://osf.io/preprints/psyarxiv/xa4fj). Furthermore, out of these metrics, the PAS seems to be the preferred one amongst consciouness researchers (see figure 4 in Francken, Beerendonk, Molenaar, Fahrenfort, Kiverstein, Seth, Gaal S van, 2022; as well as https://osf.io/preprints/psyarxiv/bkxzh). Thus, given the fact that other metrics are either expected to behave in similar ways and/or because it would require more empirical work to determine along which dimension(s) criterion shifts would operate in alternative metrics, we see no clear path to implement the suggested simulations. We anticipate that aiming to do this would require a considerable amount of additional work, figuring out many things which we believe would better suit a future project. We would of course be open to doing this if the reviewer would have more specific suggestions for how to go about the proposed simulations.
Reviewer #2 (Public review):
Summary:
The study investigates the potential influence of the response criterion on neural decoding accuracy in consciousness and unconsciousness, utilizing either simulated data or reanalyzing experimental data with post-hoc sorting data.
Strengths:
When comparing the neural decoding performance of Target versus NonTarget with or without post-hoc sorting based on subject reports, it is evident that response criterion can influence the results. This was observed in simulated data as well as in two experiments that manipulated the subject response criterion to be either more liberal or more conservative. One experiment involved a two-level response (seen vs unseen), while the other included a more detailed four-level response (ranging from 0 for no experience to 3 for a clear experience). The findings consistently indicated that adopting a more conservative response criterion could enhance neural decoding performance, whether in conscious or unconscious states, depending on the sensitivity or overall response threshold.
Weaknesses:
(1) The response criterion plays a crucial role in influencing neural decoding because a subject's report may not always align with the actual stimulus presented. This discrepancy can occur in cases of false alarms, where a subject reports seeing a target that was not actually there, or in cases where a target is present but not reported. Some may argue that only using data from consistent trials (those with correct responses) would not be affected by the response criterion. However, the authors' analysis suggests that a conservative response criterion not only reduces false alarms but also impacts hit rates. It is important for the authors to further investigate how the response criterion affects neural decoding even when considering only correct trials.
We would like to thank reviewer 2 for taking the time to evaluate our manuscript. We appreciate the suggestion to investigate neural decoding on only correct trials. What we in fact did is consider target trials that are 'correct' (hits = seen target present trials) and 'incorrect' (misses = unseen target present trials) separately, see figure 4A and figure 4B. This shows that the response criterion also affects the neural measure of consciousness when only considering correct target present trials. Note however, that one cannot decode 'unseen' (target present) trials if one only aims to decode 'correct' trials, because those are all incorrect by definition. We did not analyze false alarms (these would be the 'seen' trials on the noise distribution of Figure 1A), as there were not enough trials of those, especially in the conservative condition (see Figure 2C and 2D), making comparisons between conservative and liberal impossible. However, the predictions for false alarms are pretty straightforward, and follow directly from the framework in Figure 1.
(2) The author has utilized decoding target vs. nontarget as the neural measures of unconscious and/or conscious processing. However, it is important to note that this is just one of the many neural measures used in the field. There are an increasing number of studies that focus on decoding the conscious content, such as target location or target category. If the author were to include results on decoding target orientation and how it may be influenced by response criterion, the field would greatly benefit from this paper.
We thank the reviewer for the suggestion to decode orientation of the target. In our experiments, the target itself does not have an orientation, but the texture of which it is composed does. We used four orientations, which were balanced out within and across conditions such that presence-absence decoding is never driven by orientation, but rather by texture based figure-ground segregation (for similar logic, see for example Fahrenfort et al, 2007; 2008 etc). There are a couple of things to consider when wanting to apply a decoding analysis on the orientation of these textures:
(1) Our behavioral task was only on the presence or absence of the target, not on the orientation of the textures. This makes it impossible to draw any conclusions about the visibility of the orientation of the textures. Put differently: based on behavior there is no way of identifying seen or unseen orientations, correctly or incorrectly identified orientations etc. For examply, it is easy to envision that an observer detects a target without knowing the orientation that defines it, or vice versa a situation in which an observer does not detect the target while still being aware of the orientation of a texture in the image (either of the figure, or of the background). The fact that we have no behavioral response to the orientation of the textures severely limits the usefulness of a hypothetical decoding effect on these orientations, as such results would be uninterpretable with respect to the relevant dimension in this experiment, which is visibility.
(2) This problem is further excarbated by the fact that the orientation of the background is always orthogonal to the orientation of the target. Therefore, one would not only be decoding the orientation of the texture that constitutes the target itself, but also the texture that constitutes the background. Given that we also have no behavioral metric of how/whether the orientation of the background is perceived, it is similarly unclear how one would interpret any observed effect.
(3) Finally, it is important to note that – even when categorization/content is sometimes used as an auxiliary measure in consciousness research (often as a way to assay objective performance) - consciousness is most commonly conceptualized on the presence-absence dimension. A clear illustration of this is the concept of blindsight. Blindsight is the ability of observers to discriminate stimuli (i.e. identify content) without being able to detect them. Blindsight is often considered the bedrock of the cognitive neuroscience of consciousness as it acts as proof that one can dissociate between unconscious processing (the categorization of a stimulus, i.e. the content) and conscious processing of that stimulus (i.e. the ability to detect it).
Given the above, we do not see how the suggested analysis could contribute to the conclusions that the manuscript already establishes. We hope that – given the above - the reviewer agrees with this assessment.
Reviewer #3 (Public review):
Summary:
Fahrenfort et al. investigate how liberal or conservative criterion placement in a detection task affects the construct validity of neural measures of unconscious cognition and conscious processing. Participants identified instances of "seen" or "unseen" in a detection task, a method known as post hoc sorting. Simulation data convincingly demonstrate that, counterintuitively, a conservative criterion inflates effect sizes of neural measures compared to a liberal criterion. While the impact of criterion shifts on effect size is suggested by signal detection theory, this study is the first to address this explicitly within the consciousness literature. Decoding analysis of data from two EEG experiments further shows that different criteria lead to differential effects on classifier performance in post hoc sorting. The findings underscore the pervasive influence of experimental design and participants report on neural measures of consciousness, revealing that criterion placement poses a critical challenge for researchers.
Strengths and Weaknesses:
One of the strengths of this study is the inclusion of the Perceptual Awareness Scale (PAS), which allows participants to provide more nuanced responses regarding their perceptual experiences. This approach ensures that responses at the lowest awareness level (selection 0) are made only when trials are genuinely unseen. This methodological choice is important as it helps prevent the overestimation of unconscious processing, enhancing the validity of the findings.
A potential area for improvement in this study is the use of single time-points from peak decoding accuracy to generate current source density topography maps. While we recognize that the decoding analysis employed here differs from traditional ERP approaches, the robustness of the findings could be enhanced by exploring current source density over relevant time windows. Event-related peaks, both in terms of timing and amplitude, can sometimes be influenced by noise or variability in trial-averaged EEG data, and a time-window analysis might provide a more comprehensive and stable representation of the underlying neural dynamics.
We thank reviewer 3 for their positive words and for taking the time to evaluate our manuscript. If we understand the reviewer correctly, he/she suggests that the signal-to-noise ratio could be improved by averaging over time windows rather than taking the values at singular peaks in time. Before addressing this suggestion, we would like to point out that we plotted the relevant effects across time in Supplementary Figure S1A and S1B. These show that the observed effects were not somehow limited in time, i.e. only occuring around the peaks, but that they consistenly occured throughout the time course of the trial. In line with this observation one might argue that the results could be improved further by averaging across windows of interest rather than taking the peak moments alone, as the reviewer suggests. Although this might be true, there are many analysis choices that one can make, each of which could have a positive (or negative) effect on the signal to noise ratio. For example, when taking a window of interest, one is faced with a new choice to make, this time regarding the number of consecutive samples to average across (i.e. the size of the window), etc. More generally there is a long list of choices that may affect the precise outcome of analyses, either positively or negatively. Having analyzed the data in one way, the problem with adding new analysis approaches is that there is no objective criterion for deciding which analysis would be ‘best’, other than looking at the outcome of the statistical analyses themselves. Doing this would constitute an explorative double-dipping-like approach to analyzing the results, which – aside from potentially increasing the signal to noise ratio – is likely to also result in an increase of the type I error rate. In the past, when the first author of this manuscript has attempted to minimize the number of statistical tests, he has lowered the number of EEG time points by simply taking the peaks (for example see https://doi.org/10.1073/pnas.1617268114), and that is the approach that was taken here as well. Given the above, we prefer not to further ‘try out’ additional analytical approaches on this dataset, simply to improve the results. We hope the reviewer sympathizes with our position that it is methodologically most sound to stick to the analyses we have already performed and reported, without further exploration.
It is helpful that the authors show the standard error of the mean for the classifier performance over time. A similar indication of a measure of variance in other figures could improve clarity and transparency.
That said, the paper appears solid regarding technical issues overall. The authors also do a commendable job in the discussion by addressing alternative paradigms, such as wagering paradigms, as a possible remedy to the criterion problem (Peters & Lau, 2015; Dienes & Seth, 2010). Their consideration of these alternatives provides a balanced view and strengthens the overall discussion.
We thank the reviewer for this suggestion. Note that we already have a measure of variance in the other figures too, namely showing the connected data points of individual participants. Indvidual data points as a visualization of variance is preferred by many journals (e.g., see https://www.nature.com/documents/cr-gta.pdf), and also shows the spread of relevant differences when paired points are connected. For example, in Figure 2, 3 and 4, the relevant difference is between the liberal and conservative condition. When wanting to show the spread of the differences between these conditions, one option would be to first subtract the two measures in a pairwise fashion (e.g., liberal-conservative), and then plot the spread of those differences using some metric (e.g. standard error/CI of the mean difference). However, this has the disadvantage of no longer separately showing the raw scores on the conditions that are being compared. Showing conditions separately provides clarity to the reader about what is being compared to what. The most common approach to visualizing the variance of the relevant difference in such cases, is to plot the connected individual data points of all participants in the same plot. The uniformity of the slope of these lines in such a visualization provides direct insight into the spread of the relevant difference. Plotting the standard error of the mean on the raw scores of the conditions in these plots would not help, because this would not visualize the spread of the relevant difference (liberal-conservative). We therefore opted in the manuscript to show the mean scores on the conditions that we compare, while also showing the connected raw data points of individual participants in the same plot. One might argue that we should then use that same visualization in figure 3A, but note that this figure is merely intended to identify the peaks, i.e. it does not compare liberal to conservative. Furthermore, plotting the decoding time lines of individual participants would greatly diminish the clarity of this figure. Given our explanation, we hope the reviewer agrees with the approach that we chose, although we are of course open to modifying the figures if the reviewer has a suggestion for doing so while taking into account the points we raise here in our response.
Impact of the Work:
This study effectively demonstrates a phenomenon that has been largely unexplored within the consciousness literature. Subjective measures may not reliably capture the construct they aim to measure due to criterion confounds. Future research on neural measures of consciousness should account for this issue, and no-report measures may be necessary until the criterion problem is resolved.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors could further elaborate on the results of the PAS to provide a clearer insight into the impact of response criteria, which is notably more complex than in other experiments. Specifically, the results demonstrate that conservative response criterion condition displays a considerably higher sensitivity compared to those with a liberal response criterion. It would be interesting to explore whether this shift in sensitivity suggests a correlation between changes in response criteria and conscious experiences, and how the interaction between sensitivity and response criteria can affect the neural measure of consciousness.
We thank the reviewer for this suggestion. Note that the change in sensitivity that we observed is minor compared to the change we observed in response criterion (hedges g criterion in exp 2 = 2.02, compared to hedges g sensitivity/d’ in exp 2 = 0.42). However, we do investigate the effect of sensitivity (disregarding response criterion) on decoding accuracy. To this end we devised Figure 3C (for the full decoding time course see Supplementary Figure S1B). These figures show that the small behavioral sensitivity effects observed in both experiments (hedges g sensitivity in exp 1 = 0.30, exp 2 = 0.42) did not translate into significant decoding differences between conservative and liberal in either experiment. This comes as no surprise given the small corresponding behavioral effects. Note that small sensitivity differences between liberal and conservative conditions are commonplace, plausibly driven by the fact that being liberal also involves being more noisy in one’s response tendencies (i.e. sometimes randomly indicating presence). Further, the reviewer suggests that we might correlate changes in response criteria to changes in conscious experience. The only relevant metric of conscious experience for which we have data in this manuscript is the Perceptual Awareness Scale (PAS), so we assume the reviewer asks for a correlation between experimentally induced changes in response criterion with the equivalent changes in d’. To this end we computed the difference in the PAS-based d’ metric between conservative and liberal, as well as the difference in the PAS-based criterion metric between conservative and liberal, and correlated these across subjects (N=26) using a Spearman rank correlation. The result shows that these metrics do not correlate r(24)=0.04, p=0.85. Note however that small-N correlations like these are only somewhat reliable for large effect sizes. An N of 26 and a mere power of 80% requires an effect size of at least r=0.5 to be detectable, so even if a correlation were to exist we may not have had enough power to detect it. Due to these caveats we opted to not report this null-correlation in the manuscript, but we are of course willing to do so if the reviewer and/or editor disagrees with this assessment.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
The authors investigated the mechanisms underlying the pause in striatal cholinergic interneurons (SCINs) induced by thalamic input, identifying that Kv1 channels play a key role in this burst-dependent pause. The valuable study provides mechanistic insights into how burst activity in SCINs leads to a subsequent pause, highlighting the involvement of D1/D5 receptors. The experimental evidence is solid; however, the reviewers suggest further clarifying the mechanism by which clozapine reduces D5R ligand-independent activity in the L-DOPA-off state.
-
Reviewer #1 (Public review):
Summary:
Tubert C. et al. investigated the role of dopamine D5 receptors (D5R) and their downstream potassium channel, Kv1, in the striatal cholinergic neuron pause response induced by thalamic excitatory input. Using slice electrophysiological analysis combined with pharmacological approaches, the authors tested which receptors and channels contribute to the cholinergic interneuron pause response in both control and dyskinetic mice (in the L-DOPA off state). They found that activation of Kv1 was necessary for the pause response, while activation of D5R blocked the pause response in control mice. Furthermore, in the L-DOPA off state of dyskinetic mice, the absence of the pause response was restored by the application of clozapine. The authors claimed that 1) the D5R-Kv1 pathway contributes to the cholinergic interneuron pause response in a phasic dopamine concentration-dependent manner, and 2) clozapine inhibits D5R in the L-DOPA off state, which restores the pause response.
Strengths
The electrophysiological and pharmacological approaches used in this study are powerful tools for testing channel properties and functions. The authors' group has well-established these methodologies and analysis pipelines. Indeed, the data presented were robust and reliable.
Weaknesses:
Although the paper has strengths in its methodological approaches, there is a significant gap between the presented data and the authors' claims.
The authors answered the most of concerns I raised. However, the critical issue remains unresolved.
I am still not convinced by the results presented in Fig. 6 and their interpretation. Since Clozapine acts as an agonist in the absence of an endogenous agonist, it may stimulate the D5R-cAMP-Kv1 pathway. Stimulation of this pathway should abolish the pause response mediated by thalamic stimulation in SCINs, rather than restoring the pause response. Clarification is needed regarding how Clozapine reduces D5R-ligand-independent activity in the absence of dopamine (the endogenous agonist). In addition, the author's argued that D5R antagonist does not work in the absence of dopamine, therefore solely D5R antagonist didn't restore the pause response. However, if D5R-cAMP-Kv1 pathway is already active in L-DOPA off state, why D5R antagonist didn't contribute to inhibition of D5R pathway?<br /> Since Clozapine is not D5 specific and Clozapine experiments were not concrete, I recommend testing whether other receptors, such as the D2 receptor, contribute to the Clozapine-induced pause response in the L-DOPA-off state.
-
Reviewer #2 (Public review):
Summary:
This manuscript by Tubert et al. presents the role of D5 receptors (D5R) in regulating the striatal cholinergic interneuron (CIN) pause response through D5R-cAMP-Kv1 inhibitory signaling. Their findings provide a compelling model explaining the "on/off" switch of the CIN pause, driven by the distinct dopamine affinities of D2R and D5R. This mechanism, coupled with varying dopamine states, is likely critical for modulating synaptic plasticity in cortico-striatal circuits during motor learning and execution. Furthermore, the study bridges their previous finding of CIN hyperexcitability (Paz et al., Movement Disorder 2022) with the loss of the pause response in LID mice and demonstrates the restore of the pause through D1/D5 inverse agonism.
Strengths:
The study presents solid findings, and the writing is logically structured and easy to follow. The experiments are well-designed, properly combining ex vivo electrophysiology recording, optogenetics, and pharmacological treatment to dissect / rule out most, if not all, alternative mechanisms in their model.
Weaknesses:
While the manuscript is overall satisfying, one conceptual gap needs to be further addressed or discussed: the potential "imbalance" between D2R and D5R signaling due to the ligand-independent activity of D5R in LID. Given that D2R and D5R oppositely regulate CIN pause responses through cAMP signaling, investigating the role of D2R under LID off L-DOPA (e.g., by applying D2 agonists or antagonists, even together with intracellular cAMP analogs or inhibitors) could provide critical insights. Addressing this aspect would strengthen the manuscript in understanding CIN pause loss under pathological conditions.
-
Reviewer #3 (Public review):
Summary:
Tubert et al. investigate the mechanisms underlying the pause response in striatal cholinergic interneurons (SCINs). The authors demonstrate that optogenetic activation of thalamic axons in the striatum induces burst activity in SCINs, followed by a brief pause in firing. They show that the duration of this pause correlates with the number of elicited action potentials, suggesting a burst-dependent pause mechanism. The authors demonstrated this burst-dependent pause relied on Kv1 channels. The pause is blocked by a SKF81297 and partially by sulpiride and mecamylamine, implicating D1/D5 receptor involvement. The study also shows that the ZD7288 does not reduce the duration of the pause, and that lesioning dopamine neurons abolishes this response, which can be restored by clozapine.
Weaknesses:
While this study presents an interesting mechanism for SCIN pausing after burst activity, there are several major concerns that should be addressed:
(1) Scope of the Mechanism: It is important to clarify that the proposed mechanism may apply specifically to the pause in SCINs following burst activity. The manuscript does not provide clear evidence that this mechanism contributes to the pause response observed in behavioral animals. While the thalamus is crucial for SCIN pauses in behavioral contexts, the exact mechanism remains unclear. Activating thalamic input triggers burst activity in SCINs, leading to a subsequent pause, but this mechanism may not be generalizable across different scenarios. For instance, approximately half of TANs do not exhibit initial excitation but still pause during behavior, suggesting that the burst-dependent pause mechanism is unlikely to explain this phenomenon. Furthermore, in behavioral animals, the duration of the pause seems consistent, whereas the proposed mechanism suggests it depends on the prior burst, which is not aligned with in vivo observations. Additionally, many in vivo recordings show that the pause response is a reduction in firing rate, not complete silence, which the mechanism described here does not explain. Please address these in the manuscript.
(2) Terminology: The use of "pause response" throughout the manuscript is misleading. The pause induced by thalamic input in brain slices is distinct from the pause observed in behavioral animals. Given the lack of a clear link between these two phenomena in the manuscript, it is essential to use more precise terminology throughout, including in the title, bullet points, and body of the manuscript.
(3) Kv1 Blocker Specificity: It is unclear how the authors ruled out the possibility that the Kv1 blocker did not act directly on SCINs. Could there be an indirect effect contributing to the burst-dependent pause? Clarification on this point would strengthen the interpretation of the results.
(4) Role of D1 Receptors: While it is well-established that activating thalamic input to SCINs triggers dopamine release, contributing to SCIN pausing (as shown in Figure 3), it would be helpful to assess the extent to which D1 receptors contribute to this burst-dependent pause. This could be achieved by applying the D1 agonist SKF81297 after blocking nAChRs and D2 receptors.
(5) Clozapine's Mechanism of Action: The restoration of the burst-dependent pause by clozapine following dopamine neuron lesioning is interesting, but clozapine acts on multiple receptors beyond D1 and D5. Although it may be challenging to find a specific D5 antagonist or inverse agonist, it would be more accurate to state that clozapine restores the burst-dependent pause without conclusively attributing this effect to D5 receptors.
Comments on revisions:
The authors have addressed many of my concerns. However, I remain unconvinced that adding an 'ex vivo' experiment fully resolves the fundamental differences between the burst-dependent pause observed in slices - defined by the duration of a single AHP - and the pause response in CHINs observed in vivo, which may involve contributions from more than one prolonged AHP. In vivo, neurons can still fire action potentials during the pause, albeit at a lower frequency. Moreover, in behaving animals, pause duration does not vary with or without initial excitation. The mechanism proposed demonstrates that the pause duration, defined by the length of a single AHP, is positively correlated with preceding burst activity.
To improve clarity, I recommend using the term 'SCIN pause' to describe the ex vivo findings, distinguishing them more explicitly from the 'pause response' observed in behaving animals. This distinction would help contextualize the ex vivo findings as potentially contributing to, but not fully representing, the pause response in vivo.
Again, it would be helpful to present raw data for pause durations rather than relying solely on ratios. This approach would provide the audience with a clearer understanding of the absolute duration of the burst-dependent pause and allow for better comparison to the ~200 ms pause observed in behaving animals.
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Tubert C. et al. investigated the role of dopamine D5 receptors (D5R) and their downstream potassium channel, Kv1, in the striatal cholinergic neuron pause response induced by thalamic excitatory input. Using slice electrophysiological analysis combined with pharmacological approaches, the authors tested which receptors and channels contribute to the cholinergic interneuron pause response in both control and dyskinetic mice (in the LDOPA off state). They found that activation of Kv1 was necessary for the pause response, while activation of D5R blocked the pause response in control mice. Furthermore, in the LDOPA off-state of dyskinetic mice, the absence of the pause response was restored by the application of clozapine. The authors claimed that (1) the D5R-Kv1 pathway contributes to the cholinergic interneuron pause response in a phasic dopamine concentration-dependent manner, and (2) clozapine inhibits D5R in the L-DOPA off state, which restores the pause response.
Strengths:
The electrophysiological and pharmacological approaches used in this study are powerful tools for testing channel properties and functions. The authors' group has well-established these methodologies and analysis pipelines. Indeed, the data presented were robust and reliable.
Thank you for your comments.
Weaknesses:
Although the paper has strengths in its methodological approaches, there is a significant gap between the presented data and the authors' claims.
There was no direct demonstration that the D5R-Kv1 pathway is dominant when dopamine levels are high. The term 'high' is ambiguous, and it raises the question of whether the authors believe that dopamine levels do not reach the threshold required to activate D5R under physiological conditions.
We acknowledge that further work is necessary to clarify the role of the D5R in physiological conditions. While we haven’t found effects of the D1/D5 receptor antagonist SCH23390 on the pause response in control animals (Fig. 3), it is still possible that dopamine levels reach the threshold to stimulate D5R when burst firing of dopaminergic neurons contributes to dopamine release. We believe the pause response depends, among other factors, on the relative stimulation levels of SCIN D2 and D5 receptors, which is likely not an all-or-nothing phenomenon. To reduce ambiguity, we have eliminated the labels referring to dopamine levels in Figure 6F.
Furthermore, the data presented in Figure 6 are confusing. If clozapine inhibits active D5R and restores the pause response, the D5R antagonist SCH23390 should have the same effect. The data suggest that clozapine-induced restoration of the pause response might be mediated by other receptors, rather than D5R alone.
Thank you for letting us clarify this issue. Please note that the levels of endogenous dopamine 24 h after the last L-DOPA challenge in severe parkinsonian mice are expected to be very low. In the absence of an agonist, a pure D1/D5 antagonist would not exert an effect, as demonstrated with SCH23390 alone, which did not have an impact on the SCIN response to thalamic stimulation (Fig. 6). While clozapine can also act as a D1/D5 receptor antagonist, its D1/D5 effects in absence of an agonist are attributed to its inverse agonist properties (PMID: 24931197). Notably, SCH23390 prevented the effect of clozapine, allowing us to conclude that ligand-independent D1/D5 receptor-mediated mechanisms are involved in suppressing the pause response in dyskinetic mice. We now made it clearer in the third paragraph of the Discussion.
Reviewer #2 (Public review):
Summary:
This manuscript by Tubert et al presents the role of the D5 receptor in modulating the striatal cholinergic interneuron (CIN) pause response through D5R-cAMP-Kv1 inhibitory signaling. Their model elucidates the on / off switch of CIN pause, likely due to the different DA affinity between D2R and D5R. This machinery may be crucial in modulating synaptic plasticity in cortical-striatal circuits during motor learning and execution. Furthermore, the study bridges their previous finding of CIN hyperexcitability (Paz et al., Movement Disorder 2022) with the loss of pause response in LID mice.
Strengths:
The study had solid findings, and the writing was logically structured and easy to follow. The experiments are well-designed, and they properly combined electrophysiology recording, optogenetics, and pharmacological treatment to dissect/rule out most, if not all, possible mechanisms in their model.
Thank you for your comments.
Weaknesses:
The manuscript is overall satisfying with only some minor concerns that need to be addressed. Manipulation of intracellular cAMP (e.g. using pharmacological analogs or inhibitors) can add additional evidence to strengthen the conclusion.
Thank you for the suggestion. While we acknowledge that we are not providing direct evidence of the role of cAMP, we chose not to conduct these experiments because cAMP levels influence several intrinsic and synaptic currents beyond Kv1, significantly affecting membrane oscillations and spontaneous firing, as shown in Paz et al. 2021. However, we are modifying the fourth paragraph of the Discussion so there is no misinterpretation about our findings in the current work.
Reviewer #3 (Public review):
Summary:
Tubert et al. investigate the mechanisms underlying the pause response in striatal cholinergic interneurons (SCINs). The authors demonstrate that optogenetic activation of thalamic axons in the striatum induces burst activity in SCINs, followed by a brief pause in firing. They show that the duration of this pause correlates with the number of elicited action potentials, suggesting a burst-dependent pause mechanism. The authors demonstrated this burst-dependent pause relied on Kv1 channels. The pause is blocked by an SKF81297 and partially by sulpiride and mecamylamine, implicating D1/D5 receptor involvement. The study also shows that the ZD7288 does not reduce the duration of the pause and that lesioning dopamine neurons abolishes this response, which can be restored by clozapine.
Weaknesses:
While this study presents an interesting mechanism for SCIN pausing after burst activity, there are several major concerns that should be addressed:
(1) Scope of the Mechanism:
It is important to clarify that the proposed mechanism may apply specifically to the pause in SCINs following burst activity. The manuscript does not provide clear evidence that this mechanism contributes to the pause response observed in behavioral animals. While the thalamus is crucial for SCIN pauses in behavioral contexts, the exact mechanism remains unclear. Activating thalamic input triggers burst activity in SCINs, leading to a subsequent pause, but this mechanism may not be generalizable across different scenarios. For instance, approximately half of TANs do not exhibit initial excitation but still pause during behavior, suggesting that the burst-dependent pause mechanism is unlikely to explain this phenomenon. Furthermore, in behavioral animals, the duration of the pause seems consistent, whereas the proposed mechanism suggests it depends on the prior burst, which is not aligned with in vivo observations. Additionally, many in vivo recordings show that the pause response is a reduction in firing rate, not complete silence, which the mechanism described here does not explain. Please address these in the manuscript.
Thank you for your valuable feedback. While the absence of an initial burst in some TANs in vivo may suggest the involvement of alternative or additional mechanisms, this does not exclude a participation of Kv1 currents. We have seen that subthreshold depolarizations induced by thalamic inputs are sufficient to produce an afterhyperpolarization (AHP) mediated by Kv1 channels (see Tubert et al., 2016, PMID: 27568555). Although such subthreshold depolarizations are not captured in current recordings from behaving animals, intracellular in vivo recordings have demonstrated an intrinsically generated AHP after subthreshold depolarization of SCIN caused by stimulation of excitatory afferents (PMID: 15525771). Additionally, when pause duration is plotted against the number of spikes elicited by thalamic input (Fig. 1G), we found that one elicited spike is followed by an interspike interval 1.4 times longer than the average spontaneous interspike interval. We acknowledge the potential involvement of additional factors, including a decrease of excitatory thalamic input coinciding with the pause, followed by a second volley of thalamic inputs (Fig. 1J-K, after observations by Matsumoto et al., 2001- PMID: 11160526), as well as the timing of elicited spikes relative to ongoing spontaneous firing (Fig. 1D-E). Dopaminergic modulation (Fig. 3) and regional differences among striatal regions (PMID: 24559678) may also contribute to the complexity of these dynamics.
(2) Terminology:
The use of "pause response" throughout the manuscript is misleading. The pause induced by thalamic input in brain slices is distinct from the pause observed in behavioral animals. Given the lack of a clear link between these two phenomena in the manuscript, it is essential to use more precise terminology throughout, including in the title, bullet points, and body of the manuscript.
While we acknowledge that our study does not include in vivo evidence, we believe ex vivo preparations have been instrumental in elucidating the mechanisms underlying the responses observed in vivo. We also agree with previous ex vivo studies in using consistent terminology. However, we will clarify the ex vivo nature of our work in the abstract and bullet points for greater transparency.
(3) Kv1 Blocker Specificity:
It is unclear how the authors ruled out the possibility that the Kv1 blocker did not act directly on SCINs. Could there be an indirect effect contributing to the burst-dependent pause? Clarification on this point would strengthen the interpretation of the results.
Thank you for letting us clarify this issue. In our previous work (Tubert et al., 2016) we showed that the Kv1.3 and Kv1.1 subunits are selectively expressed in SCIN throughout the striatum. Moreover, gabaergic transmission is blocked in our preparations. We are including a phrase to make it clearer in the manuscript (Results section, subheading “The pause response to thalamic stimulation requires activation of Kv1 channels”).
(4) Role of D1 Receptors:
While it is well-established that activating thalamic input to SCINs triggers dopamine release, contributing to SCIN pausing (as shown in Figure 3), it would be helpful to assess the extent to which D1 receptors contribute to this burst-dependent pause. This could be achieved by applying the D1 agonist SKF81297 after blocking nAChRs and D2 receptors.
Thank you for letting us clarify this point. We show that blocking D2R or nAChR reduces the pause only for strong thalamic stimulation eliciting 4 SCIN spikes (Figure 3G), whereas the D1/D5 agonist SKF81297 is able to reduce the pause induced by weaker stimulation as well (Figure 3C). In addition, the D1/D5 receptor antagonist SCH23390 does not modify the pause response (Figure 3C). This may indicate that nAChR-mediated dopamine release induced by thalamic-induced bursts more efficiently activates D2R compared to D5R. We speculate that, in this context, lack of D5R activation may be necessary to keep normal levels of Kv1.3 currents necessary for SCIN pauses.
(5) Clozapine's Mechanism of Action:
The restoration of the burst-dependent pause by clozapine following dopamine neuron lesioning is interesting, but clozapine acts on multiple receptors beyond D1 and D5.
Although it may be challenging to find a specific D5 antagonist or inverse agonist, it would be more accurate to state that clozapine restores the burst-dependent pause without conclusively attributing this effect to D5 receptors.
Thank you for your insightful observation. We acknowledge the difficulty of targeting dopamine receptors pharmacologically due to the lack of highly selective D1/D5 inverse agonists. We used SCH23390, which is a highly selective D1/D5 receptor antagonist devoid of inverse agonist effects, to block clozapine’s ability to restore SCIN pauses (Figure 6C). This indicates that the restoration of SCIN pauses by clozapine depends on D1/D5 receptors. Furthermore, in a previous study, we demonstrated that clozapine’s effect on restoring SCIN excitability in dyskinetic mice (a phenomenon mediated by Kv1 channels in SCIN; Tubert et al., 2016) was not due to its action on serotonin receptors (Paz, Stahl et al., 2022). While our data do not rule out the potential contribution of other receptors, such as muscarinic acetylcholine receptors, we believe they strongly support the role of D1/D5 receptors. To reflect this, we added a statement discussing the potential contribution of receptors beyond D1/D5 in the last paragraph of the Discussion.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) The effect of MgTx was not consistent with the previous study (Tubert, 2016). I expected MgTx to increase the basal firing rate of cholinergic interneurons.
Thank you for highlighting this. In our previous study we used ACSF in the recording pipette, instead of the intracellular solution -higher in potassium- used in the present study. This is likely related to the higher spontaneous firing rates of SCIN observed in the present study, which made the SCIN response stand out. In addition, our previous study analyzed the effect of MgTx on spontaneous firing frequency of SCIN isolated from major circuit regulation by adding CNQX and picrotoxin to the bath, while in this study we needed to preserve the thalamic input and only picrotoxin in the bath was used. Given these differences, the two conditions are not strictly comparable but rather give complementary information.
(2) In the text, the authors claim that "SCINs recorded in the parkinsonian OFF-L-DOPA condition show an increase in membrane excitability that mimics changes acutely induced by SKF81297 in SCINs from control mice." However, the data for SKF81297 do not support this claim.
We modified the text to make it clearer that the cited phrase refers to a previous publication (PMID: 35535012) in which SCIN intrinsic excitability was characterized via analysis of responses to somatic current injection in whole-cell recordings. In the present study Fig. 3D shows SKF81297 effects on interspike intervals during spontaneous activity with a trend towards increased firing, and Fig. 4E a lack of effect on “burst duration” for responses with different numbers of spikes elicited by thalamic afferent stimulation.
(3) I recommend testing whether other receptors, such as D2R, contribute to the clozapineinduced pause response in the L-DOPA off state.
Thank you for your suggestion. We acknowledge that studying the role of D2R is important. However, our preliminary data suggest that a comprehensive follow up study, which is beyond the scope of this manuscript, is necessary to understand their role.
Reviewer #2 (Recommendations for the authors):
(1) For Figure 1D-E, I understand that the authors are trying to state that the previous spontaneous spike contributes to a hyperpolarized window that induces a delay in the evoked spikes. However, it is almost impossible to discriminate between spontaneous and evoked spikes in this experiment. Furthermore, considering the tonic firing property, I highly suspect that even a sham control design (no optogenetic light) will give you a similar distribution as in Figure 1E (the longer IN X1, the shorter in X2).
We agree that some spikes following stimulus onset may have occurred independently of the light stimulus, as it is also possible during behavioral tasks. We used the baseline recordings to estimate the effects of a sham stimulus as requested and included the data in Fig. 1E-F. As expected, the sham stimulation data showed a similar inverse relationship with the time elapsed from the preceding spike, but latencies were longer than with the stimulus (except for times close to the average ISI), suggesting that the optical stimulation increased the probability of evoking a spike (Fig. 1F). Remarkably, the pause following this threshold stimulation was significantly longer than the baseline ISI, as reported in the main text (Results section, last sentence of first paragraph).
(2) The authors used optogenetics to induce thalamic inputs to induce the pause after bursts. Considering CINs also receive inputs from different brain regions, e.g. cortex, does this phenomena/pause after bursts also exist following cortical inputs?
We did not study the SCIN response to cortical inputs, but both thalamic and cortical inputs seem to drive SCIN pause-responses as observed by others (PMID: 24553950).
(3) The effect of the D5R inverse agonism, and the further combined with D5R agonist and antagonist, faithfully reveal/confirm the increase of ligand-independent activity of D5R in LID reported previously. It would be ideal to also directly modulate intracellular cAMP (as in the 2022 paper) to confirm the rescue effects on the CIN pause response.
Please, see our response in the public review.
(4) In healthy conditions, the balance between D2R and D5R signaling (shown in Figure 6F left) switches the pause and no pause modes which potentially contributes to cortical-striatal plasticity. How about in LID off L-DOPA condition? Is it possible to rescue/modulate the pause on/off mode by D2R agonism in LID?
We haven’t tested the effect of D2 agonists yet, but this is scheduled for follow up studies.
Reviewer #3 (Recommendations for the authors):
(1) The authors use the ratio of pause duration to baseline ISI to describe the pause, which is useful for detecting significant differences. However, it would be beneficial to also report the actual duration of the burst-dependent pause to provide readers with a clearer understanding of the variation in pauses.
In all figures we report the average baseline ISI duration for each experiment / experimental condition, allowing readers to estimate actual pause durations. We added in the main text actual average pause durations corresponding to Fig. 1H, which are representative of those observed along the study.
(2) In Figure 3D, a more detailed comparison would be helpful, as there appears to be a significant difference between the SKF81297 group and others.
We acknowledge that there might be a trend towards reduced ISIs, however, it was statistically non-significant (see legend of figure 3). In addition, the effect of SKF81297 seems unrelated to this trend, as its effect is also seen under the effect of ZD7288, which substantially prolongs the baseline ISI (Fig. 4E-F).
-
-
www.researchsquare.com www.researchsquare.com
-
Author response:
The following is the authors’ response to the current reviews.
Comments on revisions:
I thank the authors for addressing my comments.
- I believe that additional in vivo experiments, or the inclusion of controls for the specificity of the inhibitor, which the authors argue are beyond the scope of the current study, are essential to address the weaknesses and limitations stated in my current evaluation.
We respectfully acknowledge the reviewer's concern but would like to reiterate that demonstrating the specificity of the inhibitor is beyond the scope of this study. Alpelisib (BYL-719) is a clinically approved drug widely recognized as a specific inhibitor of p110α, primarily used in the treatment of breast cancer. Its selectivity for the p110α isoform has been extensively validated in the literature.
In our study, we used Alpelisib to assess whether pharmacological inhibition of p110α would produce effects similar to those observed in our genetic model, which is particularly relevant for the potential translational implications of our findings. Given the well-documented specificity of this inhibitor, we believe that additional controls to confirm its selectivity are unnecessary within the context of this study. Instead, our focus has been to investigate the functional role of p110α activity in macrophage-driven inflammation using the models described.
We appreciate the reviewer’s insight and hope this clarification addresses their concern.
- While the neutrophil depletion suggests neutrophils are not required for the phenotype, there are multiple other myeloid cells, in addition to macrophages, that could be contributing or accounting for the in vivo phenotype observed in the mutant strain (not macrophage specific).
We appreciate the reviewer's observation regarding the potential involvement of other myeloid cells. However, it is important to highlight that the inflammatory process follows a well-characterized sequential pattern. Our data clearly demonstrate that in the paw inflammation model:
· Neutrophils are effectively recruited, as evidenced by the inflammatory abscess filled with polymorphonuclear cells.
· However, macrophages fail to be recruited in the RBD model.
Given that this critical step is disrupted, it is reasonable to expect that any subsequent steps in the inflammatory cascade would also be affected. A precise dissection of the role of other myeloid populations would require additional lineage-specific models to selectively target each subset, which, as we have previously stated, would be the focus of an independent study.
While we cannot entirely exclude the contribution of other myeloid cells, our data strongly support the conclusion that macrophages are, at the very least, a key component of the observed phenotype. We explicitly address this point in the Discussion section, where we acknowledge the potential involvement of other myeloid populations.
- Inclusion of absolute cell numbers (in addition to the %) is essential. I do not understand why the authors are not including these data. Have they not counted the cells?
We appreciate the reviewer’s concern regarding the inclusion of absolute cell numbers. However, as stated in the Materials and Methods section, we analyzed 50,000 cells per sample, and the percentages reported in the manuscript are directly derived from this standardized count.
Our decision to present the data as percentages follows standard practices in flow cytometry-based analyses, as it allows for a clearer and more biologically relevant comparison of relative changes between conditions. This approach ensures consistency across samples and facilitates the interpretation of population dynamics during inflammation.
We would also like to clarify that all data are based on actual counts, and rigorous controls were implemented throughout the study to ensure accuracy and reproducibility. We hope this explanation addresses the reviewer’s concern and provides further clarity on our approach.
- Lastly, inclusion of representatives staining and gating strategies for all immune profiling measurements carried out by flow cytometry is important. This point has not been addressed, not even in writing.
We appreciate the reviewer’s concern regarding the inclusion of absolute cell numbers. However, as stated in the Materials and Methods section, we analyzed 50,000 cells per sample, and the percentages reported in the manuscript are directly derived from this standardized count.
Our decision to present the data as percentages follows standard practices in flow cytometry-based analyses, as it allows for a clearer and more biologically relevant comparison of relative changes between conditions. This approach ensures consistency across samples and facilitates the interpretation of population dynamics during inflammation.
We would also like to clarify that all data are based on actual counts, and rigorous controls were implemented throughout the study to ensure accuracy and reproducibility. We hope this explanation addresses the reviewer’s concern and provides further clarity on our approach.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
This study by Alejandro Rosell et al. reveals the immunoregulatory role of the RAS-p110α pathway in macrophages, specifically in regulating monocyte extravasation and lysosomal digestion during inflammation. Disrupting this pathway, through genetic tools or pharmacological intervention in mice, impairs the inflammatory response, leading to delayed resolution and more severe acute inflammation. The authors suggest that activating p110α with small molecules could be a potential therapeutic strategy for treating chronic inflammation. These findings provide important insights into the mechanisms by which p110α regulates macrophage function and the overall inflammatory response.
The updates made by the authors in the revised version have addressed the main points raised in the initial review, further improving the strength of their findings.
Reviewer #2 (Public review):
Summary:
Cell intrinsic signaling pathways controlling the function of macrophages in inflammatory processes, including in response to infection, injury or in the resolution of inflammation are incompletely understood. In this study, Rosell et al. investigate the contribution of RAS-p110α signaling to macrophage activity. p110α is a ubiquitously expressed catalytic subunit of PI3K with previously described roles in multiple biological processes including in epithelial cell growth and survival, and carcinogenesis. While previous studies have already suggested a role for RAS-p110α signaling in macrophage function, the cell intrinsic impact of disrupting the interaction between RAS and p110α in this central myeloid cell subset is not known.
Strengths:
Exploiting a sound previously described genetically engineered mouse model that allows tamoxifen-inducible disruption of the RAS-p110α pathway and using different readouts of macrophage activity in vitro and in vivo, the authors provide data consistent with their conclusion that alteration in RAS-p110α signaling impairs various but selective aspects of macrophage function in a cell-intrinsic manner.
Weaknesses:
My main concern is that for various readouts, the difference between wild-type and mutant macrophages in vitro or between wild-type and Pik3caRBD mice in vivo is modest, even if statistically significant. To further substantiate the extent of macrophage function alteration upon disruption of RAS-p110α signaling and its impact on the initiation and resolution of inflammatory responses, the manuscript would benefit from a more extensive assessment of macrophage activity and inflammatory responses in vivo.
Thank you for raising this point. We understand the reviewer’s concern regarding the modest yet statistically significant differences observed between wild-type and mutant macrophages in vitro, as well as between wild-type and Pik3ca<sup>RBD</sup> mice in vivo. Our current study aimed to provide a foundational exploration of the role of RAS-p110α signaling in macrophage function and inflammatory response, focusing on a set of core readouts that demonstrate the physiological relevance of this pathway. While a more extensive in vivo assessment could offer additional insights into macrophage activity and the nuanced effects of RAS-p110α disruption, it would require an array of new experiments that are beyond the current scope.
However, we believe that the current data provide significant insights into the pathway’s role, highlighting important alterations in macrophage function and inflammatory processes due to RAS-p110α disruption. These findings lay the groundwork for future studies that can build upon our results with a more comprehensive analysis of macrophage activity in various inflammatory contexts.
In the in vivo model, all cells have disrupted RAS-p100α signaling, not only macrophages. Given that other myeloid cells besides macrophages contribute to the orchestration of inflammatory responses, it remains unclear whether the phenotype described in vivo results from impaired RAS-p100α signaling within macrophages or from defects in other haematopoietic cells such as neutrophils, dendritic cells, etc.
Thank you for raising this point. To address this, we have added a paragraph in the Discussion section acknowledging that RAS-p110α signaling disruption affects all hematopoietic cells (lines 461-470 in the discussion). However, we also provide several lines of evidence that support macrophages as the primary cell type involved in the observed phenotype. Specifically, we note that neutrophil depletion in chimera mice did not alter transendothelial extravasation, and that macrophages were the primary cell type showing significant functional defects in the paw edema model. These findings, combined with specific deficiencies in myeloid populations, suggest a predominant role of macrophages in the impaired inflammatory response, though we acknowledge the potential contributions of other myeloid cells.
Inclusion of information on the absolute number of macrophages, and total immune cells (e.g. for the spleen analysis) would help determine if the reduced frequency of macrophages represents an actual difference in their total number or rather reflects a relative decrease due to an increase in the number of other/s immune cell/s.
Thank you for this suggestion. We understand the value of presenting actual measurements; however, we opted to display normalized data to provide a clearer comparison between WT and RBD mice, as this approach highlights the relative differences in immune populations between the two groups. Normalizing data helps to focus on the specific impact of the RAS-p110α disruption by minimizing inter-sample variability that can obscure these differences.
To further address the reviewer’s concern regarding the interpretation of macrophage frequencies, we have included a pie chart that represents the relative proportions of the various immune cell populations studied within our dataset. Author response image 1 provides a visual overview of the immune cell distribution, enabling a clearer understanding of whether the observed decrease in macrophage frequency represents an actual reduction in total macrophage numbers or a shift in their relative abundance due to changes in other immune populations.
We hope this approach satisfactorily addresses reviewer’s concerns by providing both a normalized dataset for clearer interpretation of genotype-specific effects and an overall immune profile that contextualizes macrophage frequency within the broader immune cell landscape.
Author response image 1.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
As proof of concept data that activation of RAS-p110α signaling constitutes indeed a putative approach for treating chronic inflammation is not included in the manuscript, I suggest removing this implication from the abstract.
Thank you for this suggestion. We have now removed this implication from the abstract to maintain clarity and to better reflect the scope of the data presented in the manuscript.
Inclusion of a control in which RBD/- cells are also treated with BYL719, across experiments in which the inhibitor is used, would be important to determine, among other things, the specificity of the inhibitor.
We appreciate the reviewer’s suggestion to include RBD/- cells treated with BYL719 as an additional control. However, we would like to clarify that this approach would raise a different biological question, as treating RBD mice with BYL719 would not only address the specificity of the inhibitor but also examine the combined effects of genetic and pharmacologic disruptions on PI3K pathway signaling. Investigating this dual disruption falls outside the scope of our current study, which is focused specifically on the effects of RAS-p110α disruption.
It is also important to note that our RBD mouse model selectively disrupts RAS-mediated activation of p110α, while PI3K activation can still occur through other pathways, such as receptor tyrosine kinases (RTKs) and G protein-coupled receptors (GPCRs). Thus, inhibiting p110α with BYL719 would produce broader effects beyond the inhibition of RAS-PI3K signaling, impacting PI3K activation regardless of its upstream source.
In addition, incorporating this control would require us to repeat nearly all experiments in the manuscript, as it would necessitate generating and analyzing new samples for each experimental condition. Given the scope and resources involved, we believe this approach is unfeasible at this stage of the revision process.
We hope this explanation is satisfactory and that the current data in the manuscript provide a rigorous assessment of the RAS-p110α signaling pathway within the defined experimental scope.
Figure 3I is missing the statistical analysis (this is mentioned in the legend though).
Thank you for pointing this out. We apologize for the oversight. The statistical analysis for Figure 3I has now been added.
Gating strategies and representative staining should be included more generally across the manuscript.
Thank you for this suggestion. To address this, we have added a new supplementary figure (Figure 2-Supplement Figure 2) that illustrates the gating strategy along with a representative dataset. Additionally, a brief summary of the gating strategy has been included in the main text to further clarify the methodology.
It is recommended that authors show actual measurements rather than only data normalized to the control (or arbitrary units).
Thank you for this suggestion. We understand the value of presenting actual measurements; however, we opted to display normalized data to provide a clearer comparison between WT and RBD mice, as this approach highlights the relative differences in immune populations between the two groups. Normalizing data helps to focus on the specific impact of the RAS-p110α disruption by minimizing inter-sample variability that can obscure these differences.
To further address the reviewer’s concern regarding the interpretation of macrophage frequencies, we have included a pie chart that represents the relative proportions of the various immune cell populations studied within our dataset. Author response image 1 provides a visual overview of the immune cell distribution, enabling a clearer understanding of whether the observed decrease in macrophage frequency represents an actual reduction in total macrophage numbers or a shift in their relative abundance due to changes in other immune populations.
We hope this approach satisfactorily addresses reviewer’s concerns by providing both a normalized dataset for clearer interpretation of genotype-specific effects and an overall immune profile that contextualizes macrophage frequency within the broader immune cell landscape.
-
eLife Assessment
This useful study investigates the impact of disrupting the interaction of RAS with the PI3K subunit p110α in macrophage function in vitro and inflammatory responses in vivo. Solid data overall supports a role for RAS-p110α signalling in regulating macrophage activity and so inflammation, however for many of the readouts presented the magnitude of the phenotype is not particularly pronounced. Further analysis would be required to substantiate the claims that RAS-p110α signalling plays a key role in macrophage function. Of note, the molecular mechanisms of how exactly p110α regulates the functions in macrophages have not yet been established.
-
Reviewer #2 (Public review):
Summary:
Cell intrinsic signaling pathways controlling the function of macrophages in inflammatory processes, including in response to infection, injury or in the resolution of inflammation are incompletely understood. In this study, Rosell et al. investigate the contribution of RAS-p110α signaling to macrophage activity. p110α is a ubiquitously expressed catalytic subunit of PI3K with previously described roles in multiple biological processes including in epithelial cell growth and survival, and carcinogenesis. While previous studies have already suggested a role for RAS-p110α signaling in macrophage function, the cell intrinsic impact of disrupting the interaction between RAS and p110α in this central myeloid cell subset is not known.
Strengths:
Exploiting a sound previously described genetically engineered mouse model that allows tamoxifen-inducible disruption of the RAS-p110α pathway and using different readouts of macrophage activity in vitro and in vivo, the authors provide data consistent with their conclusion that alteration in RAS-p110α signaling impairs various but selective aspects of macrophage function in a cell-intrinsic manner.
Weaknesses:
My main concern is that for various readouts, the difference between wild-type and mutant macrophages in vitro or between wild-type and Pik3caRBD mice in vivo is modest, even if statistically significant. To further substantiate the extent of macrophage function alteration upon disruption of RAS-p110α signaling and its impact on the initiation and resolution of inflammatory responses, the manuscript would benefit from a more extensive assessment of macrophage activity and inflammatory responses in vivo.
In the in vivo model, all cells have disrupted RAS-p100α signaling, not only macrophages. Given that other myeloid cells besides macrophages contribute to the orchestration of inflammatory responses, it remains unclear whether the phenotype described in vivo results from impaired RAS-p100α signaling within macrophages or from defects in other haematopoietic cells such as neutrophils, dendritic cells, etc.
Inclusion of information on the absolute number of macrophages, and total immune cells (e.g. for the spleen analysis) would help determine if the reduced frequency of macrophages represents an actual difference in their total number or rather reflects a relative decrease due to an increase in the number of other/s immune cell/s.
Comments on revisions:
I thank the authors for addressing my comments.<br /> - I believe that additional in vivo experiments, or the inclusion of controls for the specificity of the inhibitor, which the authors argue are beyond the scope of the current study, are essential to address the weaknesses and limitations stated in my current evaluation.<br /> - While the neutrophil depletion suggests neutrophils are not required for the phenotype, there are multiple other myeloid cells, in addition to macrophages, that could be contributing or accounting for the in vivo phenotype observed in the mutant strain (not macrophage specific).<br /> - Inclusion of absolute cell numbers (in addition to the %) is essential. I do not understand why the authors are not including these data. Have they not counted the cells?<br /> - Lastly, inclusion of representatives staining and gating strategies for all immune profiling measurements carried out by flow cytometry is important. This point has not been addressed, not even in writing.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript reports important findings that the methyltransferase METTL3 is involved in the repair of abasic sites and uracil in DNA, mediating resistance to floxuridine-driven cytotoxicity. The presented evidence is conclusive for the involvement of m6A in DNA involving single cell imaging and mass spectrometry data. The authors present convincing evidence that the m6A signal does not result from bacterial contamination or RNA.
-
Reviewer #1 (Public review):
Summary:
The authors sought to identify unknown factors involved in the repair of uracil in DNA through a CRISPR knockout screen.
Strengths:
The screen identified both known and unknown proteins involved in DNA repair resulting from uracil or modified uracil base incorporation into DNA. The conclusion is that the protein activity of METTL3, which converts A nucleotides to 6mA nucleotides, plays a role in the DNA damage/repair response. The importance of METTL3 in DNA repair, and its colocalization with a known DNA repair enzyme, UNG2, is well characterized.
-
Reviewer #2 (Public review):
Summary:
In this work, the authors performed a CRISPR knockout screen in the presence of floxuridine, a chemotherapeutic agent that incorporates uracil and fluoro-uracil into DNA, and identified unexpected factors, such as the RNA m6A methyltransferase METTL3, as required to overcome floxuridine-driven cytotoxicity in mammalian cells. Interestingly, the observed N6-methyladenosine was embedded in DNA, which has been reported as DNA 6mA in mammalian genomes and is currently confirmed with mass spectrometry in this model. Therefore, this work consolidated the functional role of mammalian genomic DNA 6mA, and supported with solid evidence to uncover the METTL3-6mA-UNG2 axis in response to DNA base damage.
Strengths:
In this work, the authors took an unbiased, genome-wide CRISPR approach to identify novel factors involved in uracil repair with potential clinical interest.
The authors designed elegant experiments to confirm the METTL3 works through genomic DNA, adding the methylation into DNA (6mA) but not the RNA (m6A), in this base damage repair context. The authors employ different enzymes, such as RNase A, RNase H, DNase, and liquid chromatography coupled to tandem mass spectrometry to validate that METTL3 deposits 6mA in DNA in response to agents that increase genomic uracil.
They also have the Mettl3-KO and the METTL3 inhibition results to support their conclusion.
Weaknesses:
The authors used the METTL3 inhibitor and Mettl3-KO to validate the METTL3-6mA-UNG2 functional roles. While not an outright weakness, rescue experiments of the KO line with wild type and the METTL3 catalytic mutant would have further strengthened the evidence.
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This manuscript reports important findings that the methyltransferase METTL3 is involved in the repair of abasic sites and uracil in DNA, mediating resistance to floxuridine-driven cytotoxicity. Convincing evidence shows the involvement of m6A in DNA based on single cell imaging and mass spec data. The authors present evidence that the m6A signal does not result from bacterial contamination or RNA, but the text does not make this overly clear.
We thank the editors for recognizing the importance of our work and the relevance of METTL3 and 6mA in DNA repair. We agree the evidence presented can be regarded as convincing, in that it includes validation with orthogonal approaches and excludes the source of 6mA being RNA or bacterial contamination.
To clarify, the identification of 6mA in DNA, upon DNA damage, is based first on immunofluorescence observations using an anti-m6A antibody. In this setting, removal of RNA with RNase treatment fails to reduce the 6mA signal, excluding the possibility that the source of signal is RNA. In contrast, removal of DNA with DNase treatment removes all 6mA signal, strongly suggesting that the species carrying the N6-methyladenosine modification is DNA (Figure 3D, E). Importantly, in Figure 3F, G, we provide orthogonal, quantitative mass spectrometry data that independently confirm this finding. Mass spectrometry-liquid chromatography of DNA analytes, conclusively shows the presence of 6mA in DNA upon treatment with DNA damaging agents and excludes that the source is RNA, based on exact mass.
Cells only show the 6mA signal when treated with DNA damaging agents, and the 6mA is absent from untreated cells (Figure 3D, E, H, I). This provides strong evidence that the 6mA signal is not a result of bacterial contamination in our cell lines. Furthermore, our cell lines are routinely tested for mycoplasma contamination. It could be possible that stock solutions of DNA damaging agents may be contaminated, but this would need to be true for all individual drugs and stocks tested, which is highly unlikely. Moreover, the data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3H, I) provides strong evidence against bacterial contamination in our stocks.
In summary, we provide conclusive evidence, based on orthogonal methods, that the METTL3-dependent N6-methyladenosine modification is deposited in DNA, not RNA, in response to DNA damage and have now clarified these points in the results and discussion.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The authors sought to identify unknown factors involved in the repair of uracil in DNA through a CRISPR knockout screen.
Strengths:
The screen identified both known and unknown proteins involved in DNA repair resulting from uracil or modified uracil base incorporation into DNA. The conclusion is that the protein activity of METTL3, which converts A nucleotides to 6mA nucleotides, plays a role in the DNA damage/repair response. The importance of METTL3 in DNA repair, and its colocalization with a known DNA repair enzyme, UNG2, is well characterized.
Weaknesses:
This reviewer identified no major weaknesses in this study. The manuscript could be improved by tightening the text throughout, and more accurate and consistent word choice around the origin of U and 6mA in DNA. The dUTP nucleotide is misincorporated into DNA, and 6mA is formed by methylation of the A base present in DNA. Using words like 6mA "deposition in DNA" seems to imply it results from incorporation of a methylated dATP nucleotide during DNA synthesis.
The increased presence of 6mA during DNA damage could result from methylation at the A base itself (within DNA) or from incorporation of pre-modified 6mA during DNA synthesis. Our data do not directly discriminate between these two mechanisms, and we clarified this point in the discussion.
Reviewer #2 (Public review):
Summary:
In this work, the authors performed a CRISPR knockout screen in the presence of floxuridine, a chemotherapeutic agent that incorporates uracil and fluoro-uracil into DNA, and identified unexpected factors, such as the RNA m6A methyltransferase METTL3, as required to overcome floxuridine-driven cytotoxicity in mammalian cells. Interestingly, the observed N6-methyladenosine was embedded in DNA, which has been reported as DNA 6mA in mammalian genomes and is currently confirmed with mass spectrometry in this model. Therefore, this work consolidated the functional role of mammalian genomic DNA 6mA, and supported with solid evidence to uncover the METTL3-6mA-UNG2 axis in response to DNA base damage.
Strengths:
In this work, the authors took an unbiased, genome-wide CRISPR approach to identify novel factors involved in uracil repair with potential clinical interest.
The authors designed elegant experiments to confirm the METTL3 works through genomic DNA, adding the methylation into DNA (6mA) but not the RNA (m6A), in this base damage repair context. The authors employ different enzymes, such as RNase A, RNase H, DNase, and liquid chromatography coupled to tandem mass spectrometry to validate that METTL3 deposits 6mA in DNA in response to agents that increase genomic uracil.
They also have the Mettl3-KO and the METTL3 inhibition results to support their conclusion.
Weaknesses:
Although this study demonstrates that METTL3-dependent 6mA deposition in DNA is functionally relevant to DNA damage repair in mammalian cells, there are still several concerns and issues that need to be improved to strengthen this research.
First, in the whole paper, the authors never claim or mention the mammalian cell lines contamination testing result, which is the fundamental assay that has to be done for the mammalian cell lines DNA 6mA study.
Our cell lines are routinely tested for bacterial contamination, specifically mycoplasma, and we state this information in the revised manuscript.
Importantly, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on the presence of DNA damage and not caused by contamination in the cell lines (Figure 3D, E, H, I). While it could be possible that stock solutions of DNA damaging agents may be contaminated, this would need to be the case for all individual drugs and stocks tested that induce 6mA, which is very unlikely. Finally, the data showing 6mA signal is not significantly different from untreated cells when a DNA damaging agent is combined with a METTL3 inhibitor (Figure 3 H, I) provides strong evidence against bacterial contamination in our drug stocks.
Second, in the whole work, the authors have not supplied any genomic sequencing data to support their conclusions. Although the sequencing of DNA 6mA in mammalian models is challenging, recent breakthroughs in sequencing techniques, such as DR-Seq or NT/NAME-seq, have lowered the bar and improved a lot in the 6mA sequencing assay. Therefore, the authors should consider employing the sequencing methods to further confirm the functional role of 6mA in base repair.
While we agree that it could be important to understand the precise genomic location of 6mA in relation to DNA damage, this is outside the scope of the current study. Moreover, this exercise may prove unproductive. If 6mA is enriched in DNA at damage sites or as DNA is replicated, the genomic mapping of 6mA is likely to be stochastic. If stochastic, it would be impossible to obtain the read depth necessary to map 6mA accurately.
Third, the authors used the METTL3 inhibitor and Mettl3-KO to validate the METTL36mA-UNG2 functional roles. However, the catalytic mutant and rescue of Mettl3 may be the further experiments to confirm the conclusion.
We believe this to be an excellent suggestion from Reviewer #2 but we are unable to perform the proposed experiment at this time. We encourage future studies to explore the rescue experiment.
Reviewer #3 (Public review):
Summary:
The authors are showing evidence that they claim establishes the controversial epigenetic mark, DNA 6mA, as promoting genome stability.
Strengths:
The identification of a poorly understood protein, METTL3, and its subsequent characterization in DDR is of high quality and interesting.
Weaknesses:
(1) The very presence of 6mA (DNA) in mammalian DNA is still highly controversial and numerous studies have been conclusively shown to have reported the presence of 6mA due to technical artifacts and bacterial contamination. Thus, to my knowledge there is no clear evidence for 6mA as an epigenetic mark in mammals, and consequently, no evidence of writers and readers of 6mA. None of this is mentioned in the introduction. Much of the introduction can be reduced, but a paragraph clearly stating the controversy and lack of evidence for 6mA in mammals needs to be added, otherwise, the reader is given an entirely distorted view of the field.
These concerns must also be clearly in the limitations section and even in the results section which fails to nuance the authors' findings.
We agree with the reviewer that the presence and potential function of 6mA in mammalian DNA has been debated. Importantly, the debate regarding the presence and quantity of 6mA in DNA has been previously restricted to undamaged, baseline conditions. In complete agreement with this notion, we do not detect appreciable levels of 6mA in untreated cells. We revised the introduction section to present the debate about 6mA in DNA. We, however, want to highlight that our study provides, for the first time, convincing evidence (based on two orthogonal methods) that 6mA is present in DNA in response to a stimulus, DNA damage. We do not claim or provide any data that suggest 6mA is a baseline epigenetic mark.
(2) What is the motivation for using HT-29 cells? Moreover, the materials and methods do not state how the authors controlled for bacterial contamination, which has been the most common cause of erroneous 6mA signals to date. Did the authors routinely check for mycoplasma?
HT-29 is a cell line of colorectal origin and chemotherapeutic agents that introduce uracil and uracil derivatives in DNA, as those used in this study, are relevant for the treatment of colorectal cancer. As indicated above, we do not observe 6mA in untreated cells, strongly suggesting that the 6mA signal observed is dependent on DNA damage and not caused by a potential bacterial contamination (Figure 3D, E, H, I). Additionally, our cell lines are routinely tested for bacterial contamination, specifically mycoplasma.
(3) The single cell imaging of 6mA in various cells is nice. The results are confirmed by mass spec as an orthogonal approach. Another orthogonal and quantitative approach to assessing 6mA levels would be PacBio. Similarly, it is unclear why the authors have not performed dot-blots of 6mA for genomic DNA from the given cell lines.
We are confused by this point since an orthogonal approach to detect 6mA, mass spectrometry-liquid chromatography, was employed. This method does not use an antibody and confirms the increase of 6mA in DNA when cells were treated with DNA damaging agents. This data is presented in Figure 3F, G.
It is sensible to hypothesize that the localization of 6mA is consistent with DNA replication (like uracil deposition). In this event, the genomic mapping of 6mA is likely to be stochastic. This would make quantification with PacBio sequencing difficult because it would be very challenging to achieve the appropriate read depth to call a modified base.
Dot blots rely on an antibody and thus are not truly orthogonal to our immunofluorescence-based measurements. We preferred the mass spectrometry-liquid chromatography approach we took as a true orthogonal approach.
(4) The results of Figure 3 need further investigation and validation. If the results are correct the authors are suggesting that the majority of 6mA in their cell lines is present in the DNA, and not the RNA, which is completely contrary to every other study of 6mA in mammalian cells that I am aware of. This could suggest that the antibody is not, in fact, binding to 6mA, but to unmodified adenine, which would explain why the signal disappears after DNAse treatment. Indeed, binding of 6mA to unmethylated DNA is a commonly known problem with most 6mA antibodies and is well described elsewhere.
Based on this and the following comment, we are convinced that Reviewer #3 has overlooked two critical elements of our study:
First, the immunofluorescence work presented in Figure 3, showing 6mA signal in response to DNA damage, uses cells that were pre-extracted to remove excess cytoplasmic RNA. This method is often used in immunofluorescence experiments of this kind. The pre-extraction method removes most of the cytoplasmic content, and the majority of the cytoplasmic m6A RNA signal. Supplementary Figure 3D shows cells that have not been pre-extracted prior to staining. These images show the cytoplasmic m6A signal is abundant if we do not perform the pre-extraction step.
If the antibody used to label 6mA significantly reacted with unmodified adenine, we would expect a large signal in untreated or untreated and denatured conditions. In contrast, an increase in 6mA is not observed in either case.
Second, the orthogonal approach we employed, mass spectrometry coupled with liquid chromatography, measures 6mA DNA analytes specifically by exact mass. This approach does not depend on an antibody and yields results consistent with those from the immunofluorescence experiments.
(5) Given the lack of orthologous validation of the observed DNA 6mA and the lack of evidence supporting the presence of 6mA in mammalian DNA and consequently any functional role for 6mA in mammalian biology, the manuscript's conclusions need to be toned down significantly, and the inherent difficulty in assessing 6mA accurately in mammals acknowledged throughout.
As discussed in response to prior comments, Figure 3 does provide two independent and orthologous methods that demonstrate 6mA presence in DNA specifically, and not RNA, in response to DNA damage. Complementary and orthogonal datasets are presented using either immunofluorescence microscopy or mass spectrometry-liquid chromatography of extracted DNA. The latter method does not rely on an antibody and can discriminate 6mA DNA versus RNA based on exact mass. We revised the text to clarify that Figure 3F, G is a completely orthogonal approach.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors cited most of the related publications; however, the reviewer suggested that three 2015 papers in Cell (Dahua Chen's, Yang Shi's, and Chuan He's) and the 2016 Nature (Andrew Xiao's) article are worth citing here because those are the milestone works reported the genomic DNA 6mA, for the first wave, in eukaryotic and mammalian genomes.
Furthermore, in Tao P. Wu and Andrew Z. Xiao's 2016 Nature article, the result has already emphasized the genomic DNA 6mA is enriched in the H2A.X sites; therefore, that work indicated the link between DNA damage and repair and 6mA's functional role. The authors may add some comments or discussion on this point.
Last but not least, the authors may also need to discuss the reported evidence of DNA 6mA's function in mitochondria.
We thank the reviewer for these suggestions. We revised our introduction and include additional references and discussion points, as suggested by the reviewer.
Reviewer #3 (Recommendations for the authors):
Minor points:
(1) In general, the manuscript is too verbose, and the amount of text can be dramatically reduced/sharpened. The introduction in particular is too long.
We revised the manuscript and reduced text when appropriate.
(2) Each results section can also be condensed to improve clarity significantly. Indeed the results section reads like a 'Result & Discussion' section, which is then followed by a Discussion. Maybe the discussion section can be shortened to a 'conclusion'.
We revised the results section when appropriate and reworked the discussion.
Importantly, we revised the text related to Figure 3 as it does appear that Reviewer #3 did not appreciate key results present in this figure, specifically the orthogonal, mass spectrometry approach validating the discovery of 6mA DNA species (Figure 3F, G). We added a schematic as Figure 3F to further clarify this point as well.
(3) The accession number for sequencing data in GEO data should be provided.
The accession numbers is now provided in the manuscript. GSE282260.
(4) All figures are unnecessarily small and in some cases, supporting figures from the supplementary data should be moved into the main figure to improve clarity.
The figures are of high image quality and can be enlarged easily. If there are specific figures that the reviewer believes will improve clarity, we would be happy to move them.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This paper addresses the important question of quantifying epistasis patterns, which affect the predictability of evolution, by reanalyzing a recently published combinatorial deep mutational scan experiment. The findings are that epistasis is fluid, i.e. strongly background dependent, but that fitness effects of mutations are predictable based on the wild-type phenotype. However, these potentially interesting claims are inadequately supported by the analysis, because measurement noise is not accounted for, arbitrary cutoffs are used, and global nonlinearities are not sufficiently considered. If the results continue to hold after these major improvements in the analysis, they should be of interest to all biologists working in the field of fitness landscapes.
-
Reviewer #1 (Public review):
This paper describes a number of patterns of epistasis in a large fitness landscape dataset recently published by Papkou et al. The paper is motivated by an important goal in the field of evolutionary biology to understand the statistical structure of epistasis in protein fitness landscapes, and it capitalizes on the unique opportunities presented by this new dataset to address this problem.
The paper reports some interesting previously unobserved patterns that may have implications for our understanding of fitness landscapes and protein evolution. In particular, Figure 5 is very intriguing. However, I have two major concerns detailed below. First, I found the paper rather descriptive (it makes little attempt to gain deeper insights into the origins of the observed patterns) and unfocused (it reports what appears to be a disjointed collection of various statistics without a clear narrative. Second, I have concerns with the statistical rigor of the work.
(1) I think Figures 5 and 7 are the main, most interesting, and novel results of the paper. However, I don't think that the statement "Only a small fraction of mutations exhibit global epistasis" accurately describes what we see in Figure 5. To me, the most striking feature of this figure is that the effects of most mutations at all sites appear to be a mixture of three patterns. The most interesting pattern noted by the authors is of course the "strong" global epistasis, i.e., when the effect of a mutation is highly negatively correlated with the fitness of the background genotype. The second pattern is a "weak" global epistasis, where the correlation with background fitness is much weaker or non-existent. The third pattern is the vertically spread-out cluster at low-fitness backgrounds, i.e., a mutation has a wide range of mostly positive effects that are clearly not correlated with fitness. What is very interesting to me is that all background genotypes fall into these three groups with respect to almost every mutation, but the proportions of the three groups are different for different mutations. In contrast to the authors' statement, it seems to me that almost all mutations display strong global epistasis in at least a subset of backgrounds. A clear example is C>A mutation at site 3.
1a. I think the authors ought to try to dissect these patterns and investigate them separately rather than lumping them all together and declaring that global epistasis is rare. For example, I would like to know whether those backgrounds in which mutations exhibit strong global epistasis are the same for all mutations or whether they are mutation- or perhaps position-specific. Both answers could be potentially very interesting, either pointing to some specific site-site interactions or, alternatively, suggesting that the statistical patterns are conserved despite variation in the underlying interactions.
1b. Another rather remarkable feature of this plot is that the slopes of the strong global epistasis patterns seem to be very similar across mutations. Is this the case? Is there anything special about this slope? For example, does this slope simply reflect the fact that a given mutation becomes essentially lethal (i.e., produces the same minimal fitness) in a certain set of background genotypes?
1c. Finally, how consistent are these patterns with some null expectations? Specifically, would one expect the same distribution of global epistasis slopes on an uncorrelated landscape? Are the pivot points unusually clustered relative to an expectation on an uncorrelated landscape?
1d. The shapes of the DFE shown in Figure 7 are also quite interesting, particularly the bimodal nature of the DFE in high-fitness (HF) backgrounds. I think this bimodality must be a reflection of the clustering of mutation-background combinations mentioned above. I think the authors ought to draw this connection explicitly. Do all HF backgrounds have a bimodal DFE? What mutations occupy the "moving" peak?
1e. In several figures, the authors compare the patterns for HF and low-fitness (LF) genotypes. In some cases, there are some stark differences between these two groups, most notably in the shape of the DFE (Figure 7B, C). But there is no discussion about what could underlie these differences. Why are the statistics of epistasis different for HF and LF genotypes? Can the authors at least speculate about possible reasons? Why do HF and LF genotypes have qualitatively different DFEs? I actually don't quite understand why the transition between bimodal DFE in Figure 7B and unimodal DFE in Figure 7C is so abrupt. Is there something biologically special about the threshold that separates LF and HF genotypes? My understanding was that this was just a statistical cutoff. Perhaps the authors can plot the DFEs for all backgrounds on the same plot and just draw a line that separates HF and LF backgrounds so that the reader can better see whether the DFE shape changes gradually or abruptly.
1f. The analysis of the synonymous mutations is also interesting. However I think a few additional analyses are necessary to clarify what is happening here. I would like to know the extent to which synonymous mutations are more often neutral compared to non-synonymous ones. Then, synonymous pairs interact in the same way as non-synonymous pair (i.e., plot Figure 1 for synonymous pairs)? Do synonymous or non-synonymous mutations that are neutral exhibit less epistasis than non-neutral ones? Finally, do non-synonymous mutations alter epistasis among other mutations more often than synonymous mutations do? What about synonymous-neutral versus synonymous-non-neutral. Basically, I'd like to understand the extent to which a mutation that is neutral in a given background is more or less likely to alter epistasis between other mutations than a non-neutral mutation in the same background.
(2) I have two related methodological concerns. First, in several analyses, the authors employ thresholds that appear to be arbitrary. And second, I did not see any account of measurement errors. For example, the authors chose the 0.05 threshold to distinguish between epistasis and no epistasis, but why this particular threshold was chosen is not justified. Another example: is whether the product s12 × (s1 + s2) is greater or smaller than zero for any given mutation is uncertain due to measurement errors. Presumably, how to classify each pair of mutations should depend on the precision with which the fitness of mutants is measured. These thresholds could well be different across mutants. We know, for example, that low-fitness mutants typically have noisier fitness estimates than high-fitness mutants. I think the authors should use a statistically rigorous procedure to categorize mutations and their epistatic interactions. I think it is very important to address this issue. I got very concerned about it when I saw on LL 383-388 that synonymous stop codon mutations appear to modulate epistasis among other mutations. This seems very strange to me and makes me quite worried that this is a result of noise in LF genotypes.
-
Reviewer #2 (Public review):
Significance:
This paper reanalyzes an experimental fitness landscape generated by Papkou et al., who assayed the fitness of all possible combinations of 4 nucleotide states at 9 sites in the E. coli DHFR gene, which confers antibiotic resistance. The 9 nucleotide sites make up 3 amino acid sites in the protein, of which one was shown to be the primary determinant of fitness by Papkou et al. This paper sought to assess whether pairwise epistatic interactions differ among genetic backgrounds at other sites and whether there are major patterns in any such differences. They use a "double mutant cycle" approach to quantify pairwise epistasis, where the epistatic interaction between two mutations is the difference between the measured fitness of the double-mutant and its predicted fitness in the absence of epistasis (which equals the sum of individual effects of each mutation observed in the single mutants relative to the reference genotype). The paper claims that epistasis is "fluid," because pairwise epistatic effects often differs depending on the genetic state at the other site. It also claims that this fluidity is "binary," because pairwise effects depend strongly on the state at nucleotide positions 5 and 6 but weakly on those at other sites. Finally, they compare the distribution of fitness effects (DFE) of single mutations for starting genotypes with similar fitness and find that despite the apparent "fluidity" of interactions this distribution is well-predicted by the fitness of the starting genotype.
The paper addresses an important question for genetics and evolution: how complex and unpredictable are the effects and interactions among mutations in a protein? Epistasis can make the phenotype hard to predict from the genotype and also affect the evolutionary navigability of a genotype landscape. Whether pairwise epistatic interactions depend on genetic background - that is, whether there are important high-order interactions -- is important because interactions of order greater than pairwise would make phenotypes especially idiosyncratic and difficult to predict from the genotype (or by extrapolating from experimentally measured phenotypes of genotypes randomly sampled from the huge space of possible genotypes). Another interesting question is the sparsity of such high-order interactions: if they exist but mostly depend on a small number of identifiable sequence sites in the background, then this would drastically reduce the complexity and idiosyncrasy relative to a landscape on which "fluidity" involves interactions among groups of all sites in the protein. A number of papers in the recent literature have addressed the topics of high-order epistasis and sparsity and have come to conflicting conclusions. This paper contributes to that body of literature with a case study of one published experimental dataset of high quality. The findings are therefore potentially significant if convincingly supported.
Validity:
In my judgment, the major conclusions of this paper are not well supported by the data. There are three major problems with the analysis.
(1) Lack of statistical tests. The authors conclude that pairwise interactions differ among backgrounds, but no statistical analysis is provided to establish that the observed differences are statistically significant, rather than being attributable to error and noise in the assay measurements. It has been established previously that the methods the authors use to estimate high-order interactions can result in inflated inferences of epistasis because of the propagation of measurement noise (see PMID 31527666 and 39261454). Error propagation can be extreme because first-order mutation effects are calculated as the difference between the measured phenotype of a single-mutant variant and the reference genotype; pairwise effects are then calculated as the difference between the measured phenotype of a double mutant and the sum of the differences described above for the single mutants. This paper claims fluidity when this latter difference itself differs when assessed in two different backgrounds. At each step of these calculations, measurement noise propagates. Because no statistical analysis is provided to evaluate whether these observed differences are greater than expected because of propagated error, the paper has not convincingly established or quantified "fluidity" in epistatic effects.
(2) Arbitrary cutoffs. Many of the analyses involve assigning pairwise interactions into discrete categories, based on the magnitude and direction of the difference between the predicted and observed phenotypes for a pairwise mutant. For example, the authors categorize as a positive pairwise interaction if the apparent deviation of phenotype from prediction is >0.05, negative if the deviation is <-0.05, and no interaction if the deviation is between these cutoffs. Fluidity is diagnosed when the category for a pairwise interaction differs among backgrounds. These cutoffs are essentially arbitrary, and the effects are assigned to categories without assessing statistical significance. For example, an interaction of 0.06 in one background and 0.04 in another would be classified as fluid, but it is very plausible that such a difference would arise due to error alone. The frequency of epistatic interactions in each category as claimed in the paper, as well as the extent of fluidity across backgrounds, could therefore be systematically overestimated or underestimated, affecting the major conclusions of the study.
(3) Global nonlinearities. The analyses do not consider the fact that apparent fluidity could be attributable to the fact that fitness measurements are bounded by a minimum (the fitness of cells carrying proteins in which DHFR is essentially nonfunctional) and a maximum (the fitness of cells in which some biological factor other than DHFR function is limiting for fitness). The data are clearly bounded; the original Papkou et al. paper states that 93% of genotypes are at the low-fitness limit at which deleterious effects no longer influence fitness. Because of this bounding, mutations that are strongly deleterious to DHFR function will therefore have an apparently smaller effect when introduced in combination with other deleterious mutations, leading to apparent epistatic interactions; moreover, these apparent interactions will have different magnitudes if they are introduced into backgrounds that themselves differ in DHFR function/fitness, leading to apparent "fluidity" of these interactions. This is a well-established issue in the literature (see PMIDs 30037990, 28100592, 39261454). It is therefore important to adjust for these global nonlinearities before assessing interactions, but the authors have not done this.
This global nonlinearity could explain much of the fluidity claimed in this paper. It could explain the observation that epistasis does not seem to depend as much on genetic background for low-fitness backgrounds, and the latter is constant (Figure 2B and 2C): these patterns would arise simply because the effects of deleterious mutations are all epistatically masked in backgrounds that are already near the fitness minimum. It would also explain the observations in Figure 7. For background genotypes with relatively high fitness, there are two distinct peaks of fitness effects, which likely correspond to neutral mutations and deleterious mutations that bring fitness to the lower bound of measurement; as the fitness of the background declines, the deleterious mutations have a smaller effect, so the two peaks draw closer to each other, and in the lowest-fitness backgrounds, they collapse into a single unimodal distribution in which all mutations are approximately neutral (with the distribution reflecting only noise).<br /> Global nonlinearity could also explain the apparent "binary" nature of epistasis. Sites 4 and 5 change the second amino acid, and the Papkou paper shows that only 3 amino acid states (C, D, and E) are compatible with function; all others abolish function and yield lower-bound fitness, while mutations at other sites have much weaker effects. The apparent binary nature of epistasis in Figure 5 corresponds to these effects given the nonlinearity of the fitness assay. Most mutations are close to neutral irrespective of the fitness of the background into which they are introduced: these are the "non-epistatic" mutations in the binary scheme. For the mutations at sites 4 and 5 that abolish one of the beneficial mutations, however, these have a strong background-dependence: they are very deleterious when introduced into a high-fitness background but their impact shrinks as they are introduced into backgrounds with progressively lower fitness. The apparent "binary" nature of global epistasis is likely to be a simple artifact of bounding and the bimodal distribution of functional effects: neutral mutations are insensitive to background, while the magnitude of the fitness effect of deleterious mutations declines with background fitness because they are masked by the lower bound. The authors' statement is that "global epistasis often does not hold." This is not established. A more plausible conclusion is that global epistasis imposed by the phenotype limits affects all mutations, but it does so in a nonlinear fashion.
In conclusion, most of the major claims in the paper could be artifactual. Much of the claimed pairwise epistasis could be caused by measurement noise, the use of arbitrary cutoffs, and the lack of adjustment for global nonlinearity. Much of the fluidity or higher-order epistasis could be attributable to the same issues. And the apparently binary nature of global epistasis is also the expected result of this nonlinearity.
-
Reviewer #3 (Public review):
Summary:
The authors have studied a previously published large dataset on the fitness landscape of a 9 base-pair region of the folA gene. The objective of the paper is to understand various aspects of epistasis in this system, which the authors have achieved through detailed and computationally expensive exploration of the landscape. The authors describe epistasis in this system as "fluid", meaning that it depends sensitively on the genetic background, thereby reducing the predictability of evolution at the genetic level. However, the study also finds two robust patterns. The first is the existence of a "pivot point" for a majority of mutations, which is a fixed growth rate at which the effect of mutations switches from beneficial to deleterious (consistent with a previous study on the topic). The second is the observation that the distribution of fitness effects (DFE) of mutations is predicted quite well by the fitness of the genotype, especially for high-fitness genotypes. While the work does not offer a synthesis of the multitude of reported results, the information provided here raises interesting questions for future studies in this field.
Strengths:
A major strength of the study is its detailed and multifaceted approach, which has helped the authors tease out a number of interesting epistatic properties. The study makes a timely contribution by focusing on topical issues like the prevalence of global epistasis, the existence of pivot points, and the dependence of DFE on the background genotype and its fitness. The methodology is presented in a largely transparent manner, which makes it easy to interpret and evaluate the results.
The authors have classified pairwise epistasis into six types and found that the type of epistasis changes depending on background mutations. Switches happen more frequently for mutations at functionally important sites. Interestingly, the authors find that even synonymous mutations in stop codons can alter the epistatic interaction between mutations in other codons. Consistent with these observations of "fluidity", the study reports limited instances of global epistasis (which predicts a simple linear relationship between the size of a mutational effect and the fitness of the genetic background in which it occurs). Overall, the work presents some evidence for the genetic context-dependent nature of epistasis in this system.
Weaknesses:
Despite the wealth of information provided by the study, there are some shortcomings of the paper which must be mentioned.
(1) In the Significance Statement, the authors say that the "fluid" nature of epistasis is a previously unknown property. This is not accurate. What the authors describe as "fluidity" is essentially the prevalence of certain forms of higher-order epistasis (i.e., epistasis beyond pairwise mutational interactions). The existence of higher-order epistasis is a well-known feature of many landscapes. For example, in an early work, (Szendro et. al., J. Stat. Mech., 2013), the presence of a significant degree of higher-order epistasis was reported for a number of empirical fitness landscapes. Likewise, (Weinreich et. al., Curr. Opin. Genet. Dev., 2013) analysed several fitness landscapes and found that higher-order epistatic terms were on average larger than the pairwise term in nearly all cases. They further showed that ignoring higher-order epistasis leads to a significant overestimate of accessible evolutionary paths. The literature on higher-order epistasis has grown substantially since these early works. Any future versions of the present preprint will benefit from a more thorough contextual discussion of the literature on higher-order epistasis.
(2) In the paper, the term 'sign epistasis' is used in a way that is different from its well-established meaning. (Pairwise) sign epistasis, in its standard usage, is said to occur when the effect of a mutation switches from beneficial to deleterious (or vice versa) when a mutation occurs at a different locus. The authors require a stronger condition, namely that the sum of the individual effects of two mutations should have the opposite sign from their joint effect. This is a sufficient condition for sign epistasis, but not a necessary one. The property studied by the authors is important in its own right, but it is not equivalent to sign epistasis.
(3) The authors have looked for global epistasis in all 108 (9x12) mutations, out of which only 16 showed a correlation of R^2 > 0.4. 14 out of these 16 mutations were in the functionally important nucleotide positions. Based on this, the authors conclude that global epistasis is rare in this landscape, and further, that mutations in this landscape can be classified into one of two binary states - those that exhibit global epistasis (a small minority) and those that do not (the majority). I suspect, however, that a biologically significant binary classification based on these data may be premature. Unsurprisingly, mutational effects are stronger at the functional sites as seen in Figure 5 and Figure 2, which means that even if global epistasis is present for all mutations, a statistical signal will be more easily detected for the functionally important sites. Indeed, the authors show that the means of DFEs decrease linearly with background fitness, which hints at the possibility that a weak global epistatic effect may be present (though hard to detect) in the individual mutations. Given the high importance of the phenomenon of global epistasis, it pays to be cautious in interpreting these results.
(4) The study reports that synonymous mutations frequently change the nature of epistasis between mutations in other codons. However, it is unclear whether this should be surprising, because, as the authors have already noted, synonymous mutations can have an impact on cellular functions. The reader may wonder if the synonymous mutations that cause changes in epistatic interactions in a certain background also tend to be non-neutral in that background. Unfortunately, the fitness effect of synonymous mutations has not been reported in the paper.
(5) The authors find that DFEs of high-fitness genotypes tend to depend only on fitness and not on genetic composition. This is an intriguing observation, but unfortunately, the authors do not provide any possible explanation or connect it to theoretical literature. I am reminded of work by (Agarwala and Fisher, Theor. Popul. Biol., 2019) as well as (Reddy and Desai, eLife, 2023) where conditions under which the DFE depends only on the fitness have been derived. Any discussion of possible connections to these works could be a useful addition.
-
Author response:
Thank you for sharing a detailed review of our manuscript titled, Variations and predictability of epistasis on an intragenic fitness landscape. We have now carefully gone through the reviewers’ and the editor’s comments and have the following preliminary responses.
(1) Measurement noise in the folA fitness landscape. All three reviewers and the editors raise the important matter of incorporating measurement noise in the fitness landscape. The paper by Papkou and coworkers makes the fitness measurements of the landscape in six independent repeats. They show that the fitness data is highly correlated in each repeat, and use the weighted mean of the repeats to report their results. They do not study how measurement noise influences their findings. The results by Papkou and coworkers were our starting point, and hence, we built on the landscape properties reported in their study. As a result, we also analyse our results working with the same mean of the six independent measurements.
The main result of the work by Papkou and coworkers is that largest subgraph in the landscape has 514 fitness peaks.
We revisit this result by quantifying how measurement noise changes this number. By doing this, we note the subgraph contains only 127 peaks which are statistically significant. We define a sequence as a peak when its corresponding fitness is greater than all its one-distance neighbours with a p-value < 0.05. This shows that, as pointed out in the reviews, incorporating noise in the landscape results significantly changes how we view the landscape – a facet not included in Papkou et al and the current version of our manuscript.
Not incorporating measurement noise means that the entire landscape has 4055 peaks. When measurement noise is included in the analysis, this number reduces to 137, out of which 136 are high fitness backgrounds (functional).
In the revised version of our manuscript, we will incorporate measurement noise in our analysis. Through this, we will also address the concern regarding the use of an arbitrary cut-off to study “fluid” epistasis. However, we note that arbitrary cut-offs to define DFEs have been recently used (Sane et al., PNAS, 2023).
We also note that previous work with large scale landscapes (Wu et al, eLife, 2016) also reported a fitness landscape with a single experiment, with no repeats.
(2) Global nonlinearities and higher-order leading to fluid epistasis. Attempts at building models for higher-order epistasis from empirical data have largely been confined to landscapes of a limited data size. For example, Sailer & Harms, Genetics, 2017 propose models for higher-order epistasis from seven empirical data sets, each with less than a 100 data points. Another recent attempt (Park et al, Nat Comm, 2024) proposes rule for protein structure-function with 20 fitness landscapes. In this study, only one landscape which used fitness as a phenotype had ~160000 data points (of which only 42% were included for analysis). All other data sets which used fitness as a phenotype contained less than 10000 data points. While these statistical proposals of how higher-order epistasis operates exist, none of them are reliant of large scale, exhaustive network, like the one proposed by Papkou and coworkers.
In the edited manuscript, we will replace our arbitrary cut-off with results of statistical tests carried out based on measurement noise.
Global non-linearities shape evolutionary responses. We would like to emphasize that the goal of this work to study and understand how these global non-linearities result in patterns on a large fitness landscape by presenting the sum total of these fundamental factors in shaping statistical patterns.
While we understand that we may not have sufficiently explained the effects of global non-linearities on our results, we do not agree with the reviewer’s conclusion that our results are artifacts of these non-linearities. We will expand on the role of these nonlinearities on the patterns that we observe (like, fitness being bounded, as pointed out by reviewer 2, or differential impact of a mutation in functional vs. non-functional variants).
We also speculate that changing our arbitrary cut-off (selection coefficient of 0.05) to measurement noise will not alter our results qualitatively.
The question we address in our work is, therefore, how does the nature of epistasis change with genetic background over a large, exhaustive landscape. The nature of epistasis between two mutations is analysed in all 4<sup>7</sup> backgrounds. The causative agents for the change in epistasis will be context-dependent, depending on the precise nature of the two mutations and the background. For instance, a certain background might simply introduce a Stop codon in the sequence. Notwithstanding these precise, local mechanistic explanations, we seek to answer how epistasis changes statistically in a sequence. Investigating statistical patterns which explain switch in nature of epistasis in deep, exhaustive landscapes is a long-term goal of this research.
(3) Last, in our revised manuscript, we will address the reviewers’ other minor comments on the various aspects of the manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study makes a valuable contribution to understanding Bayesian inference in dynamic environments by demonstrating how humans integrate prior beliefs with sensory evidence, revealing an overestimation of environmental volatility while accurately tracking noise. The evidence is solid, supported by robust model fitting and principled factorial model set analyses, though limitations in sample size and inconclusive findings on memory capacity tradeoffs reduce the overall impact. Future work should expand validation across datasets, enhance model comparisons, and explore the generalizability of reduced Bayesian frameworks to strengthen the conclusions and broader relevance of the study.
-
Reviewer #1 (Public review):
Summary
Behavioural adjustments to different sources of uncertainty remain a hot topic in many fields including reinforcement learning. The authors present valuable findings suggesting that human participants integrate prior beliefs with sensory evidence to improve their predictions in dynamically changing environments involving perceptual decision-making, pinpointing to hallmarks of Bayesian inference. Fitting of a reduced Bayesian model to participant choice behaviour reveals that decision-makers overestimate environmental volatility, but were reasonably accurate in terms of tracking environmental noise.
Strengths
Using a perceptual decision-making task in which participants were presented with sequences of noisy observation in environments with constant volatility and variable noise, the authors demonstrate solid evidence in favour of reduced Bayesian models that can account for participant choice behaviour when its generative parameters are fitted freely. The work nicely complements recent work demonstrating the fitting of a full Bayesian model to human reinforcement learning. The authors' approach to the fitting of the model in a principled/factorial manner that is exhaustive performs the model comparison and highlights the need for further work in evaluating the model's performance in environments outside of its generative parameters. Overall the work further highlights the utility of using perceptual decision-making for Bayesian inference questions.
Weaknesses
Although data sharing and reanalysis of data are extremely welcome, particularly considering their utility for open science, the small sample size (N= 29) of the original dataset somewhat restricts the authors' ability to show more conclusive findings when it comes to deciphering the optimal memory capacity of the fitted models. It is likely that the relatively small sample size also contributes to certain key hypotheses not being confirmed intuitively, for example, the expected negative relationship between hazard rates and log (noise). The notion that the participants rely on priors to a greater extent in low noise environments relative to high noise may also indicate that they might misattribute noise as volatility, as higher noise in the environment usually obscures the information content of outcomes, and in the case of pure random/noisy sequences, it should increase reliance to priors as new sensory evidence becomes unreliable.
-
Reviewer #2 (Public review):
Summary:
Meijer et al reanalyze behavioral data from a task in which people made predictions about the next in a sequence of localized sounds with the goal of understanding the computations through which people combine sensory experiences into a prior used for perception. The authors combine basic analyses of experimental data with model simulations and development and fitting of a factorial model set that includes a prominent model of change-point detection that has previously been shown to approximate Bayesian inference at a reduced computational cost and provide a good match to human prediction data (reduced Bayesian model). The authors present a number of findings, including a demonstration of key qualitative markers for Bayesian change-point detection, a tendency in humans to over-rely on recent observations, a lack of an inverse relationship between fit values of hazard rate and fit values of noise, support for a number of assumptions in the reduced Bayesian model, and a lack of evidence for reliance on memory systems beyond the extremely minimal requirements of that model.
Strengths:
The paper asks an important question and takes a number of useful steps toward answering it. In particular, the factorial model set constructed to examine a number of explicit assumptions in the models typically fit to change-point predictive inference task data was a very useful innovation, and in some cases showed clearly that assumptions in the model are necessary or at least better than the proposed alternatives. In particular, the paper develops a notion of memory capacity that allows for a continuum of models differing in their tradeoffs between computational cost and predictive precision. Another strength of the paper is that it relies on data that avoids sequential biases that can contaminate reported beliefs in more standard predictive inference tasks.
Weaknesses:
The primary weakness of the paper is that most of the definitive findings reported within it have already been reported elsewhere. That humans increase the influence of surprising outcomes indicative of change points, or to say this another way, decrease their reliance on prior information in such cases, has been fairly well established, as has the discovery that humans tend to overuse recent outcomes when making predictions. The most novel aspect of the paper, the exploration of reductions of the Bayesian ideal observer that rely on differing memory capacities, yielded results that are somewhat difficult to interpret, particularly because it is not clear that the task analyzed is diagnostic of the memory capacity term in the model, or if so, what the qualitative hallmarks of a high/low memory capacity model reduction might be.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript presents a useful mean-field model for a network of Hodgkin-Huxley neurons retaining the equations for ion exchange between the intracellular and extracellular space. The mean-field model derived in this work relies on approximations and heuristic arguments that, on the one hand, allow a closed-form derivation of the mean-field equations, but also raise questions about their justifications and the degree to which the results agree with experiments as well as direct numerical simulations. Therefore, the evidence for the utility of this approach is at present incomplete.
-
Reviewer #1 (Public review):
Summary:
In this manuscript, the authors derive a mean-field model for a network of Hodgkin-Huxley neurons retaining the equations for ion exchange between the intracellular and extracellular space.
The mean-field model derived in this work relies on approximations and heuristic arguments that, on the one hand, allow a closed-form derivation of the mean-field equations, and on the other hand restrict its validity to a limited regime of activity corresponding to quasi-synchronous neuronal populations. Therefore, rather than an exact mean-field representation, the model provides a description of a mesoscopic population of connected neurons driven by ion exchange dynamics.
Strengths:
The idea of deriving a mean-field model that relates the slow-timescale biophysical mechanism of ion exchange and transportation in the brain to the fast-timescale electrical activities of large neuronal ensembles.
Weaknesses:
The idea underlying this work is not completely implemented in practice.
The derived mean field model does not show a one-to-one correspondence with the neural network simulations, except in strongly synchronous regimes. The agreement with the in vitro experiment is hardly evident, both for the mean-field model and for the network model. The assumptions made to derive the closed-form equations of the mean-field model have not been justified by any biological reason, they just allow for the mathematical derivation. The final form of the mean-field equations does not clarify whether or not microscopic variables are used together with macroscopic variables in an inconsistent mixture.
-
Reviewer #2 (Public review):
Summary:
The authors aim to develop a neural mass model characterized by a few collective variables mimicking the dynamics of a network of Hodgkin - Huxley neurons encompassing ion-exchange mechanisms. They describe in detail the derivation of the mean-field model, then they compare experimental results obtained for the hippocampus of a mouse with the neural network simulations and the mean-field results. Furthermore, they report a bifurcation analysis of the developed model and simulation of a small network containing various coupled neural masses, somehow moving towards the simulation of an entire connectome.
Strengths:
The author attempts to develop a mean-field model for a globally coupled network of heterogeneous Hodgkin-Huxley neurons with an explicit ion exchange mechanism between the cell interior and exterior.
Weaknesses:
(1) It seems that the reduction methodology that is employed is not the most suitable one for the single-neuron model they are considering.<br /> (2) The authors' derivation of the neural mass model is based on several assumptions, and not all well justified.<br /> (3) The formulation of the mean-field derivation is unnecessarily complicated. It could be heavily simplified by following previously published approaches to derive biologically realistic neural masses.<br /> (4) The model seems to work only for highly synchronized situations and not for the standard asynchronous evolution usually observed in neural circuits.
General Statements:
The authors honestly declared the many limitations of their approach. It is assumed that the results of the mean-field are somehow inconsistent with the neural network simulations as expected.
The authors suggest employing this model for the simulations on the whole connectome to follow seizure propagation, however, I believe that the Epileptor remains superior in this respect to this model. That indeed includes biophysical parameters but their correspondence with the ones employed in the network dynamics remains elusive, due to the many assumptions required to derive this mean-field model. Furthermore, it is more complicated than the Epileptor, I do not think that the present model will be largely employed by the community.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study uses diffusion magnetic resonance imaging to non-invasively map the white matter fibres connecting the zona incerta and cortex in humans. The authors present convincing evidence to indicate that these connections are organized along a rostro-caudal axis. The findings will be of interest to researchers interested in neuroanatomy and cortico-subcortical connectivity.
-
Reviewer #1 (Public review):
Summary:
This is a study that used 7T diffusion MRI in subjects from a Human Connectome Project dataset to characterize the zona incerta, an area of gray matter whose involvement has been demonstrated in a broad range of behavioral and physiologic functions. The authors employ tractography to model white matter tracts that involve connections with the ZI and use clustering techniques to segment the ZI into distinct subregions based on similar patterns of connectivity. The authors report a rostral-caudal organization of the ZI's streamlines where rostrally-projecting tracts are rostrally-positioned in the ZI and caudally-projecting tracts are caudally-positioned in the ZI.
Strengths:
The paper presents robust findings that demonstrate subregions of the human ZI that appear to be structurally distinct using a combination of spectral clustering and diffusion map embedding methods. The results of this work can contribute to our understanding of the anatomy and structural connectivity of the ZI, allowing us to further explore its role as a neuromodulatory target for various neurological disorders.
Weaknesses:
There should be further discussion of the clustering methods employed and why they are appropriate for the pertinent data. Additionally, the limitations of analyzing solely the cortical connections of the zona incerta should be addressed, as anatomical studies of the ZI have shown significant involvement of the ZI in tracts projecting to deep brain regions.
-
Reviewer #2 (Public review):
Summary:
Haast et al. investigated the organization of the zona incerta (ZI) in the human brain based on its structural connectivity to the neocortex. They found that the ZI is organized according to a primary rostro-caudal gradient, where the rostral ZI is more strongly connected to the prefrontal cortex and the caudal ZI to the sensorimotor cortex. They also found that the central region of the ZI is differently connected to the neocortex compared with the rostral and caudal regions, and could be important as a deep brain stimulation target for the treatment of essential tremors.
Strengths:
I think the overall quality of this work is great, and the results are presented in a very clear and organized manner. I particularly appreciate the effort that the authors put into validating the results using 7T and 3T data, as well as test-retest data.
Weaknesses:
That being said, I was left with a couple of concerns after reading the paper.
(1) Although the authors discussed animal evidence for a dorsal-ventral organization of the ZI, I thought that the evidence they presented for it in this paper was not so convincing. In Figure S5, the second gradient (G2) shows a clear dorsoventral pattern, but this pattern seems to primarily separate the ZI and H fields rather than show an internal topology of the ZI. This is more likely the case given that there are two bands (superior and inferior) of high G2 values surrounding a single band (middle) of low G2 values. The evidence for the rostrocaudal gradient, on the other hand, is quite convincing.
(2) HCP data is still too advanced for clinical translation. Although 3T is becoming more and more prevalent for presurgical planning, the HCP 3T dataset is acquired with a voxel size of 1.25mm, which is a far higher resolution than the typical clinical scan. It would be very useful for clinical readers to see what individual subject replicability looks like if the data were acquired at the more typical voxel size of 2mm. This could be achieved by replicating the analysis on a downsampled version of the HCP data that more closely resembles clinical data. This is understandably a large undertaking, so it could be left to future validation work.
-
- Jan 2025
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important work proposes a neural network model of interactions between the prefrontal cortex and basal ganglia to implement adaptive resource allocation in working memory, where the gating strategies for storage are adjusted by reinforcement learning. Numerical simulations provide convincing evidence for the superiority of the model in improving effective capacity, optimizing resource management, and reducing error rates, as well as for its human-like performance. This work will be of broad interest to computational and cognitive neuroscientists, and may also interest machine-learning researchers who seek to develop brain-inspired machine-learning algorithms for memory.
-
Reviewer #1 (Public review):
Summary:
In this research, Soni and Frank investigate the network mechanisms underlying capacity limitations in working memory from a new perspective, with a focus on Visual Working Memory (VWM). The authors have advanced beyond the classical neural network model, which incorporates the prefrontal cortex and basal ganglia (PBWM), by introducing an adaptive chunking variant. This model is trained using a biologically-plausible, dopaminergic reinforcement learning framework. The adaptive chunking mechanism is particularly well-suited to the VWM tasks involving continuous stimuli and elegantly integrates the 'slot' and 'resource' theories of working memory constraints. The chunk-augmented PBWM operates as a slot-like system with resource-like limitations.
Through numerical simulations under various conditions, Soni and Frank demonstrate the performance of the chunk-augmented PBWM model surpass the no-chunk control model. The improvements are evident in enhanced effective capacity, optimized resource management, and reduced error rates. The retention of these benefits, even with increased capacity allocation, suggests that working memory limitations are due to a combination of factors, including the efficient credit assignment that are learned flexibly through reinforcement learning. In essence, this work addresses fundamental questions related to a computational working memory limitation using a biologically-inspired neural network, thus has implications for conditions such as Parkinson's disease, ADHD and schizophrenia.
Strengths:
The integration of mechanistic flexibility, reconciling two theories for WM capacity into a single unified model, results in a neural network that is both more adaptive and human-like. Building on the PBWM framework ensures the robustness of the findings. The addition of the chunking mechanism tailors the original model for continuous visual stimuli. Chunk-stripe mechanisms contribute to the 'resource' aspect, while input-stripes contribute to the 'slot' aspect. This combined network architecture enables flexible and diverse computational functions, enhancing performance beyond that of the classical model.
Moreover, unlike previous studies that design networks for specific task demands, the proposed network model can dynamically adapt to varying task demands by optimizing the chunking gating policy through RL.
The implementation of a dopaminergic reinforcement learning protocol, as opposed to a hard-wired design, leads to the emergence of strategic gating mechanisms that enhance the network's computational flexibility and adaptability. These gating strategies are vital for VWM tasks and are developed in a manner consistent with ecological and evolutionary learning held by human. Further examination of how reward prediction error signals, both positive and negative, collaborate to refine gating strategies reveals the crucial role of reward feedback in fine-tuning the working memory computations and the model's behavior, aligning with the current neuroscientific understanding that reward matters.
Assessing the impact of a healthy balance of dopaminergic RPE signals on information manipulation holds implications for patients with altered striatal dopaminergic signaling.
Comments on revisions:
In the revised version, the authors have thoroughly addressed all the questions raised in my previous review. They have clarified the model architecture, provided detailed explanations of the training process, and elaborated on the convergence of the optimization.
Additionally, Reviewer 2 made a very constructive suggestion: Can related cognitive functions or phenomena emerge from the model? The newly added analysis and results highlighting the recency effect directly address this question and significantly strengthen the paper.
-
Reviewer #2 (Public review):
Summary:
This paper utilizes a neural network model to investigate how the brain employs an adaptive chunking strategy to effectively enhance working memory capacity, which is a classical and significant question in cognitive neuroscience. By integrating perspectives from both the 'slot model' and 'limited resource models,' the authors adopted a neural network model encompassing the prefrontal cortex and basal ganglia, introduced an adaptive chunking strategy, and proposed a novel hybrid model. The study demonstrates that the brain can adaptively bind various visual stimuli into a single chunk based on the similarity of color features (a continuous variable) among items in visual working memory, thereby improving working memory efficiency. Additionally, it suggests that the limited capacity of working memory arises from the computational characteristics of the neural system, rather than anatomical constraints.
Strengths:
The neural network model utilized in this paper effectively integrates perspectives from both slot models and resource models (i.e., resource-like constraints within a slot-like system). This methodological innovation provides a better explanation for the limited capacity of working memory. By simulating the neural networks of the prefrontal cortex and basal ganglia, the model demonstrates how to optimize working memory storage and retrieval strategies through reinforcement learning (i.e., the efficient management of access to and from working memory). This biological simulation offers a novel perspective on human working memory and provides new explanations for the working memory difficulties observed in patients with Parkinson's disease and other disorders. Furthermore, the effectiveness of the model has been validated through computational simulation experiments, yielding reliable and robust predictions.
Comments on revisions:
The authors have already answered all my questions.
-
Author response:
The following is the authors’ response to the original reviews.
eLife Assessment
This important work proposes a neural network model of interactions between the prefrontal cortex and basal ganglia to implement adaptive resource allocation in working memory, where the gating strategies for storage are adjusted by reinforcement learning. Numerical simulations provide convincing evidence for the superiority of the model in improving effective capacity, optimizing resource management, and reducing error rates, as well as solid evidence for its human-like performance. The paper could be strengthened further by a more thorough comparison of model predictions with human behavior and by improved clarity in presentation. This work will be of broad interest to computational and cognitive neuroscientists, and may also interest machine-learning researchers who seek to develop brain-inspired machine-learning algorithms for memory.
We thank the reviewers for their thorough and constructive comments, which have helped us clarify, augment and solidify our work. Regarding the suggestion to include a “more thorough comparison with with human behavior”, we believe this comment reflects one of the reviewer’s suggestion to compare with sequential order effects. We now include a new section with simulations showing that the network exhibits clear recency effects in accordance with the literature, and where such recency effects are known to be related to WM interference and not due to passive decay. Overall our work makes substantial contact with human behavioral patterns that have been documented in the human literature (and which as far as we know have not been jointly captured by any one model), such as the shape of the error distributions, including probability of recall and variable precision; attraction to recently presented items, sensitivity to reinforcement history, set-size dependent chunking, recency effects, dopamine manipulation effects, as well of a range of human data linking capacity limitations to frontostriatal function. It also provides a theoretical proposal for the well established phenomenon of capacity limitations in humans, suggesting that they arise due to difficulty in WM management.
Below we address each reviewer individually, responding to each comment and providing the relevant location in the paper that the changes and additions were made. Reviewer responses are included in blue/bold for clarity.
Public Reviews:
Reviewer 1:
Thank you for your comments. We appreciate your statements of the strengths of this paper and your suggestions to improve this paper.
First, the method section appears somewhat challenging to follow. To enhance clarity, it might be beneficial to include a figure illustrating the overall model architecture. This visual aid could provide readers with a clearer understanding of the overall network model.
Additionally, the structure depicted in Figure 2 could be potentially confusing. Notably, the absence of an arrow pointing from the thalamus to the PFC and the apparent presence of two separate pathways, one from sensory input to the PFC and another from sensory input to the BG and then to the thalamus, may lead to confusion. While I recognize that Figure 2 aims to explain network gating, there is room for improvement in presenting the content accurately.
As suggested, we added a figure (new figure 2) illustrating the overall model architecture before expanding it to show the chunking circuitry. This figure also shows the projections from thalamus to PFC (we preserve the previous figure 2, now figure 3, as an example sequence of network gating decisions, in more abstract form to help facilitate a functional understanding of the sequence of events without too much clutter). We also made several other general clarifications to the methods sections to make it more transparent and easier to follow, as per your suggestions.
Still, for the method part, it would enhance clarity to explicitly differentiate between predesigned (fixed) components and trainable components. Specifically, does the supplementary material state that synaptic connection weights in striatal units (Go&NoGo) are trained using XCAL, while other components, such as those in the PFC and lateral inhibition, are not trained (I found some sentences in 'Limitations and Future Directions')?
We have now explicitly specified learned and fixed components. We have further explained the role of XCAL and how striatal Go/NoGo weights are trained. We have also added clarification on how gating policies are learned via eligibility traces and synaptic tags.
I'm not sure about the training process shown in Figure 8. It appears that the training may not have been completed, given that the blue line representing the chunk stripe is still ascending at the endpoint. The weights depicted in panel d) seem to correspond with those shown in panels b) and c), no? Then, how is the optimization process determined to be finished? Alternatively, could it be stated that these weight differences approach a certain value asymptotically? It would be better to clarify the convergence criteria of the optimization process.
The training process has been clarified and we specify (in the last paragraph of the Base PBWM Model) how we determine when training is complete. We also can confirm that the network behavior has stabilized in learning even if the Go/NoGo weights continue to grow over time for the chunked layer (due to imperfect performance and reinforcement of the chunk gating strategy).
Reviewer 2:
Thank you for your comments. We appreciate your notes on the strengths of the paper and your suggestions to help improve the paper.
The model employs a spiking neural network, which is relatively complex. Additionally, while this paper validates the effectiveness of chunking strategies used by the brain to enhance working memory efficiency through computational simulations, further comparison with related phenomena observed in cognitive neuroscience experiments on limited working memory capacity, such as the recency effect, is necessary to verify its generalizability.
Thank you for proposing we add in more connections with human WM. Based on your specific recommendation, we have included the section “Network recapitulates human sequential effects in working memory.” where we discuss recency effects in human working memory and how our model recapitulates this effect. We have also made the connections to human data and human work more explicit throughout the manuscript (Figure 4c). As noted in response to the assessment, we believe our model does make contact with a wide variety of cognitive neuroscience data in human WM, such as the shape of the error distributions, including probability of recall and variable precision; attraction to recently presented items, sensitivity to
reinforcement history, set-size dependent chunking, recency effects, and dopamine manipulation effects, as well of a range of human data linking capacity limitations to frontostriatal function. It also provides a theoretical proposal for the well established phenomenon of capacity limitations in humans, suggesting that they arise due to difficulty in WM management.
Recommendations For The Authors:
Reviewer 1:
I appreciate the authors' clear discussion of the limitations of this work in the section "Limitations and Future Directions". The development of a comprehensive model framework to overcome these constraints should require a separate paper, though, I am curious if the authors have attempted any experiments, such as using two identically designed chunking layers, that could partially support the assumptions presented in the paper.
Expanding the number of chunking layers is a great future direction. We felt that it was most effective for this paper to begin with a minimal set up with proof of concept. We hypothesize that, given our results, a reinforcement learning algorithm would be able to learn to select the best level of abstraction (degree of chunking) in more continuous form, but would require more experience across a range of tasks to do so.
I'm not sure whether it's appropriate that "Frontostriatal Chunking Gating..." precedes "Dopamine Balance is...", maybe it would be better to reverse the order thus avoiding the need to mention the role of dopamine before delving into the details. Additionally, including a summary at the end of the Introduction, outlining how the paper is organized, could provide readers with a clear roadmap of the forthcoming content.
We appreciate this suggestion. After careful thought, we wanted to preserve the order because we felt it was important to make the direct connection between set size and stripe usage following the discussion on performance based on increasing stripes.
The authors could improve the overall polish of the paper. The equations in the Method section are somewhat confusing: Eq. (2) appears incorrect, as it lacks a weight w_i and n should presumably be in the denominator. For Eq. (3), the comma should be replaced with ']'... It would be advisable to cross-reference these equations with the original O'Reilly and Frank paper for consistency.
Thank you for pointing out the errors in the method equations- those equations were indeed rendering incorrectly. We have fixed this problem.
Additionally, there are frequent instances of missing figure and reference citations (many '?'s), and it would be beneficial to maintain consistent citation formatting throughout the paper: sometimes citations are presented as "key/query coding (Traylor, Merullo, Frank, and Pavlick, 2024; see also Swan and Wyble, 2014)", while other times they are written as "function (O'Reilly & Frank, 2006)"...
Lastly, there is an empty '3.1' section in the supplementary material that should be addressed.
The citation issues were fixed. The supplementary information was cleaned and the missing section was removed. Thank you for mentioning these errors.
Reviewer 2:
Thank you for the following recommendations and suggestions. We respond to each individual point based on the numbering system used in your review.
(1) This paper utilizes the experimental paradigm of visual working memory, in which different visual stimuli are sequentially loaded into the working memory system, and the accuracy of memory for these stimuli is calculated.
The authors could further plot the memory accuracy curve as the number of items (N) increases, under both chunking and non-chunking strategies. This would allow for the examination of whether memory accuracy suddenly declines at a specific value of N (denoted as Nc), thereby determining the limited capacity of working memory within this experimental framework, which is about 4 different items or chunks. Additionally, it could be investigated whether the value of Nc is larger when the chunking strategy is applied.
We have included an additional plot (Probability of Recall) as a supplemental figure to Figure 5 to explore the probability of recall as a function of set size for both chunking and no chunking models. This plot shows that the chunking model increases probability of recall when set size exceeds allocated capacity (but that nevertheless both models show decreases in recall with set size, consistent with the literature).
(2) The primacy effect or recency effect observed in the experiments and traditional working memory models, including the slot model and the limited resource model, should be examined to see if it also appears in this model.
The literature on human working memory shows a prevalent recency effect (but not a primacy effect, which is thought to be due to episodic memory, and which is not included in our model). We have added a section showing that our model demonstrates clear recency effects.
(3) The construction of the model and the single neuron dynamics involved need further refinement and optimization:
Model Description: The details of the model construction in the paper need to be further elaborated to help other researchers better understand and apply the model in reproducing or extending research. Specifically:
a) The construction details of different modules in the model (such as Input signal, BG, striatum, superficial PFC, deep PFC) and the projection relationships between different modules. Adding a diagram to illustrate the network construction would be beneficial.
To aid in the understanding of the model construction and model components, we have included an additional figure (Figure 1: Base Model) that explains the key layers and components of the model. We have also altered the overall model figures to show more clearly that the inputs project to both PFC and striatum, to highlight that information is temporarily represented in superficial PFC layers even before striatal gating, which is needed for storage after the input decays.
We have expanded the methods and equations and we also provide a link to the model github for purposes of reproducibility and sharing.
A base model figure was added to specify key connections.
a) The numbers of excitatory and inhibitory neurons within different modules and the connections between neurons.
We added clarification on the type of connections between layers (specifying which are fixed and learned). We have also added the size of layers in a new appendix section “Layer Sizes and Inner Mechanics”
b) The dynamics of neurons in different modules need to be elaborated, including the description of the dynamic equations of variables (such as x) involved in single neuron equations.
Single neuron dynamics are explained in equations 1-4. Equations 5-6 explain how activation travels between layers. The specific inhibitory dynamics in the chunking layer are elaborated in Figure 4. PBWM Model and Chunking Layer Details. The Appendix section “Neural model implementational details” states the key equations, neural information and connectivity. Since there is a large corpus of background information underlying these models, we have linked the Emergent github and specifically the Computational Cognitive Neuroscience textbook which has a detailed description of all equations. For the sake of paper length and understability, we chose the most relevant equations that distinguish our model.
c) The selection of parameters in the model, especially those that significantly affect the model's performance.
The appendix section hyperparameter search details some of the key parameters and why those values were chosen.
d) The model employs a sequential working memory paradigm, the forms of external stimuli involved in the encoding and recalling phases (including their mathematical expressions, durations, strengths, and other parameters) need to be elaborated further.
We appreciate this comment. We have expanded the Appendix section “Continuous Stimuli” to include the details of stimuli presentation (including durations etc).
(4) The figures in the paper need optimization. For example, the size of the schematic diagram in Figure 2 needs to be enlarged, while the size of text such as "present stimulus 1, 2, recall stimulus 1" needs to be reduced. Additionally, the citation of figures in the main text needs to be standardized. For example, Figure 1b, Figure 1c, etc., are not cited in the main text.
The task sequence figure (original Figure 2) has been modified and following your suggestions, text sizes have been modified.
(5) Section 3.1 in the appendix is missing.
Supplemental section 3.1 is removed.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This report used a new double knockout mouse model to investigate the role of two neuropeptides, substance P and CGRPa, in pain signaling. There is convincing evidence that double knockout of these two molecules, both of which have historically been associated with pain, does not affect nociception or acute pain behaviors in males and females. This finding is fundamental, as it challenges the hypothesis that these peptides are essential for pain transmission, even when targeted together. This paper will be of interest to those interested in the neurobiology of pain and/or neuropeptide function.
-
Reviewer #2 (Public review):
Summary,
The paper aimed to examine the effect of co-ablating Substance P and CGRPα peptides on pain using Tac1 and Calca double knockout (DKO) mice. The authors observed no significant changes in acute, inflammatory, and neuropathic pain. These results suggest that Substance P and CGRPα peptides do not play a major role in mediating pain in mice. Moreover, they reveal that the lack of behavioral phenotype cannot be explained by the redundancy between the two peptides, which are often co-expressed in the same neuron
Strengths,
The paper uses a straightforward approach to address a significant question in the field. The authors confirm the absence of Substance P and CGRPα peptides at the levels of DRG, spinal cord, and midbrain. Subsequently, they employ a comprehensive battery of behavioral tests to examine pain phenotypes, including acute, inflammatory, and neuropathic pain. Additionally, they evaluate neurogenic inflammation by measuring edema and extravasation, revealing no changes in DKO mice. The data are compelling, and the study's conclusions are well-supported by the results. The manuscript is succinct and well-presented.
-
Reviewer #3 (Public review):
In this study, the authors aimed to determine the role of a global double knockout (DKO) of substance P and CGRPα in modulating acute and chronic pain transmission. After successfully generating and validating the DKO mouse model, they conducted a series of behavioral pain assessments to evaluate the role of these neuropeptides in acute and chronic pain. Despite the well-established involvement of substance P and CGRPα in chronic pain, their findings revealed that the global loss of both neuropeptides did not affect the transmission of either acute or chronic pain.
A major strength of the paper is that they validated their double knockout mouse model before using a comprehensive array of both acute and chronic pain tests to reach their conclusions. One minor weakness is that their n numbers for some of the studies conducted are low.
The conclusions made by the authors are largely supported by their results and the authors successfully achieved their aim of investigating the role of simultaneous inhibition of substance P and CGRPα in pain transmission.
This study offers valuable insights into our understanding of the pain pathways. Both Substance P and CGRPα neuropeptides and their receptors were considered key players in pain signaling due to their high expression in pain-responsive neurons. However, targeting these peptides in clinical trials has not been successful. By investigating the simultaneous inhibition of substance P and CGRPα through the generation of Tac1 and Calca double knockout (DKO) mice, the authors addressed an important gap in the field. Their comprehensive assessment of pain behaviors across a range of acute and chronic pain models revealed an unexpected outcome: the absence of both neuropeptides did not significantly alter pain responses. This finding is pivotal, as it challenges the hypothesis that these peptides are essential for pain transmission, even when targeted together.
Comments on revisions:
All my previous concerns have been addressed.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
MacDonald et al., investigated the consequence of double knockout of substance P and CGRPα on pain behaviors using a newly created mouse model. The investigators used two methods to confirm knockout of these neuropeptides: traditional immunolabeling and a neat in vitro assay where sensory neurons from either wildtype or double knock are co-cultured with substance P "sniffer cells", HEK cells stably expressing NKR1 (a substance P receptor), GCaMP6s and Gα15. It should be noted that functional assays confirming CGRPα knockout were not performed. Subsequently, the authors assayed double knockout mice (DKO) and wildtype (WT) mice in numerous behavioral assays using different pain models, including acute pain and itch stimuli, intraplanar injection of Complete Freund's Adjuvant, prostaglandin E2, capsaicin, AITC, oxaliplatin, as well as the spared nerve injury model. Surprisingly, the authors found that pain behaviors did not differ between DKO and WT mice in any of the behavioral assays or pain paradigms. Importantly, female and male mice were included in all analyses. These data are important and significant, as both substance P and CGRPα have been implicated in pain signaling, though the magnitude of the effect of a single knockout of either gene has been variable and/or small between studies.
The conclusions of the study are largely supported by the data; however, additional experimental controls and analyses would strengthen the authors claims.
We thank the reviewer for their insightful comments and have answered them below.
(1) The authors note that single knockout models of either substance P or CGRPα have produced variable effects on pain behaviors that are study-dependent. Therefore, it would have strengthened the study if the authors included these single knockout strains in a side-by-side analysis (in at least some of the behavioral assays), as has been done in prior studies in the field when using double- or triple-knockout mouse models (for example, see PMID: 33771873). If in the authors hands, single knockouts of either peptide also show no significant differences in pain behaviors, then the finding that double knockouts also do not show significant differences would be less surprising.
In our study, we found no phenotypic differences between WT and DKO mice, suggesting Substance P and CGRPα are largely dispensable for pain behavior. We agree that if we had we observed significant changes in behavior, it would have been interesting to examine the effects of knocking out each gene individually to determine which peptide is responsible for the phenotype. However, given the double deletion had no effect, we can predict that loss of each alone would have no or minor effects. In line with this, a more recent study that comprehensively phenotyped the Calca KO mouse found no deficits in a range of danger related behaviors (PMID: 34376756). Overall, as we are reporting negative data about the Double KO, we do not believe extensive studies of the single KOs is necessary to support the findings of our paper.
(2) It is unclear why the authors only show functional validation of substance P knockout using "sniffer" cells, but not CGRPα. Inclusion of this experiment would have added an additional layer of rigor to the study.
Imaging of CGRPα release is more challenging using the ‘sniffer’ approach because functional CGRP receptors require the expression of two genes: Calcrl (or Calcr) along with Ramp1. We now have succeeded in generating a new stable cell line expressing Calcrl and Ramp1, along with GCaMPs and human Galpha15 and include new data in the revised Figure 1F-H and Figure Supplement 1B. These cells respond robustly to CGRPalpha, but not to SP. In contrast, the existing SP cell line responds to SP but not CGRPalpha. Capsaicin evokes a strong response in these cells in co-culture with DRGs. This response is dramatically reduced in the DKO. This data therefore confirms our mice have a loss of CGRPalpha signaling as indicated by IHC.
(3) The authors should be a bit more reserved in the claims made in the manuscript. The main claim of the study is that "CGRPα and substance P are not required for pain transmission." However, the authors also note that neuropeptides can have opposing effects that may produce a net effect of no change. In my view, the data presented show that double knockout of substance P and CGRPα do not affect somatic pain behaviors, but do not preclude a role for either of these molecules in pain signaling more generally. Indeed, the authors also note that these neuropeptides could be involved in nociceptor crosstalk with the immune or vascular systems to promote headache. The authors only assayed pain responses to glabrous skin stimulation. How the DKO mice would behave in orofacial pain assays, migraine assays, visceral pain assays, or bone/joint pain assays, for example, was not tested. I do not suggest the authors include these experiments, only that they address the limitations/weaknesses of their study more thoroughly.
The reviewer makes an important point that we agree with. Our study assesses acute and chronic pain in peptide DKO mice lacking Substance P and CGRPα. Most of our data focuses on the hindpaw as pain in the paw is the gold-standard approach for phenotyping pain targets and numerous well-validated chronic pain models have been developed for this body site. However, to extend the conclusions to other tissues, we did also look at visceral pain and GI distress using acetic acid and LiCl models (Figure 2J and Figure 2 supplement). We agree with the reviewer that given the utility of CGRP monoclonal antibodies, migraine experiments would be interesting for future studies using these mice, a point we highlight in the discussion. Bone/joint pain is also clearly important from a translational perspective, but outside the scope of the current study.
(4) A more minor but important point, the authors do not describe the nature of the WT animals used. Are the littermates or a separately maintained colony of WT animals? The WT strain background should be included in the methods section.
The WT strain are C57/BL6j from Jackson Lab. This has been added to the methods.
Reviewer #2 (Public Review):
Summary:
The paper aimed to examine the effect of co-ablating Substance P and CGRPα peptides on pain using Tac1 and Calca double knockout (DKO) mice. The authors observed no significant changes in acute, inflammatory, and neuropathic pain. These results suggest that Substance P and CGRPα peptides do not play a major role in mediating pain in mice. Moreover, they reveal that the lack of behavioral phenotype cannot be explained by the redundancy between the two peptides, which are often co-expressed in the same neuron
Strengths:
The paper uses a straightforward approach to address a significant question in the field. The authors confirm the absence of Substance P and CGRPα peptides at the levels of DRG, spinal cord, and midbrain. Subsequently, they employ a comprehensive battery of behavioral tests to examine pain phenotypes, including acute, inflammatory, and neuropathic pain. Additionally, they evaluate neurogenic inflammation by measuring edema and extravasation, revealing no changes in DKO mice. The data are compelling, and the study's conclusions are well-supported by the results. The manuscript is succinct and well-presented.
We thank the reviewer for their enthusiasm for the importance of our work.
Reviewer #3 (Public Review):
In this study, the authors were assessing the role of double global knockout of substance P and CGPRα on the transmission of acute and chronic pain. The authors first generated the double knockout (DKO) mice and validated their animal model. This is then followed by a series of acute and chronic pain assessments to evaluate if the global DKO of these neuropeptides are important in modulating acute and chronic pain behaviors. Authors found that these DKO mice Substance P and CGRPα are not required for the transmission of acute and chronic pain although both neuropeptides are strongly implicated in chronic pain. This study does provide more insight into the role of these neuropeptides on chronic pain processing, however, more work still needs to be done. (see the comments below).
We thank the reviewer for their detailed and constructive feedback, and below outline the steps we have taken to answer their concerns.
(1) In assessing the double KO (result #1), why are different regions of the brains shown for substance P and CGRPα (for example, midbrain for substance P and amygdala for CGRPα)? Since the authors mentioned that these peptides co-expressed in the brain (as in the introduction), shouldn't the same brain regions be shown for both IHC? It would be ideal if the authors could show both regions (midbrain and amygdala) in addition to the DRG and spinal cord for both peptides in their findings.<br /> In addition, since this is double KO, the authors should show more representative IHC-stained brain regions (spanning from the anterior to posterior).
We could not co-stain both SP and CGRP in the same sections as the DKO mouse has endogenous GFP and RFP fluorescence, limiting us to one channel (far red). Specifically, we use a Calca KO that is a Cre:GRP knock-in/knockout (Chen et al 2018, PMID30344042) and Tac1 KO is a tagRFP knock-in/knockout (Wu et al 2018 PMID29485996). This is why we show different brain sections.
(2) It is also unclear as to why the authors only assessed the loss of substance P signaling in the double KO mice. Shouldn't the same be done for CGRPα signaling? Either the authors assess this, or the authors have to provide clear explanations as to why only substance P signaling was assessed.
As noted in our response to Reviewer 1, imaging of CGRP release is more challenging using the ‘sniffer’ approach because functional CGRP receptors require the expression of two genes: Calcrl (or Calcr) along with Ramp1. We have now generated this cell line and performed the experiment (see revised Figure 1 and Figure 1 Supplement).
(3) Has these animal's naturalistic behavior been assessed after the double KO (food intake, sleep, locomotion for example)? I think this is important as changes to these naturalistic behaviors can affect pain processes or outcomes.
We agree that assessment of naturalistic behavior including food intake, sleep and locomotion would be interesting to look at in DKO mice. However, our study is focused on acute and chronic pain behavior of these animals, and therefore a comprehensive phenotypic assessment of naturalistic home-cage behavior is outside the scope of our study.
(4) Figure 2H: The authors acknowledge that there is a trend to decrease with capsaicin-evoked coping-like responses. However, a close look at the graph suggests that the lack of significance could be driven by 1 mouse. Have the authors run an outlier test? Alternatively, the authors should consider adding more n to these experiments to verify their conclusions.
We were reluctant to add more animals searching for significance. Instead, we investigated the potential phenotype further by looking at cfos staining in the cord and found no differences (Figure 2, supplement 1). This result suggests loss of the two peptides does not grossly disrupt capsaicin evoked pain signal transmission between the nociceptor and post-synaptic dorsal neurons in the spinal cord.
(5) Similarly, the values for WT in the evoked cFos activity (Figure 2- Suppl Figure 1) are pretty variable. Considering that the n number is low (n = 5), authors should consider adding more n.<br /> Also, since the n number is low in this experiment (eg. 5 vs 4), does this pass the normality test to run a parametric unpaired t-test? Either the authors increase their n numbers or run the appropriate statistical test.
As described in the statistical tables, the Shapiro-Wilk test indicates these data do pass the normality test. Therefore, we retain the use of the unpaired t test, which demonstrates no significant difference between the groups.
(6) In most of the results, authors ran a parametric test despite the low n number. Authors have to ensure that they are carrying out the appropriate statistical test for their dataset and n number.
We now provide a table of the statistical results, which provides detailed information about all statistical tests performed in this study. For experiments where we make a single comparison between the two distributions (WT vs DKO), we have run a Shapiro-Wilk test. Where the data from both groups pass the normality test, we retain the use of the unpaired t test. Where the Shapiro-Wilk test indicates data from either group are unlikely to be normally distributed, we now use a Mann-Whitney U test to compare the groups, as this non-parametric test makes no assumptions about the underlying distribution.
Many experiments involved two factors (genotype, and e.g. temperature, drug, time-point). These data were analyzed in the original submission using 2-WAY ANOVA or Repeated Measures 2-WAY ANOVA, followed by post-hoc Sidak’s tests to compute p values adjusted for multiple comparisons. Because there is no widely agreed non-parametric alternative to 2-WAY ANOVA for analyzing data with two factors and that enables us to account for multiple comparisons, we used 2-WAY ANOVA as is typically used in the field for these kinds of experiments. We reasoned sticking with the 2-WAY ANOVA was the best course of action based on information provided by the statistical software used for this study - https://www.graphpad.com/support/faq/with-two-way-anova-why-doesnt-prism-offer-a-nonparametric-alternative-test-for-normality-test-for-homogeneity-of-variances-test-for-outliers/
We note that regardless of the test, our conclusion that there are no major changes in acute or chronic pain behaviors are clear and strongly supported.
(7) Along the same line of comment with the previous, authors should increase the n number for DKO for staining (Figure 4) as n number is only 3 and there is variability in the cFos quantification in the ipsilateral side.
We believe this is not necessary as the finding is clear that there is no difference.
(8) Authors should provide references for statement made in Line 319-321 as authors mentioned that there are accumulating evidence indicating that secretion of these neuropeptides from nociceptor peripheral terminals modulates immune cells and the vasculature in diverse tissues.
We now provide several references to primary papers and reviews supporting this statement.
(9) Authors state that the sample size used was similar to those from previous studies, but no references were provided. Also, even though the sample sizes used were similar, I believe that the right statistic test should be used to analyze the data.
We have now cited several classic studies phenotyping mouse KOs in pain in the methods that used similar sample sizes. As detailed above, we have taken the reviewer’s feedback on board and performed normality testing to ensure the correct statistical test is used for each experiment.
(10) In the discussion, the authors noted that knocking out of a gene remains the strongest test of whether the molecule is essential for a biological phenomenon. At the same time, it was acknowledged that Substance P infusion into the spinal cord elicits pain, but it is analgesic in the brain. The authors might want to expand more on this discussion, including how we can selectively assess the role of these neuropeptides in areas of interest. For example, knocking out both Substance P and CGRPα in selected areas instead of the global KO since there are reported compensatory effects.
This is highlighted in the closing paragraph: “Emerging approaches to image and manipulate these molecules (Girven et al., 2022; Kim et al., 2023), as well as advances in quantitating pain behaviors (Bohic et al., 2023; MacDonald and Chesler, 2023), may ultimately reveal the fundamental roles of neuropeptides in generating our experience of pain.” The Kim preprint (now published, and so the citation has been updated in the text) describes a method of inactivating neuropeptide transmission in select brain regions in a cell-type specific manner.
Recommendations for the authors:
Reviewer #2 (Recommendations For The Authors):
I do not have any major comments. My minor comments are as follows:
(1) What was the control group for all behavioral studies? Was it WT from an independent colony or one of the littermates was used for generating controls?
We used C57/Bl6 mice from Jax. This is now mentioned in methods.
(2) In Fig. 2H, it seems that the effect will become significant if several mice are added.
We are reluctant to add mice searching for significance. Sample sizes were determined before we collected the data blind.
(3) There is no figure 3, but two figures 4.
Thank you. This has been corrected.
(4) Multiple typos in the legend for figure 4 (lines 234-254). Line 242 (& n=8 (3M, 3F)), line 243 (swelling and plasma), line 252 ((n=8 for) & n=6 for DKO (4M, 4F)).
Thank you. This has been corrected.
(5) In Figure 4 (lines 273-285), the contralateral side is mentioned in B but no images are shown.
Thank you. We removed the mention.
(6) Although ligand knockouts cannot be compared directly with receptor inhibition, the readers could benefit from discussing studies of receptor ablation and/or pharmacological inhibition.
We do discuss the classic studies of receptor KO, and the clinical data on receptor blockers here –
“However, selective antagonists of the Substance P receptor NKR1 failed to relieve chronic pain in human clinical trials (Hill, 2000). Although CGRP monoclonal antibodies and receptor blockers have proven effective for subsets of migraine patients, their usefulness for other types of pain in humans is unclear (De Matteis et al., 2020; Jin et al., 2018). In line with this, knockout mice deficient in Substance P, CGRPα or their receptors have been reported to display some pain deficits, but the analgesic effects are neither large nor consistent between studies (Cao et al., 1998; De Felipe et al., 1998; Guo et al., 2012; Salmon et al., 2001, 1999; Zimmer et al., 1998).”
Reviewer #3 (Recommendations For The Authors):
Minor comments:
(1) Figure 1E: What does chambers mean? Additionally, are the 12 chambers equally from the male and female samples (6 from male and 6 from female)?
We have changed this to well. Each replicate is an individual well from 8 well chamber slide. In all these experiments, the wells are approximately evenly distributed by mouse, because from each mouse we cultured around 8 wells’ worth of DRGs.
(2) Figure 1D: What does low and high mean in the Hargreaves test?
These refer to a low and high active intensity of the radiant heat stimulus. Number is now described in the methods. 40 and 55 in the intensity units used by the instrument.
(3) Figure 2-Suppl Figure 1: Authors should provide a bigger image of the image so that it is clearer to the readers.
We think the image is of a reasonable size and comparable to the images used elsewhere in the paper.
(4) Authors should consider labeling their supplementary figures in running numbers or combining supplementary figures together to avoid confusion. For example, Figure 2-Supplementary Figure 1 and Figure 2- Supplementary Figure 2 can be combined as just Supplementary Figure 2.
We agree with the reviewer this would be clearer, but we have followed eLife’s convention for labelling and numbering supplements.
(5) Figure 3 is mislabeled as Figure 4.
Thank you. We have corrected this.
(6) Only female mice were used in the CFA experiment, which does not go in line with the rest of the results which consist of both sexes.
We have repeated the experiment with additional male mice. To be consistent with the von frey data, these were followed for 7 days, and so the figure now shows a 7 day time course.
(7) Typo in line 243. The word "and" is subscript.
Thank you. We have corrected this.
(8) There is a typo in the legend for Figure 4 where E is labeled I, G is labeled as F, and J is labeled as J.
Thank you. We have corrected this.
(9) Authors should specify what "several weeks" means (Line 263).
It means three weeks. We tested to 21 days. We will replace with three.
(10) Authors should specify what "one day" means (Line 267). For example, how many days after the intraplantar oxaliplatin treatment? Also, authors should justify why that specific time point was selected or have a reference for it.
This means one day after - 24 hours. Please see PMID: 33693512. Two references are provided in them methods.
(11) Figure 4 legend: authors should again be specific on what "prolonged" entails (Line 277).
We have replaced prolonged with 30 minutes brushing. Specifically, 3 x 10 min stim period, with 1 min rest between stim. It is in the methods.
(12) In the methods section, authors state that both male and female mice were used for all experiments. However, only female mice were used in the CFA experiment (see minor comment #6). Authors should verify and correct this.
This is correct. We only used female mice for one of the groups. We have since repeated with males, now included in the data.
(13) Authors should be more specific in the methods section on how long the habituation per day, how many days and what were the mice habituation to (experimenter, room, chamber, etc)?
As noted in the methods, mice are habituated for at least an hour to the chambers, and thus implicitly to the room. We do not perform explicit habituation to the investigator such as repeated handling.
(14) Authors need to provide more information on the semi-automated procedure they are referring to in Line 397. Also, authors should also provide the criteria for cFos quantification (eg. Intensity, etc). If this has been published before, they should provide the reference.
We have added this. We used the ‘Find maxima’ and ‘Analyze particles’ functions in FIJI, followed by a manual curation step.
(15) How much acetone was applied and how was it applied to the paw? (Line 495)
We used the same applicator (1ml syringe with a well at the top) to generate a droplet of acetone that was used for all mice. This has been added to methods.
(16) Authors should specify the amount of capsaicin injected in μl (Line 500).
20 ul. We have added this.
(17) Authors should explain or reference why they are analyzing the 15 min interval between 5 and 20 minutes for injection (Line507-508).
Acetic acid behaviour lasts around 30 mins in our hands. We chose the 15 minute interval because it reduces burdensome hand scoring time by 50% versus doing the whole 30 mins. We reasoned that in the first 5 mins post injection the animal behaviour may be contaminated by stress related to handling, injection and return to chamber. Thus, 5 and 20 minutes provided a sensible time-frame for scoring the behavior when it is at its peak.
(18) Authors have to provide more information/explanation on how they decide on the conditioned taste aversion protocol. Like why they do 30 mins exposure to a single water-containing bottle followed 90 mins exposure to both bottles. If this has been published before, they should provide the reference.
We read dozens of different published protocols in the literature, and piloted one that was something of an amalgam of some of them with various adaptations of convenience. Because it worked on our first attempt, we stuck to it. The advantage of the CTA assay is it is incredibly robust to changes in the specificities of the paradigm, evincing the clear survival value of learning to avoid tastes that make you sick.
(19) Authors again should provide more detail in their methods section.
a. Specify the time frame that they are assessing here (Line 533).
This can be seen in the Figure. 0 to 120 mins. We have added it to the methods.
b. How long were the mice allowed to recover post-SNI before mechanical allodynia was assessed (Line 545)?
This is apparent in the figures. 2 days to 21 days. We have added it to the methods.
c. How much of the oxaliplatin was injected into the mice?
40 ug / 40 ul (see PMID:33693512)
Editors note: Reviewers agreed that addressing the concerns about power, outliers, and statistics, as well as functional validation of CGRPα would raise the strength of evidence to compelling, and inclusion of comparison to single KO would raise it to exceptional.
Should you choose to revise your manuscript, please check to ensure full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study provides convincing data from in vitro models and patient-derived samples to demonstrate how modulation of GSK3 activity can reprogram macrophages, revealing potential therapeutic applications in inflammatory diseases such as severe COVID-19. The study stands out for its clear and systematic presentation, strong experimental approach, and the relevance of its findings to the field of immunology.
-
Reviewer #1 (Public review):
The manuscript by Rios et al. investigates the potential of GSK3 inhibition to reprogram human macrophages, exploring its therapeutic implications in conditions like severe COVID-19. The authors present convincing evidence that GSK3 inhibition shifts macrophage phenotypes from pro-inflammatory to anti-inflammatory states, thus highlighting the GSK3-MAFB axis as a potential therapeutic target. Using both GM-CSF- and M-CSF-dependent monocyte-derived macrophages as model systems, the study provides extensive transcriptional, phenotypic, and functional characterizations of these reprogrammed cells. The authors further extend their findings to human alveolar macrophages derived from patient samples, demonstrating the clinical relevance of GSK3 inhibition in macrophage biology.
The experimental design is sound, leveraging techniques such as RNA-seq, flow cytometry, and bioenergetic profiling to generate a comprehensive dataset. The study's integration of multiple model systems and human samples strengthens its impact and relevance. The findings not only offer insights into macrophage plasticity but also propose novel therapeutic strategies for macrophage reprogramming in inflammatory diseases.
Strengths:
(1) Robust Experimental Design: The use of both in vitro and ex vivo models adds depth to the findings, making the conclusions applicable to both experimental and clinical settings.<br /> (2) Thorough Data Analysis: The extensive use of RNA-seq and gene set enrichment analysis (GSEA) provides a clear transcriptional signature of the reprogrammed macrophages.<br /> (3) Relevance to Severe COVID-19: The study's focus on macrophage reprogramming in the context of severe COVID-19 adds clinical significance, especially given the relevance of macrophage-driven inflammation in this disease.
Weaknesses:
There are no significant weaknesses in the study.
-
Reviewer #2 (Public review):
Summary:
The study by Rios and colleagues provides the scientific community with a compelling exploration of macrophage plasticity and its potential as a therapeutic target. By focusing on the GSK3-MAFB axis, the authors present a strong case for macrophage reprogramming as a strategy to combat inflammatory and fibrotic diseases, including severe COVID-19. Using a robust and comprehensive methodology, in this study it is conducted a broad transcriptomic and functional analyses and offers valuable mechanistic insights while highlighting its clinical relevance
Strengths:
Well performed and analyzed
Weaknesses:
Additional analyses, including mechanistic studies, would increase the value of the study.
-
Author response:
Regarding a future revised version, we plan to:
-
refer to the "MoMac-VERSE" study according to the original report.
-
modify incorrectly formatted references.
-
modify the text to acknowledge the heterogeneity and variability in the response of primary cells to the GSK3 inhibitor.
-
improve the explanation of the reanalysis of single cell RNAseq data in Figure 7 (ref. 47, GSE120833), and re-adapt the graphs of the scRNA-Seq data using different plot parameters (e.g., reduction = "umap.scvi") to provide a more friendly-user visualization including bona fide macrophage markers for each subpopulation.
-
include statistical analyses in each one of the figure legends
-
perform additional analyses (e.g., dose-response and kinetics of CHIR-99021 effects) and mechanistic studies (e.g., role of proteasome) to further dissect the re-programming ability of the GSK3/MAFB axis.
-
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study provides valuable insights into the behavioral, computational, and neural mechanisms of regime shift detection, by identifying distinct roles for the frontoparietal network and ventromedial prefrontal cortex in sensitivity to signal diagnosticity and transition probabilities, respectively. The findings are supported by solid evidence, including an innovative task design, robust behavioral modeling, and well-executed model-based fMRI analyses, though claims of neural selectivity would benefit from more rigorous statistical comparisons. Overall, this work advances our understanding of how humans adapt belief updating in dynamic environments and offers a framework for exploring biases in decision-making under uncertainty.
-
Reviewer #1 (Public review):
Summary:
The study examines human biases in a regime-change task, in which participants have to report the probability of a regime change in the face of noisy data. The behavioral results indicate that humans display systematic biases, in particular, overreaction in stable but noisy environments and underreaction in volatile settings with more certain signals. fMRI results suggest that a frontoparietal brain network is selectively involved in representing subjective sensitivity to noise, while the vmPFC selectively represents sensitivity to the rate of change.
Strengths:
(1) The study relies on a task that measures regime-change detection primarily based on descriptive information about the noisiness and rate of change. This distinguishes the study from prior work using reversal-learning or change-point tasks in which participants are required to learn these parameters from experiences. The authors discuss these differences comprehensively.
(2) The study uses a simple Bayes-optimal model combined with model fitting, which seems to describe the data well.
(3) The authors apply model-based fMRI analyses that provide a close link to behavioral results, offering an elegant way to examine individual biases.
Weaknesses:
My major concern is about the correlational analysis in the section "Under- and overreactions are associated with selectivity and sensitivity of neural responses to system parameters", shown in Figures 5c and d (and similarly in Figure 6). The authors argue that a frontoparietal network selectively represents sensitivity to signal diagnosticity, while the vmPFC selectively represents transition probabilities. This claim is based on separate correlational analyses for red and blue across different brain areas. The authors interpret the finding of a significant correlation in one case (blue) and an insignificant correlation (red) as evidence of a difference in correlations (between blue and red) but don't test this directly. This has been referred to as the "interaction fallacy" (Niewenhuis et al., 2011; Makin & Orban de Xivry 2019). Not directly testing the difference in correlations (but only the differences to zero for each case) can lead to wrong conclusions. For example, in Figure 5c, the correlation for red is r = 0.32 (not significantly different from zero) and r = 0.48 (different from zero). However, the difference between the two is 0.1, and it is likely that this difference itself is not significant. From a statistical perspective, this corresponds to an interaction effect that has to be tested directly. It is my understanding that analyses in Figure 6 follow the same approach.
Relevant literature on this point is:
Nieuwenhuis, S, Forstmann, B & Wagenmakers, EJ (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nat Neurosci 14, 1105-1107. https://doi.org/10.1038/nn.2886
Makin TR, Orban de Xivry, JJ (2019). Science Forum: Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. eLife 8:e48175. https://doi.org/10.7554/eLife.48175
There is also a blog post on simulation-based comparisons, which the authors could check out: https://garstats.wordpress.com/2017/03/01/comp2dcorr/
I recommend that the authors carefully consider what approach works best for their purposes. It is sometimes recommended to directly compare correlations based on Monte-Carlo simulations (cf Makin & Orban). It might also be appropriate to run a regression with the dependent variable brain activity (Y) and predictors brain area (X) and the model-based term of interest (Z). In this case, they could include an interaction term in the model:
Y = \beta_0 + \beta_1 \cdot X + \beta_2 \cdot Z + \beta_3 \cdot X \cdot Z
The interaction term reflects if the relationship between the model term Z and brain activity Y is conditional on the brain area of interest X.
Another potential concern is that some important details about the parameter estimation for the system-neglect model are missing. In the respective section in the methods, the authors mention a nonlinear regression using Matlab's "fitnlm" function, but it remains unclear how the model was parameterized exactly. In particular, what are the properties of this nonlinear function, and what are the assumptions about the subject's motor noise? I could imagine that by using the inbuild function, the assumption was that residuals are Gaussian and homoscedastic, but it is possible that the assumption of homoscedasticity is violated, and residuals are systematically larger around p=0.5 compared to p=0 and p=1.
Relatedly, in the parameter recovery analyses, the authors assume different levels of motor noise. Are these values representative of empirical values?
The main study is based on N=30 subjects, as are the two control studies. Since this work is about individual differences (in particular w.r.t. to neural representations of noise and transition probabilities in the frontoparietal network and the vmPFC), I'm wondering how robust the results are. Is it likely that the results would replicate with a larger number of subjects? Can the two control studies be leveraged to address this concern to some extent?
It seems that the authors have not counterbalanced the colors and that subjects always reported the probability of the blue regime. If so, I'm wondering why this was not counterbalanced.
-
Reviewer #2 (Public review):
Summary:
This paper focuses on understanding the behavioral and neural basis of regime shift detection, a common yet hard problem that people encounter in an uncertain world. Using a regime-shift task, the authors examined cognitive factors influencing belief updates by manipulating signal diagnosticity and environmental volatility. Behaviorally, they have found that people demonstrate both over and under-reaction to changes given different combinations of task parameters, which can be explained by a unified system-neglect account. Neurally, the authors have found that the vmPFC-striatum network represents current belief as well as belief revision unique to the regime detection task. Meanwhile, the frontoparietal network represents cognitive factors influencing regime detection i.e., the strength of the evidence in support of the regime shift and the intertemporal belief probability. The authors further link behavioral signatures of system neglect with neural signals and have found dissociable patterns, with the frontoparietal network representing sensitivity to signal diagnosticity when the observation is consistent with regime shift and vmPFC representing environmental volatility, respectively. Together, these results shed light on the neural basis of regime shift detection especially the neural correlates of bias in belief update that can be observed behaviorally.
Strengths:
(1) The regime-shift detection task offers a solid ground to examine regime-shift detection without the potential confounding impact of learning and reward. Relatedly, the system-neglect modeling framework provides a unified account for both over or under-reacting to environmental changes, allowing researchers to extract a single parameter reflecting people's sensitivity to changes in decision variables and making it desirable for neuroimaging analysis to locate corresponding neural signals.
(2) The analysis for locating brain regions related to belief revision is solid. Within the current task, the authors look for brain regions whose activation covary with both current belief and belief change. Furthermore, the authors have ruled out the possibility of representing mere current belief or motor signal by comparing the current study results with two other studies. This set of analyses is very convincing.
(3) The section on using neuroimaging findings (i.e., the frontoparietal network is sensitive to evidence that signals regime shift) to reveal nuances in behavioral data (i.e., belief revision is more sensitive to evidence consistent with change) is very intriguing. I like how the authors structure the flow of the results, offering this as an extra piece of behavioral findings instead of ad-hoc implanting that into the computational modeling.
Weaknesses:
(1) The authors have presented two sets of neuroimaging results, and it is unclear to me how to reason between these two sets of results, especially for the frontoparietal network. On one hand, the frontoparietal network represents belief revision but not variables influencing belief revision (i.e., signal diagnosticity and environmental volatility). On the other hand, when it comes to understanding individual differences in regime detection, the frontoparietal network is associated with sensitivity to change and consistent evidence strength. I understand that belief revision correlates with sensitivity to signals, but it can probably benefit from formally discussing and connecting these two sets of results in discussion. Relatedly, the whole section on behavioral vs. neural slope results was not sufficiently discussed and connected to the existing literature in the discussion section. For example, the authors could provide more context to reason through the finding that striatum (but not vmPFC) is not sensitive to volatility.
(2) More details are needed for behavioral modeling under the system-neglect framework, particularly results on model comparison. I understand that this model has been validated in previous publications, but it is unclear to me whether it provides a superior model fit in the current dataset compared to other models (e.g., a model without \alpha or \beta). Relatedly, I wonder whether the final result section can be incorporated into modeling as well - i.e., the authors could test a variant of the model with two \betas depending on whether the observation is consistent with a regime shift and conduct model comparison.
-
Author response:
eLife Assessment
This study provides valuable insights into the behavioral, computational, and neural mechanisms of regime shift detection, by identifying distinct roles for the frontoparietal network and ventromedial prefrontal cortex in sensitivity to signal diagnosticity and transition probabilities, respectively. The findings are supported by solid evidence, including an innovative task design, robust behavioral modeling, and well-executed model-based fMRI analyses, though claims of neural selectivity would benefit from more rigorous statistical comparisons. Overall, this work advances our understanding of how humans adapt belief updating in dynamic environments and offers a framework for exploring biases in decision-making under uncertainty.
Thank you for reviewing our manuscript. We appreciate the editors’ assessment and the reviewers’ constructive comments. Below we address the reviewers’ comments. In particular, we addressed Reviewer 1’s comments on (1) neural selectivity by performing statistical comparisons and (2) parameter estimation by providing more details on how the system-neglect model was parameterized. We addressed Reviewer 2’s comments on (1) our neuroimaging results regarding frontoparietal network and (2) model comparisons.
Public Reviews:
Reviewer #1 (Public review):
Summary:
The study examines human biases in a regime-change task, in which participants have to report the probability of a regime change in the face of noisy data. The behavioral results indicate that humans display systematic biases, in particular, overreaction in stable but noisy environments and underreaction in volatile settings with more certain signals. fMRI results suggest that a frontoparietal brain network is selectively involved in representing subjective sensitivity to noise, while the vmPFC selectively represents sensitivity to the rate of change.
Strengths:
(1) The study relies on a task that measures regime-change detection primarily based on descriptive information about the noisiness and rate of change. This distinguishes the study from prior work using reversal-learning or change-point tasks in which participants are required to learn these parameters from experiences. The authors discuss these differences comprehensively.
Thank you for recognizing our contribution to the regime-change detection literature and our effort in discussing our findings in relation to the experience-based paradigms.
(2) The study uses a simple Bayes-optimal model combined with model fitting, which seems to describe the data well.
Thank you for recognizing the contribution of our Bayesian framework and system-neglect model.
(3) The authors apply model-based fMRI analyses that provide a close link to behavioral results, offering an elegant way to examine individual biases.
Thank you for recognizing our execution of model-based fMRI analyses and effort in using those analyses to link with behavioral biases.
Weaknesses:
My major concern is about the correlational analysis in the section "Under- and overreactions are associated with selectivity and sensitivity of neural responses to system parameters", shown in Figures 5c and d (and similarly in Figure 6). The authors argue that a frontoparietal network selectively represents sensitivity to signal diagnosticity, while the vmPFC selectively represents transition probabilities. This claim is based on separate correlational analyses for red and blue across different brain areas. The authors interpret the finding of a significant correlation in one case (blue) and an insignificant correlation (red) as evidence of a difference in correlations (between blue and red) but don't test this directly. This has been referred to as the "interaction fallacy" (Niewenhuis et al., 2011; Makin & Orban de Xivry 2019). Not directly testing the difference in correlations (but only the differences to zero for each case) can lead to wrong conclusions. For example, in Figure 5c, the correlation for red is r = 0.32 (not significantly different from zero) and r = 0.48 (different from zero). However, the difference between the two is 0.1, and it is likely that this difference itself is not significant. From a statistical perspective, this corresponds to an interaction effect that has to be tested directly. It is my understanding that analyses in Figure 6 follow the same approach.
Relevant literature on this point is:
Nieuwenhuis, S, Forstmann, B & Wagenmakers, EJ (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nat Neurosci 14, 1105-1107. https://doi.org/10.1038/nn.2886
Makin TR, Orban de Xivry, JJ (2019). Science Forum: Ten common statistical mistakes to watch out for when writing or reviewing a manuscript. eLife 8:e48175. https://doi.org/10.7554/eLife.48175
There is also a blog post on simulation-based comparisons, which the authors could check out: https://garstats.wordpress.com/2017/03/01/comp2dcorr/
I recommend that the authors carefully consider what approach works best for their purposes. It is sometimes recommended to directly compare correlations based on Monte-Carlo simulations (cf Makin & Orban). It might also be appropriate to run a regression with the dependent variable brain activity (Y) and predictors brain area (X) and the model-based term of interest (Z). In this case, they could include an interaction term in the model:
Y = \beta_0 + \beta_1 \cdot X + \beta_2 \cdot Z + \beta_3 \cdot X \cdot Z
The interaction term reflects if the relationship between the model term Z and brain activity Y is conditional on the brain area of interest X.
Thank you for this great suggestion. We tested the difference in correlation both parametrically and nonparametrically. Their results were identical. In our parametric test, we used the Fisher z transformation to transform the difference in correlation coefficients to the z statistic (Fisher, 1921). That is, for two correlation coefficients, r<sub>blue</sub> (the correlation between behavioral slope, and neural slope estimated at change-consistent signals; sample size n<sub>blue</sub>) and r<sub>red</sub>, (the correlation between behavioral slope, and neural slope estimated at change-consistent signals; sample size n<sub>red</sub>), the z statistic of the difference in correlation is given by
We found that among the five ROIs in the frontoparietal network, two of them, namely the left IFG and left IPS, the difference in correlation was significant (one-tailed z test; left IFG: z=1.8355, p=0.0332; left IPS: z=2.3782, p=0.0087). For the remaining three ROIs, the difference in correlation was not significant (dmPFC: z=0.7594, p=0.2238 ; right IFG: z=0.9068, p=0.1822; right IPS: z=1.3764, p=0.0843). We chose one-tailed test because we already know the correlation under the blue signals was significantly greater than 0. Hence the alternative hypothesis is that r<sub>blue</sub>–r<sub>red</sub>>0.
In our nonparametric test, we performed nonparametric bootstrapping to test for the difference in correlation. That is, we resampled with replacement the dataset (subject-wise) and used the resampled dataset to compute the difference in correlation. We then repeated the above for 100,000 times so as to obtain the distribution of the correlation difference. We then tested for significance and estimated p-value based on this distribution. Consistent with our parametric tests, here we also found that the difference in correlation was significant in left IFG and left IPS (left IFG: r<sub>blue</sub>–r<sub>red</sub>=0.46, p=0.0496; left IPS: r<sub>blue</sub>–r<sub>red</sub>=0.5306, p=0.0041), but was not significant in dmPFC, right IFG, and right IPS (dmPFC: r<sub>blue</sub>–r<sub>red</sub>=0.1634, p=0.1919; right IFG: r<sub>blue</sub>–r<sub>red</sub>=0.2123, p=0.1681; right IPS: r<sub>blue</sub>–r<sub>red</sub>=0.3434, p=0.0631).
We will update these results in the revised manuscript. In summary, we found that the left IFG and left IPS in the frontoparietal network differentially responded to signals consistent with change (blue signals) compared with signals inconsistent with change (red signals). First, the neural sensitivity to signal diagnosticity measured when signals consistent with change appeared (blue signals) significantly correlated with individual subjects’ behavioral sensitivity to signal diagnosticity (r<sub>blue</sub>). By contrast, neural sensitivity to signal diagnosticity measured when signals inconsistent with change appeared did not significantly correlate with behavioral sensitivity (r<sub>red</sub>). Second, the difference in correlation, r<sub>blue</sub>–r<sub>red</sub>, was statistically significant between correlation obtained at signals consistent with change and correlation obtained at signals inconsistent with change.
Another potential concern is that some important details about the parameter estimation for the system-neglect model are missing. In the respective section in the methods, the authors mention a nonlinear regression using Matlab's "fitnlm" function, but it remains unclear how the model was parameterized exactly. In particular, what are the properties of this nonlinear function, and what are the assumptions about the subject's motor noise? I could imagine that by using the inbuild function, the assumption was that residuals are Gaussian and homoscedastic, but it is possible that the assumption of homoscedasticity is violated, and residuals are systematically larger around p=0.5 compared to p=0 and p=1. Relatedly, in the parameter recovery analyses, the authors assume different levels of motor noise. Are these values representative of empirical values?
We thank the reviewer for this excellent point. The reviewer touched on model parameterization, assumption of noise, and parameter recovery analysis, which we answered below.
On our model was parameterized
We parameterized the model according to the system-neglect model in Eq. (2) and estimated the alpha parameter separately for each level of transition probability and the beta parameter separately for each level of signal diagnosticity. As a result, we had a total of 6 parameters (3 alpha and 3 beta parameters) in the model. The system-neglect model is then called by fitnlm so that these parameters can be estimated. The term ‘nonlinear’ regression in fitnlm refers to the fact that you can specify any model (in our case the system-neglect model) and estimate its parameters when calling this function. In our use of fitnlm, we assume that the noise is Gaussian and homoscedastic (the default option).
On the assumptions about subject’s motor noise
We wish to emphasize that we did not call the noise ‘motor’ because it can be estimation noise as well. Regardless, in the context of fitnlm, we assume that the noise is Gaussian and homoscedastic.
On the possibility that homoscedasticity is violated
In the revision, we plan to examine this possibility (residuals larger around p=0.5 compared with p=0 and p=1).
On whether the noise levels in parameter recovery analysis are representative of empirical values
To address the reviewer’s question, we conducted a new analysis using maximum likelihood estimation to estimate the noise level of each individual subject. We proceeded in the following steps. First, for each subject separately, we used the parameter estimates of the system-neglect model to compute the period-wise probability estimates of regime shift. As a reminder, we referred to a ‘period’ as the time when a new signal appeared during a trial (for a given transition probability and signal diagnosticity). Each trial consisted of 10 successive periods. Second, we computed the period-wise likelihood, the probability of observing the subject’s actual probability estimate given the probability estimate predicted by the system-neglect model and the noise level. Here we define noise as the standard deviation of a Gaussian distribution centered at the model-predicted probability estimate. We then summed over all periods the negative logarithm of likelihood and used MATLAB’s minimization algorithm (the ‘fmincon’ function) to obtain the noise estimate that minimized the sum of negative log likelihood (thus the noise estimate that maximized the sum of log likelihood). Across subjects, we found that the mean noise estimate was 0.1480 and ranged from 0.0816 to 0.3239. The noise estimate of each subject can be seen in the figure below.
Author response image 1.
Compared with our original parameter recovery analysis where the maximum noise level was set at 0.1, our data indicated that some subjects’ noise was larger than this value. Therefore, we expanded our parameter recovery analysis to include noise levels beyond 0.1 to up to 0.3. We found good parameter recovery across different levels of noise, with the Pearson correlation coefficient between the input parameter values used to simulate data and the estimated parameter values greater 0.95 (Supplementary Fig. S3). The results will be updated in Supplementary Fig. S3.
Author response image 2.
Parameter recovery. We simulated probability estimates according to the system-neglect model. We used each subject’s parameter estimates as our choice of parameter values used in the simulation. Using simulated data, we estimated the parameters (𝛼 and 𝛽) in the system-neglect model. To examine parameter recovery, we plotted the parameter values we used to simulate the data against the parameter estimates we obtained based on simulated data and computed their Pearson correlation. Further, we added different levels of Gaussian white noise with standard deviation 𝜎 = 0.01, 0.05, 0.1,0.2, 0.3 to the simulated data to examine parameter recovery and show the results respectively in Fig. A, B, C, D, and E. For each noise level, we show the parameter estimates in the left two graphs. In the right two graphs, we plot the parameter estimates based on simulated data against the parameter values used to simulate the data. A. Noise 𝜎 = 0.01. B. Noise 𝜎 = 0.05. C. Noise 𝜎 = 0.1. D. Noise 𝜎 = 0.2. E. Noise 𝜎 = 0.3.
We will update the parameter recovery section (p. 44) and Supplementary Figure S3 to incorporate these new results:
“We implemented 5 levels of noise with σ={0.01,0.05,0.1,0.2,0.3} and examined the impact of noise on parameter recovery for each level of noise. These noise levels covered the range of empirical noise levels we estimated from the subjects. To estimate each subject’s noise level, we carried out maximum likelihood estimation in the following steps. First, for each subject separately, we used the parameter estimates of the system-neglect model to compute the period-wise probability estimates of regime shift. As a reminder, we referred to a ‘period’ as the time when a new signal appeared during a trial (for a given transition probability and signal diagnosticity). Each trial consisted of 10 successive periods. Second, we computed the period-wise likelihood, the probability of observing the subject’s actual probability estimate given the probability estimate predicted by the system-neglect model and the noise level. Here we define noise as the standard deviation of a Gaussian distribution centered at the model-predicted probability estimate. We then summed over all periods the negative natural logarithm of likelihood and used MATLAB’s minimization algorithm (the ‘fmincon’ function) to obtain the noise estimate that minimized the sum of negative log likelihood (thus the noise estimate that maximized the sum of log likelihood). Across subjects, we found that the mean noise estimate was 0.1480 and ranged from 0.0816 to 0.3239 (Supplementary Figure S3).”
The main study is based on N=30 subjects, as are the two control studies. Since this work is about individual differences (in particular w.r.t. to neural representations of noise and transition probabilities in the frontoparietal network and the vmPFC), I'm wondering how robust the results are. Is it likely that the results would replicate with a larger number of subjects? Can the two control studies be leveraged to address this concern to some extent?
It would be challenging to use the control studies to address the robustness concern. The control studies were designed to address the motor confounds. They were less suitable, however, for addressing the individual difference issue raised by the reviewer. We discussed why this is the case below.
The two control studies did not allow us to examine individual differences – in particular with respect to neural selectivity of noise and transition probability – and therefore we think it is less likely to leverage the control studies. Having said that, it is possible to look at neural selectivity of noise (signal diagnosticity) in the first control experiment where subjects estimated the probability of blue regime in a task where there was no regime change (transition probability was 0). However, the fact that there were no regime shifts in the first control experiment changed the nature of the task. Instead of always starting at the Red regime in the main experiment, in the first control experiment we randomly picked the regime to draw the signals from. It also changed the meaning and the dynamics of the signals (red and blue) that would appear. In the main experiment the blue signal is a signal consistent with change, but in the control experiment this is no longer the case. In the main experiment, the frequency of blue signals is contingent upon both noise and transition probability where blue signals are less frequent than red signals because of the small transition probabilities. But in the first control experiment, the frequency of blue signals is not less frequent because the regime was blue in half of the trials. Due to these differences, we do not see how analyzing the control experiments could help in establishing robustness because we do not have a good prediction as to whether and how the neural selectivity would be impacted by these differences.
We can address the issue of robustness through looking at the effect size. In particular, with respect to individual differences in neural sensitivity of transition probability and signal diagnosticity, since the significant correlation coefficients between neural and behavioral sensitivity were between 0.4 and 0.58 for signal diagnosticity in frontoparietal network (Fig. 5C), and -0.38 and -0.37 for transition probability in vmPFC (Fig. 5D), the effect size of these correlation coefficients was considered medium to large (Cohen, 1992). Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155-159.
It seems that the authors have not counterbalanced the colors and that subjects always reported the probability of the blue regime. If so, I'm wondering why this was not counterbalanced.
We are aware of the reviewer’s concern. The first reason we did not do these (color counterbalancing and report blue/red regime balancing) was to not confuse the subjects in an already complicated task. Balancing these two variables also comes at the cost of sample size, which was the second reason we did not do it. Although we can elect to do these balancing at the between-subject level to not impact the task complexity, we could have introduced another confound that is the individual differences in how people respond to these variables. This is the third reason we were hesitant to do these counterbalancing.
Reviewer #2 (Public review):
Summary:
This paper focuses on understanding the behavioral and neural basis of regime shift detection, a common yet hard problem that people encounter in an uncertain world. Using a regime-shift task, the authors examined cognitive factors influencing belief updates by manipulating signal diagnosticity and environmental volatility. Behaviorally, they have found that people demonstrate both over and under-reaction to changes given different combinations of task parameters, which can be explained by a unified system-neglect account. Neurally, the authors have found that the vmPFC-striatum network represents current belief as well as belief revision unique to the regime detection task. Meanwhile, the frontoparietal network represents cognitive factors influencing regime detection i.e., the strength of the evidence in support of the regime shift and the intertemporal belief probability. The authors further link behavioral signatures of system neglect with neural signals and have found dissociable patterns, with the frontoparietal network representing sensitivity to signal diagnosticity when the observation is consistent with regime shift and vmPFC representing environmental volatility, respectively. Together, these results shed light on the neural basis of regime shift detection especially the neural correlates of bias in belief update that can be observed behaviorally.
Strengths:
(1) The regime-shift detection task offers a solid ground to examine regime-shift detection without the potential confounding impact of learning and reward. Relatedly, the system-neglect modeling framework provides a unified account for both over or under-reacting to environmental changes, allowing researchers to extract a single parameter reflecting people's sensitivity to changes in decision variables and making it desirable for neuroimaging analysis to locate corresponding neural signals.
Thank you for recognizing our task design and our system-neglect computational framework in understanding change detection.
(2) The analysis for locating brain regions related to belief revision is solid. Within the current task, the authors look for brain regions whose activation covary with both current belief and belief change. Furthermore, the authors have ruled out the possibility of representing mere current belief or motor signal by comparing the current study results with two other studies. This set of analyses is very convincing.
Thank you for recognizing our control studies in ruling out potential motor confounds in our neural findings on belief revision.
(3) The section on using neuroimaging findings (i.e., the frontoparietal network is sensitive to evidence that signals regime shift) to reveal nuances in behavioral data (i.e., belief revision is more sensitive to evidence consistent with change) is very intriguing. I like how the authors structure the flow of the results, offering this as an extra piece of behavioral findings instead of ad-hoc implanting that into the computational modeling.
Thank you for appreciating how we showed that neural insights can lead to new behavioral findings.
Weaknesses:
(1) The authors have presented two sets of neuroimaging results, and it is unclear to me how to reason between these two sets of results, especially for the frontoparietal network. On one hand, the frontoparietal network represents belief revision but not variables influencing belief revision (i.e., signal diagnosticity and environmental volatility). On the other hand, when it comes to understanding individual differences in regime detection, the frontoparietal network is associated with sensitivity to change and consistent evidence strength. I understand that belief revision correlates with sensitivity to signals, but it can probably benefit from formally discussing and connecting these two sets of results in discussion. Relatedly, the whole section on behavioral vs. neural slope results was not sufficiently discussed and connected to the existing literature in the discussion section. For example, the authors could provide more context to reason through the finding that striatum (but not vmPFC) is not sensitive to volatility.<br />
We thank the reviewer for the valuable suggestions.
With regard to the first comment, we wish to clarify that we did not find frontoparietal network to represent belief revision. It was the vmPFC and ventral striatum that we found to represent belief revision ( in Fig. 3). For the frontoparietal network, we identified its involvement in our task through finding that its activity correlated with strength of change evidence (Fig. 4) and individual subjects’ sensitivity to signal diagnosticity (Fig. 5). Conceptually, these two findings reflect how individuals interpret the signals (signals consistent or inconsistent with change) in light of signal diagnosticity. This is because (1) strength of change evidence is defined as signals (+1 for signal consistent with change, and -1 for signal inconsistent with change) multiplied by signal diagnosticity and (2) sensitivity to signal diagnosticity reflects how individuals subjectively evaluate signal diagnosticity. At the theoretical level, these two findings can be interpreted through our computational framework in that both the strength of change evidence and sensitivity to signal diagnosticity contribute to estimating the likelihood of change (Eqs. 1 and 2). We added a paragraph in Discussion to talk about this.
We will add on p. 35:
“For the frontoparietal network, we identified its involvement in our task through finding that its activity correlated with strength of change evidence (Fig. 4) and individual subjects’ sensitivity to signal diagnosticity (Fig. 5). Conceptually, these two findings reflect how individuals interpret the signals (signals consistent or inconsistent with change) in light of signal diagnosticity. This is because (1) strength of change evidence is defined as signals (+1 for signal consistent with change, and -1 for signal inconsistent with change) multiplied by signal diagnosticity and (2) sensitivity to signal diagnosticity reflects how individuals subjectively evaluate signal diagnosticity. At the theoretical level, these two findings can be interpreted through our computational framework in that both the strength of change evidence and sensitivity to signal diagnosticity contribute to estimating the likelihood of change (Eqs. 1 and 2).”
With regard to the second comment, we added discussion on the behavioral and neural slope comparison. We pointed out previous papers conducting similar analysis (Vilares et al., 2012; Ting et al., 2015; Yang & Wu, 2020), their findings and how they relate to our results. Vilares et al. found that sensitivity to prior information (uncertainty in prior distribution) in the orbitofrontal cortex (OFC) and putamen correlated with behavioral measure of sensitivity to prior. In the current study, transition probability acts as prior in the system-neglect framework (Eq. 2) and we found that ventromedial prefrontal cortex represents subjects’ sensitivity to transition probability. Together, these results suggest that OFC and vmPFC are involved in the subjective evaluation of prior information in both static (Vilares et al., 2012) and dynamic environments (current study). In addition, we added to the literature by showing that distinct from vmPFC in representing sensitivity to transition probability or prior, the frontoparietal network represents how sensitive individual decision makers are to the diagnosticity of signals in revealing the true state (regime) of the environment.
We will add on p. 36:
“In the current study, our psychometric-neurometric analysis focused on comparing behavioral sensitivity with neural sensitivity to the system parameters (transition probability and signal diagnosticity). We measured sensitivity by estimating the slope of behavioral data (behavioral slope) and neural data (neural slope) in response to the system parameters. Previous studies had adopted a similar approach (Vilares et al., 2012; Ting et al., 2015; Yang & Wu, 2020). For example, Vilares et al. (2012) found that sensitivity to prior information (uncertainty in prior distribution) in the orbitofrontal cortex (OFC) and putamen correlated with behavioral measure of sensitivity to the prior. In the current study, transition probability acts as prior in the system-neglect framework (Eq. 2) and we found that ventromedial prefrontal cortex represents subjects’ sensitivity to transition probability. Together, these results suggest that OFC and vmPFC are involved in the subjective evaluation of prior information in both static (Vilares et al., 2011) and dynamic environments (current study). In addition, we added to the literature by showing that distinct from vmPFC in representing sensitivity to transition probability or prior, the frontoparietal network represents how sensitive individual decision makers are to the diagnosticity of signals in revealing the true state (regime) of the environment.”
(2) More details are needed for behavioral modeling under the system-neglect framework, particularly results on model comparison. I understand that this model has been validated in previous publications, but it is unclear to me whether it provides a superior model fit in the current dataset compared to other models (e.g., a model without \alpha or \beta). Relatedly, I wonder whether the final result section can be incorporated into modeling as well - i.e., the authors could test a variant of the model with two \betas depending on whether the observation is consistent with a regime shift and conduct model comparison.
Thank you for the great suggestion.
To address the reviewer’s question on model comparison, we tested 4 variants of the system-neglect model and incorporated them into the final result section. The original system-neglect model and its four models are:
– Original system-neglect model: 6 total parameters, 3 beta parameters (one for each level of signal diagnosticity) and 3 alpha parameters (one for each level of transition probability).
– M1: System-neglect model with signal-dependent beta parameters (alpha parameters, and beta parameters separately estimated at change-consistent and change-inconsistent signals): 9 total parameters, 3 beta parameters for change-consistent signals, 3 beta parameters for change-inconsistent signals, and 3 alpha parameters.
– M2: System-neglect model with signal-dependent alpha parameters (alpha parameters separately estimated at change-consistent and change-inconsistent signals, and beta parameters): 9 total parameters, 3 alpha parameters for change-consistent signals, 3 alpha parameters for change-inconsistent signals, and 3 beta parameters.
– M3: System-neglect model without alpha parameters (only the beta parameters): 3 total parameters, all are beta parameters (one for each level of signal diagnosticity).
– M4: System-neglect model without beta parameters (only the alpha parameters): 3 total parameters, all are alpha parameters (one for each level of transition probability).
We compared these four models with the original system-neglect model. In the figure below, we plot where is the Akaike Information Criterion (AIC) of one of the new models minus the AIC of the original model. ∆AIC<0 indicates that the new model is better than the original model. By contrast, ∆AIC>0 suggests that the new model is worse than the original model.
Author response image 3.
When we separately estimated the beta parameter (M1) for change-consistent signals and change-inconsistent signals, we found that its AIC is significantly smaller than the original model (p<0.01). The same was found for the model where we separately estimated the alpha parameters for change-consistent and change-inconsistent signals (M2). When we took out either the alpha (M3) or the beta parameters (M4), we found that these models were worse than the original model (p<0.01). In summary, we found that models where we separately estimated the alpha/beta parameters for change-consistent and change-inconsistent signals were better than the original model. This is consistent with the insight the neural data provided.
To show these results, we will add a new figure (Figure 7) in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study implicates a specific Wolbachia gene in driving the male-killing phenotype in a moth: This is a contribution to a growing body of literature from the authors in which they authors have nicely teased apart the loci responsible for male killing across diverse insects. The conclusions are supported by solid evidence.
-
Reviewer #1 (Public review):
Summary:
Insects and their relatives are commonly infected with microbes that are transmitted from mothers to their offspring. A number of these microbes have independently evolved the ability to kill the sons of infected females very early in their development; this male killing strategy has evolved because males are transmission dead-ends for the microbe. A major question in the field has been to identify the genes that cause male killing and to understand how they work. This has been especially challenging because most male-killing microbes cannot be genetically manipulated. This study focuses on a male-killing bacterium called Wolbachia. Different Wolbachia strains kill male embryos in beetles, flies, moths, and other arthropods. This is remarkable because how sex is determined differs widely in these hosts. Two Wolbachia genes have been previously implicated in male-killing by Wolbachia: oscar (in moth male-killing) and wmk (in fly male-killing). The genomes of some male-killing Wolbachia contain both of these genes, so it is a challenge to disentangle the two.
This paper provides strong evidence that oscar is responsible for male-killing in moths. Here, the authors study a strain of Wolbachia that kills males in a pest of tea, Homona magnanima. Overexpressing oscar, but not wmk, kills male moth embryos. This is because oscar interferes with masculinizer, the master gene that controls sex determination in moths and butterflies. Interfering with the masculinizer gene in this way leads the (male) embryo down a path of female development, which causes problems in regulating the expression of genes that are found on the sex chromosomes.
Strengths:
The authors use a broad number of approaches to implicate oscar, and to dissect its mechanism of male lethality. These approaches include: a) overexpressing oscar (and wmk) by injecting RNA into moth eggs, b) determining the sex of embryos by staining female sex chromosomes, c) determining the consequences of oscar expression by assaying sex-specific splice variants of doublesex, a key sex determination gene, and by quantifying gene expression and dosage of sex chromosomes, using RNASeq, and d) expressing oscar along with masculinizer from various moth and butterfly species, in a silkmoth cell line. This extends recently published studies implicating oscar in male-killing by Wolbachia in Ostrinia corn borer moths, although the Homona and Ostrinia oscar proteins are quite divergent. Combined with other studies, there is now broad support for oscar as the male-killing gene in moths and butterflies (i.e. order Lepidoptera). So an outstanding question is to understand the role of wmk. Is it the master male-killing gene in insects other than Lepidoptera and if so, how does it operate?
Weaknesses:
I found the transfection assays of oscar and masculinizer in the silkworm cell line (Figure 4) to be difficult to follow. There are also places in the text where more explanation would be helpful for non-experts.
-
Reviewer #2 (Public review):
Summary:
Wolbachia are maternally transmitted bacteria that can manipulate host reproduction in various ways. Some Wolbachia induce male killing (MK), where the sons of infected mothers are killed during development. Several MK-associated genes have been identified in Homona magnanima, including Hm-oscar and wmk-1-4, but the mechanistic links between these Wolbachia genes and MK in the native host are still unclear.
In this manuscript, Arai et al. show that Hm-oscar is the gene responsible for Wolbachia-induced MK in Homona magnanima. They provide evidence that Hm-Oscar functions through interactions with the sex determination system. They also found that Hm-Oscar disrupts sex determination in male embryos by inducing female-type dsx splicing and impairing dosage compensation. Additionally, Hm-Oscar suppresses the function of Masc. The manuscript is well-written and presents intriguing findings. The results support their conclusions regarding the diversity and commonality of MK mechanisms, contributing to our understanding of the mechanisms and evolutionary aspects of Wolbachia-induced MK.
Comments on revisions:
The authors have already addressed the reviewer's concerns.
-
Reviewer #3 (Public review):
Summary:
Overall, this is a clearly written manuscript with nice hypothesis testing in a non-model organism that addresses the mechanism of Wolbachia-mediated male killing. The authors aim to determine how five previously identified male-killing genes (encoded in the prophage region of the wHm Wolbachia strain) impact the native host, Homona magnanima moths. This work builds on the authors' previous studies in which<br /> (1) they tested the impact of these same wHm genes via heterologous expression in Drosophila melanogaster<br /> (2) also examined the activity of other male-killing genes (e.g., from the wFur Wolbachia strain in its native host: Ostrinia furnacalis moths).
Advances here include identifying which wHm gene most strongly recapitulates the male-killing phenotype in the native host (rather than in Drosophila), and the finding that the Hm-Oscar protein has the potential for male-killing in a diverse set of lepidopterans, as inferred by the cell-culture assays.
Strengths:
Strengths of the manuscript include the reverse genetics approaches to dissect the impact of specific male-killing loci, and use of a "masculinization" assay in Lepidopteran cell lines to determine the impact of interactions between specific masc and oscar homologs.
Weaknesses:
It is clear from Figure 1 that the combinations of wmk homologs do not cause male killing on their own here. While I largely agree with the author's conclusions that oscar is the primary MK factor in this system, I don't think we can yet rule out that wmk(s) may work synergistically or interactively with oscar in vivo. This might be worth a small note in the discussion. (eg at line 294 'indicating that wmk likely targets factors other than masc." - this could be downstream of the impacts of oscar; perhaps dependent on oscar-mediated impacts on masc first).
Regarding the perceived male-bias in Figure 2a: I think readers might be interpreting "unhatched" as "total before hatching". You could eliminate ambiguity by perhaps splitting the bars into male and female, and then within a bar, coloring by hatched versus unhatched. But this is a minor point, and I think the updated text helps clarify this.
The new Figure 4b looks to be largely redundant with the oscar information in Figure 1a.
Updated statistical comparisons for the RNA-seq analysis are helpful. However these analyses are based on single libraries (albeit each a pool of many individuals), so this is still a weaker aspect of the manuscript.
The new information on masc similarity is useful (Fig 4d) - if the authors could please include a heatmap legend for the colors, that would be helpful. Also, please avoid green and red in the same figure when key for interpretation.
Figure 1A "helix-turn-helix" is misspelled. ("tern").
-
Author response:
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer #1 (Public review):
Summary:
Insects and their relatives are commonly infected with microbes that are transmitted from mothers to their offspring. A number of these microbes have independently evolved the ability to kill the sons of infected females very early in their development; this male killing strategy has evolved because males are transmission dead-ends for the microbe. A major question in the field has been to identify the genes that cause male killing and to understand how they work. This has been especially challenging because most male-killing microbes cannot be genetically manipulated. This study focuses on a male-killing bacterium called Wolbachia. Different Wolbachia strains kill male embryos in beetles, flies, moths, and other arthropods. This is remarkable because how sex is determined differs widely in these hosts. Two Wolbachia genes have been previously implicated in male-killing by Wolbachia: oscar (in moth male-killing) and wmk (in fly male-killing). The genomes of some male-killing Wolbachia contain both of these genes, so it is a challenge to disentangle the two.
This paper provides strong evidence that oscar is responsible for male-killing in moths. Here, the authors study a strain of Wolbachia that kills males in a pest of tea, Homona magnanima. Overexpressing oscar, but not wmk, kills male moth embryos. This is because oscar interferes with masculinizer, the master gene that controls sex determination in moths and butterflies. Interfering with the masculinizer gene in this way leads the (male) embryo down a path of female development, which causes problems in regulating the expression of genes that are found on the sex chromosomes.
We would like to thank you for evaluating our manuscript.
Strengths:
The authors use a broad number of approaches to implicate oscar, and to dissect its mechanism of male lethality. These approaches include:
(1) Overexpressing oscar (and wmk) by injecting RNA into moth eggs.
(2) Determining the sex of embryos by staining female sex chromosomes.
(3) Determining the consequences of oscar expression by assaying sex-specific splice variants of doublesex, a key sex determination gene, and by quantifying gene expression and dosage of sex chromosomes, using RNASeq.
(4) Expressing oscar along with masculinizer from various moth and butterfly species, in a silkmoth cell line.
This extends recently published studies implicating oscar in male-killing by Wolbachia in Ostrinia corn borer moths, although the Homona and Ostrinia oscar proteins are quite divergent. Combined with other studies, there is now broad support for oscar as the male-killing gene in moths and butterflies (i.e. order Lepidoptera). So an outstanding question is to understand the role of wmk. Is it the master male-killing gene in insects other than Lepidoptera and if so, how does it operate?
Thank you for your comments. Wolbachia strains often carry wmk genes, but as observed in this study, the homologs in Homona showed no apparent MK ability. These showed strong male lethality in D. melanogaster, but it is still unclear whether the genes are the master male-killing gene in Diptera. It is also possible that the genes show toxicities in other lepidopteran insects as well as in other insect taxa. Further functional validation assays in different insects are warranted to clarify whether wmk shows toxicity in different insect taxa. We have also discussed the functions of wmk in the Discussion section (lines 301-306).
Weaknesses:
I found the transfection assays of oscar and masculinizer in the silkworm cell line (Figure 4) to be difficult to follow. There are also places in the text where more explanation would be helpful for non-experts (see recommendations).
Thank you for your suggestion. We have thoroughly revised the manuscript to address all the questions, comments and suggestions you raised in “recommendations”. In particular, we have revised the section on the transfection assays of Oscar and Masc in Bm-N4 cells (result section “Hm-oscar suppresses the masculinizing functions of lepidopteran masc genes” starts on line 214 and Fig. 4; materials and methods section ”Transfection assays and quantification of BmIMP<sup>M</sup>”, starts on line 483). We have also provided more detailed explanations for non-experts in some contexts (in response to your recommendation). We believe that the resulting revisions have significantly improved the quality and comprehensiveness of our manuscript.
Reviewer #2 (Public review):
Summary:
Wolbachia are maternally transmitted bacteria that can manipulate host reproduction in various ways. Some Wolbachia induce male killing (MK), where the sons of infected mothers are killed during development. Several MK-associated genes have been identified in Homona magnanima, including Hm-oscar and wmk-1-4, but the mechanistic links between these Wolbachia genes and MK in the native host are still unclear.
In this manuscript, Arai et al. show that Hm-oscar is the gene responsible for Wolbachia-induced MK in Homona magnanima. They provide evidence that Hm-Oscar functions through interactions with the sex determination system. They also found that Hm-Oscar disrupts sex determination in male embryos by inducing female-type dsx splicing and impairing dosage compensation. Additionally, Hm-Oscar suppresses the function of Masc. The manuscript is well-written and presents intriguing findings. The results support their conclusions regarding the diversity and commonality of MK mechanisms, contributing to our understanding of the mechanisms and evolutionary aspects of Wolbachia-induced MK.
We would like to thank you for evaluating our manuscript.
Strengths/weaknesses:
(1) The authors found that transient overexpression of Hm-oscar, but not wmk-1-4, in Wolbachia-free H. magnanima embryos induces female-biased sex ratios. These results are striking and mirror the phenotype of the wHm-t infected line (WT12). However, Table 1 lists the "male ratio," while the text presents the "female ratio" with standard deviation. For consistency, the calculation term should be uniform, and the "ratio" should be listed for each replicate.
We have revised the first results section (Hm-oscar induces female-biased sex ratios, starting from line 147) accordingly to maintain the consistency in the calculation term. In the revised manuscript, the 'male ratio' is now consistently used, in alignment with Fig. 1. In addition, we have included all sex ratio information (number of males and females) in the supplementary data file for transparency and clarity.
(2) The error bars in Figure 3 are quite large, and the figure lacks statistical significance labels. The authors should perform statistical analysis to demonstrate that Hm-oscar-overexpressed male embryos have higher levels of Z-linked gene expression.
The large error bar on each chromosome (Fig.3a-d) likely reflect the overall variation in expression levels across different transcripts. Accordingly, we have included statistical data for Figure 3 based on the Steel-Dwass test for expression levels. However, displaying statistical significance directly on the whisker plots would make the figure too cluttered due to the numerous combinations. Instead, we have provided all the statistical data in the supplementary data file. To further support the claim that Z-linked genes are more highly expressed in wHm-t-infected/Hb-Oscar-injected embryos, we have included the expression data for a Z-linked gene tpi, along with its statistical data in the revised manuscript (Fig. 3e, lines 210-212).
(3) The authors demonstrated that Hm-Oscar suppresses the masculinizing functions of lepidopteran Masc in BmN-4 cells derived from the female ovaries of Bombyx mori. They should clarify why this cell line was chosen and its biological relevance. Additionally, they should explain the rationale for evaluating the expression levels of the male-specific BmIMP variant and whether it is equivalent to dsx.
Thank you for your suggestion. We selected BmN-4 cell line because previous studies have established it as a reliable model for investigating the functions of lepidopteran masc genes and the interactions between masc and Oscar genes (Katsuma et al., 2019; 2022). In addition, BmIMP<sup>M</sup> is a male-specific regulator of the male-type dsx, making it an ideal target for assessing the 'maleness' induced by transfection of the masc gene in female-derived BmN-4 cells (Suzuki et al., 2010; Katsuma et al., 2015). We have included more detailed background information in the revised manuscript and have thoroughly revised this section (Hm-oscar suppresses the masculinizing functions of lepidopteran masc genes, starting at line 214) and Figure 4 for better clarity.
(4) Although the authors show that Hm-oscar is involved in Wolbachia-induced MK in Homona magnanima and interacts with the sex determination system in lepidopteran insects, the precise molecular mechanism of Hm-oscar-induced MK remains unclear. Further studies are needed to elucidate how Hm-oscar regulates Homona magnanima genes to induce MK, though this may be beyond the scope of the current manuscript.
Based on our findings and previous studies in Homona, Ostrinia and Bombyx (Arai et al., 2023a; Katsuma et al., 2023; Kiuchi et al., 2014), we hypothesize that the molecular mechanisms underlying _w_Hm-induced MK are likely linked to impaired dosage compensation caused by the inhibition of Masc function by the Hm-Oscar protein. While the precise mechanisms remain unclear, unbalanced Z-linked gene expression due to the impaired dosage compensation (i.e., 2-fold higher Z-linked gene expression compared to normal males) is known to be lethal for lepidopteran males (Kiuchi et al., 2014; Fukui et al., 2015; Visser et al., 2021). We have outlined this hypothesis in the Discussion section (lines 245-254).
Reviewer #3 (Public review):
Summary:
Overall, this is a clearly written manuscript with nice hypothesis testing in a non-model organism that addresses the mechanism of Wolbachia-mediated male killing. The authors aim to determine how five previously identified male-killing genes (encoded in the prophage region of the wHm Wolbachia strain) impact the native host, Homona magnanima moths. This work builds on the authors' previous studies in which:
(1) They tested the impact of these same wHm genes via heterologous expression in Drosophila melanogaster.
(2) They examined the activity of other male-killing genes (e.g., from the wFur Wolbachia strain in its native host: Ostrinia furnacalis moths).
Advances here include identifying which wHm gene most strongly recapitulates the male-killing phenotype in the native host (rather than in Drosophila), and the finding that the Hm-Oscar protein has the potential for male-killing in a diverse set of lepidopterans, as inferred by the cell-culture assays.
Strengths:
Strengths of the manuscript include the reverse genetics approaches to dissect the impact of specific male-killing loci, and the use of a "masculinization" assay in Lepidopteran cell lines to determine the impact of interactions between specific masc and oscar homologs.
We would like to thank you for evaluating our manuscript.
Weaknesses:
My major comments are related to the lack of statistics for several experiments (and the data normalization process), and opportunities to make the manuscript more broadly accessible.
Thank you for your suggestions. We have thoroughly revised the manuscript to provide clearer explanations for non-experts. In addition, we have included more detailed statistical data for Figure 3 and Figure 4 based on the Steel-Dwass tests. For Figure 3a-d, displaying statistical significance directly on the whisker plots would make the figure too cluttered due to the numerous combinations. Therefore, we have provided all the statistical data in the supplementary data file. To further support the claim that Z-linked genes are more highly expressed in w_Hm-t-infected/Hm-Oscar-injected embryos, we have included the expression data for a Z-linked gene _tpi, along with its statistical data in the revised manuscript (Fig.3e, lines 210-212). Regarding Figure 4, we have revised the Figure based on the reviewer’s suggestions, and provided more detailed information on how the expression data were analyzed (Transfection assays and quantification of BmIMP<sup>M</sup>, lines 495-520). We have also included more detailed background information on the assay system (Hm-oscar suppresses the masculinizing functions of lepidopteran masc genes, lines 215-237). Although we did not observe statistical significance based on the Steel-Dwass test, likely due to limited replicates, the observed changes in the IMP gene expression remain clearly evident.
The manuscript I think would be much improved by providing more details regarding some of the genes and cross-lineage comparisons. I know some of this is reported in previous publications, but some summary and/or additional analysis would make this current manuscript much more approachable for a broader audience, and help guide readers to specific important findings. For example, a graphic and/or more detail on how the wmk/oscar homologs (within and across Wolbachia strains) differ (e.g., domains, percent divergence, etc) would be helpful for contextualizing some of the results. I recognize the authors discuss this in parts (e.g., lines 223-227), but it does require some bouncing between sections to follow. Similarly, the experiments presented in Figure 4 indicate that Hm-oscar has broad spectrum activity: how similar are the masc proteins from these various lepidopterans? Are they highly conserved? Rapidly evolving? Do the patterns of masc protein evolution provide any hints at how Oscar might be interacting with masc?
Thank you for your valuable suggestion. To address this, we have included a visualization of the structural differences between the Oscar and wmk homologs in Figure 1a of the revised manuscript. In addition, we have included more detailed information for these genes and revised the introduction (lines 110-114; 124-137) and discussion (lines 255-266) to provide a clearer and more comprehensive overview. We have also described the similarity of the Masc proteins and Oscar proteins that we used, which is now reflected in the revised Figure 4b and 4d. More detailed information on these proteins is available in the supplementary data. Notably, Masc proteins exhibit high sequence variability with conserved domains (Figure 4d). Previous study identified the N-terminal region of Masc as crucial for the Oscar function (Katsuma et al., 2022). The wide spectrum of the actions of Hm-Oscar likely stems from these conserved structures of Masc, but the effects might have undergone evolutionary tuning through interactions with the native host as discussed in lines 293-294.
It is clear from Figure 1 that the combinations of wmk homologs do not cause male killing on their own. Did the authors test if any of the wmk homologs impact the MK phenotype of oscar? It looks like a previous study tested this in wFur (noted in lines 250-252), but given that the authors also highlight the differences between the wFur-oscar and Hm-oscar proteins, this may be worth testing in this system. Related to this, what is the explanation for why there would be 4 copies of wmk in Hm?
Thank you for your valuable suggestion. Unfortunately, we have not yet tested the effects of co-expression of wmk and Oscar. Due to a technical issue, the mixing of multiple constructs results in a reduced amount of mRNA (i.e. mixing wmk-3 and Hm-Oscar at the same concentration results in a 2-fold lower concentration in mRNA for both genes compared to mono-injected groups). In addition, we have previously tested injecting mRNA at the twofold higher concentration (i.e. 2 ug/ul mRNA), which resulted in very low hatchability regardless of the genes. Katsuma et al (2022) tested the effect of wmk on the sex determination system, but did not test the effect of co-injection/transfection of wmk and Oscar. Considering the results of this and previous studies (Katsuma et al., 2022; Arai et al., 2023), it is likely that the targets of the wmk and oscar genes are different (as discussed in lines 267-289). Co-injection of wmk and oscar may not produce additive effects. Nevertheless, we would like to test the results in future studies using the Drosophila system as well.
As you point out, it is an interesting point that the moth-derived MK Wolbachia w_Hm-t encodes four _wmk genes, although they have no apparent effect on host survival. The exact functional relevance of these wmk homologs remains unclear. However, they may play a role in Wolbachia biology as transcriptional regulators, given that they encode HTH domains. Wolbachia generally encode several wmk homologs in their genome, regardless of whether they induce MK. This suggests that the functions of the wmk genes may be 'suppressed' in certain Wolbachia-host systems. The wmk and Hm-oscar genes are located within a prophage region, and some wmk genes are tandemly arrayed (as described in Arai et al., 2023). These wmk homologs may have increased in number by horizontal phage transfer, and the region containing wmk and adjacent sequences may act as a genomic island for virulence. So far, the function of wmk homologs has only been tested in D. melanogaster and H. magnanima, and further studies in other Wolbachia-host systems are highly warranted to test whether wmk exerts MK effects in other insect models. These points have been briefly discussed in the revised manuscript (lines 301-306; 318-320).
Why are some of the broods male-biased (2/3) rather than ~50:50? (Lines 170-175, Figure 2a). For example, there is a strong male bias in un-hatched oscar-injected and naturally infected embryos, whereas the control uninfected embryos have normal 50:50 sex ratios. It is difficult to interpret the rate of male-killing given that the sex ratios of different sets of zygotes are quite variable.
The observed male-biased sex ratios in unhatched embryos are due to the occurrence of MK during embryogenesis. In the unhatched groups, the skew towards males reflects that fact that the male embryos were targeted and killed by Wolbachia/Oscar, leading to a surplus of unhatched male embryos. Conversely, hatched individuals show a higher proportion of females because many of the males were eliminated during embryogenesis. Thus, the unhatched embryos are more male-biased, while the hatched individuals are more female-biased in the Hm-oscar/_w_Hm-t treated groups. We have revised the relevant section (Males are killed mainly at the embryonic stage, lines 179-186) and provided more detailed information to clarify this explanation.
Figure 2b - it appears there are both male and female bands in the HmOsc male lane. I think this makes sense (likely a partial phenotype due to the nature of the overexpression approach), but this is worth highlighting, especially in the context of trying to understand how much of the MK phenotype might be recapitulated through these methods. Related, there is no negative control for this PCR.
Thank you for your suggestion. As you noted, a faint dsx-M band is visible in the Hm-oscar treated group in Figure 2b. This is consistent with previous findings by Arai et al. (2023), which reported that male embryos with low-density w_Hm-t showed double bands of _dsx-M and dsx-F, similar to what we observed in this study. This information has been included in the revised manuscript in lines 196-198, as follows:
“Notably, male embryos expressing Hm-oscar also exhibited weak male-type dsx splicing in addition to the female-type splicing, resembling the previously observed pattern in male embryos infected with low-titer _w_Hm-t (Arai et al., 2023a).”
Also, we appreciate your comment regarding the missing of negative control. The figure has now been revised as we realised that the negative control lane had been lost during the preparation of the figure. We also included the relevant molecular marker information in both the figure legends and Figure 2b.
It appears the RNA-seq analysis (Figure 3) is based on a single biological replicate for each condition. And, there are no statistical comparisons that support the conclusions of a shift in dosage compensation. Finally, it is unclear what exactly is new data here: the authors note "The expression data of the wHm-t-infected and non-infected groups were also calculated based on the transcriptome data included in Arai et al. (2023a)" - So, are the data in Figure 3c and 3d a re-print of previous data? The level of dosage compensation inferred by visually comparing the control conditions in 3b and 3d does not appear consistent. With only one biological replicate library per condition, what looks like a re-print of previous data, and no statistical comparisons, this is a weakly supported conclusion.
Thank you for your suggestion. In this study, we generated the RNA-seq data for the Hm-oscar/GFP-injected groups, but did not sequence the w_Hm-t-infected/NSR lines. Instead, the previously generated RNA-seq data of _w_Hm-t-infected/NSR (Arai et al., 2023) were re-analyzed (rather than simply reprinted) to evaluate whether the expression patterns of _Hm-oscar-injected and w_Hm-t-infected groups are similar. We have revised the Results section (_Hm-oscar impairs dosage compensation in male embryos, lines 200-212), the Materials and methods section (Quantification of Z chromosome-linked genes, lines 454-456), and the figure legends to provide more precise information about this analysis.
Although we did not perform replicates for the RNA-seq comparisons, it is important to note that each RNA-seq sample contains 50-60 male/female individuals. We believe the results are still robust and clearly indicative of the trends we observe. This was further supported by the quantification of Hmtpi gene expression, which we have visualized in Figure 3e (and lines 210-212). As you noted, the expression patterns in Figure 3b (GFP injected) and Figure 3d (NSR) are not completely identical. This discrepancy may be due to the differences between injection treatments and natural infections. Nevertheless, both treatments are consistent in showing that gene expressions on the Z chromosome (Chr01 and Chr15) are not upregulated.
We have also added more detailed statistical data for Figure 3 based on the Steel-Dwass tests. For Figure 3a-d, however, showing the statistical significance directly on the whisker plots would create excessive clutter due to the numerous combinations of chromosomes. Instead, we have provided the full statistical data in the supplementary data file. Furthermore, to support/strengthen our conclusion that Z-linked genes are highly expressed in w_Hm-t-infected/_Hm-Oscar-injected embryos, we have included expression data for the Z-linked gene tpi, along with statistical data, in the revised manuscript (Fig. 3e, lines 210-212).
In Figure 4: There are no statistics to support the conclusions presented here. Additionally, the data have gone through a normalization process, but it is difficult to follow exactly how this was done. The control conditions appear to always be normalized to 100 ("The expression levels of BmImpM in the Masc and Hm-Oscar/Oscar co-transfected cells were normalized by setting each Masc-transfected cell as 100"). I see two problems with this approach:
(1) This has eliminated all of the natural variation in BmImpM expression, which is likely not always identical across cells/replicates.
(2) How then was the percentage of BmImpM calculated for each of the experimental conditions? Was each replicate sample arbitrarily paired with a control sample? This can lead to very different outcomes depending on which samples are paired with each other. The most appropriate way to calculate the change between experimental and control would be to take the difference between every single sample (6 total, 3 control, 3 experimental) and the mean of the control group. The mean of the control can then be set at 100 as the authors like, but this also maintains the variability in the dataset and then eliminates the issue of arbitrary pairings. This approach would also then facilitate statistical comparisons which is currently missing.
Thank you for your suggestion. As you pointed out in (1), the previous analysis did indeed eliminate the natural variation in BmIMP-M expression. In the revised manuscript and Figure 4, we have reanalyzed the data following your suggestion and have described the variation across replicates.
For (2), the data shown in the previous manuscript were normalized to 100 for each Masc-treated group. In doing so, each replicate sample was arbitrarily paired with a control sample from the same cell lot to account for variations that might occur due to differences in cell lots. However, following your recommendation, we have revised the figure to set the average of the Hm-masc treated group to 100, rather than using arbitrary pairings. More detailed normalization procedures have been provided in the section 'Transfection assays and quantification of BmIMP' (lines 483-520). Additionally, we have provided more detailed background information on the assay system in lines 218-223. Although we did not observe statistical significance based on the Steel-Dwass test, likely due to the limited number of replicates, the differences in IMP gene expression between the Masc-treated and Masc&Hm-oscar-treated groups remain evident.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Line 38: change to: 'Wolbachia are maternally transmitted'.
Revised accordingly (line 38).
Line 69: remove 'seemingly'.
Revised accordingly (line 69).
Paragraph starting line 123: I don't think this is so clear to a reader who is not familiar with the work and system. It would be helpful to more clearly explain that candidate male-killing genes from Wolbachia that infect Homona were inserted into Drosophila melanogaster, and that their expression was then induced, with interesting patterns (and that it can be a bit difficult to interpret the transgenic expression of genes from a moth male-killer that are inserted into a fly). Also, the sentence about the combined action of cifA and cifB in Drosophila cytoplasmic incompatibility is also confusing to a non-expert. I would suggest removing it.
Thank you for your suggestion. We have revised the paragraph (lines 124-139) to provide clearer background information, making it easier for non-experts to follow. We have also removed the sentence regarding the combined effect of cifA and cifB to improve the flow and overall clarity.
Line 170: what is the explanation for the male-biased sex ratio instead of 50-50?
The male-biased sex ratio occurs because MK happens during embryogenesis. Unhatched embryos include males that were killed by Wolbachia/Oscar, resulting in a higher proportion of unhatched male embryos. Conversely, the hatched individuals display a female bias, as most of the males were eliminated during embryogenesis. Thus, the unhatched embryos are more male-biased, while the hatched individuals are more female-biased in the Hm-oscar/_w_Hm-t treated groups. We have revised the section “Males are killed mainly at the embryonic stage” (lines 170-186) to include more detailed information explaining this phenomenon.
Line 190: please explain what are the Z chromosomes in Bombyx and Homona and Lepidoptera in general (chromosomes 1 and 15?), as this is not so clear for a non-expert.
Thank you for your suggestion. I have revised the section (lines 200-212) to include more precise background information about the chromosome constitutions in lines 202-204 as follows:
“Unlike other lepidopteran species, Tortricidae, including H. magnanima, generally possess a large Z chromosome that is homologous to B. mori chromosomes 1 (Z) and 15 (autosome).”
Line 222: please explain oscar diversity and classification in more detail, as this is not so clear for a non-expert.
Thank you for your suggestion. We have revised the sentences to provide clearer background information on the diversity of oscar genes (lines 255-264).
Figure 4: I found this difficult to follow. Why are there 2 rows (HmOscar and Oscar)? Does oscar here refer to oscar from Ostrinia? I am also a bit confused about the baseline control of Masc in these cell lines. If I understand Lepidoptera sex determination, then these cell lines are expressing high levels of female-specific piRNAs that suppress Masc. How specific are these piRNAs (i.e. do Bombyx piRNAs suppress Mascs from other Lepidoptera)? How much extra Masc will override endogenous piRNA? Information is lost by setting Masc expression to 100% in each separate comparison.
Yes, the Oscar indicates the w_Fur-encoded _oscar (Oscar from Ostrinia) that was tested to compare function with the Homona-derived Hm-oscar gene. In addition, following the reviewer's suggestions, we have revised the figure and included more detailed information on how we adjusted the expressions in the M&M section.
A previous study (Shoji et al., 2017, RNA 23:86–97) demonstrated that the Fem piRNA (29 bp) in Bombyx mori requires a 17 bp complementary sequence from its 5' region for its function. However, in species other than B. mori, no significant homology (i.e., over 17 bp matches) was found between the B. mori Fem piRNA and the masc genes analyzed in this study. Therefore, it is likely that the Fem piRNA expressed in BmN-4 cells is unable to suppress the masculinizing function driven by masc genes in other lepidopteran species. In addition, we did not quantify the levels of piRNA in this system, but the expression levels of masc are probably too high to be suppressed.
Figure 4 legend: spelling of Spodoptera.
Revised accordingly.
Reviewer #2 (Recommendations for the authors):
In Figure 2, what is the dsx splicing type for the hatched male in the Hm-oscar-injected group and the wHm-t infected line? Dsx-F or dsx-M?
Thank you for your suggestion. Unfortunately, we have not tested splicing in the hatched male neonates (1st instar larvae), partly due to difficulties in obtaining sufficient material for RNA extraction. Based on the previous publication in the Ostrinia system, where Oscar-bearing w_Sca induces MK, the hatched males (ZZ) exhibit female type _dsx as observed in the male embryos (Herran et al., 2022). The hatched Homona males may show double bands for dsx-M and dsx-F as observed in this study.
The size of the markers (in kilobase pairs) should be indicated in Figure 2.
We have accordingly included the marker information in the revised Figure 2b and the figure legends.
In Figure 3, could the authors identify which genes exhibit higher expression levels in the Hm-oscar-injected group and the wHm-t infected line? Could they provide hints for the possible mechanism of male-killing?
In the RNA-seq data shown in Figure 3a-d, we observed that both the Hm-oscar-injected and w_Hm-infected groups generally exhibited upregulated expression of Z-linked genes. Rather than the upregulation or downregulation of a specific gene, we consider that global upregulation of Z-linked genes, caused by improper dosage compensation, is lethal for males. The Z chromosome contains various genes involved in key biological processes such as endocrine function and detoxification, and disruption of these processes may contribute to male lethality. Additionally, in this revised manuscript, we have provided more detailed information on the expression level of the Z-linked gene _tpi. We have also discussed the potential mechanisms of MK in the Discussion section (lines 245-254).
The format of the references should be consistent. Gene and species names should be italicized.
We have accordingly formatted.
Reviewer #3 (Recommendations for the authors):
The authors use the term "upstream" (e.g., Oscar suppressed the function of masculinizer, the upstream male sex determinant...), which was sometimes confusing. In many cases, it reads as though the masculinizer was upstream of oscar, but what I think the authors are trying to convey is that masculinizer is a primary sex-determining factor.
Thank you for your suggestion. We have accordingly revised the term.
Line 101: which insect is wFur from?
It is from Ostrinia furnacalis - line 104 has been revised.
Figure 1: it would be helpful to indicate the statistical results on the figure.
Accordingly, we have added statistical data (binominal test) for Figure 1. The data for the Steel-Dwass test have been included in the supplementary data.
Figure 2b: please label the ladder on the gel.
Thank you for your suggestion. We have accordingly labeled the DNA ladder on the gel.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study provides compelling data regarding the molecular characterization of a rare tumor type with few treatment options. This fundamental work significantly advances our mechanistic understanding of solitary fibrous tumours, a critical first step towards targeted precision medicine approaches. The results of this study will be of broad interest to cancer biologists and experimental oncologists.
-
Joint Public Review:
Solitary Fibrous Tumors (SFTs) are a rare malignancy defined by NAB2-STAT6 fusions. Because the molecular understanding of the disease is largely lacking, there are currently no targeted treatment approaches. Using primary tumor and adjacent normal tissue samples and cells inducibly expressing NAB2-STAT6, Hill et al. perform a detailed characterization of the transcriptomic and epigenomic NAB2-STAT6 SFT signatures. They identify enrichment or EGR1/NAB2 (but not STAT6) sites bound by the fusion protein and increased expression of EGR1 targets. Their studies indicate that NAB2-STAT6 fusion may direct the nuclear translocation of NAB2 and EGR1 proteins and potentially NAB1. Transcriptionally, NAB2-STAT6 SFTs most closely resemble neuroendocrine tumors.
This pioneering study provides critical insight into the molecular pathogenesis of SFTs, pivotal for the future development of mechanistically informed treatment approaches. The study is rigorously executed and well-written. This new knowledge is an important addition to the field.
-
Author response:
The following is the authors’ response to the original reviews.
Response to the Joint Public Review:
We are indebted to eLife’s reviewing process for helping us improve our manuscript and for highlighting that our study provides new molecular insights into SFT pathogenesis.
Response to Reviewers:
(1) The authors state that "NAB2-STAT6 localization is exclusively driven by EGR1 binding" yet WT1 motives are also consistently enriched. Can you please touch upon the potential involvement of WT1 (or lack thereof, and why)?
Our data suggest that EGR1 is the primary driver of NAB2-STAT6 localization. In fact, EGR1 is the most significantly enriched motif (Fig. 4) at NAB2-STAT6 binding sites and we detect an interaction between the fusion protein and EGR1 (Fig. 5). Conversely, we did not identify an interaction between NAB2-STAT6 and WT1. However, WT1 also belongs to the C2H2 zinc finger subclass and recognizes a motif bearing striking similarities to the EGR1/2 consensus. EGR1 has been previously described to bind WT1 motifs and to function as an activator of WT1 targets (as opposed to WT1 repressive abilities). See https://www.jbc.org/article/S0021-9258(20)74720-4/fulltext and https://www.sciencedirect.com/science/article/pii/S0378111901005935.
(2) In the description of Figure 5C the authors observe nuclear staining of both NAB2 and STAT6 following NAB2-STAT6 fusion induction. They interpret this as the fusion stimulates nuclear translocation of endogenous NAB2. This statement can only be rigorously made if the authors can unequivocally demonstrate that their antibody exclusively detects endogenous NAB2 and not the NAB2 portion of the fusion. As presented, a more likely interpretation is that the NAB2 staining detects NAB2-STAT6 fusion protein. Since there is some cytoplasmic NAB2 signal still present, the findings in Figure 5c do not support nor disprove nuclear translocation of endogenous NAB2. It may be prudent to remove this section. Figure 5B is currently the best direct evidence of nuclear translocation.
We agree with the reviewer that Fig. 5C does not rigorously show that NAB2-STAT6 fusion proteins drag endogenous NAB2 into the nucleus. The immunostaining reveals that wt NAB2 localization is overwhelmingly cytoplasmic at steady-state conditions (and prior to expression of the fusion protein). Instead, Figure 5B shows that endogenous NAB2 translocates to the nucleus upon NAB2-STAT6 expression. Additionally, figure 5A (along with Suppl. Fig. 5 E-F) demonstrates that endogenous NAB2 co-precipitates with NAB2-STAT6 fusions in nuclear extracts of U2OS and HEK293T cells. We have rephrased the paragraph accordingly.
(3) Figure 5D: for the interpretation of the presented data to hold up, namely, NAB1 nuclear translocation upon NAB2-STAT6 expression, it is important to demonstrate that NAB1 antibodies do not cross-react with NAB2 given the similarity between NAB1 and NAB2. Without such control, another likely interpretation of the results in Figure 5D is that NAB1 antibody detects the NAB2 portion of the overexpressed fusion protein. This needs to be acknowledged in the text.
We had similar concerns, therefore we confirmed that the NAB1 antibody does not cross react with NAB2 by immunoblot (see figure below). We overexpressed FLAG-NAB2, HA-NAB1 and GFP constructs in HEK293T cells, we performed immunoprecipitation with either HA or FLAG from whole cell extracts followed by western blot using anti-NAB2 and anti-NAB1 polyclonal antibodies. We did not observe cross-reactivity of these antibodies. We acknowledged antibody validation in the revised text.
Author response image 1.
(4) Also, to support the notion that NAB2-STAT6 fusion promotes nuclear translocation of the entire complex, an imaging approach detecting EGR1 similar to Figure 5C-D would be helpful. EGR1 staining also avoids the potential pitfall of NAB1/2 antibodies detecting NAB2-STAT6 overexpressed fusion instead of endogenous proteins.
We agree with the reviewer that this would be a helpful approach. Unfortunately, none of the commercially available EGR1 antibodies that we tested were suitable for immunocytochemistry, as they either failed to show a proper signal or were marred by high nonspecific background signal.
(5) The authors found increased mRNA expression of certain cytokines and secreted neuropeptides in SFTs. While this may be consistent with a secretory phenotype, additional evidence such as detection of elevated levels of these proteins in tumor lysates or in culture media is necessary to formally make this claim. Please rephrase.
We have rephrased our claims as suggested. The revised text is now as follows: “We also identified a distinct secretory gene signature associated with SFTs. In fact, IGF2 is the most upregulated gene, via activation of an intronic enhancer by EGR1. IGF2 was pinpointed as the cause of hypoglycemia occurring in a very small subset of SFTs (Doege–Potter syndrome)(52). Our data suggest that IGF2 (and IGF1) upregulation is a common feature of all SFTs. In addition to insulin-like growth factors, STFs may secrete a host of peptides with diverse functions in neuronal processes, chemotaxis, and growth stimulation. The previously unrecognized neuronal features and the putative secretory phenotype of STFs set them apart from mesenchymal malignancies and relate them to neuroendocrine malignancies such as pheochromocytoma, oligodendroglioma and neuroblastoma.”
(6) GSEA with 500 randomly selected genes from target datasets needs a more detailed description to clarify the method.
To improve clarity, we added the following description: “Gene set enrichment analysis (GSEA) was done with 500 randomly selected genes from the given set of genes across the C2 collection of the human molecular signatures database or custom signatures using the GSEA function in clusterProfiler package in R (v4.6.2).
(7) In the IP-MS description, please double check the NaCl concentration in the second extraction step - 0.5mM seems low. Also, in the IP part, a buffer recipe appears to have been incorrectly pasted.
We thank the reviewer for identifying this typo. Indeed, we used 0.5M NaCl instead of 0.5mM. We have corrected the co-IP buffer recipe accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable contribution combines high-resolution histology with magnetic resonance imaging in a novel way to study the organisation of the human amygdala. The main findings convincingly show the axes of microstructural organisation within the amygdala and how they map onto the functional organisation. Overall, the approach taken in this paper showcases the utility of combining multiple modalities at different spatial scales to help understand brain organisation.
-
Reviewer #1 (Public review):
The paper by Auer et. makes several contributions:
(1) The study developed a novel approach to map the microstructural organization of the human amygdala by applying radiomics and dimensionality reduction techniques to high-resolution histological data from the BigBrain dataset.
(2) The method identified two main axes of microstructural variation in the amygdala, which could be translated to in vivo 7 Tesla MRI data in individual subjects.
(3) Functional connectivity analysis using resting-state fMRI suggests that microstructurally defined amygdala subregions had distinct patterns of functional connectivity to cortical networks, particularly the limbic, frontoparietal, and default mode networks.
(4) Meta-analytic decoding was used to suggest that the superior amygdala subregion's connectivity is associated with autobiographical memory, while the inferior subregion was linked to emotional face processing.
(5) Overall, the data-driven, multimodal approach provides an account of amygdala microstructure and possibly function that can be applied at the individual subject level, potentially advancing research on amygdala organization.
-
Reviewer #2 (Public review):
Summary:
This study bridges a micro- to macroscale understanding of the organization of the amygdala. First, using a data-driven approach, the authors identify structural clusters in the human amygdala from high-resolution post-mortem histological data. Next, multimodal imaging data to identify structural subunits of the amygdala and the functional networks in which they are involved. This approach is exciting because it permits the identification of both structural amygdalar subunits, and their functional implications, in individual subjects. There are, however, some differences in the macro and microscale levels of organization that should be addressed.
Strengths:
The use of data-driven parcellation on a structure that is important for human emotion and cognition, and the combination of this with high-resolution individual imaging-based parcellation, is a powerful and exciting approach, addressing both the need for a template-level understanding of organization as well as a parcellation that is valid for individuals. The functional decoding of rsfMRI permits valuable insight into the functional role of structural subunits. Overall, the combination of micro to macro, structure, and function, and general organization to individual relevance is an impressive holistic approach to brain mapping.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1:
The paper by Auer et. makes several contributions: (1) The study developed a novel approach to map the microstructural organization of the human amygdala by applying radiomics and dimensionality reduction techniques to high-resolution histological data from the BigBrain dataset. (2) The method identified two main axes of microstructural variation in the amygdala, which could be translated to in vivo 7 Tesla MRI data in individual subjects. (3) Functional connectivity analysis using resting-state fMRI suggests that microstructurally defined amygdala subregions had distinct patterns of functional connectivity to cortical networks, particularly the limbic, frontoparietal, and default mode networks. (4) Meta-analytic decoding was used to suggest that the superior amygdala subregion's connectivity is associated with autobiographical memory, while the inferior subregion was linked to emotional face processing. (5) Overall, the data-driven, multimodal approach provides an account of amygdala microstructure and possibly function that can be applied at the individual subject level, potentially advancing research on amygdala organization.
We thank the Reviewer for the positive comments and insightful evaluation of the work.
(1.1) Although these are meritorious contributions there are some concerns that I will summarize below. The paper makes little-to-no contact with the monkey literature regarding the anatomy of amygdala subregions, their functionality, and their patterns of anatomical connectivity. This is surprising because such literature on non-human primates is a very important starting point for understanding the human amygdala. I recommend taking a careful look at the work by Helen Barbas, among others. There are too many papers to cite but a notable example is: Ghashghaei, H. T., Hilgetag, C. C., & Barbas, H. (2007). Sequence of information processing for emotions based on the anatomic dialogue between prefrontal cortex and amygdala. Neuroimage, 34(3), 905-923. The work of Amaral is also highly relevant.
As suggested, we included the important work of Amaral et al. as well as Ghashghaei et al. highlighting its contribution to mapping the intricate anatomy and function of the amygdala in non-human primates. We comment on this in the Introduction of the manuscript. Please see P.3.
“Early research on the amygdala in non-human primates has been instrumental in understanding its intricate structure, function and patterns of anatomical connectivity (Amaral and Price 1984; Ghashghaei et al. 2007). This foundational study highlights the amygdala’s different subdivisions, most notably the basomedial nucleus (BM), basolateral nucleus (BL), and central nucleus (Ce) (Amaral et al. 1992). Furthermore, this work describes a dense network between these subdivisions and the prefrontal cortex, most strongly found in the posterior orbitofrontal and anterior cingulate areas.”
(1.2) Furthermore, the authors subscribe to a model with LB, CM, and SF sectors. How does the SF sector relate to monkey anatomy?
The overall organization of these subregions is largely conserved between humans and monkeys, reflecting their evolutionary relationship. While the basic subregional organization is conserved, there are still some important structural and functional differences between human and monkey amygdalae. For example, the SF subregion, often described in humans includes parts of the cortical nuclei (VCo), anterior amygdaloid area (AAA), amygdalohippocampal transition area (AHi), amygdalopiriform transition area (APir) as well as the lateral olfactory tract (LOT). This remark was added in the Discussion, on P.12:
“However, this region has been previously described as consisting of three main subdivisions: LB, CM, and SF, each composed of smaller subnuclei with distinct connectivity patterns and functions (Amunts et al. 2005; Ball et al. 2007; Bzdok et al. 2013; de Olmos and Heimer 1999). These subregions are largely conserved between humans and monkeys, reflecting their evolutionary relationship. However, there are still some considerable differences such as in the SF subregion, where its description in monkeys additionally contains the lateral olfactory tract (LOT) (De Olmos 1990).”
(1.3) The authors use meta-analytical decoding via NeuroSynth. If the authors like those results of course they should keep them but the quality of coordinate reporting in the literature is insufficient to conclude much in the context of amygdala subregion function in my opinion. I believe the results reported are at most "somewhat suggestive".
We agree with the Reviewer that use of data from NeuroSynth poses unique challenges, particularly relating to investigations of a small structure such as the amygdala. However, to clarify, these analyses decode the cortex-wide functional connectivity patterns of amygdala subregions and not activations within subregions defined by our microanatomical analyses. Additionally, comments from Reviewer 2 suggested expanding the NeuroSynth decoding to the contralateral hemisphere. As such, we decided to keep this analysis in the main manuscript but rephrase the interpretation of these findings in the Discussion to emphasize their exploratory nature on P.13:
“Functional decoding of subregional functional connectivity patterns indicated possible dissociations in cognitive (e.g., memory) and affective (e.g., emotional face processing) functions of the amygdala, echoing previous accounts of this region’s involvement in associative processing of emotional stimuli. Notably, these findings link the functional connectivity profile of a subregion partially co-localizing with LB to emotional face processing. The LB subregion has been previously linked to associative processing related to the integration of sensory information (Bzdok et al. 2013; Ghods-Sharifi, St Onge, and Floresco 2009; Pessoa 2010; Winstanley et al. 2004; Boyer 2008), which is consistent with the association with visual emotional information processing identified in the present work.”
(1.4) Another significant concern has to do with the results in Figure 3. The red and yellow clusters identified are quite distinct but the differences in functional connectivity are very modest. Figure 3C reveals very similar functional connectivity with the networks investigated. This is very surprising, and the authors should include a careful comparison with related findings in the literature. Overall, there is limited comparison between the observed results and those obtained via other methods. On a more pessimistic note, the results of Figure 3 seem to question the validity of the general approach.
We agree with the Reviewer that we can indeed observe considerable overlap between functional connectivity profiles of amygdala subregions. The amygdala is a relatively small structure, leading to likely interconnectivity between its subregions (Bzdok et al. 2013) in addition to considering BOLD signal autocorrelation within this region. In addition, functional signals in the amygdala are affected by relatively lower signal-to-noise ratio (SNR), a limitation extending to temporobasal and mesiotemporal regions. Despite these challenges, our technique remained sensitive to detect subtle differences in connectivity patterns even in this small group of subjects in this restricted subcortical territory.
In the revised manuscript, we further highlight these caveats in the Discussion (P.13):
“Although these findings are promising, we also observe considerable overlap between functional connectivity networks of both our defined subregions. Indeed, the amygdala is a relatively small structure, leading to likely interconnectivity between its subregions and locally high signal autocorrelation. Functional connectivity and microstructure in the amygdala are certainly related, however previous work suggests they do not perfectly overlap (Bzdok et al. 2013). In addition, this region is affected by relatively low signal-to-noise ratio (SNR), as is observed in broader temporobasal and mesiotemporal territories.”
(1.5) Some statements in the Discussion feel unwarranted. For example, "significant dissociation in functional connectivity to prefrontal structures that support self-referential, reward-related, and socio-affective processes." This feels way beyond what can be stated based on the analyses performed.
We agree that this interpretation may reach beyond the analyses performed and reported findings. We have adjusted this portion of the text accordingly in our Discussion on functional connectivity findings (P.13):
“Qualitatively, we found that the subregion defined by the highest 25% of U1 values mainly overlapped with what is commonly defined as the superficial and centromedial subregions, whereas the lowest 25% U1 values subregion overlapped mostly with the laterobasal division. Interestingly, CM and SF characterized subregions showed significantly stronger functional connectivity to prefrontal structures. This finding aligns with previous work demonstrating unique affiliations between the CM subregion and anterior cingulate and frontal cortices (Kapp, Supple, and Whalen 1994; Barbour et al. 2010), as well as between the SF subregion and the orbitofrontal cortex (Goossens et al. 2009; Caparelli et al. 2017; Pessoa 2010; Klein-Flügge et al. 2022).”
Additionally, we have also edited our Discussion to ensure that our interpretations are grounded in the analyses conducted, while framing the findings as potential avenues for future work. Please see P.13.
“Functional decoding of functional connectivity results indicated possible dissociations in cognitive (e.g., memory) and affective (e.g., emotional face processing) functions of the amygdala, echoing previous accounts of this region’s functional specialization and subregional segregation of associative processing of emotional stimuli.”
Recommendations for the authors:
(1.6) Figure 1 has panels A-I but only A-D are discussed in the caption. The orientation of the slices is not indicated which makes it very hard to follow for most readers.
The subpanels are now referred to in the revised Results. We also added a notation on the orientation of the slices and described them accordingly in our Figure 1 description. (P.5-6):
“(A) The amygdala was segmented from the 100-micron resolution BigBrain dataset using an existing subcortical parcellation (Xiao et al. 2019). Slice orientation is consistent across all panels in this figure.”
(1.7) Some figure references in the text seem to be incorrect; please check that the text refers to the correct figure number and panel.
We thank the Reviewer for pointing this out. We thoroughly revised the correspondence between figure panel labels and their referencing in the text.
Reviewer #2:
This study bridges a micro- to macroscale understanding of the organization of the amygdala. First, using a data-driven approach, the authors identify structural clusters in the human amygdala from high-resolution post-mortem histological data. Next, multimodal imaging data to identify structural subunits of the amygdala and the functional networks in which they are involved. This approach is exciting because it permits the identification of both structural amygdalar subunits, and their functional implications, in individual subjects. There are, however, some differences in the macro and microscale levels of organization that should be addressed.
Strengths:
The use of data-driven parcellation on a structure that is important for human emotion and cognition, and the combination of this with high-resolution individual imaging-based parcellation, is a powerful and exciting approach, addressing both the need for a template-level understanding of organization as well as a parcellation that is valid for individuals. The functional decoding of rsfMRI permits valuable insight into the functional role of structural subunits. Overall, the combination of micro to macro, structure, and function, and general organization to individual relevance is an impressive holistic approach to brain mapping.
We thank the Reviewer for their constructive and helpful feedback on our work.
Weaknesses:
(2.1) UMAP 1, as calculated from the histological data, appears to correlate well across individuals, and decently with the MRI data, although the medial-lateral coordinate axis is an outlier. UMAP 2, on the other hand, does not appear to correlate well with imaging data or across individuals. This does pose a problem with the claim that this paper bridges micro- and macroscale parcellations. One might certainly expect, however, that different levels of organization might parcellate differently, but the authors should address this in the discussion and offer ways forward.
Data driven methods hold several advantages for the quantitative extraction of signal from the underlying data in an observer-independent manner. However, these techniques are also sensitive to potential idiosyncrasies in the data. In the present work, our main analyses rely on the processing of a histological dataset (BigBrain) providing a unique opportunity for high-resolution analysis of amygdala histology and in vivo translation of findings leveraging ultra-high field MRI (n=10). However, both datasets are limited by their small sample size (n=1 for BigBrain and n=10 for MICA-PNI). As a result, we speculate that signal variations captured by U2 may be sensitive to artifacts or subject-specific sources of variance. Moving forward, this hypothesis could be assessed in future work via the analysis of larger histological and neuroimaging datasets to better track recurring features picked up by U2 or the association of these unique topographies with behavioural markers.
As suggested, we included a section in our Discussion highlighting this shortcoming and the importance for larger datasets moving forward. Please see P.11-12.
“However, it is important to note that both datasets analyzed in this work are limited by their small sample size (n=1 for BigBrain and n=10 for MICA-PNI). We speculate that the signal variations captured by U2 may be sensitive to artifacts or subject-specific sources of variance, potentially explaining why it was not consistent between subjects and modalities. Moving forward, this hypothesis could be assessed in future work via the analysis of larger histological and neuroimaging datasets to better track recurring features picked up by U2 or the association of these unique topographies with behavioural markers.”
(2.1) It would be interesting to see functional decoding for the right amygdala. This could be included in the supplementary material. A discussion of differences in the results in the two hemispheres could be illuminating.
In accordance with the Reviewer’s suggestion, we added Supplementary figure S2 exploring the decoding of connectivity profiles of the right amygdala stratified by its cytoarchitectural embedding with UMAP.
Upon analysis, dissociation in functional connectivity patterns over the right amygdala were less evident, leading to overall similar functional decoding across the two clusters. We refer to this Supplementary Figure in our Discussion on P.13.
“For the right amygdala, dissociation in functional connectivity patterns were more subtle, leading to overall similar functional decoding across the two clusters. (Figure S2)”
(2.3) The authors acknowledge that this mapping matches some but not all subunits that have been previously described in the amygdala. It would be helpful to neuroanatomists if the authors could discuss these differences in more detail in the discussion, to identify how this mapping differs and what the implications of this are.
In our work, we focus on mapping the three well characterized amygdala subregions, specifically the superficial (SF), centromedial (CM) and laterobasal (LB) subdivisions. Qualitative histological accounts have indeed delineated multiple subunits within these subregions which we now describe in the revised manuscript. Due to the lower resolution of in vivo MRI data used in this work relative to post mortem histology, we focused our analyses on larger subregions that could be more reliably mapped to native quantitative T1 spaces of each participant. We now overview this issue in the Discussion. Please see P.12.
“Although qualitative histological accounts have indeed delineated multiple subunits within these general regions, the present work focuses on three subdivisions (Amunts et al. 2005) to account for resolution disparities when translating our findings to in vivo MRI data. The LB subdivision includes the basomedial nucleus (Bm), basolateral nucleus (BL), lateral nucleus (LA) and paralaminar nucleus (PL). Moving medially, the CM subdivision includes the central (Ce) and medial nuclei (Me), while the SF subdivision includes the anterior amygdaloid area (AAA), amygdalohippocampal transition area (AHi), amygdalopiriform transition area (APir), and ventral cortical nucleus (VCo) (Heimer et al. 1999). However, disagreement on the precise attribution of nuclei to broader subdivisions motivated our investigations of probabilistic subunits of the amygdala (Kedo et al. 2018). The development of new tools to segment amygdala subnuclei in vivo offers opens opportunities for future work to further validate our framework at the precision of these nuclei within subjects (Saygin et al. 2017).”
(2.4) The acronym UMAP is not explained. A brief explanation and description would be useful to the reader.
We moved the expanded acronym from the Methods to the first instance of the term UMAP in our paper, found in the Introduction. As suggested, we also added a sentence describing the technique. Please see P.6.
“We then applied Uniform Manifold Approximation and Projection (UMAP), a non-linear dimensionality reduction technique that preserves the local and global structure of high-dimensional data by projecting it into a lower-dimensional space (Becht et al. 2018), to the resulting 20-feature matrix to derive a 2-dimensional embedding of amygdala cytoarchitecture (Figure 1D).”
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study provides insights into the role of the cerebellum in fear conditioning, addressing a key gap in the literature. The evidence presented is solid overall, although the theoretical framing and clarity of the results can be improved and some concerns remain about the reliability of results based on small numbers of trials. This work will be of interest to both the extinction learning and cerebellar research communities.
-
Reviewer #1 (Public review):
Nio and colleagues address an important question about how the cerebellum and ventral tegmental area (VTA) contribute to the extinction learning of conditioned fear associations. This work tackles a critical gap in the existing literature and provides new insights into this question in humans through the use of high-field neuroimaging with robust methodology. The presented results are novel and will broadly interest both the extinction learning and cerebellar research communities. As such, this is a very timely and impactful manuscript. However, there are several points that could be addressed during the review process to strengthen the claims and enhance their value for readers and the broader scientific community.
Points to Address:
(1) Reward Interpretation and Skin Conductance Responses (SCR):<br /> A central premise of the manuscript is that 'unexpected omissions of expected aversive events' are rewarding, which plays a critical role in extinction learning. The authors also suggest that the cerebellum is involved in reward processing. However, it is unclear how this conclusion can be directly drawn from their task, which does not explicitly model 'reward.' Instead, the interpretation relies on SCR, which seems more indicative of association or prediction rather than reward per se. Is SCR a valid metric of reward experienced during the extinction of feared associations? Or could these findings reflect processes tied more closely to predictive learning? Please, discuss.
(2) Reinforcement Agent and SCR Modeling:<br /> The modeling approach with the deep reinforcement agent treats SCR as a personalized expectation of shock for a given trial. However, this interpretation seems misaligned with participants' actual experience - they are aware of the shock but exhibit evolving responses to it over time. Why is this operationalization useful or valid? It would benefit the manuscript to provide a clearer justification for this approach.
(3) Clarity and Visualization of Results:<br /> The results section is challenging to follow, and the visualization and quantification of findings could be significantly improved. Terms like 'trending' appear frequently - what does this mean, and is it worth reporting? Adding clear statistical quantifications alongside additional visualizations (e.g., bar or violin plots of group means within specific subregions within the cerebellum, or grouped mean activity in VTA and DCN) would enhance clarity and allow readers to better assess the distribution and systematicity of effects. Furthermore, the figures are overly complex and difficult to read due to the heavy use of abbreviations. Consider splitting figures by either phase of the experiment or regions, and move some details to the supplemental material for improved readability.
(4) Theoretical Context for Paradigm Phases:<br /> The manuscript benefits from the comprehensive experimental paradigm, which includes multiple phases (acquisition, extinction, recall, reacquisition, re-extinction). This design has great potential for providing a more holistic view of conditioned fear learning and extinction. However, the manuscript lacks clarity on what insights can be drawn from these distinct phases. What theoretical framework underpins the different stages, and how should the results be interpreted in this context? At present, the findings seem like a display of similar patterns across phases without sufficient interpretation. Providing a stronger theoretical rationale and reorganizing the results by experimental phase could significantly improve readability and impact.
(5) Cerebellum-VTA Connectivity Analysis:<br /> The authors argue that the cerebellum modulates VTA activity, yet they perform the PPI analysis in the reverse direction. Why does this make sense? In their DCM analysis, they found a bidirectional relationship (both cerebellum - VTA and VTA-cerebellum), yet the discussion focused on connectivity from the cerebellum to VTA. A more careful interpretation of the connectivity findings would be useful - especially the strong claims in the discussion on the cerebellum providing the reward signal to the VTA should be tempered.
-
Reviewer #2 (Public review):
Summary:
Building upon the group's previous work, this study used a 3-day threat acquisition, extinction, recall, reextinction, and reacquisition paradigm with 7T imaging to probe the mechanism by which the cerebellum contributes to fear extinction learning. The authors hypothesise this may be via its connection to the VTA, a known modulator of fear extinction due to its role in reward processing. Using complementary analysis methods, the authors demonstrate that activity with the cerebellum, DNC, and VTA is modulated by predictions about the occurrence of the US, which shows regional specificity. They show trend-level evidence that there is increased functional connectivity between the cerebellum and VTA during all phases of the paradigm with unexpected omissions. They also present a DCM which indicates that the cerebellum could positively modulate VTA activity during extinction learning. This study adds to a growing literature supporting the role of the historically overlooked cerebellum in the control of emotions and suggests that an interaction between the cerebellum and VTA should be considered in the existing model of the fear extinction network.
Strengths:
The authors address their research question using a number of complementary methods, including parametric modulation by model-derived expectation parameters, PPI, and DCM, in a logical and easily understood way. I feel the authors provide a balanced interpretation of their findings, presenting numerous interpretations and offering insight with regard to reward vs attention or unsigned prediction errors and the directionality of the interaction they identify. The manuscript is a timely addition to growing literature highlighting the role of the cerebellum in fear conditioning, and emotion generation and regulation more generally.
Weaknesses:
Subjective and skin conductance responses do not completely support the success of the learning paradigm. For example, CS+/CS- differentiation in both domains persisted after extinction training. I do not feel that this negates the findings of this manuscript, though it raises questions about the parametric modulators used, and the interpretation of the neural mechanisms proposed if they do not strongly relate to updated subjective appraisals (the goal of extinction therapy). My interpretation of the manuscript suggests there are some key results based upon contrasts that have as few as three events; I am a little unsure about the power and reliability of these effects, though I await author clarification on this matter. There are a number of unaddressed deviations from the pre-registered protocol that I have asked the authors to elaborate upon.
-
Author response:
Reviewer 1:
(1) Reward Interpretation and Skin Conductance Responses (SCR):
The reviewer raises a valid point, as the model from which we derive prediction errors describes predictive learning—specifically, the occurrence of shock—without incorporating additional reward learning effects. SCRs are used to fit the model’s hyperparameters but do not directly measure reward; rather, they serve as a marker of arousal.
In our paradigm, SCRs are measured during CS presentation and primarily reflect predictive learning, as they are closely linked to contingency awareness. The association between estimated prediction errors during unexpected US omissions and reward remains reliant on existing literature.
In the revised manuscript, we will further elaborate on these points to clarify the distinction between predictive learning and direct reward processing, while contextualizing our findings within the broader literature on reward signaling and fear extinction.
(2) Reinforcement Agent and SCR Modeling:
Notably, we do not use SCR as a personalized expectation measure due to its limited reliability at the individual level; instead, the model's hyperparameters are fitted on the entire SCR dataset, yielding per-trial prediction and prediction error estimates for each CS sequence rather than for individual participants.
(3) Clarity and Visualization of Results:
We recognize that the presentation of our results can be improved and will take steps to enhance figure clarity, also ensuring that trend-level results are clearly distinguished.
(4) Theoretical Context for Paradigm Phases:
Regarding the differences across experimental phases, we recognize the theoretical significance of these distinctions. However, our primary focus is on identifying commonalities in unexpected US omission responses across phases rather than emphasizing phase-specific differences. Nevertheless, we will provide a brief clarification on phase differences to enhance the manuscript’s interpretability.
(5) Cerebellum-VTA Connectivity Analysis:
Furthermore, we acknowledge that our conclusion regarding the modulation of the dopaminergic system by the cerebellum should be framed more cautiously. We will temper our claims to better reflect the bidirectional and potentially indirect nature of cerebellum-VTA interactions. Additionally, we plan to include PPI results using a cerebellar seed showing the VTA, potentially in the supplementary material.
Reviewer 2:
(1) Success of extinction learning based on Self-reports and SCRs?
The reviewer points to a problem, which is inherent to extinction learning: The initial fear association is not erased, but merely inhibited, and is prone to return. Although the recall phase follows the extinction phase, we did not expect a complete inhibition of the conditioned response; instead, spontaneous recovery is expected. In fact, the spontaneous recovery observed in the recall phase provided us with an additional opportunity to investigate unexpected US omissions, which was our primary focus.
(2) Concerns on reliability of event-based contrasts using three events:
Regarding concerns about the reliability of analyses based on three events, we believe that the consistency of our parametric modulation analysis— which incorporates all events— combined with the three-event analysis results, provides further support for the observed patterns. We are currently discussing ways of additional analysis for further verification of the reliability of using three events.
(3) Deviations from preregistration:
Finally, we will carefully review all deviations from our preregistration to ensure transparency. Any methodological or analytical changes will be explicitly addressed in the revised manuscript.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This research addresses an important and timely topic in cancer treatment, as the authors present a novel computational tool, 'retriever,' which has the potential to revolutionize personalized cancer treatment strategies by predicting effective drug combinations for triple-negative breast cancer. The strength of the evidence presented is solid, as evidenced by the systematic testing of 152 drug response profiles and 11,476 drug combinations.
-
Reviewer #1 (Public review):
Summary:
Identifying drugs that target specific disease phenotypes remains a persistent challenge. Many current methods are only applicable to well-characterized small molecules, such as those with known structures. In contrast, methods based on transcriptional responses offer broader applicability because they do not require prior information about small molecules. Additionally, they can be rapidly applied to new small molecules. One of the most promising strategies involves the use of "drug response signatures"-specific sets of genes whose differential expression can serve as markers for the response to a small molecule. By comparing drug response signatures with expression profiles characteristic of a disease, it is possible to identify drugs that modulate the disease profile, indicating a potential therapeutic connection.
This study aims to prioritize potential drug candidates and to forecast novel drug combinations that may be effective in treating triple-negative breast cancer (TNBC). Large consortia, such as the LINCS-L1000 project, offer transcriptional signatures across various time points after exposing numerous cell lines to hundreds of compounds at different concentrations. While this data is highly valuable, its direct applicability to pathophysiological contexts is constrained by the challenges in extracting consistent drug response profiles from these extensive datasets. The authors use their method to create drug response profiles for three different TNBC cell lines from LINCS.
To create a more precise, cancer-specific disease profile, the authors highlight the use of single-cell RNA sequencing (scRNA-seq) data. They focus on TNBC epithelial cells collected from 26 diseased individuals compared to epithelial cells collected from 10 healthy volunteers. The authors are further leveraging drug response data to develop inhibitor combinations.
Strengths:
The authors of this study contribute to an ongoing effort to develop automated, robust approaches that leverage gene expression similarities across various cell lines and different treatment regimens, aiming to predict drug response signatures more accurately. The authors are trying to address the gap that remains in computational methods for inferring drug responses at the cell subpopulation level.
Weaknesses:
One weakness is that the authors do not compare their method to previous studies. The authors develop a drug response profile by summarizing the time points, concentrations, and cell lines. The computational challenge of creating a single gene list that represents the transcriptional response to a drug across different cell lines and treatment protocols has been previously addressed. The Prototype Ranked List (PRL) procedure, developed by Iorio and co-authors (PNAS, 2010, doi:10.1073/pnas.1000138107), uses a hierarchical majority-voting scheme to rank genes. This method generates a list of genes that are consistently overexpressed or downregulated across individual conditions, which then hold top positions in the PRL. The PRL methodology was used by Aissa and co-authors (Nature Comm 2021, doi:10.1038/s41467-021-21884-z) to analyze drug effects on selective cell populations using scRNA-seq datasets. They combined PRL with Gene Set Enrichment Analysis (GSEA), a method that compares a ranked list of genes like PRL against a specific set of genes of interest. GSEA calculates a Normalized Enrichment Score (NES), which indicates how well the genes of interest are represented among the top genes in the PRL. Compared to the method described in the current manuscript, the PRL method allows for the identification of both upregulated and downregulated transcriptional signatures relevant to the drug's effects. It also gives equal weight to each cell line's contribution to the drug's overall response signature.
The authors performed experimental validation of the top two identified drugs; however, the effect was modest. In addition, the effect on TNBC cell lines was cell-line specific as the identified drugs were effective against BT20, whose transcriptional signatures from LINCS were used for drug identification, but not against the other two cell lines analyzed. An incorrect choice of genes for the signature may result in capturing similarities tied to experimental conditions (e.g., the same cell line) rather than the drug's actual effects. This reflects the challenges faced by drug response signature methods in both selecting the appropriate subset of genes that make up the signature and in managing the multiple expression profiles generated by treating different cell lines with the same drug.
-
Reviewer #2 (Public review):
Summary:
In their study, Osorio and colleagues present 'retriever,' an innovative computational tool designed to extract disease-specific transcriptional drug response profiles from the LINCS-L1000 project. This tool has been effectively applied to TNBC, leveraging single-cell RNA sequencing data to predict drug combinations that may effectively target the disease. The public review highlights the significant integration of extensive pharmacological data with high-resolution transcriptomic information, which enhances the potential for personalized therapeutic applications.
Strengths:
A key finding of the study is the prediction and validation of the drug combination QL-XII-47 and GSK-690693 for the treatment of TNBC. The methodology employed is robust, with a clear pathway from data analysis to experimental confirmation.
Weaknesses:
However, several issues need to be addressed. The predictive accuracy of 'retriever' is contingent upon the quality and comprehensiveness of the LINCS-L1000 and single-cell datasets utilized, which is an important caveat as these datasets may not fully capture the heterogeneity of patient responses to treatment. While the in vitro validation of the drug combinations is promising, further in vivo studies and clinical trials are necessary to establish their efficacy and safety. The applicability of these findings to other cancer types also warrants additional investigation. Expanding the application of 'retriever' to a broader range of cancer types and integrating it with clinical data will be crucial for realizing its potential in personalized medicine. Furthermore, as the study primarily focuses on kinase inhibitors, it remains to be seen how well these findings translate to other drug classes.
-
Author response:
Reviewer 1:
Summary:
Identifying drugs that target specific disease phenotypes remains a persistent challenge. Many current methods are only applicable to well-characterized small molecules, such as those with known structures. In contrast, methods based on transcriptional responses offer broader applicability because they do not require prior information about small molecules. Additionally, they can be rapidly applied to new small molecules. One of the most promising strategies involves the use of “drug response signatures”-specific sets of genes whose differential expression can serve as markers for the response to a small molecule. By comparing drug response signatures with expression profiles characteristic of a disease, it is possible to identify drugs that modulate the disease profile, indicating a potential therapeutic connection.
This study aims to prioritize potential drug candidates and to forecast novel drug combinations that may be effective in treating triple-negative breast cancer (TNBC). Large consortia, such as the LINCS-L1000 project, offer transcriptional signatures across various time points after exposing numerous cell lines to hundreds of compounds at different concentrations. While this data is highly valuable, its direct applicability to pathophysiological contexts is constrained by the challenges in extracting consistent drug response profiles from these extensive datasets. The authors use their method to create drug response profiles for three different TNBC cell lines from LINCS.
To create a more precise, cancer-specific disease profile, the authors highlight the use of single-cell RNA sequencing (scRNA-seq) data. They focus on TNBC epithelial cells collected from 26 diseased individuals compared to epithelial cells collected from 10 healthy volunteers. The authors are further leveraging drug response data to develop inhibitor combinations.
Strengths:
The authors of this study contribute to an ongoing effort to develop automated, robust approaches that leverage gene expression similarities across various cell lines and different treatment regimens, aiming to predict drug response signatures more accurately. The authors are trying to address the gap that remains in computational methods for inferring drug responses at the cell subpopulation level.
Weaknesses:
One weakness is that the authors do not compare their method to previous studies. The authors develop a drug response profile by summarizing the time points, concentrations, and cell lines. The computational challenge of creating a single gene list that represents the transcriptional response to a drug across different cell lines and treatment protocols has been previously addressed. The Prototype Ranked List (PRL) procedure, developed by Iorio and co-authors (PNAS, 2010, doi:10.1073/pnas.1000138107), uses a hierarchical majority-voting scheme to rank genes. This method generates a list of genes that are consistently overexpressed or downregulated across individual conditions, which then hold top positions in the PRL. The PRL methodology was used by Aissa and co-authors (Nature Comm 2021, doi:10.1038/s41467-021-21884-z) to analyze drug effects on selective cell populations using scRNA-seq datasets. They combined PRL with Gene Set Enrichment Analysis (GSEA), a method that compares a ranked list of genes like PRL against a specific set of genes of interest. GSEA calculates a Normalized Enrichment Score (NES), which indicates how well the genes of interest are represented among the top genes in the PRL. Compared to the method described in the current manuscript, the PRL method allows for the identification of both upregulated and downregulated transcriptional signatures relevant to the drug’s effects. It also gives equal weight to each cell line’s contribution to the drug’s overall response signature.
The authors performed experimental validation of the top two identified drugs; however, the effect was modest. In addition, the effect on TNBC cell lines was cell-line specific as the identified drugs were effective against BT20, whose transcriptional signatures from LINCS were used for drug identification, but not against the other two cell lines analyzed. An incorrect choice of genes for the signature may result in capturing similarities tied to experimental conditions (e.g., the same cell line) rather than the drug’s actual effects. This reflects the challenges faced by drug response signature methods in both selecting the appropriate subset of genes that make up the signature and managing the multiple expression profiles generated by treating different cell lines with the same drug.
We appreciate the reviewer’s thoughtful feedback and their suggestion to refer to the Prototype Ranked List (PRL) manuscript. Unfortunately, since this methodology for the PRL isn’t implemented in an open-source package, direct comparison with our approach is challenging. Nonetheless, we investigated whether using ranks would yield similar results for the most likely active drug pairs identified by retriever. To do this, we calculated and compared the rankings of the average effect sizes provided by retriever. Although the Spearman (ρ \= 0.98) correlation coefficient was high, we observed that key genes are disadvantaged when using ranks compared to effect sizes. This difference is particularly evident in the gene set enrichment analysis, where using average ranks identified only one pathway as statistically significantly enriched. The code to replicate these analyses is available at https://github.com/dosorio/L1000-TNBC/blob/main/Code/.
Author response image 1.
Given the similarity in purpose between retriever and the PRL approach, we have added the following statement to the introduction: “Previously, this goal was approached using a majority-voting scheme to rank genes across various cell types, concentrations, and time points. This approach generates a prototype ranked list (PRL) that represents the consistent ranks of genes across several cell lines in response to a specific drug.”
Regarding the experimental validation, we believe there is a misunderstanding about the evidence we provided. We would like to claridy that we used three different TNBC cell lines: CAL120, BT20, and DU4475. It’s important to note that CAL120 and DU4475 were not included in the signature generation process. Despite this, we observed effects that exceeded the additive effects expectations, particularly in the CAL120 cell line (Figure 5, Panel F).
Reviewer 2:
Summary:
In their study, Osorio and colleagues present ‘retriever,’ an innovative computational tool designed to extract disease-specific transcriptional drug response profiles from the LINCS-L1000 project. This tool has been effectively applied to TNBC, leveraging single-cell RNA sequencing data to predict drug combinations that may effectively target the disease. The public review highlights the significant integration of extensive pharmacological data with high-resolution transcriptomic information, which enhances the potential for personalized therapeutic applications.
Strengths:
A key finding of the study is the prediction and validation of the drug combination QL-XII-47 and GSK-690693 for the treatment of TNBC. The methodology employed is robust, with a clear pathway from data analysis to experimental confirmation.
Weaknesses:
However, several issues need to be addressed. The predictive accuracy of ’retriever’ is contingent upon the quality and comprehensiveness of the LINCS-L1000 and single-cell datasets utilized, which is an important caveat as these datasets may not fully capture the heterogeneity of patient responses to treatment. While the in vitro validation of the drug combinations is promising, further in vivo studies and clinical trials are necessary to establish their efficacy and safety. The applicability of these findings to other cancer types also warrants additional investigation. Expanding the application of ’retriever’ to a broader range of cancer types and integrating it with clinical data will be crucial for realizing its potential in personalized medicine. Furthermore, as the study primarily focuses on kinase inhibitors, it remains to be seen how well these findings translate to other drug classes.
We thank the reviewer for their thoughtful and constructive feedback. We appreciate your insights and agree that several important considerations need to be addressed.
We recognize that the predictive accuracy of retriever depends on the LINCS-L1000 and single-cell datasets. These resources may not fully represent the complete range of transcriptional responses to disease and treatment across different patients. As you mentioned, this is an important limitation. However, we believe that by extrapolating the evaluation of the most likely active compound to each individual patient, we can help address this issue. This approach will provide valuable insights into which patients in the study are most likely to respond positively to treatment.
On the in-vitro validation of drug combinations, we agree that while promising, these results are not sufficient on their own to establish clinical efficacy. Additional in-vivo studies will be essential in assessing the therapeutic potential and safety of these combinations, and clinical trials will be an important next step to validate the translational impact of our findings.
Lastly, we appreciate the reviewer’s comment about the focus of our study on kinase inhibitors. This result was unexpected, as we tested the full set of compounds from the LINCS-L1000 project. We agree that exploring other top candidates, including different drug classes, will be important for assessing how broadly retriever approach can be applied.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study provides new evidence on the role of norepinephrine (NE) release in the hippocampus in response to environmental transitions (event boundaries), providing a potential link between NE signaling and the segmentation of episodic memories. The work is solid, employing innovative techniques such as fiber photometry with the GRAB-NE sensor for NE measurement, the analysis of public electrophysiology hippocampal datasets, and well-controlled experiments. While further analysis could strengthen some claims, this work offers insights into memory, neuromodulation, and hippocampal function.
-
Reviewer #1 (Public review):
Summary:
This study investigates the role of norepinephrine (NE) signaling in the hippocampus during event transitions, positing that NE release serves as a mechanism for marking event boundaries to facilitate episodic memory segmentation. The authors use a genetically encoded fluorescent indicator (GRABNE) to measure NE release with high temporal precision, correlating these signals with changes in hippocampal firing dynamics. By integrating photometry data, behavioral analyses, and analysis of neuronal activity from publicly available datasets, the work addresses fundamental questions about the relationship between neuromodulatory signals and memory encoding.
Strengths:
The authors present a compelling framework linking NE signaling to event boundaries, offering insight into how episodic memory segmentation may occur in the brain. The writing is clear and the data are well-described. It is easy to follow. The pharmacological validation of the GRABNE sensor enhances confidence in their NE measurements, an important methodological strength given the potential limitations of fluorescence-based neuromodulatory indicators. Moreover, the authors carefully disentangle NE signals from confounding behavioral variables, providing evidence that NE release is time-locked to event boundaries rather than movement or arousal-related behaviors. This level of analytical rigor strengthens their central claims. Additionally, the observation of NE signal dynamics that decay over hundreds of seconds is interesting, as it aligns with timescales relevant to hippocampal plasticity reported in prior literature.
Weaknesses:
While the authors establish correlations between NE signaling and hippocampal activity changes, causation is not demonstrated. Future studies using perturbative approaches (e.g., optogenetic or chemogenetic manipulation of NE release) would be necessary to establish a direct causal link. Furthermore, the persistence of NE signals over long timescales (hundreds of seconds) raises questions about its role in encoding rapid event boundaries, as it is unclear how this prolonged signaling might affect memory encoding for closely spaced events. The lack of a discussion about how NE dynamics would operate in such scenarios weakens the proposed framework. Finally, while the authors acknowledge the limitations of the GRABNE sensor, a more detailed exploration of how sensor sensitivity might influence their results would enhance the interpretation of their findings.
-
Reviewer #2 (Public review):
Summary:
The authors use a genetically encoded fluorescent sensor, GRABNE, to measure NE dynamics in the dorsal hippocampus of mice in response to multiple behavioral manipulations. A non-linear model and regression were used to quantitatively assess the contribution of multiple behavioral covariates to changes in NE signaling, with the result that NE signal dynamics were best predicted by time from event transitions, with the signal exponentially decaying over a period of seconds to minutes after transitions. Event transitions were implemented as a transfer from a home cage to a novel arena, a transfer to a familiar linear track, and the introduction of novel objects. Additional experiments showed that spatial context transitions dominate NE signaling over novel object presentations, and experience accelerates the decay of the NE signal after spatial context transitions. Correspondingly, the hippocampal CA1 spatial code takes minutes to stabilize after context transition in both novel and familiar spaces.
Strengths:
A strength of the study is the use of the NE sensor with sub-second resolution, non-linear modeling, and regression to identify the prominent variable of interest as time from event transition, and multiple behavioral controls. The use of multiple behavioral designs to investigate the effect of familiarity, experience, and interaction of spatial context transitions and novel object introduction is a strength. Relating the dynamics of NE signal decay to the rate of CA1 spatial code changes is also a strength.
Weaknesses:
A minor weakness is that the concept of an event boundary needs to be more broadly discussed. The manuscript uses event transitions such as spatial context changes and novel object introduction to implement an event boundary. However, especially in episodic memory studies in humans, event structure and boundaries have also been shown to occur through the automatic segmentation of experiences into discrete events (Baldassano et al., Neuron, 2017; Radvansky and Zacks, Curr. Opi. Behav. Sci, 2017). The rodent experiments in the current manuscript explicitly introduce event boundaries through changes in context or objects, which can potentially be conflated with novelty. A discussion of these differences, and whether NE can also have a role in event boundary transitions based on automatic segmentation of experiences, will add to the impact of the manuscript.
-
Reviewer #3 (Public review):
Summary
The manuscript investigates the role of norepinephrine (NE) release in the rodent hippocampus during event boundaries, such as transitions between spatial contexts and the introduction of novel objects. It also explores how NE release is altered by experience and how novelty drives the amplitude and decay times of extracellular NE. By utilizing the GRABNE sensor for sub-second resolution measurement of NE, the authors demonstrate that NE release is driven primarily by the time elapsed since an event boundary and is independent of behaviors like movement or reward. The study further explores how hippocampal neural representations are altered over time, showing that these representations stabilize shortly after event transitions, potentially linking NE release to episodic memory encoding.
Strengths
Overall, the work provides novel insights into the interplay between NE signaling and hippocampal activity and presents an intriguing hypothesis on how NE release may help push hippocampal activity into unique attractor states to encode novel experiences. The experiments are well-controlled, and the analysis is well-presented, with a detailed and engaging discussion that points towards several new and exciting research directions. The use of several behavioral paradigms to demonstrate the strongest predictor of NE release is a strength, as well as the regression analysis to disambiguate the contribution of other correlated variables. The suggestion that NE does not select ensembles for subsequent replay is also an interesting result.
Weaknesses
The authors have not convincingly established a link between hippocampal neural activity and NE release, showing qualitative rather than quantitative correlations. Therefore, at this stage, the role of NE on hippocampal function remains speculative.
Another general concern is that the smoothing/ kinetics of the sensor impacts the regression analyses. Most of the other variables, such as speed, acceleration, and even reward time points are highly dynamic and it is possible that the limitations of the sensor decorrelate the signal from (potentially) causal variables, therefore resulting in the time since the event start having the most explanatory power for most of the analyses.
More broadly, the figure legends should be expanded to better describe error bounds, mean vs median, sample sizes, and averaging choices for plots.
There are also some concerns regarding the nearest neighbor analysis and the reported differences in the rate of reactivations after familiar and novel environments, as outlined below.
(1) Lines 657-658. How far away in time can the top three nearest neighbor time points be? Must they lie in different trials, or can they also be within the same trial? Is there a systematic difference in the average time lags for the nearest neighbors over the course of the session?
The authors should only allow nearest neighbors to be in a different lap because systematic changes in behavior (running fast initially) might force earlier time bins in a certain location to match with a different trial, while the later time bins can be from within the same trial if the mice are moving slower and stay in the same spatial bin location longer. The authors should also provide information on how the averaging is performed because there are several axes of variability - spatial bin locations, sessions, different environments, and animals.
(2) Figure 8: These results are very interesting. However, I am confused by the differences between Figure 8B and D because the significant reactivations in A and C are very similar. The 1-minute and 10-minute windows seem somewhat arbitrary and prone to noise and variability. Perhaps the authors should fit a slope for the curves on A and C and compare whether the slope/ intercept are significantly different between the novel and familiar environments.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study examined the important question of how neurons code temporal information across the hippocampus, dorsal striatum, and orbitofrontal cortex. Using a behavioral task in the rat that requires discrimination between short and long time intervals, the authors conclude that time intervals are represented in all three regions and that synchronized activity of time-coding cells across the brain regions is coordinated by theta rhythms. However, several weaknesses are noted, and in its current form, the study provides incomplete evidence for understanding how temporal information is processed and coordinated throughout these brain networks.
-
Reviewer #1 (Public review):
Summary:
It is known that neuronal activity in several brain regions encodes interval time. However, how interval time is encoded across distributed brain regions remains unclear. By simultaneously recording neuronal activity from the hippocampal CA1, dorsal striatum, and orbitofrontal cortex during a temporal bisection task, the authors showed that elapsed time during the interval period is encoded similarly across these regions and that the neuronal activity of time cells across these regions tends to be synchronized within 100 ms. Using Bayesian decoding, they demonstrated that the interval time decoded from the firing activity of time cells in these regions correlated with the rats' decisions and that the times decoded from the neuronal activity of different brain regions were correlated. The sound experiments and analyses support most of the main conclusions of this paper.
Strengths:
They used a temporal bisection task in which the effects of time and distance can be dissociated. The test trials successfully revealed the relationship between the interval time estimated by Bayesian decoding and the animal's judgment of long versus short interval times. Simultaneous recording of neuronal activity from the hippocampal CA1, dorsal striatum, and orbitofrontal cortex, which is technically challenging, allowed comparison of interval time encoding across brain regions and the degree of synchrony between neurons from different brain regions.
Weaknesses:
Some analyses were not explained in detail, making it difficult to assess whether their results support the authors' conclusions.
-
Reviewer #2 (Public review):
Summary:
In this work, the authors examined how neural activity related to temporal information is distributed and coordinated throughout the hippocampus, dorsal striatum, and orbitofrontal cortex. Rats were forced to run for fixed time intervals on a treadmill and make a decision based on whether the interval was long (10s) or short (5s). Under these conditions time cells were observed across all examined brain regions. The primary finding of the authors is that synchronized activity between time cells across brain regions is entrained into the theta cycle. This observation is used to support the central claim that the sharing of temporal information is mediated by the theta oscillation.
Strengths:
By simultaneously recording several brain regions in an interval discrimination task, the authors provide a valuable dataset for understanding how temporal information is processed and distributed throughout relevant networks.
Weaknesses:
Several methodological concerns should be addressed and a more focused analysis should be performed to strengthen the central claims of this work.
Major Concerns
(1) The restriction to only use time cells to understand temporal information processing. Other mechanisms of encoding time, like population clocks and ramping, have been characterized in the striatum and frontal cortex, and these dynamics might contain more temporal information than the subset of cells that meet the statistical criteria for being a time cell. Furthermore, time cells in the OFC, and DS in particular, appear to be heavily biased towards the beginning of treadmill running. This raises the question of whether temporal information can be encoded by neurons other than time cells in these two regions.
(2) The results of the Bayesian decoding analysis should be expanded on. In particular, the performance of each decoder above the chance level is not quantified. Comparing the performance of decoders trained on all cells to the performance of decoders trained on time cells alone would partially address the question of whether or not time cells are the only cells that can encode temporal information in the DS and OFC.
(3) The decoding results for the test trials appear different from the results in the authors' previous publication (Shimbo et. al., 2021). There, differences in decoded time between the selected-long and selected-short trials emerged after 5s, the duration of the short trials. This was to be expected given the following two reasons. First, from the task design, it is unclear that the animal can distinguish trial types (long, short, or test) until after the first 5 seconds of treadmill running, making it logical for differences in decoded time to emerge only after this point. Second, time cell activity was identical in the first 5s of the long and short trials as shown in Figure 2A. Here, however, the differences in decoded time during the selected-long and selected-short test trials emerge within the first 2s of treadmill running. Could the authors explain this discrepancy?
Furthermore, in Figure 6B, at 3 seconds of running time, the decoded time for selected-long and selected-short trials shows a difference of nearly 2 seconds, with no further increase as running time progresses. In contrast, at 2 seconds of running time, there is no significant difference in decoded time for DS and OFC, while CA1 shows a slight increase in the decoded time for selected-long trials. This pattern suggests a sudden jump in the encoded time for selected-long trials between 2 and 3 seconds. However, without explicitly showing the raw data, it is difficult to interpret this result and other results from the decoding analysis.
Minor Concerns
(1) It is not clear how the Bayes decoder was trained. Does the training data come entirely from the long trials?
(2) For Figure 5D, even if only one of two neurons in a pair has its spike rate modulated by theta, wouldn't the expectation be that synchronous spike events between these two neurons would be modulated by theta as well? This analysis might benefit from shuffling methods to determine if the mean resultant length of synchronous spike events is larger than the chance level.
(3) In Figure 5A, the authors suggest that 'the synchronization of time cells was modulated by theta oscillation.' However, it is unclear whether the population exhibits a preferred theta phase or the phase preference only occurs at the individual cell level. If there is no preference on the population level, how would the authors interpret this result?
-
Reviewer #3 (Public review):
Summary:
This study examines neural activity recorded simultaneously in the hippocampus, dorsal striatum, and orbitofrontal cortex as rats performed an interval timing task. The analyses primarily focus on the activity of "time cells" which are neurons that fire at specific moments during the intervals. In this experiment, the intervals consist of periods when animals are running on a treadmill before selecting the arm associated with the interval duration. The results show that the theta oscillations induced by this running behavior were observed across the three regions and that this strong oscillation modulated the activity of neurons across regions. While these findings are correlative in nature, they provide an important characterization of activity patterns across regions during complex behavior. However, more research is needed to determine whether these activity patterns specifically contribute to temporal coding.
Strengths:
(1) Overall, the paper is very well written. Although I have specific concerns about the review of the relevant literature and the interpretation of the results (see below), I do want to commend the authors for their efforts toward presenting this complex work in an accessible manner.
(2) The study is well designed and the quality of the electrophysiological data collected from multiple brain regions in such a challenging behavioral experiment is impressive. This work is a technical tour de force.
(3) The analyses are very thorough, statistically rigorous, and clearly explained and visualized. The authors provide a thoughtful mixture of example data (at the level of individual cells or animals) and aggregated data (at the group or session level) to properly explain and quantify the activity patterns of interest.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This is an important study providing convincing evidence that increased blood pressure variability impairs myogenic tone and diminishes baroreceptor reflex. The study also provides evidence that blood pressure variability blunts functional hyperemia and contributes to cognitive decline. The authors use appropriate and validated methodology in line with the current state-of-the-art.
-
Reviewer #1 (Public review):
This study examined the effect of blood pressure variability on brain microvascular function and cognitive performance. By implementing a model of blood pressure variability using an intermittent infusion of AngII for 25 days, the authors examined different cardiovascular variables, cerebral blood flow, and cognitive function during midlife (12-15-month-old mice). Key findings from this study demonstrate that blood pressure variability impairs baroreceptor reflex and impairs myogenic tone in brain arterioles, particularly at higher blood pressure. They also provide evidence that blood pressure variability blunts functional hyperemia and impairs cognitive function and activity. Simultaneous monitoring of cardiovascular parameters, in vivo imaging recordings, and the combination of physiological and behavioral studies reflect rigor in addressing the hypothesis. The experiments are well-designed, and the data generated are clear. I list below a number of suggestions to enhance this important work:
(1) Figure 1B: It is surprising that the BP circadian rhythm is not distinguishable in either group. Figure 2, however, shows differences in circadian rhythm at different timepoints during infusion. Could the authors explain the lack of circadian effect in the 24-h traces?
(2) While saline infusion does not result in elevation of BP when compared to Ang II, there is an evident "and huge" BP variability in the saline group, at least 40mmHg within 1 hour. This is a significant physiological effect to take into consideration, and therefore it warrants discussion.
(3) The decrease in DBP in the BPV group is very interesting. It is known that chronic Ang II increases cardiac hypertrophy, are there any changes to heart morphology, mass, and/or function during BPV? Can the the decrease in DBP in BPV be attributed to preload dysfunction? This observation should be discussed.
(4) Examining the baroreceptor reflex during the early and late phases of BPV is quite compelling. Figures 3D and 3E clearly delineate the differences between the two phases. For clarity, I would recommend plotting the data as is shown in panels D and E, rather than showing the mathematical ratio. Alternatively, plotting the correlation of ∆HR to ∆SBP and analyzing the slopes might be more digestible to the reader. The impairment in baroreceptor reflex in the BPV during high BP is clear, is there any indication whether this response might be due to loss of sympathetic or gain of parasympathetic response based on the model used?
(5) Figure 3B shows a drop in HR when the pump is ON irrespective of treatment (i.e., independent of BP changes). What is the underlying mechanism?
(6) The correlation of ∆diameter vs MAP during low and high BP is compelling, and the shift in the cerebral autoregulation curve is also a good observation. I would strongly recommend that the authors include a schematic showing the working hypothesis that depicts the shift of the curve during BPV.
(7) Functional hyperemia impairment in the BPV group is clear and well-described. Pairing this response with the kinetics of the recovery phase is an interesting observation. I suggest elaborating on why BPV group exerts lower responses and how this links to the rapid decline during recovery.
(8) The experimental design for the cognitive/behavioral assessment is clear and it is a reasonable experiment based on previous results. However, the discussion associated with these results falls short. I recommend that the authors describe the rationale to assess recognition memory, short-term spatial memory, and mice activity, and explain why these outcomes are relevant in the BPV context. Are there other studies that support these findings? The authors discussed that no changes in alternation might be due to the age of the mice, which could already exhibit cognitive deficits. In this line of thought, what is the primary contributor to behavioral impairment? I think that this sentence weakens the conclusion on BPV impairing cognitive function and might even imply that age per se might be the factor that modulates the various physiological outcomes observed here. I recommend clarifying this section in the discussion.
(9) Why were only male mice used?
(10) In the results for Figure 3: "Ang II evoked significant increases in SBP in both control and BPV groups;...". Also, in the figure legend: "B. Five-minute average HR when the pump is OFF or ON (infusing Ang II) for control and BPV groups...." The authors should clarify this as the methods do not state a control group that receives Ang II.
-
Reviewer #2 (Public review):
Summary:
Blood pressure variability has been identified as an important risk factor for dementia. However, there are no established animal models to study the molecular mechanisms of increased blood pressure variability. In this manuscript, the authors present a novel mouse model of elevated BPV produced by pulsatile infusions of high-dose angiotensin II (3.1ug/hour) in middle-aged male mice. Using elegant methodology, including direct blood pressure measurement by telemetry, programmable infusion pumps, in vivo two-photon microscopy, and neurobehavioral tests, the authors show that this BPV model resulted in a blunted bradycardic response and cognitive deficits, enhanced myogenic response in parenchymal arterioles, and a loss of the pressure-evoked increase in functional hyperemia to whisker stimulation.
Strengths:
As the presentation of the first model of increased blood pressure variability, this manuscript establishes a method for assessing molecular mechanisms. The state-of-the-art methodology and robust data analysis provide convincing evidence that increased blood pressure variability impacts brain health.
Weaknesses:
One major drawback is that there is no comparison with another pressor agent (such as phenylephrine); therefore, it is not possible to conclude whether the observed effects are a result of increased blood pressure variability or caused by direct actions of Ang II. Ang II is known to have direct actions on cerebrovascular reactivity, neuronal function, and learning and memory. Given that Ang II is increased in only 15% of human hypertensive patients (and an even lower percentage of non-hypertensive), the clinical relevance is diminished. Nonetheless, this is an important study establishing the first mouse model of increased BPV.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
The authors show that: 1) following brief peripheral optogenetic stimulation of forepaw proprioceptors in mice, sensory-evoked responses in primary motor cortex (M1) are delayed relative to primary somatosensory cortex (S1); 2) the responses in both cortical areas follow a triphasic pattern of activation-suppression-activation; 3) directly activating cortical parvalbumin-positive (PV) inhibitory interneurons mimicked both the suppression and rebound components of the sensory-evoked response; and 4) partially suppressing activity in S1 reduces the sensory-evoked response in M1. The conclusions are convincing and build on prior work on cortical circuits related to the mouse forelimb from this group (Yamawaki et al., 2021, eLife, doi:10.7554/eLife.66836). More rigorously determining whether the peripheral stimulation approach used evokes movements would strengthen the conclusions. It is also possible that these effects would differ for peripheral mechanoreceptor stimulation. Overall, this in vivo work assessing sensory responses in forepaw-related cortical circuits represents a valuable comparison to previously published work.
-
Reviewer #1 (Public review):
Summary:
Building on previous in vitro synaptic circuit work (Yamawaki et al., eLife 10, 2021), Piña Novo et al. utilize an in vivo optogenetic-electrophysiological approach to characterize sensory-evoked spiking activity in the mouse's forelimb primary somatosensory (S1) and motor (M1) areas. Using a combination of a novel "phototactile" somatosensory stimuli to the mouse's hand and simultaneous high-density linear array recordings in both S1 and M1, the authors report in awake mice that evoked cortical responses follow a triphasic peak-suppression-rebound pattern response. They also find that M1 responses are delayed and attenuated relative to S1. Further analysis revealed a 20-fold difference in subcortical versus corticocortical propagation speeds. They also report that PV interneurons in S1 are strongly recruited by hand stimulation. Furthermore, they report that selective activation of PV cells can produce a suppression and rebound response similar to "phototactile" stimuli. Lastly, the authors demonstrate that silencing S1 through local PV cell activation reduces M1 response to hand stimulation, suggesting S1 may directly drive M1 responses.
Strengths:
The study was technically well done, with convincing results. The data presented are appropriately analyzed. The author's findings build on a growing body of both in vitro and in vivo work examining the synaptic circuits underlying the interactions between S1 and M1. The paper is well-written and illustrated. Overall, the study will be useful to those interested in forelimb S1-M1 interactions.
Weaknesses:
Although the results are clear and convincing, one weakness is that many results are consistent with previous studies in other sensorimotor systems, and thus not all that surprising. For example, the findings that sensory stimulation results in delayed and attenuated responses in M1 relative to S1 and that PV inhibitory cells in S1 are strongly recruited by sensory stimulation are not novel (e.g., Bruno et al., J Neurosci 22, 10966-10975, 2002; Swadlow, Philos Trans R Soc Lond B Biol Sci 357, 1717-1727, 2002; Gabernet et al., Neuron 48, 315-327, 2005; Cruikshank et al., Nat Neurosci 10, 462-468, 2007; Ferezou et al., Neuron 56, 907-923, 2007; Sreenivasan et al., Neuron 92, 1368-1382, 2016; Yu et al., Neuron 104, 412-427 e414, 2019). Furthermore, the observation that sensory processing in M1 depends upon activity in S1 is also not novel (e.g., Ferezou et al., Neuron 56, 907-923, 2007; Sreenivasan et al., Neuron 92, 1368-1382, 2016). The authors do a good job highlighting how their results are consistent with these previous studies.
Perhaps a more significant weakness, in my opinion, was the missing analyses given the rich dataset collected. For example, why lump all responsive units and not break them down based on their depth? Given superficial and deep layers respond at different latencies and have different response magnitudes and durations to sensory stimuli (e.g., L2/3 is much more sparse) (e.g., Constantinople et al., Science 340, 1591-1594, 2013; Manita et al., Neuron 86, 1304-1316, 2015; Petersen, Nat Rev Neurosci 20, 533-546, 2019; Yu et al., Neuron 104, 412-427 e414, 2019), their conclusions could be biased toward more active layers (e.g., L4 and L5). These additional analyses could reveal interesting similarities or important differences, increasing the manuscript's impact. Given the authors use high-density linear arrays, they should have this data.
Similarly, why not isolate and compare PV versus non-PV units in M1? They did the photostimulation experiments and presumably have the data. Recent in vitro work suggests PV neurons in the upper layers (L2/3) of M1 are strongly recruited by S1 (e.g., Okoro et al., J Neurosci 42, 8095-8112, 2022; Martinetti et al., Cerebral cortex 32, 1932-1949, 2022). Does the author's data support these in vitro observations?
It would have also been interesting to suppress M1 while stimulating the hand to determine if any part of the S1 triphasic response depends on M1 feedback. I appreciate the control experiment showing that optical hand stimulation did not evoke forelimb movement. However, this appears to be an N=1. How consistent was this result across animals, and how was this monitored in those animals? Can the authors say anything about digit movement? A light intensity of 5 mW was used to stimulate the hand, but it is unclear how or why the authors chose this intensity. Did S1 and M1 responses (e.g., amplitude and latency) change with lower or higher intensities? Was the triphasic response dependent on the intensity of the "phototactile" stimuli?
-
Reviewer #2 (Public review):
Summary:
Communication between sensory and motor cortices is likely to be important for many aspects of behavior, and in this study, the authors carefully analyse neuronal spiking activity in S1 and M1 evoked by peripheral paw stimulation finding clear evidence for sensory responses in both cortical regions
Strengths:
The experiments and data analyses appear to have been carefully carried out and clearly represented.
Weaknesses:
(1) Some studies have found evidence for excitatory projection neurons expressing PV and in particular some excitatory pyramidal cells can be labelled in PV-Cre mice. The authors might want to check if this is the case in their study, and if so, whether that might impact any conclusions.
(2) I think the analysis shown in Figure S1 apparently reporting the absence of movements evoked by the forepaw stimulation could be strengthened. It is unclear what is shown in the various panels. I would imagine that an average of many stimulus repetitions would be needed to indicate whether there is an evoked movement or not. This could also be state-dependent and perhaps more likely to happen early in a recording session. Videography could also be helpful.
(3) Some similar aspects of the evoked responses, including triphasic dynamics, have been reported in whisker S1 and M1, and the authors might want to cite Sreenivasan et al., 2016.
-
Reviewer #3 (Public review):
Summary:
This is a solid study of stimulus-evoked neural activity dynamics in the feedforward pathway from mouse hand/forelimb mechanoreceptor afferents to S1 and M1 cortex. The conclusions are generally well supported, and match expectations from previous studies of hand/forelimb circuits by this same group (Yamawaki et al., 2021), from the well-studied whisker tactile pathway to whisker S1 and M1, and from the corresponding pathway in primates. The study uses the novel approach of optogenetic stimulation of PV afferents in the periphery, which provides an impulse-like volley of peripheral spikes, which is useful for studying feedforward circuit dynamics. These are primarily proprioceptors, so results could differ for specific mechanoreceptor populations, but this is a reasonable tool to probe basic circuit activation. Mice are awake but not engaged in a somatosensory task, which is sufficient for the study goals.
The main results are:<br /> (1) brief peripheral activation drives brief sensory-evoked responses at ~ 15 ms latency in S1 and ~25 ms latency in M1, which is consistent with classical fast propagation on the subcortical pathway to S1, followed by slow propagation on the polysynaptic, non-myelinated pathway from S1 to M1;<br /> (2) each peripheral impulse evokes a triphasic activation-suppression-rebound response in both S1 and M1;<br /> (3) PV interneurons carry the major component of spike modulation for each of these phases;<br /> (4) activation of PV neurons in each area (M1 or S1) drives suppression and rebound both in the local area and in the other downstream area;<br /> (5) peripheral-evoked neural activity in M1 is at least partially dependent on transmission through S1.
All conclusions are well-supported and reasonably interpreted. There are no major new findings that were not expected from standard models of somatosensory pathways or from prior work in the whisker system.
Strengths:
This is a well-conducted and analyzed study in which the findings are clearly presented. This will provide important baseline knowledge from which studies of more complex sensorimotor processing can build.
Weaknesses:
A few minor issues should be addressed to improve clarity of presentation and interpretation:
(1) It is critical for interpretation that the stimulus does not evoke a motor response, which could induce reafference-based activity that could drive, or mask, some of the triphasic response. Figure S1 shows that no motor response is evoked for one example session, but this would be stronger if results were analyzed over several mice.
(2) The recordings combine single and multi-units, which is fine for measures of response modulation, but not for absolute evoked firing rate, which is only interpretable for single units. For example, evoked firing rate in S1 could be higher than M1, if spike sorting were more difficult in S1, resulting in a higher fraction of multi-units relative to M1. Because of this, if reporting of absolute firing rates is an essential component of the paper, Figs 3D and 4E should be recalculated just for single units.
(3) In Figure 5B, the average light-evoked firing rate of PV neurons seems to come up before time 0, unlike the single-trial rasters above it. Presumably, this reflects binning for firing rate calculation. This should be corrected to avoid confusion.
(4) In Figure 6A bottom, please clarify what legends "W. suppression" and "W. rebound" mean.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study examines heterochromatin domain dynamics using a model system that allows reversible transition from an embryonic stem cell to a 2-cell-like state. The authors present a solid resource to the research community that will further the understanding of changes in the chromatin-bound proteome during the 2C-to-ESC transition. However, conclusions related to the functional roles of the interaction between the SWI/SNF complex component SMARCAD1 and the DNA Topoisomerase II Binding protein (TOPBP1) remain incomplete.
-
Reviewer #1 (Public review):
In this study, the authors investigate the molecular mechanisms driving the establishment of constitutive heterochromatin during embryonic development. The experiments have been meticulously conducted and effectively address the proposed hypotheses.
The methodology stands out for its robustness, utilizing:<br /> i) an efficient system for converting ESCs to 2C-like cells via Dux overexpression;<br /> ii) a global approach through IPOTD, which unveils the chromatome at distinct developmental stages; and<br /> iii) STORM technology, enabling high-resolution visualization of DNA decompaction. These tools collectively provide clear and comprehensive insights that support the study's conclusions.
The work makes a significant contribution to the field, offering valuable insights into chromatin-bound proteins at critical stages of embryonic development. These findings may also inform our understanding of processes beyond heterochromatin maintenance.
The revised manuscript shows improvement, particularly through enhanced discussion and the addition of new references addressing the cooperation of SMARCAD1 and TOPBP1. All my previous concerns have been thoroughly addressed by the authors. However, I believe that, as this reviewer suggested, the inclusion of a model that summarizes the main findings of the study and discusses the potential mechanisms involved, would enhance the clarity and understanding of the message the manuscript aims to convey.
-
Reviewer #2 (Public review):
As noted in the original review, the study by Sebastian-Perez addresses an important research question using a tractable model system to examine the earliest drivers of heterochromatin formation during embryogenesis. Moreover, the proteomic analyses provide a valuable resource to the research community to understand changes in the chromatin-bound proteome during the 2C-to-ESC transition. From there, they carry out more detailed analyses of TOPBP1, which shows substantive changes in chromatin association in 2C-like cells, and a potential interacting protein SMARCAD1, which shows only modest changes in chromatin association. While I appreciate that the authors have revised the manuscript to some extent to address the minor points raised, the major over-arching issue of how TOPBP1 and SMARCAD1 function in the 2C-like state is still a concern.
-
Reviewer #3 (Public review):
The manuscript entitled "SMARCAD1 and TOPBP1 contribute to heterochromatin maintenance at the transition from the 2C-like to the pluripotent state" by Sebastian-Perez et al. adopted the iPOTD method to compare the chromatin-bound proteome in ESCs and 2CLCs induced by Dux overexpression. The authors identified 397 chromatin-bound proteins enriched specifically in non-2CLCs, among which they further investigated TOPBP1 due to its potential role in chromocenter reorganization. SMARCD1, a known interacting protein of TOPBP1, was also investigated in parallel. The authors report increased size and decreased number of H3K9me3-heterochromatin foci in Dux-induced 2CLCs. Remarkably, depletion of either TOPBP1 or SMARCD1 resulted in similar phenotypes. However, the absence of these proteins did not affect the entry into or exit from the 2C-like state. The authors further showed that both TOPBP1 and SMARCD1 are essential for early embryonic development.
This manuscript provides valuable insights into the features of 2CLCs regarding H3K9me3-heterochromatin reorganization. However, the findings are largely descriptive. Mechanistic studies are required in future studies, such as: 1) how SMARCD1 associates with H3K9me3 and contributes to heterochromatin maintenance, 2) how TOPBP1 regulates the expression of SMARCD1 and facilitates its localization in heterochromatin foci, 3) whether the remodelling of chromocenter directly influence the transitions between ESCs and 2CLCs.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
In the present work the authors explore the molecular driving events involved in the establishment of constitutive heterochromatin during embryo development. The experiments have been carried out in a very accurate manner and clearly fulfill the proposed hypotheses.
Regarding the methodology, the use of: i) an efficient system for conversion of ESCs to 2C-like cells by Dux overexpression; ii) a global approach through IPOTD that reveals the chromatome at each stage of development and iii) the STORM technology that allows visualization of DNA decompaction at high resolution, helps to provide clear and comprehensive answers to the conclusion raised.
The contribution of the present work to the field is very important as it provides valuable information on chromatin-bound proteins at key stages of embryonic development that may help to understand other relevant processes beyond heterochromatin maintenance.
The study could be improved through a more mechanistic approach that focuses on how SMARCAD1 and TOPBP1 cooperate and how they functionally connect with H3K9me3, HP1b and heterochromatin regulation during embryonic development. For example, addressing why topoisomerase activity is required or whether it connects (or not) to SWI/SNF function and the latter to heterochromatin establishment, are questions that would help to understand more deeply how SMARCAD1 and TOPBP1 operate in embryonic development.
We would like to thank the reviewer for the positive evaluation of our work and the methodology we employed. We greatly appreciated the reviewer’s recognition of our study to “provide valuable information on chromatin-bound proteins at key stages of embryonic development that may help to understand other relevant processes beyond heterochromatin maintenance”. While we acknowledge the value of including mechanistic studies, such an addition would require a substantial amount of experimental work that exceeds our current resources.
Reviewer #1 (Recommendations For The Authors):
In my opinion, the authors could improve the study by deciphering -to a certain extent- the possible mechanism by which SMARCAD1 and TOPBP1 are cooperating in their system to establish H3K9me3 and consequently heterochromatin; and whether it is different (or not) from that already reported in yeast (ref 27). In fact, is it only SMARCAD1 that participates in this process or the whole SWI/SNF complex? Could the lack of SMARCAD1 compromise the proper assembly of the SWI/SNF complex? In this regard, a model describing the main findings of the study and the discussion of the possible mechanisms involved -based on the current bibliography- would be appreciated. This, although speculative, would illustrate the range of possibilities that could be operating in the maintenance of heterochromatin during embryonic development. In conclusion, it would be great if the authors could link -mechanistically- the dots connecting SMARCD1, TOPBP1, H3K9me3/HP1/heterochromatin.
As suggested by the reviewer and to enrich the discussion, we have included some additional sentences and references in the revised discussion section.
As a minor point, In Figure 3A, left panel it appears that the protein precipitating with H3K9me3 reacts with TOPBP1 but its molecular weight does not exactly match to the TOPBP1 band found in the input. The authors should clarify this point and it is also recommended that IPs and inputs are run in the same gel. Please replace Figure 3A right panel.
Following the reviewer’s suggestion and to improve the reading flow, we have restructured the order of the figures and removed the original Figure 3A. The revised Figure 3A-C panel illustrates the SMARCAD1 association with H3K9me3 in ESCs and 2C- cells, while capturing the reduced SMARCAD1-H3K9me3 association in 2C<sup>+</sup> cells.
Reviewer #2 (Public Review):
The manuscript by Sebastian-Perez describes determinants of heterochromatin domain formation (chromocenters) at the 2-cell stage of mouse embryonic development. They implement an inducible system for transition from ESC to 2C-like cells (referred to as 2C<sup>+</sup>) together with proteomic approaches to identify temporal changes in associated proteins. The conversion of ESCs to 2C<sup>+</sup> is accompanied by dissolution of chromocenter domains marked by HP1b and H3K9me3, which reform upon transition back to the 2C-like state. The innovation in this study is the incorporation of proteomic analysis to identify chromatin-associated proteins, which revealed SMARCAD1 and TOPBP1 as key regulators of chromocenter formation.
In the model system used, doxycycline induction of DUX leads to activation of EGFP reporter regulated by the MERVL-LTR in 2C<sup>+</sup> cells that can be sorted for further analysis. A doxycycline-inducible luciferase cell line is used as a control and does not activate the MERVL-LTR GFP reporter. The authors do see groups of proteins anticipated for each developmental stage that suggest the overall strategy is effective.
The major strengths of the paper involve the proteomic screen and initial validation. From there, however, the focus on TOPBP1 and SMARCAD1 is not well justified. In addition, how data is presented in the results section does not follow a logical flow. Overall, my suggestion is that these structural issues need to be resolved before engaging in comprehensive review of the submission. This may be best achieved by separating the proteomic/morphological analyses from the characterization of TOPBP1 and SMARCAD1.
We appreciate the reviewer’s positive evaluation of our inducible system to trigger the transition from ESCs to 2C-like cells, and the strength of the chromatin proteomics we conducted. In response to the reviewer’s suggestion, we have reorganized the order of the figures, particularly Figure 1 and Figure 2, and revised the text to improve readability and flow.
Reviewer #2 (Recommendations For The Authors):
There are some very interesting components to the study but, as noted, the narrative requires changes and the rationale for focusing on TOPBP1 and SMARCAD1 is not strong at present. Specific comments are noted below
(1) Inclusion of authentic 2C cells for comparative chromocenter analysis (or at least a more fulsome discussion of how the system has been benchmarked in previous studies).
We have included more detail in the revised methods section, in the “Cell lines and culture conditions” paragraph. We have added: “The Dux overexpression system was benchmarked according to previously reported features. Dux overexpression resulted in the loss of DAPI-dense chromocenters and the loss of the pluripotency transcription factor OCT4 (fig. S1E) (6, 7), upregulation of specific genes of the 2-cell transcriptional program such as endogenous Dux, MERVL, and major satellites (MajSat) (fig. S1F) (6, 7, 11, 26, 58), and accumulation in the G2/M cell cycle phase (fig. S1G), with a reduced S phase consistent in several clonal lines (fig. S1H) (15).”
(2) In Figure 1A, the text indicates a loss of chromocenters, but it may be better described as decompaction because the DAPI/H3K9me3 staining shows diffuse/expanded structures (this is in fact how it is described in relation to Figure 2).
We have changed the text accordingly, now describing it as “decompaction”.
(3) Table S1 has 6 separate tabs but these are not specified in the text. It would be useful to separate the 397 proteins unique to Luc and 2C- cells since they form much of the basis for the remaining analysis. This approach also assumes it is the absence of a protein in the 2C<sup>+</sup> that accounts for the lack of chromocenters (noting there are 510 proteins unique to the 2C<sup>+</sup> state that are not discussed).
We have referenced the supplementary table as Table S1 in the text for simplicity. It includes: Table S1A - List of Protein Groups identified by mass spectrometry in -EdU, Luc, 2C- and 2C<sup>+</sup> cells; Table S1B - Input data for SAINT analysis; Table S1C - SAINT results of the comparison 2C- vs Luc and 2C<sup>+</sup> vs Luc; Table S1D - SAINT results of the comparison Luc vs 2C- and 2C<sup>+</sup> vs 2C-; Table S1E - SAINT results of the comparison Luc vs 2C<sup>+</sup> and 2C- vs 2C<sup>+</sup>; and Table S1F - Total number of PSM per protein in the different cells and conditions tested.
(4) Since there is no change in H3K9me3 levels, loss of SUV420H2 from 2C<sup>+</sup> chromatin (figure 1G) coupled with potential changes in H4K20me3 could contribute the morphological differences. SUV420H2 is known to regulate chromocenter clustering in a way the requires H4K20me3 but this is not addressed or cited (PUBMED: 23599346).
As suggested by the reviewer, we have added additional sentences and references in the revised manuscript.
(5) In Figure 1C, there does appear to be overlap between the 2C<sup>+</sup> and 2C- populations (while the Luc population is distinct) even though they are morphologically distinct when imaged in Figure 2A. The 2C- cells are thought to be an intermediate, low Dux expressing population.
Chromatome profiling through genome capture provides a snapshot of the chromatin-bound proteome in the analyzed samples (shown in revised Fig. 2B). As indicated by the reviewer and previously reported in the literature, 2C- cells are an intermediate population before reaching 2C<sup>+</sup> cells. For this study, we have focused on H3K9me3 morphological changes. Even though 2C- and 2C<sup>+</sup> cells are distinct with respect to H3K9me3 morphology (shown in revised Fig. 1B), analysis of the chromatome data from hundreds of chromatin-bound proteins revealed some overlap between these two populations. However, replicates from the same population tend to cluster together, for example, 2C<sup>+</sup> rep1 and 2C<sup>+</sup> rep3, and 2C- rep1 and 2C- rep2. Collectively, these data suggest that a defined subset of coordinated changes in the chromatome likely triggers the transition from 2C- to 2C<sup>+</sup> cells. Further experimental investigation of the chromatome dataset during the 2C-like transition would be interesting, however, we believe it is beyond the scope of this study.
(6) Data with SUV39H1 and 2 is difficult to accommodate; what about other H3K9 methyltransferases or proteins such as TRIM28 (KAP1) and SETDB1 (this comes up in the discussion but is not assessed in the results section).
We agree that investigating the role of TRIM28 (KAP1) and SETDB1 in this experimental setting could be of interest, however, we believe that these experiments go beyond the scope of the presented study.
(7) Rationale for choosing TOPBP1 needs to be improved. How do TOPBP1 levels relate to TOPI/TOP2A/TOP2B levels across the 3 cell populations? By what criteria does topoisomerase inhibitor treatment increase 2C<sup>+</sup> like cells? Moreover, to what extent will inhibiting topoisomerases lead to global heterochromatin and cell cycle changes regardless of cell type.
Following the reviewer’s suggestion, we have included some additional references throughout the text to strengthen our rationale for selecting TOPBP1, given its well-established critical role in DNA replication and repair. Additionally, we have revised the results and discussion sections to include new sentences that propose a potential mechanism by which topoisomerase inhibitors may indirectly recruit TOPBP1 to facilitate DNA repair, ultimately leading to an increase in 2C<sup>+</sup> cells.
(8) Likewise, the decision to look at SMARCAD1 based solely on its interaction with TOPBP1 seems somewhat arbitrary and it did not seem to come up as of interest in the iPOTD analysis. Moreover, they were not able to validate the interaction with their own analyses.
We have revised the text to clarify the connection further.
(9) The flow of results is confusing. The first section concludes with a focus on TOPBP1 and SMARCAD1, then progresses to morphological characterization of heterochromatin regions in the next two sections before returning to TOPBP1 and SMARCAD1. It seems like it would make more sense to describe the model system and morphological characterization at the beginning of the results section and then transition to the proteomic analysis and characterization of TOPBP1 and SMARCAD1 (with the expectation that the rationale be improved).
As suggested by the reviewer, we have reordered the figures, particularly Figure 1 and Figure 2, and rephased the text to improve the overall reading flow.
(10) There has been considerable work done on characterizing chromatin structure, epigenetic changes, and morphology during early embryonic development. It is therefore difficult to see what validating some of these changes in the inducible model is adding much in the way of new knowledge. It may, but this is not articulated in the current text.
As detailed before, we have rephrased the text to improve the overall reading flow, which we hope has improved the understanding of the impact of our results.
(11) It is difficult to disentangle broader effects of both TOPBP1 and SMARCAD1 from those described here; they may induce phenotypes, but these may not be unique to this model system.
We agree with the reviewer, but to address this point would require additional experiments which would go beyond the scope of the presented study.
(12) One of the issues with this assay is global chromatin recovery; it is not focused on heterochromatin compartments. The statement "We identified a total of 2396 proteins, suggesting an efficient pull-down of chromatin-associated factors (fig. S2D and Table S1)" does not demonstrate efficiency. Additional functional annotation would be required to establish this claim, including what fraction are known chromatin-associated proteins (with a focus on the heterochromatin compartment).
We have changed the text accordingly. The resulting statement reads as: “We identified a total of 2396 proteins, suggesting an effective pull-down of putative chromatin-associated factors (fig. S2D and Table S1)”.
Reviewer #3 (Public Review):
The manuscript entitled "SMARCAD1 and TOPBP1 contribute to heterochromatin maintenance at the transition from the 2C-like to the pluripotent state" by Sebastian-Perez et al. adopted the iPOTD method to compare the chromatin-bound proteome in ESCs and 2C-like cells generated by Dux overexpression. The authors identified 397 chromatin-bound proteins enriched only in ESC and 2C- cells, among which they further investigated TOPBP1 due to its potential role in controlling chromocenter reorganization. SMARCD1, a known interacting protein of TOPBP1, was also investigated in parallel. The authors observed increased size and decreased number of H3K9me3-heterochromatin foci in Dux-induced 2C<sup>+</sup> cells. Interestingly, depletion of TOPBP1 or SMARCD1 also led to increased size and decreased number of H3K9me3 foci. However, depletion of these proteins did not affect entry into or exit from the 2C-like state. Nevertheless, the authors showed that both TOPBP1 and SMARCD1 are required for early embryonic development.
Although this manuscript provides new insights into the features of 2C-like cells regarding H3K9me3-heterochromatin reorganization, it remains largely descriptive at this stage. It does not provide new insights into the following important aspects: 1) how SMARCD1 associates with H3K9me3 and contributes to heterochromatin maintenance, 2) how TOPBP1 regulates the expression of SMARCD1 and facilitates its localization in heterochromatin foci, 3) whether the remodelling of chromocenter is causally related to the mutual transitions between ESCs and 2C-like cells. Furthermore, some results are over-interpreted. Additional experiments and analyses are needed to increase the strength of mechanistic insights and to support all claims in the manuscript.
We would like to thank the reviewer for their positive and thorough evaluation of our manuscript. We have revised the text and hope that the overall flow is now clearer. Moreover, while we acknowledge the value of including mechanistic studies, such an addition would require a substantial amount of experimental work that exceeds our current resources.
Reviewer #3 (Recommendations For The Authors):
Major points:
(1) Fig.2: the DNA decompaction of the chromatin fibers shown in 2C<sup>+</sup> cells may be more related to a relaxed 3D chromatin conformation (Zhu, NAR 2021; Olbrich, Nat Commun 2021) than chromatin accessibility. The authors should discuss this point.
As suggested by the reviewer, we have included some additional sentences and references in the revised manuscript to address this concern.
(2) Chemical inhibition of topoisomerases resulted in an increase in the percentage of 2C<sup>+</sup> cells. Does depletion of TOPBP1 also resulted in increased percentage of 2C<sup>+</sup> cells? Please include this result in Fig. 3E. Additionally, it should be noted that DDR and p53 have been reported to activate Dux (Stashpaz, eLife 2020; Grow, Nat Genet 2021), and thus, may contribute to the increased percentage of 2C<sup>+</sup> cells observed upon topoisomerase inhibition. This point should be discussed in the manuscript.
To address this concern, we have included some additional sentences and references in the revised manuscript.
(3) Fig 3A: the TOPBP1 band in the IP sample is questionable, and therefore the conclusion that TOPBP1 is associated with H3K9me3 is difficult to draw from Fig 3A. Additionally, the authors mentioned that association of TOPBP1 and SMARCAD1 is undetected in ESCs, likely due to the suboptimal efficiency of available antibodies. As these are key conclusions in this study, the authors are suggested to try other commercially available TOPBP1 antibodies (e.g., Abcam #ab-105109, used by ElInati, PNAS 2017) or knock-in tags to perform the co-IP experiment.
Following the reviewer’s suggestion and to improve the reading flow, we have restructured the order of figures and removed the original Figure 3A. The revised Figure 3A-C panel illustrates the SMARCAD1 association with H3K9me3 in ESCs and 2C- cells, while capturing the reduced SMARCAD1-H3K9me3 association in 2C<sup>+</sup> cells.
(4) Fig. 3C-D, Fig. S3D: the authors claimed reduction of both SMARCAD1 expression and its co-localization with H3K9me3 foci in 2C<sup>+</sup> cells, but did not perform mechanistic studies. It is important to know if TOPBP1 expression also decreases in 2C<sup>+</sup> cells. Additionally, it is unclear if the reduced co-localization of SMARCAD1 with H3K9me3 foci results from its altered nuclear localization or simply from reduced expression level? In either case, please provide some mechanistic insights.
While we acknowledge the value of including mechanistic studies, such an addition would require a substantial amount of experimental work that exceeds our current resources.
(5) Fig. 3K, Fig. S4D-E: does SMARCAD1 expression decrease upon TOPBP1 depletion? Statistical analysis of SMARCAD1 intensity in Fig. S4E is needed, and a Western blot analysis is strongly suggested. Additionally, it is unclear if the reduced co-localization of SMARCAD1 with H3K9me3 foci results from its altered nuclear localization or simply from reduced expression level? In Fig. 3K, TOPBP1-depleted cells appear to show decreased size and increased number of H3K9me3 foci, which is inconsistent with Fig. S4B-C. The authors should clarify this discrepancy. Furthermore, statistics should be performed to determine whether Smarcad1/Topbp1 knockdown could further increase the size and decrease the number of H3K9me3 foci in 2C<sup>+</sup> cells. This would provide additional evidence for the involvement of these proteins in heterochromatin maintenance.
We did not observe Smarcad1 downregulation after Topbp1 knockdown (shown in fig. S4A). In Figs. S4B and S4C, we observed that the number of H3K9me3 foci decreased, and their area became larger after knocking down either Smarcad1 or Topbp1, compared to scramble controls. These results align with the reviewer’s comment. Additionally, it should be noted that these findings were derived from the quantification of tens of cells and hundreds of foci, as indicated in the figure legend. This resulted in statistical significance after applying the test indicated in the figure legend.
(6) Fig. 3J is suggested to be moved to Fig. 4. Additionally, performing immunostaining of SMARCAD1, TOPBP1, and H3K9me3 during pre-implantation development would provide valuable information on their protein-level dynamics, interactions, and functions in early embryos. This would further strengthen the conclusions drawn in the manuscript.
We agree that performing these additional experiments would provide additional valuable information, however this would require a substantial amount of experimental work that exceeds our current resources.
(7) Fig. 4 and Fig. S5: the authors observed reduced H3K9me3 signal in the Smarcad1 MO embryos at the 8-cell stage, but claim that they failed to examine Topbp1 MO embryos at the 8-cell stage due to their developmental arrest at the 4-cell stage. However, based on Fig. 4A, not all Topbp1 MO embryos were arrested at the 4-cell stage, and it is still possible to examine the H3K9me3 signal in 8-cell Topbp1 MO embryos, which is critical for demonstrating its function in early embryos. Also, how to interpret the increased HP1b signal in Topbp1 MO embryos?
For Topbp1 silencing, we observed an even more severe phenotype compared to Smarcad1 MO. All the Topbp1 MO-injected embryos (100 %) arrested at the 4-cell stage and did not develop further (shown in Fig. 4A and 4B). Therefore, the severity of the Topbp1 morpholino phenotype posed a technical challenge in evaluating the H3K9me3 signal in 8-cell Topbp1 MO embryos, as none of the injected embryos developed beyond the 4-cell stage.
We believe the increased HP1b signal in Topbp1 MO embryos could indicate potential alterations in chromatin organization and heterochromatin stability. Specifically, we observed remodeling of heterochromatin in both 2-cell and 4-cell Topbp1 MO arrested embryos compared to controls, as evidenced by the spreading and increased HP1b signal (shown in fig. S5F-S5I). Further investigations could enhance our understanding of the underlying defects in Topbp1 knockdown embryos, extending beyond heterochromatin-related errors.
Minor points:
(1) Page 4, the third row from the bottom: please revise the sentence.
We have reviewed the text and it now reads correctly in the revised manuscript.
(2) Fig. 1C: The authors claimed "Luc replicates clustered separately from 2C<sup>+</sup> and 2C- conditions", however, Luc rep3 is apparently clustered with 2C conditions.
(3) The GFP signal in Fig. S1E is confusing.
(4) Please include ESC in Fig. 2D-E. Also label the colors in Fig. 2E.
As indicated in the figure legend of the revised Fig. 1F: “Cells with a GFP intensity score > 0.2 are colored in green. Black dots indicate 2C- cells and green dots indicate 2C<sup>+</sup> cells.”
(5) Fig. 2G: Transposition of the heatmap (show genes in rows) is suggested to improve readability.
(6) Page 7, the third row from the bottom: incorrect citation of Fig. 1K.
Thank you for spotting this incorrect citation. We have corrected it in the revised manuscript.
(7) Page 8, row 15, Fig. S3D should be cited to support the decreased expression of SMARCAD1 in 2C<sup>+</sup> cells.
We have cited the corresponding supplementary figure S3D in the mentioned sentence.
(8) Fig. 2H: what is the difference between "2C-" and "ESC-like"?
We named 2C- to those cells not expressing the GFP reporter in the transition from ESCs to 2C<sup>+</sup> cells. We named ESC-like cells to those cells that do not express the GFP reporter during exit, meaning from sorted and purified 2C<sup>+</sup> to a GFP negative state.
(9) Fig. S4A-C: compared with shTopbp1#2, shTopbp1#1 appears to be slightly more effective in knockdown, but less dramatic changes in the size/number of H3K9me3 foci.
(10) Fig. 4: please show the effectiveness of Topbp1 MO by Immunostaining of TOPBP1.
(11) Fig. 4C: please label the developmental stage as in Fig. 4E and 4G.
We have added a “8-cell” label in the Figure 4C, as suggested by the reviewer.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study shows that Type 3 secretion translocons in Edwardsiella tarda and other bacteria activate the NAIP-NLRC4 inflammasome. The data from cellular and biochemical experiments showing that EseB is required for activation of the NLRC4 inflammasome are convincing. This paper is broadly relevant to those investigating host-pathogen interactions in diverse organisms.
-
Reviewer #1 (Public review):
Summary:
In this study, Zaho and colleagues investigate inflammasome activation by E. tarda infections. They show that E. tarda induces the activation of the NLRC4 inflammasome as well as the non-canonical pathway in human THP1 macrophages. Further dissecting NLRC4 activation, the find that T3SS translocon components eseB, eseC and eseD are necessary for NLRC4 activation, and that delivery of purified eseB is sufficient to trigger NAIP-dependnet NLRC4 activation. Sequence analysis reveals that eseB shares homology within the C-terminus with T3SS needle and rod proteins, leading the authors to test if this region is necessary for inflammasome activation. They show that the eseB CT is required and that it mediates interaction with NAIP. Finally, they that homologs of eseB in other bacteria also share the same sequence and that they can activate NLRC4 in a HEK293T cell overexpression system.
Strengths:
This is a very nice study that convincingly shows that eseB and its homologs can be recognized by the human NAIP/NLRC4 inflammasome. The experiments are well-designed, controlled and described, and the papers is convincing as a whole.
Weaknesses:
The authors need to discuss their study in the context of previous papers that have shown an important role for E. tarda flagellin in inflammasome activation and test whether flagellin and/or E. tarda T3SSs needle or rod can activate NLRC4.
The authors show that eseB and its homologs can activate NLRC4, but there are also other translocon proteins that are very different such as YopB or PopB. and share little homology with eseB. It would be nice to include a section comparing the different type 3 secretion systems. are there 2 different families of T3SSs, those that feature translocon components that are recognized by NAIP-NLRC4 and those that cannot be recognized?
Comments on revisions:
The authors have addressed my concern with additional experiments, which strengthen the authors' conclusions.
-
Reviewer #2 (Public review):
Summary:
This work by Zhao et al. demonstrates the role of the Edwardsiella tarda type 3 secretion system translocon in activating human macrophage inflammation and pyroptosis. The authors show the requirement of both the bacterial translocon proteins and particular host inflammasome components for E. tarda-induced pyroptosis. In addition, the authors show that the C-terminal region of the translocon protein, EseB, is both necessary and sufficient to induce pyroptosis when present in the cytoplasm. The most terminal region of EseB was determined to be highly conserved among other T3SS-encoding pathogenic bacteria and a subset of these exhibited functionally similar effects on inflammasome activation. Overall, the data support the conclusions and interpretations and provide valuable insights into interactions between bacterial T3SS components and the host immune system., thereby expanding our understanding of E. tarda pathogenesis.
Strengths:
The authors use established and reliable molecular biology and bacterial genetics strategies to characterize the roles of the bacterial T3SS translocon and host inflammasome pathways to E. tarda-induced pyroptosis in human macrophages. These observations are naturally expanded upon by demonstrating the specific regions of EseB that are required for inflammasome activation and the conservation of this sequence and function among other pathogenic bacteria.
-
Author response:
The following is the authors’ response to the original reviews.
Reviewer #1 (Public Review):
Summary:
In this study, Zhao and colleagues investigate inflammasome activation by E. tarda infections. They show that E. tarda induces the activation of the NLRC4 inflammasome as well as the non-canonical pathway in human THP1 macrophages. Further dissecting NLRC4 activation, they find that T3SS translocon components eseB, eseC and eseD are necessary for NLRC4 activation and that delivery of purified eseB is sufficient to trigger NAIP-dependent NLRC4 activation. Sequence analysis reveals that eseB shares homology within the C-terminus with T3SS needle and rod proteins, leading the authors to test if this region is necessary for inflammasome activation. They show that the eseB CT is required and that it mediates interaction with NAIP. Finally, they that homologs of eseB in other bacteria also share the same sequence and that they can activate NLRC4 in a HEK293T cell overexpression system.
Strengths:
This is a very nice study that convincingly shows that eseB and its homologs can be recognized by the human NAIP/NLRC4 inflammasome. The experiments are well designed, controlled and described, and the papers is convincing as a whole.
Weaknesses:
The authors need to discuss their study in the context of previous papers that have shown an important role for E. tarda flagellin in inflammasome activation and test whether flagellin and/or E. tarda T3SSs needle or rod can activate NLRC4.
The authors show that eseB and its homologs can activate NLRC4, but there are also other translocon proteins that are very different such as YopB or PopB. and share little homology with eseB. It would be nice to include a section comparing the different type 3 secretion systems. are there 2 different families of T3SSs, those that feature translocon components that are recognized by NAIP-NLRC4 and those that cannot be recognized?
(1) The authors need to discuss their study in the context of previous papers that have shown an important role for E. tarda flagellin in inflammasome activation and test whether flagellin and/or E. tarda T3SSs needle or rod can activate NLRC4.
According to the reviewer’s suggestion, we added the relevant discussion (lines 326-334) and carried out additional experiments to examine whether E. tarda flagellin, needle, and rod could activate NLRC4. The relevant results are shown in Figure S3, Figure S5, and lines 226-230 and 269-274.
(2) The authors show that eseB and its homologs can activate NLRC4, but there are also other translocon proteins that are very different such as YopB or PopB. and share little homology with eseB. It would be nice to include a section comparing the different type 3 secretion systems. are there 2 different families of T3SSs, those that feature translocon components that are recognized by NAIP-NLRC4 and those that cannot be recognized?
According to the reviewer’s suggestion, additional experiments were performed to examine the NLRC4-activating potentials of 14 translocator proteins that share low sequence identities with EseB. The relevant results and discussion are shown in Figure S8 and lines 289-301; 364-372, and 377-379.
Reviewer #2 (Public Review):
Summary:
This work by Zhao et al. demonstrates the role of the Edwardsiella tarda type 3 secretion system translocon in activating human macrophage inflammation and pyroptosis. The authors show the requirement of both the bacterial translocon proteins and particular host inflammasome components for E. tarda-induced pyroptosis. In addition, the authors show that the C-terminal region of the translocon protein, EseB, is both necessary and sufficient to induce pyroptosis when present in the cytoplasm. The most terminal region of EseB was determined to be highly conserved among other T3SS-encoding pathogenic bacteria and a subset of these exhibited functionally similar effects on inflammasome activation. Overall, the data support the conclusions and interpretations and provide interesting insights into interactions between bacterial T3SS components and the host immune system.
Strengths:
The authors use established and reliable molecular biology and bacterial genetics strategies to characterize the roles of the bacterial T3SS translocon and host inflammasome pathways to E. tarda-induced pyroptosis in human macrophages. These observations are naturally expanded upon by demonstrating the specific regions of EseB that are required for inflammasome activation and the conservation of this sequence among other pathogenic bacteria.
Weaknesses:
The functional assessment of EseB homologues is limited to inflammasome activation at the protein level but does not include the effects on cell viability as shown for E. tarda EseB. Confirmation that EseB homologues have similar effects on cell death would strengthen this portion of the manuscript.
According to the reviewer’s suggestion, the effects of representative EseB homologs on cell death were examined in the revised manuscripts (Figure 5D, Figure S7 and line 289).
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
I only have a few suggestions on how to improve the study:
Activation of caspase-4 requires entry into the host cytosol. Can this be observed with E. tarda and is it T3SS dependent? The fact that deleting the translocon components abrogates all GSDMD activation (see Fig. 2D) suggests that also Casp4 activation requires an active T3SS. It would be useful for the reader to include some more information on the cellular biology of E. tarda.
In our study, we found that E. tarda could enter THP-1 cells (Figure S1), and host cell entry was not affected by deletion of eseB-D (Δ_eseB-D_) in the T3SS system (Figure 2B, C). Additional experiments showed that Δ_eseB-D_ abolished the ability of E. tarda to activate Casp4 (Figure S2), implying that Casp4 activation required an active T3SS. Relevant changes in the revised manuscript: lines 223 and 224, 341-342.
The data presented by the authors suggest that escB is sensed by NLRC4 when overexpressed, they do however not prove that during an infection escB is the main factor that drives NLRC4 activation, since deficiency in escB also abrogated translocation of other potential activators of NLRC4, e.g. flagellin and T3SS needle and rod subunits. I would thus find it essential to properly test if E. tarda flagellin can activate NLRC4 by comparing a WT and flagellin deficient strain, and/or by transfecting or expressing E.t. flagellin in these cells, as well as testing whether E.t. rod and needle subunits act as NLRC4 activators. This is important as previous studies suggested that flagellin is the main activator of cytotoxicity during E. tarda infection.
Previous studies have shown that flagellin is required for E. tarda-induced macrophage death in fish [1] but not in mice [2]. In the revised manuscript, we performed additional experiments to examine whether E. tarda flagellin, needle, and rod could activate NLRC4. The relevant results are shown in Figure S3, Figure S5, and lines 226-230 and 269-274, and 326-334.
References
(1) Xie HX, Lu JF, Rolhion N, Holden DW, Nie P, Zhou Y, et al. Edwardsiella tarda-induced cytotoxicity depends on its type III secretion system and flagellin. Infect Immun. 2014;82(8):3436-45. doi: 10.1128/IAI.01065-13.
(2) Chen H, Yang D, Han F, Tan J, Zhang L, Xiao J, et al. The bacterial T6SS effector EvpP prevents NLRP3 inflammasome activation by inhibiting the Ca<sup>2+</sup>-dependent MAPK-JNK pathway. Cell Host Microbe. 2017;21(1):47-58. doi: 10.1016/j.chom.2016.12.004.
Figure 5/S4, please list the names of the eseB homologs. It is cumbersome to have to access GenBank with the accession number to be able to understand what proteins the authors define as homologs of eseB.
The names were added to the revised Table S2, Figure 5 and Figure S6 (the original Figure S4).
The authors mention that other translocon proteins, such as YopB/D and PopB/D, were suggested to cause inflammasome activation. How do these compare to eseB and its homologs? Do they share the CT motif?
Additional experiments were performed to compare the inflammasome activation abilities of EseB and other translocator proteins including YopD and PopD. The relevant results and discussion are shown in Figure S8 and lines 289-301, 364-372, and 377-379.
It would be nice to show that there are potentially two groups of translocon proteins, one group sharing homology to needle subunits within the CT region and another that is different. A quick look at the sequence of these proteins suggests that they are quite different and much larger than eseB.
In our study, additional experiments with more translocator proteins indicated that the possession of EseB T6R-like terminal residues does not necessarily guarantee the protein to activate the NLRC4 inflammasome. Relevant results and discussion are shown in lines 289-301, 364-372, and 377-379.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This paper reports important findings on giant organelle complexes containing endosomes and lysosomes (termed endosomal-lysosomal organelles form assembly structures [ELYSAs]) present in mouse oocytes and 1- to 2-cell embryos. The data showing the localization and dynamics of ELYSAs during oocyte/embryo maturation are convincing. This work will be of interest to general cell biologists and developmental biologists.
-